Dirk Kutscher

Personal web page

Archive for the ‘multimedia’ tag

COMETS accepted at IEEE TMM

without comments

Our paper on COMETS: Coordinated Multi-Destination Video Transmission with In-Network Rate Adaptation has been accepted for publication by IEEE Transactions on Multimedia (TMM)

Abstract

Large-scale video streaming events attract millions of simultaneous viewers, stressing existing delivery infrastructures. Client-driven adaptation reacts slowly to shared congestion, while server-based coordination introduces scalability bottlenecks and single points of failure. We present COMETS, a coordinated multi-destination video transmission framework that leverages information-centric networking principles such as request aggregation and in-network state awareness to enable scalable, fair, and adaptive rate control. COMETS introduces a novel range-interest protocol and distributed in-network decision process that aligns video quality across receiver groups while minimizing redundant transmissions. To achieve this, we develop a lightweight distributed optimization framework that guides per-hop quality adaptation without centralized control. Extensive emulation shows that COMETS consistently improves bandwidth utilization, fairness, and user-perceived quality of experience over DASH, MoQ, and ICN baselines, particularly under high concurrency. The results highlight COMETS as a practical, deployable approach for next-generation scalable video delivery.

Introduction to COMETS

Nowadays, large streaming events typically attract millions of viewers, and the demand for concurrent video consumption is also expanding dramatically. For example, the number of monthly sports streaming viewers have grown from 57 million in 2021 to more than 90 million in 2025, with more than 17% users participating in multiple streams simultaneously. This explosive growth exposes fundamental limitations in existing video delivery architectures: how to maintain consistent, fair Quality of Experience (QoE) when thousands of users compete for shared bottleneck resources.

Existing infrastructures are not designed for effective coordination and resource sharing among large numbers of simultaneous viewers, resulting in inefficient management of concurrent requests for the same content segments and insufficient coordination of network resource allocation among users of the shared infrastructure. These inefficiencies lead to redundant data transmission and suboptimal bandwidth utilization, ultimately impairing user QoE by increasing network congestion, unstable bitrates, and higher incidences of buffering, especially during peak usage scenarios. To address these challenges, an ideal video delivery system must possess coordinated, scalable, and adaptive capabilities to maximize bandwidth utilization while ensuring a fair, high-quality experience for all users. Such a system should aggregate requests for the same content to eliminate redundancy, make intelligent in-network decisions and distribute computational load to avoid bottlenecks.

a) Latency vs. User Load b) Mean Bitrate vs. User Load

Figure 1: Performance Comparison between baseline MoQ and server-optimized MoQ under increasing user load.

Current solutions exhibit fundamental limitations with respect to coordination and scalability. Client-adaptive approaches like Dynamic Adaptive Streaming over HTTP (DASH) enable individual clients to select video representations independently. However, their uncoordinated decisions, based on delayed and localized network views, lag behind the actual state of shared network bottlenecks, leading to bandwidth contention and bitrate oscillations. Server-side approaches address these limitations by centralizing adaptation logic, enabling optimal resource allocation through comprehensive network and user demand assessments. However, managing state and control interactions for numerous users introduces scalability challenges, and centralized decision architectures create single points of failure that compromise real-time performance. Our experiments (Figure 1) demonstrate that even state-of-the-art server-optimized Media over QUIC (MoQ) ultimately encounters the same scalability barriers as baseline approaches under high concurrency.

Key Insights

We observe that effective multi-user video streaming requires two properties: I). aggregation-aware delivery, where identical requests are merged to eliminate redundant transmissions, and II). distributed coordination, where adaptation decisions are made at points of request convergence rather than at centralized endpoints. This leads us to consider Information-Centric Networking (ICN). ICN provides inherent advantages for multi-user content distribution through in-network caching and request aggregation in systems like CCNx/NDN. While these features reduce redundant transmissions by merging duplicate requests at forwarders, existing ICN-based solutions focus on hop-by-hop adaptation rather than coordinated multi-user rate adaptation, suffering from decision lag and failing to ensure efficient convergence toward stable, fair rate allocations (i.e., equitable QoE distribution). To address these limitations, we present COMETS (Coordinated Multi-Destination Video Transmission with In-Network Rate Adaptation), a scalable, ICN-based multi-destination video streaming framework engineered to resolve challenges in large-scale video delivery: redundant data transmission, lack of scalable coordination, and inefficient system convergence.

Design Philosophy

COMETS is based on three principles that distinguish it from prior work: I). Group-aware rather than individual optimization. Instead of each client independently selecting bitrates, COMETS groups receivers with similar capabilities and network conditions, then aligns video quality across each group. This transforms the combinatorial complexity of individual decisions into tractable group-level optimization. II). Proactive rather than reactive adaptation. Unlike existing ICN approaches that react to congestion signals, COMETS uses a distributed Lagrangian framework where forwarders exchange dual variables (price signals) to anticipate upstream constraints. This enables proactive coordination without centralized state collection. III). Deployable overlay architecture. COMETS requires no modifications to network infrastructure. To ensure deployability, COMETS is architecturally flexible and can be deployed as an application-layer overlay network over existing Internet protocols (e.g., HTTP/QUIC over UDP), similar to Content Delivery Networks (CDNs) like Akamai or CloudFlare. It requires no infrastructure modifications and assumes trusted intermediate nodes under the same administrative domain, enabling immediate integration into today’s networks without network-layer changes. While COMETS shares MoQ’s vision of moving intelligence into the network, it avoids central bottlenecks by enabling per-hop optimization via ICN primitives, and is deployable over MoQ-capable infrastructures as an overlay.

Our Approach. COMETS transforms video streaming from isolated endpoint control into coordinated in-network negotiation, with four key contributions:

Range-interest protocol for coordinated adaptation. We introduce a novel protocol where clients express resolution ranges rather than specific quality levels. This enables forwarders to aggregate requests and optimize resolution assignments across user groups, shifting adaptation logic from endpoints to the network fabric.
Scalable architecture without central bottlenecks. COMETS distributes adaptation logic across forwarders, combining request aggregation with per-hop decision-making.
Distributed optimization with closed-form solutions. We formalize coordinated multi-destination video transmission as a unified Integer Linear Programming (ILP) problem and develop a two-stage distributed algorithm. Unlike prior ICN approaches that rely on heuristics or reactive congestion signals, our method derives analytical closed-form solutions for per-hop quality decisions, enabling proactive, group-aware rate allocation with provable convergence guarantees.
Implementation and Evaluation: Through extensive emulation on Mini-NDN with up to 300 concurrent clients, we demonstrate that COMETS achieves consistent QoE scores above 0.7 across all tested scales, while baselines degrade below 0.5 at high concurrency. COMETS maintains near-perfect fairness (Jain’s index ≥ 0.93) and achieves optimization convergence within 50ms—up to 3.7× faster than centralized approaches.

References

Yulong Zhang, Ying Cui, Zili Meng, Abhishek Kumar, Dirk Kutscher; COMETS: Coordinated Multi-Destination Video Transmission with In-Network Rate Adaptation; IEEE Transactions on Multimedia; 2026; pre-print: https://arxiv.org/abs/2601.18670

Written by dkutscher

January 28th, 2026 at 4:57 am

Posted in Publications

Tagged with , , , ,

INDS Accepted at ACM Multimedia

without comments

Our paper on INDS: Incremental Named Data Streaming for Real-Time Point Cloud Video has been accepted at ACM Multimedia 2025.

Abstract:

Real-time streaming of point cloud video – characterized by high data volumes and extreme sensitivity to packet loss – presents significant challenges under dynamic network conditions. Traditional connection-oriented protocols such as TCP/IP incur substantial retransmission overhead and head-of-line blocking under lossy conditions, while reactive adaptation approaches such as DASH lead to frequent quality fluctuations and a suboptimal user experience. In this paper, we introduce INDS (Incremental Named Data Streaming), a novel adaptive transmission framework that exploits the inherent layered encoding and hierarchical object structure of point cloud data to enable clients to selectively request enhancement layers based on available bandwidth and decoding capabilities. Built on Information-Centric Networking (ICN) principles, INDS employs a hierarchical naming scheme organized by time windows and Groups of Frames (GoF), which enhances cache reuse and facilitates efficient data sharing, ultimately reducing both network and server load. We implemented a fully functional prototype and evaluated it using emulated network scenarios. The experimental results demonstrate that INDS reduces end-to-end delay by up to 80%, boosts effective throughput by 15%–50% across diverse operating conditions, and increases cache hit rates by 20%–30% on average.

References

Ruonan Chai, Yixiang Zhu, Xinjiao Li, Jiawei Li, Zili Meng, Dirk Kutscher; INDS: Incremental Named Data Streaming for Real-Time Point Cloud Video; accepted for publication at ACM Multimedia 2025; October 2025

Written by dkutscher

July 7th, 2025 at 11:51 am

ViFusion accepted at ACM ICMR

without comments

Our paper on ViFusion: In-Network Tensor Fusion for Scalable Video Feature Indexing has been accepted at the ACM International Conference on Multimedia Retrieval 2025 (CCF-B).

Abstract:
Large-scale video feature indexing in datacenters is critically dependent on efficient data transfer. Although in-network computation has emerged as a compelling strategy for accelerating feature extraction and reducing overhead in distributed multimedia systems, harnessing advanced networking resources at both the switch and host levels remains a formidable challenge. These difficulties are compounded by heterogeneous hardware, diverse application requirements, and complex multipath topologies. Existing methods focus primarily on optimizing inference for large neural network models using specialized collective communication libraries, which often face performance degradation in network congestion scenarios.

To overcome these limitations, we present ViFusion, a communication aware tensor fusion framework that streamlines distributed video indexing by merging numerous small feature tensors into consolidated and more manageable units. By integrating an in-network computation module and a dedicated tensor fusion mechanism within datacenter environments, ViFusion substantially improves the efficiency of video feature indexing workflows. The deployment results show that ViFusion improves the throughput of the video retrieval system by 8–22x with the same level of latency as state-of-the-art systems.

Stay tuned for the pre-print.

References

Yisu Wang, Yixiang Zhu, Dirk Kutscher; ViFusion: In-Network Tensor Fusion for Scalable Video Feature Indexing; The 15th ACM International Conference on Multimedia Retrieval; June 2025; Preprint

Written by dkutscher

April 22nd, 2025 at 3:25 pm