Dirk Kutscher

Personal web page

Information-Centric Networking RFCs

without comments

In the Information-Centric Networking Research Group (ICNRG) of the Internet Research Task Force (IRTF) we have recently published a set of new RFCs:

RFC 9510: Alternative Delta Time Encoding for Content-Centric Networking (CCNx) Using Compact Floating-Point Arithmetic

Content-Centric Networking (CCNx) utilizes delta time for a number of functions. When using CCNx in environments with constrained nodes or bandwidth-constrained networks, it is valuable to have a compressed representation of delta time. In order to do so, either accuracy or dynamic range has to be sacrificed. Since the current uses of delta time do not require both simultaneously, one can consider a logarithmic encoding. This document updates RFC 8609 ( to specify this alternative encoding.

RFC 9531: Path Steering in Content-Centric Networking (CCNx) and Named Data Networking (NDN)

Path steering is a mechanism to discover paths to the producers of Information-Centric Networking (ICN) Content Objects and steer subsequent Interest messages along a previously discovered path. It has various uses, including the operation of state-of-the-art multi-path congestion control algorithms and for network measurement and management. This specification derives directly from the design published in https://dl.acm.org/doi/10.1145/3125719.3125721 (4th ACM Conference on Information-Centric Networking) and, therefore, does not recapitulate the design motivations, implementation details, or evaluation of the scheme. However, some technical details are different, and where there are differences, the design documented here is to be considered definitive.

RFC 9508: Information-Centric Networking (ICN) Ping Protocol Specification

This document presents the design of an Information-Centric Networking (ICN) Ping protocol. It includes the operations of both the client and the forwarder.

Ascertaining data plane reachability to a destination and taking coarse performance measurements of Round-Trip Time (RTT) are fundamental facilities for network administration and troubleshooting. In IP, where routing and forwarding are based on IP addresses, ICMP Echo Request and ICMP Echo Reply packets are the protocol mechanisms used for this purpose, generally exercised through the familiar ping utility. In Information-Centric Networking (ICN), where routing and forwarding are based on name prefixes, the ability to ascertain the reachability of names is required.

In order to carry out meaningful experimentation and deployment of ICN protocols, new tools analogous to ping and traceroute used for TCP/IP are needed to manage and debug the operation of ICN architectures and protocols. This document describes the design of a management and debugging protocol analogous to the ping protocol of TCP/IP; this new management and debugging protocol will aid the experimental deployment of ICN protocols. As the community continues its experimentation with ICN architectures and protocols, the design of ICN Ping might change accordingly. ICN Ping is designed as a "first line of defense" tool to troubleshoot ICN architectures and protocols. As such, this document is classified as an Experimental RFC. Note that a measurement application is needed to make proper use of ICN Ping in order to compute various statistics, such as average, maximum, and minimum Round-Trip Time (RTT) values, variance in RTTs, and loss rates.

RFC 9507: Information-Centric Networking (ICN) Traceroute Protocol Specification

This document presents the design of an Information-Centric Networking (ICN) Traceroute protocol. This includes the operation of both the client and the forwarder.

In TCP/IP, routing and forwarding are based on IP addresses. To ascertain the route to an IP address and to measure the transit delays, the traceroute utility is commonly used. In Information-Centric Networking (ICN), routing and forwarding are based on name prefixes. To this end, the ability to ascertain the characteristics of at least one of the available routes to a name prefix is a fundamental requirement for instrumentation and network management. These characteristics include, among others, route properties such as which forwarders were transited and the delay incurred through forwarding.

In order to carry out meaningful experimentation and deployment of ICN protocols, new tools analogous to ping and traceroute used for TCP/IP are needed to manage and debug the operation of ICN architectures and protocols. This document describes the design of a management and debugging protocol analogous to the traceroute protocol of TCP/IP; this new management and debugging protocol will aid the experimental deployment of ICN protocols. As the community continues its experimentation with ICN architectures and protocols, the design of ICN Traceroute might change accordingly. ICN Traceroute is designed as a tool to troubleshoot ICN architectures and protocols.

Written by dkutscher

April 6th, 2024 at 11:26 am

Posted in IETF,IRTF

Tagged with , , , , , , ,

HKUST Internet Research Workshop 2024

without comments

On March 15 2024, in the week before the IETF-119 meeting in Brisbane, Zili Meng and I organized the 1st HKUST Internet Research Workshop that brought together researchers in computer networking and systems around the globe to a live forum discussing innovative ideas at their early stages. The workshop took place at HKUST's Clear Water Bay campus in Hong Hong.

We ran the workshop like a “one day Dagstuhl seminar” and focused on discussion and ideas exchange and less on conference-style presentations. The objective was to identify topics and connect like-minded people for potential future collaboration, which worked out really well.

The agenda was:

  1. Dirk Kutscher: Networking for Distributed ML
  2. Zili Meng: Overview of the Low-Latency Video Delivery Pipeline
  3. Jianfei He: The philosophy behind computer networking
  4. Carsten Bormann: Towards a device-infrastructure continuum in IoT and OT networks
  5. Zili Meng: Network Research – Academia, Industry, or Both?

Dirk Kutscher: Networking for Distributed ML

With the ever-increasing demand for compute power from large-scale machine learning training we have started to realize that not only does Moore's Law no longer address increasing performance demand automatically, but also that the growth rate in terms of training FLOPs for transformers and other large-scale machine learning exhibits by far larger exponential factors.

This has been well illustrated by presentations in an AI data center side meeting at IETF-118, for example by Omer Shabtai who talked about Distributed Training in data centers.

WIth increasing scale, communication over networks becomes a bottleneck, and the question arises, what could be good system designs, protocols, and in-network support strategies to improve performance.

Current distributed machine learning systems typically use a technology called Collective Communication that was developed as a Message Passing Interface (MPI) abstraction for high-performance computing (HPC). Collective Communication is the combination of standardized aggregration and reduction function with communication abstractions, e.g., for "broadcasting" or "unicasting" results.

Collective Communication is implemented a few popular libraries such as OpenMPI and Nvidia's NCCL. When used in IP networks, the communication is usually mapped to iterations of peer-to-peer interactions, e.g., organizing nodes in a ring and sending data for aggregation within such rings. One potential way to achieve better performance would be to perform the aggregation "in the network", as in HPC systems, e.g., using the Scalable hierarchical aggregation protocol (SHArP). Previous work has attempted doing this with P4-based dataplane programming, however such approaches are typically limited due to the mostly stateless operation of the corresponding network elements.

In large-scale training sessions, running over shared infrastructure in multi-tenant data centers, communication needs to respond to congestion, packet loss, server overload etc., i.e., the features of typical transport protocols are needed.

I had previously discussed corresponding challenges and requirements in these Internet Drafts:

In my talk at HKIRW, I discussed ideas for corresponding transport protocols. There are interesting challenges in bringing together reliable communication, congestion control, flow control, single-destination as well multi-destination communication and in-network processing.

Zili Meng: Overview of the Low-Latency Video Delivery Pipeline

Zili talked about requirements for ultra-low latency for interactive streaming for the next-generation of immersive applications. Some application provide really stringent low-latency requirements, with a consistent service quality over many hours, and the talk suggested a better coordination between all elements of the streaming and rendering pipeline.

There was a discussion as to how achievable these requirements are in the Internet and whether applications might be re-designed in terms of providing acceptable user experience even without guaranteed high-bandwidth low-latency service, for example by employing technologies such as semantic communication, prediction, local control loops etc.

Jianfei He: The philosophy behind computer networking

In his talk, Jianfe He asked the question how the field of computer networked can be more precisely defined and how a more systematic could help with the understanding and design of future networked systems.

Specifically, he suggested considering basing design on a solid understanding of potentials and absolute constraints in a certain field, such as Shannon's theory/limit and on the notion of tradeoffs, i.e., consequences of certain design decisions, as represented by the CAP theorem in distributed systems. He mentioned two examples: 1) routing protocols and 2) transport protocols.

For routing protocols, there are well-known tradeoffs between convergence time, scaling limits, and required bandwidths. With changed network properties (bandwidth) – can we reasons about options for shifting the tradeoffs?

For transport protocols, there a goals such as reliability, congestion control etc., and tradeoff relationships between packet loss, line utilization, delay and buffer size. How would designs change if we changed the objective, e.g., to shortest flow completion times or shortest message completion time (or if we looked at collections of flows)? What if we added fairness to these objectives?

Jianfe asked the question whether it was possible to develop these tradeoffs/constraints into a more consistent theory.

Carsten Bormann: Towards a device-infrastructure continuum in IoT and OT networks

Carsten talked about requirements and available technologies for providing a secure management of IoT devices in a device-infrastructure continuum in IoT and OT networks, where scale demands high degrees of automation at run-time and only limited individual device configuration (at installation only). It is no longer possible to manually track each new "Thing" species.


Carsten mentioned technologies such as

  • RFC 8250: Manufacturer's Usage Description (MUD);
  • W3C Web of Things description model; and
  • IETF Semantic Definition Format (SDF).

In his talk, Carsten formulated the goal of "Well-Informed Networking", i.e., an approach where networks can obtain sufficient information about the existing devices, their legitimate communication requirements, and their current status (device health).

Zili Meng: Network Research – Academia, Industry, or Both?

Zili discussed the significance of consistently high numbers industry and industry-only papers at major networking conferences. Often such papers are based on operational experience that can only obtained by companies actually operating corresponding systems.

Sometimes papers seem to get accepted not necessarily on the basis of their technical merits but because they report on "large-scale deployments".

When academics get involved in such work, it is often not in a driving position, but rather through students who work in internship at corresponding companies. Naturally, such papers are not questioning the status quo and are generally not critical of the systems they discuss.

At the workshop, we discussed the changes in the networking research field over the past years, as well as the challenges of successful collaborations between academia and industry.

Written by dkutscher

April 6th, 2024 at 10:55 am

Data-oriented, Decentralized, Daring: Opportunities and Research Challenges for an Information-Centric Web

without comments

Research and development in ICN has led to different communication patterns such as Sync and API implementations such as CNL. It is now time to think about how to leverage Information-Centric principles for providing better foundations for hypermedia applications in the future web. At NDNComm-2024 I talked about how ICN could possibly help, what could be fruitful future research directions, and why web3 and dweb are not the answer.

Material

Presentation

Written by dkutscher

March 7th, 2024 at 7:05 am

Posted in Publications,Talks

Tagged with , ,

Content Retrieval on the Decentralised Web

without comments

Trends and Emerging Technologies for Content Retrieval on the Decentralized Web

The control, governance, and management of the web have become increasingly centralised, resulting in security, privacy, and censorship concerns. Decentralised initiatives have emerged to address these issues, beginning with decentralised file systems. These systems have gained popularity, with major platforms serving millions of content requests daily. Complementing the file systems are decentralised search engines and name registry infrastructures, together forming the basis of a decentralised web. We have published a survey paper that analyses research trends and emerging technologies for content retrieval on the decentralised web, encompassing both academic literature and industrial projects.

Challenges

Several challenges hinder the realisation of a fully decentralised web. Achieving comparable performance to centralised systems without compromising decentralisation is a key challenge. Hybrid infrastructures, blending centralised components with verifiability mechanisms, show promise to improve decentralised initiatives. While decentralised file systems have seen more mature deployments, they still face challenges such as usability, performance, privacy, and content moderation. Integrating these systems with decentralised name-registries offers a potential for improved usability with human-readable and persistent names for content. Further research is needed to address security concerns in decentralised name-registries and enhance governance and crypto-economic incentive mechanisms.

References

Navin V. Keizer, Onur Ascigil, Michał Król, Dirk Kutscher, and George Pavlou; A Survey on Content Retrieval on the Decentralised Web; ACM Computing Surveys; March 2024; https://doi.org/10.1145/3649132

Written by dkutscher

March 7th, 2024 at 6:51 am

Posted in Publications

Tagged with , ,

Towards a Unified Transport Protocol for In-Network Computing in Support of RPC-based Applications

without comments

The emerging term In-Network Computin (INC) [inc] in particular refers applying on-path programmable networking devices (e.g., switches and routers between clients and servers) as an accelerator or function offloader to boost throughput, reduce server load, or improve latency, typically in a well-controlled data center network environment.

Some INC implementations evolved from programmable data plane systems and align with the trend of network programmability at large. In recent year, it has been shown to support many promising applications (e.g., caching, aggregation, and agreement). For example, in distributed machine learning (DML), training nodes produce data (gradients) that needs to be aggregated or reduced -- and the result could be distributed to one or multiple consumers. As another example, the NetClone system [netclone] uses in-network forwarder to replicate RPC invocation messages and to perform more informed forwarding based on observed latencies for accelerating RPC communication.

While it is possible to achieve this kind of operation purely with end-to-end communication between worker nodes, performance can be dramatically improved by offloading both the operation processing and the data dissemination to nodes in the network. These in-network processors are often conceived as semi-transparent performance enhancing on-path elements, i.e., they are not the actual endpoints in transport protocol sessions and would intercept packets with application data and potentially generate new data that they would have to transmit.

In our Internet Draft draft-song-inc-transport-protocol-req-01.txt, we are discussing this problem and are formulating some requirements for the design of future transport protocols in this space.

References

Written by dkutscher

January 25th, 2024 at 7:02 am

IETF Datatracker Document Metadata Processing

without comments

I have created two tools for fetching and formatting metadata for IETF documents (RFCs and Internet Drafts). I sometimes want to create publications lists or just reference IETF documents in other publications, and these tools are intended to automate the process as much as possible.

  1. tracker-doc: for fetching document metadata by user-id (datatracker ID)
  2. bibdoc: for formatting document metadata in text or bibtex format

These are two Clojure scripts that are executed by Babashka – a native Clojure interpreter for scripting.

Install: datatracker-publications on GitHub.

Written by dkutscher

January 5th, 2024 at 7:47 pm

Special Interest Group on Sustainable Network Operations

without comments

IEEE Comsoc

We started an IEEE special interest group on Sustainable Network Operations (SNO), chaired by Alex Clemm and Carlos Pignataro.

With respect to sustainability, communication networks can provide opportunities as well as challenges. Invariably, they play a very important role: from resource and energy efficiency optimization opportunities, enablement of applications that reduce the need for physical travel including teleworking and remote operations, to smarter agriculture, more efficient factory floors, and shifting workloads to compute powered with renewable energy.

Many of today’s improvements are driven by general advances in computing hardware as well as in transmission technologies (both fixed and wireless). While it is critical to capitalize on this hardware and transmission driven opportunity, the complexity and interdependence of this problem calls for a holistic approach: it is important to extend questions of “greenness” to other layers in the networking stack, to different planes (data, control, and management), to routing and traffic forwarding, to the ways in which networks are organized and deployed.

Research Questions

The Special Interest Group (SIG) on Sustainable Network Operations (SNO) aims to encourage and facilitate discussion, exchange of ideas, and development of solutions related to question such as:

  • Can data planes be designed in ways that make them more energy-efficient?
  • What protocol advances could enable greener networking solutions?
  • How can networks be optimized not just for QoS or utilization but for minimizing greenhouse gas emissions and maximizing energy and power efficiency?
  • What is the role of a sustainability orchestration system?
  • What novel tools are needed to operate networks more sustainably?
  • How can peak demand be flattened to minimize waste due to overprovisioning?
  • How can operators take advantage of traffic seasonality?
  • How can we even properly account for energy usage and other sustainability parameters to be optimized?
  • In which ways can network programmability, faster control loops, and AI- or intent-based networking help?

SNO aims to provide a cooperative and open forum for researchers, complementing other vendor-focused fora.

Topics

The areas of interests include, but are not limited to, the following:

  • Network optimization for sustainability and power consumption
  • Carbon-aware internet protocols
  • Network operations and orchestration for sustainability
  • Network instrumentation for energy consumption
  • Energy-efficient VNF placement and Service Function Chaining
  • Pollution-aware / energy-aware / power source-aware routing
    AI/ML techniques for optimization of energy efficiency
  • Analytical models of network sustainability
  • Decentralized power source management
  • Virtual energy and sustainability in virtualized environments
  • Sleep-mode aware orchestration
  • Protocols for rapid resource commissioning/decommissioning
  • Carbon-based accounting for networking services
  • Solution benchmarking with respect for sustainability
  • Cloud-Edge Continuum from a sustainability viewpoint
  • Sustainable capacity planning to minimize overprovisioning
  • Holistic cost optimization, with energy cost as one factor
  • Computing as an element of end-to-end communications
  • Workload adaptivity in the presence of federated learning workloads

Learn more about the SNO SIG at https://cnom.committees.comsoc.org/sustainable-network-operations-sno/.

Written by dkutscher

December 30th, 2023 at 5:10 pm

Posted in Posts

AINTEC-2023

without comments

Written by dkutscher

December 20th, 2023 at 4:18 pm

Posted in Publications

Tagged with

AINTEC Panel on 6G Research

without comments


I had the pleasure of moderating a on panel 6G Research Challenges at AINTEC-2023. The panelists were Serge Fdida, Abhimanyu Gosain, Jim Kurose, and George Michaelson.


Opportunities and Challenges for Future Network Systems Design?

The panel was discussing opportunities and challenges for future network systems design and tried to shed some light on what 6G might actually mean and what interesting research could and should be done.

5G Hype vs Reality

While many people are speculating about possible 6G features, it is quite instructive to review the adoption of current 5G technology. The panel discussed this from different perspectives. It was noted that quite many advanced 5G features, although specified, are not yet available, such as new core designs, low latency communication, positioning, and network slicing.

There may be different reasons for that. One reason that was mentioned the lack of demand. 5G seems to be mostly used as a reasonably fast bitpipe, i.e., as an access technology for mobile broadband. Economically, this means that it is difficult to monetize the network beyond that.

The panel discussed whether WiFi and 5G will integrate as just two "localized" link-level wireless technologies at the Internet edge, or whether 5G will actually provide a global end-to-end network, interconnected to the Internet.

Centralization and new Deployment Models

Another interesting topic is the evolution of deployment models and the changing nature of service provider and infrastructure providers. Not only are hyperscalers providing most of the "over-the-top" functionality and infrastructure today, they are also increasingly providing the cloud infrastructure and telco software functions, such as Microsoft with their "Azure for Operators" platform. The panel also discussed the issues of commercial consolidation and concentration in this regard.

Key Enablers for 6G

We discussed potential key enables for 6G, and the following topics were mentioned:

  • AI/ML Native Interface
  • New Spectrum Technologies: 7-24 GHz, 300GHz-1THz
  • Networking as a Sensor: Shift from Radio KPI to system and service focused
  • Communication-Compute-Data Centric
  • Zero Trust Architecture (ZTA): Security and Trust
  • Open Radio Access Networks

With respect to "Communication-Compute-Data-Centricity", we discussed whether it would be the mobile network infrastructure that would provide features in this direction, e.g., a better integration of computing and networking, or whether the network would just provide the access service, and computing etc. would continue being an application (also see my invited talk on computing in the network at AINTEC-2023). The panel expressed some preference for maintaing a separation of concerns, layering and the end-to-end principle.

Another topic that was discussed was the continuing "softwarization" and the application of Software-Defined Networking (SDN) principles. Future systems may see some more management support for applications (and application-related infrastructure), and there is certainly a trend towards more autonomous management and the use of machine learning for that.

References

Written by dkutscher

December 20th, 2023 at 4:11 pm

Posted in Publications,Talks

Tagged with , ,

Computing in the Network – Lessons Learned and New Opportunities

without comments

The Internet is a distributed system that enables distributed computing applications, from client-server web applications to collaborative multi-media applications. The evolution of both compute server and network infrastructure platforms has fueled the development of new approaches for building more programmable networks and of application support functions in the network.

At the same time, new applications such as IoT data processing, distributed machine learning, decomposed application architectures such as Microservice and distributed computing frameworks introduce new opportunities for the development of more principled approaches towards Computing in the Network.

In my invited talk at AINTEC-2023, I reviewed some promising use cases, highlighted recent relevant research results and discussed several research challenges for conceiving Computing in the Network from an Internet perspective, for example discussing the meaning of "end-to-end communication" and "permissionless innovation" in the light of these new developments.

From "In-Network Computing"...

"In-Network Computing" is a popular but also relatively poorly defined term that comes up a lot in recent research studies. I discussed the different facets such as traditional networked computing, middlebox-like packet processing, active networking, programmable dataplane, Network Functions Virtualization and Service Function Chaning as depicted in the figure below.

In general, we can distinguish two main directions:

  1. Computing on the Network: general distributed computing using Internet technologies for communication, such as the Web and related overlay networks such as CDNs.
  2. Middlebox-like packet processing: intercepting, manipulating, generating, and steering packets has been applied to production networks in data centers and telco networks, often as a performance enhancing approach.

What about Programmable Data Plane?

Programmable Data Plane approaches such as the P4 programming language are often used to implement certain elements of either of these two categories, for example, traffic steering, load balancing etc. There are some point solutions for more application-layer-oriented functionalities such as NetCache, support for distributed consensus protocols, support for distributed machine learning training etc., but these tyically operate under very specific assumptions, and are often at odds with end-to-semantics and security. One example of a productive use of Programmable Data Plane in my opinion was the SIGCOMM-2023 paper on NetClone: Fast, Scalable, and Dynamic Request Cloning for Microsecond-Scale RPCs by Gyuyeong Kim. In this work, programmable switches were used to implemenent request forwarding strategies based on relatively simple packet meta information and observed performance, i.e., without requiring application layer knowledge.

... To "Computing in the Network"

There are many relevant use cases of distributed computing that can benefit from (and urgently need) support from networking and where distributing processing, aggregation etc. with awareness of network topologies, current utilization etc. would make a real difference. We have earlier built such a system and called it Compute-First Networking: Distributed Computing meets ICN (see https://dirk-kutscher.info/publications/distributed-computing-icn/ for background).
I talked about relevant applications such as distributed stream processing, and distributed machine learning. Today, these systems are typically run on the network but could definitely benefit from a better support and from better awareness of the network – so I asked the question whether there is the possibility for a confluence of existing and emerging capabilities of modern hardware and the requirements of relevant distributed computing applications.

Questions I raised included:

  • How can we conceive such a confluence?
  • How can we support distributed computing without giving up layering and principles such as the end-to-end principle?
  • What features do we need from transport protocols to support diverse use cases?

Distributed Machine Learning

Distributed machine learning, e.g., federated learning, is an application that is currently perceived as a major driver for in-network computing. Large-scale training networks are expected to enable higher degrees of parallelization and handling of larger model sizes. How would we run such workloads as distributed systems, within data centers but potentially also across the Internet?

It is important to understand the performance requirements of such systems. Initial systems were build with bespoke High-Performance Computing (HPC) architectures and communication technologies such as Infiniband. Such systems used in-network aggregation functions and defined corresponding architectures such as SHArP.

Today's data center systems employ RDMA and RDMA over Ethernet (RoCE) as low-layer abstraction for efficient packet-based communication on layer 2, without addressing higher layer transport and system design aspects.

Collective Communications

In parallel computing architectures, Message Passing Interface (MPI) is typically used to provide efficient and portable inter-process communication for high-performance computing. One of the concepts developed in MPI is Collective Communication, a set of bespoke data aggregation and distribution patterns for different data-oriented distributed computing scenarios, such as:

  • Broadcasting, e.g., for distributing configuration data or common ML models
  • Scattering: single process involves a single process sending distinct pieces of data to each process
  • Gathering: one process collecting and combining data pieces from other processes
  • All-to-all communications: every process sends data to every other processes
  • Reduction: collect data from all processes, aggregate and send result

Today's Collective Communication implementations are implementing these patterns for different underlaying networks and inter-process facilities. For GPU-based Collective Communications in today's networks, often a ring-based communication is applied, leading to quite some inefficiencies with respect to communication overhead and idle times of the different processors. See this presentation from Tencent at the recent AIDC side meeting at IETF-118. Other implementations use peer-to-peer communication models.

Collective Communication in the Network

From a networking perspective, the question is how to map collective communication better to Internet technology-based networked systems, avoiding unnessary duplication, providing typical transport protocol features such as reliability and congestion control, and enabling an optimal placement of corresponding aggregation functions.

This would incur a set of challenges such as

  • Transport
    • Reliability: underlying network lacks communication reliability
    • Application data units instead of packets
    • Blocking & non-blocking communication modes
    • Security (potentially)
  • Multi-destination delivery
    • IP-Multicast possibly not the best fit
  • Computing in the Network Framework
    • Generic operations as primitives (at least per application domain)
    • Stringent performance requirement
  • Control, Optimizations, Management
    • Topology and utilization awareness
    • Scheduling communication and computation for optimal performance

We discussed these challenges in two recently submitted Internet Drafts on Transport for Collective Communications, and I discussed these issues in more detail during the talk.

Data-Oriented Collective Communications

I proposed the direction of data-oriented Collective Communication and discussed how concepts from Information-Centric distributed computing could possibly employed to achieve efficient and practical multi-destination transport, reliability and congestion control, and flexible placement of aggregation functions with a name-based identity scheme.

Promising features would include:

  • Data-oriented communication model
  • Locator-less model conducive to data production and consumption at different places in the network (computing)
  • Multi-destination delivery included
  • In-network retransmission and caching could help with reliability and performance

However, I also mentioned some challenges:

  • Receiver-driven transport results in polling – efficient enough?
  • RDMA-like communication unexplored
  • Security concept: data-oriented security good – unclear whether it can be afforded
  • Exact scheduling may be at odds with current ICN system design – more work needed

In summary, this seems to be rich field for future systems research. Distributed machine learning drives the development of new concepts for communication and computing. It clearly needs efficient multi-destination communication and an efficient mapping of MPI-inspired Collective Communication. The current abstractions do not fit well, and pure IP packet level communication is too limited. Connection-oriented transport seems to be at odds with the communication semantics, which makes data-oriented communication attractive. Such an approach could work with a name-based approach, i.e., without addresses, which is conducive to data production and consumption. Certainly, the challenging performance requirements call for more research and possibly evolution of current ICN protocols.

References

Written by dkutscher

December 20th, 2023 at 3:31 pm