Archive for the ‘IETF’ Category
Towards a Unified Transport Protocol for In-Network Computing in Support of RPC-based Applications
The emerging term In-Network Computin (INC) [inc] in particular refers applying on-path programmable networking devices (e.g., switches and routers between clients and servers) as an accelerator or function offloader to boost throughput, reduce server load, or improve latency, typically in a well-controlled data center network environment.
Some INC implementations evolved from programmable data plane systems and align with the trend of network programmability at large. In recent year, it has been shown to support many promising applications (e.g., caching, aggregation, and agreement). For example, in distributed machine learning (DML), training nodes produce data (gradients) that needs to be aggregated or reduced -- and the result could be distributed to one or multiple consumers. As another example, the NetClone system [netclone] uses in-network forwarder to replicate RPC invocation messages and to perform more informed forwarding based on observed latencies for accelerating RPC communication.
While it is possible to achieve this kind of operation purely with end-to-end communication between worker nodes, performance can be dramatically improved by offloading both the operation processing and the data dissemination to nodes in the network. These in-network processors are often conceived as semi-transparent performance enhancing on-path elements, i.e., they are not the actual endpoints in transport protocol sessions and would intercept packets with application data and potentially generate new data that they would have to transmit.
In our Internet Draft draft-song-inc-transport-protocol-req-01.txt, we are discussing this problem and are formulating some requirements for the design of future transport protocols in this space.
References
- Collective Communication: Better Network Abstractions for AI
- Computing in the Network – Lessons Learned and New Opportunities
- [I-D.yao-tsvwg-cco-problem-statement-and-usecases] Yao, K., Shiping, X., Li, Y., Huang, H., and D. KUTSCHER, "Collective Communication Optimization: Problem Statement and Use cases", Work in Progress, Internet-Draft, draft-yao-tsvwg-cco-problem-statement-and-usecases-00, 23 October 2023, https://datatracker.ietf.org/doc/html/draft-yao-tsvwg-cco-problem-statement-and-usecases-00.
- [inc] Klenk et al., B., "An In-Network Architecture for Accelerating Shared-Memory Multiprocessor Collectives", ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA), 2020, <https:dx.doi.org/10.1109/ISCA45697.2020.00085>
- [netclone] Kim, G., "NetClone: Fast, Scalable, and Dynamic Request Cloning for Microsecond-Scale RPCs", In Proceedings of the ACM SIGCOMM 2023 Conference (ACM SIGCOMM '23). Association for Computing Machinery, New York, NY, USA, 195-207, 2023 https://dl.acm.org/doi/10.1145/3603269.3604820
Collective Communication: Better Network Abstractions for AI
We have submitted two new Internet Drafts on Collective Communication:
-
Kehan Yao , Xu Shiping , Yizhou Li , Hongyi Huang , Dirk Kutscher; Collective Communication Optimization: Problem Statement and Use cases; Internet Draft draft-yao-tsvwg-cco-problem-statement-and-usecases-00; work in progress; October 2023
-
Kehan Yao , Xu Shiping , Yizhou Li , Hongyi Huang , Dirk Kutscher; Collective Communication Optimization: Requirement and Analysis; Internet Draft draft-yao-tsvwg-cco-requirement-and-analysis-00; work in progress; October 2023
Collective Communication refers to communication between a group of processes in distributed computing contexts, for example involving interaction types such as broadcast, reduce, all-reduce. This data-oriented communication model is employed by distributed machine learning and other data processing systems, such as stream processing. Current Internet network and transport protocols (and corresponding transport layer security) make it difficult to support these interactions in the network, e.g., for aggregating data on topologically optimal nodes for performance enhancements. These two drafts discuss use cases, problems, and initial ideas for requirements for future system and protocol design for Collective Communication. They will be discussed at IETF-118.
RFC 7927: Information-Centric Networking (ICN) Research Challenges
We (ICNRG) published RFC 7927 on Information-Centric Networking (ICN) Research Challenges.
This memo describes research challenges for Information-Centric Networking (ICN), an approach to evolve the Internet infrastructure to directly support information distribution by introducing uniquely named data as a core Internet principle. Data becomes independent from location, application, storage, and means of transportation, enabling or enhancing a number of desirable features, such as security, user mobility, multicast, and in-network caching. Mechanisms for realizing these benefits is the subject of ongoing research in the IRTF and elsewhere. This document describes current research challenges in ICN, including naming, security, routing, system scalability, mobility management, wireless networking, transport services, in-network caching, and network management.
Information-Centric Networking (ICN) is an approach to evolve the Internet infrastructure to directly support accessing Named Data Objects (NDOs) as a first-order network service. Data objects become independent of location, application, storage, and means of transportation, allowing for inexpensive and ubiquitous in-network caching and replication. The expected benefits are improved efficiency and security, better scalability with respect to information/bandwidth demand, and better robustness in challenging communication scenarios.
ICN concepts can be deployed by retooling the protocol stack: name-based data access can be implemented on top of the existing IP infrastructure, e.g., by allowing for named data structures,
ubiquitous caching, and corresponding transport services, or it can be seen as a packet-level internetworking technology that would cause fundamental changes to Internet routing and forwarding. In summary, ICN can evolve the Internet architecture towards a network model based on named data with different properties and different services.
This document presents the ICN research challenges that need to be addressed in order to achieve these goals. These research challenges are seen from a technical perspective, although business relationships between Internet players will also influence developments in this area. We leave business challenges for a separate document, however. The objective of this memo is to document the technical challenges and corresponding current approaches and to expose requirements that should be addressed by future research work.
RFC 7778: Mobile Communication Congestion Exposure
Mobile network designs have to meet several, at first sight contradicting, requirements: maximize resource utilization, provide optimal performannce (user-perceived quality of experience), enable operator-defined "fair usage" policies, maintain user privacy and minimize management complexity.
For 5G networks, virtual network slicing is often mentioned as one the desirable properties, i.e., the ability to run virtual networks for different application classes (service slicing) or different customer groups (MVNOs etc.) over the same physical infrastructure. Virtualizing networks over a larger set of shared resources (radio networks, backhaul, data centers) requires effective and efficient means for capacity sharing.
Capacity sharing can be done in different ways: traditionally, telco network capacity sharing has been inspired by telephony network architectures with an emphasis on control plane-based monitoring, resource allocation and configuration. Such approaches often involve traffic management systems that monitor performance, load etc. of network elements, analyze traffic properties (for example, DPI-based traffic inspection) and configure network elements such as base station and gateways to implement certain rate limits based on operator policies.
Three trends make this difficult in present and future networks:
- with virtualization, slicing etc., the effort of analyzing every single tenant's flows can be increasingly prohibitive;
- encryption-by-default with HTTP2 and other protocols that employ connection-based encryption renders DPI-based approaches costly (at best -- if not impossible); and
- Internet protocols and applications such TCP (transport layer) and DASH-based video streaming over HTTP (application layer) are themselves adaptive to congestion, delay and overall observed performance. New protocols with specific requirements are invented all the time (think IoT, Virtual Reality). Interfering with their control loops through network traffic management may yield bad performance, suboptimal user preference and higher cost overall.
The idea of enabling an effective capacity sharing with a productive cooperation of operator policy decision making and dynamic application/user resource utilization has driven the work in the IETF ConEx working group. Based on earlier work by Bob Briscoe on Re-Feedback, the ConEx WG has defined concepts and (experimental) mechanisms for congestion exposure, enabling a form of capacity sharing that incentivizes senders to respond to congestion signals, while still enabling operators with hooks for auditing and enforcing correct behavior.
RFC 7778 describes how the ConEx mechanisms can be applied to current LTE (EPS) networks, considering their specifics regarding QoS and network architecture. For example, RFC 7778 described how ConEx can
- enable or enhance flow policy-based traffic management;
- reduce the need for complex DPI by allowing for a bulk packet traffic management system that does not have to consider either the application classes flows belong to or the individual sessions; and how it can
- be used to more effectively trigger the offload of selected traffic to a non-3GPP network.
More experiments with ConEx and related capacity sharing mechanisms are needed, but the questions behind ConEx remain important for 5G (and beyond...): How to achieve an effective collaboration of networks and their users (senders and receivers) considering increased need for capacity sharing, increased demand for user privacy (connection encryption) and the permissionless innovation feature of the Internet, i.e., not expecting the network to know all possible application classes and their traffic management requirements.
URIs for Named Information
URIs [RFC3986] are used in various protocols for identifying resources. In many deployments those URIs contain strings that are hash function outputs in order to ensure uniqueness in terms of mapping the URI to a specific resource, or to make URIs hard to guess for security reasons. However, there is no standard way to interpret those strings and so today in general only the creator of the URI knows how to use the hash function output.
In the context of information-centric networking and elsewhere there is value in being able to compare a presented resource against the URI that was de-referenced in order to access that resource. If a cryptographically-strong comparison function can be used then this allows for many forms of in-network storage, without requiring as much trust in the infrastructure used to present the resource. The outputs of hash functions can be used in this manner, if presented in a standard way. There are also many other potential uses for these hash outputs, for example, in terms of binding the URI to an owner via signatures and public keys, mapping between names, handling versioning etc. Many such uses can be based on "wrapping" the object with meta-data, e.g. including signatures, public key certificates etc.
We therefore define the "ni" URI scheme that allows for, but does not insist upon, checking of the integrity of the URI/resource mapping.
The "ni" URI scheme is specified in draft-farrell-ni-00
Towards an Information-Centric Internet with more Things
The Internet is already made of things. However, we expect there
to be many more less-capable things, such as sensors and
actuators, connected to the Internet in years to come. In
parallel, Internet applications are more and more being used to
perform operations on named (information) objects, and various
Information-Centric Networking (ICN) approaches are being
researched in order to allow such applications to work
effectively at scale and with various forms of mobility and in
networking environments that are more challenging than a
traditional access network and data center. In a recent position
paper, we outline some benefits that may accrue, and issues that
arise, should the Internet, with many more things, make use of
the ICN approach to networking and we argue that ICN concepts
should be considered when planning for increases in the number of
things connected to the Internet.
Venue: Interconnecting Smart Objects with the Internet Workshop Prague, Friday, 25th March 2011
Paper: http://www.iab.org/about/workshops/smartobjects/papers/Kutscher.pdf
Presentation: http://www.iab.org/about/workshops/smartobjects/slides/Kutscher.pdf
DECADE Architecture
We have submitted a new version of the DECADE architecture draft, which is now a work item of the IETF DECADE WG.
Abstract:
Peer-to-peer (P2P) applications have become widely used on the
Internet today and make up a large portion of the traffic in many
networks. One technique to improve the network efficiency of P2P
applications is to introduce storage capabilities within the
networks. The DECADE Working Group has been formed with the goal of
developing an architecture to provide this capability. This document
presents an architecture, discusses the underlying principles, and
identifies core components and protocols supporting the architecture.
The Internet Draft: draft-ietf-decade-arch
Mobile Communication Congestion Exposure Scenario
We have written a description about how congestion exposure (as being worked on in the IETF CONEX WG) can be be used in mobile communication networks such as LTE.
Abstract:
This memo describes a mobile communications use case for congestion
exposure (CONEX) with a particular focus on mobile communication
networks such as 3GPP LTE. The draft describes the architecture of
these networks (access and core networks), current QoS mechanisms and
then discusses how congestion exposure concepts could be applied.
Based on this, this memo suggests a set of requirements for CONEX
mechanisms that particularly apply to mobile networks.
The Internet Draft: draft-kutscher-conex-mobile
Bundle Protocol Query Extension Block
Internet Draft on an Bundle Protocol Query Extension Block (draft-farrell-dtnrg-bpq-00)
Abstract:
The Bundle Protocol (BP) provides store-and-forward networking for
Delay- and Disruption-Tolerant Networks. This document defines the
BP query extension block (BPQ) which allows applications to query the
stores of nodes on the path along which a bundle containing a bundle
query extension block is routed.
Requirements for accessing data in network storage
Internet Draft on requirements for accessing data in network storage (Requirements for accessing data in network storage).
Abstract:
The DECoupled Application Data Enroute (DECADE) working group is
specifying standardized interfaces for accessing in-network storage
from applications to store, retrieve and manage data. The main
objective is to provide a framework that is useful to P2P
applications, without excluding other, possibly related applications
that can benefit from accessing in-network storage. This memo
presents Internet TV as a specific application scenario where access
to in-netork storage would be required and lists a set of concrete
requirements that should be considered for the DECADE architecture
and protocol specifications.