Search Results
Compute First Networking (CFN): Distributed Computing meets ICN
Edge- and, more generally, in-network computing is receiving a lot attention in research and industry fora. What are the interesting research questions from a networking perspective? In-network computing can be conceived in many different ways - from active networking, data plane programmability, running virtualized functions, service chaining, to distributed computing. Modern distributed computing frameworks and domain-specific languages provide a convenient and robust way to structure large distributed applications and deploy them on either data center or edge computing environments. The current systems suffer however from the need for a complex underlay of services to allow them to run effectively on existing Internet protocols. These services include centralized schedulers, DNS-based name translation, stateful load balancers, and heavy-weight transport protocols.
Over the past years, we have been working on alternative approaches, trying to find ways for integrating networking and computing in new ways, so that distributed computing can leverage networking capabilities directly and optimize usage of networking and computing resources in a holistic fashion.
From Application-Layer Overlays to In-Network Computing
Domain-specific distributed computing languages like LASP have gained popularity for their ability to simply express complex distributed applications like replicated key-value stores and consensus algorithms. Associated with these languages are execution frameworks like Sapphire and Ray that deal with implementation and deployment issues such as execution scheduling, layering on the network protocol stack, and auto-scaling to match changing workloads. These systems, while elegant and generally exhibiting high performance, are hampered by the daunting complexity hidden in the underlay of services that allow them to run effectively on existing Internet protocols. These services include centralized schedulers, DNS-based name translation, stateful load balancers, and heavy-weight transport protocols.
We claim that, especially for compute functions in the network, it is beneficial to design distributed computing systems in a way that allows for a joint optimization of computing and networking resources by aiming for a tighter integration of computing and networking. For example, leveraging knowledge about data location, available network paths and dynamic network performance can improve system performance and resilience significantly, especially in the presence of dynamic, unpredictable workload changes.
The above goals, we believe, can be met through an alternative approach to network and transport protocols: adopting Information-Centric Networking as the paradigm. ICN, conceived as a networking architecture based on the principle of accessing named data, and specific systems such as NDN and CCNx have accommodated distributed computation through the addition of support for remote function invocation, for example in Named Function Networking, NFN and RICE, Remote Method Invocation in ICN and distributed data set synchronization schemes such as PSync.
Introducing Compute First Networking (CFN)
We propose CFN, a distributed computing environment that provides a general-purpose programming platform with support for both stateless functions and stateful actors. CFN can lay out compute graphs over the available computing platforms in a network to perform flexible load management and performance optimizations, taking into account function/actor location and data location, as well as platform load and network performance.
We have published a paper about CFN at the ACM ICN-2019 Conference that is being presented in Macau today by Michał Król. The paper makes the following contributions:
- CFN marries a state-of-the art distributed computing framework to an ICN underlay through RICE, Remote Method Invocation in ICN. This allows the framework to exploit important properties of ICN such as name-based routing and immutable objects with strong security properties.
- We adopted the rigorous computation graph approach to representing distributed computations, which allows all inputs, state, and outputs (including intermediate results) to be directly visible as named objects. This enables flexible and fine-grained scheduling of computations, caching of results, and tracking state evolution of the computation for logging and debugging.
- CFN maintains the computation graph using Conflict-free Replicated Data Types (CRDTs) and realizes them as named ICN objects. This enables implementation of an efficient and failure-resilient fully- distributed scheduler.
- Through evaluations using ndnSIM simulations, we demonstrate that CFN is applicable to range of different distributed computing scenarios and network topologies.
Resources and Links
- My keynote at IEEE CCNC-2019 on Compute-First Networking (CFN): New Perspectives on Integrating Computing and Networking
- Internet Draft draft-kutscher-coinrg-dir-00 with Jörg Ott and Teemu Kaerkkaeinen on Directions for Computing in the Network (COIN) and the corresponding presentation.
- The CFN paper at ACM ICN-2019: Compute First Networking: Distributed Computing meets ICN. The author team: Michał Król, Spyridon Mastorakis, David Oran, Dirk Kutscher
- CFN builds on our earlier work on RICE, Remote Method Invocation in ICN, authored by: Michał Król, Karim Habak, Karim Habak, Dirk Kutscher, Ioannis Psaras
- 1st ACM CoNEXT Workshop on Emerging in-Network Computing Paradigms (ENCP)
- IRTF Computing in the Network Proposed Research Group (COIRG)
Computing in the Network – Lessons Learned and New Opportunities
The Internet is a distributed system that enables distributed computing applications, from client-server web applications to collaborative multi-media applications. The evolution of both compute server and network infrastructure platforms has fueled the development of new approaches for building more programmable networks and of application support functions in the network.
At the same time, new applications such as IoT data processing, distributed machine learning, decomposed application architectures such as Microservice and distributed computing frameworks introduce new opportunities for the development of more principled approaches towards Computing in the Network.
In my invited talk at AINTEC-2023, I reviewed some promising use cases, highlighted recent relevant research results and discussed several research challenges for conceiving Computing in the Network from an Internet perspective, for example discussing the meaning of "end-to-end communication" and "permissionless innovation" in the light of these new developments.
From "In-Network Computing"...
"In-Network Computing" is a popular but also relatively poorly defined term that comes up a lot in recent research studies. I discussed the different facets such as traditional networked computing, middlebox-like packet processing, active networking, programmable dataplane, Network Functions Virtualization and Service Function Chaning as depicted in the figure below.
In general, we can distinguish two main directions:
- Computing on the Network: general distributed computing using Internet technologies for communication, such as the Web and related overlay networks such as CDNs.
- Middlebox-like packet processing: intercepting, manipulating, generating, and steering packets has been applied to production networks in data centers and telco networks, often as a performance enhancing approach.
What about Programmable Data Plane?
Programmable Data Plane approaches such as the P4 programming language are often used to implement certain elements of either of these two categories, for example, traffic steering, load balancing etc. There are some point solutions for more application-layer-oriented functionalities such as NetCache, support for distributed consensus protocols, support for distributed machine learning training etc., but these tyically operate under very specific assumptions, and are often at odds with end-to-semantics and security. One example of a productive use of Programmable Data Plane in my opinion was the SIGCOMM-2023 paper on NetClone: Fast, Scalable, and Dynamic Request Cloning for Microsecond-Scale RPCs by Gyuyeong Kim. In this work, programmable switches were used to implemenent request forwarding strategies based on relatively simple packet meta information and observed performance, i.e., without requiring application layer knowledge.
... To "Computing in the Network"
There are many relevant use cases of distributed computing that can benefit from (and urgently need) support from networking and where distributing processing, aggregation etc. with awareness of network topologies, current utilization etc. would make a real difference. We have earlier built such a system and called it Compute-First Networking: Distributed Computing meets ICN (see https://dirk-kutscher.info/publications/distributed-computing-icn/ for background).
I talked about relevant applications such as distributed stream processing, and distributed machine learning. Today, these systems are typically run on the network but could definitely benefit from a better support and from better awareness of the network – so I asked the question whether there is the possibility for a confluence of existing and emerging capabilities of modern hardware and the requirements of relevant distributed computing applications.
Questions I raised included:
- How can we conceive such a confluence?
- How can we support distributed computing without giving up layering and principles such as the end-to-end principle?
- What features do we need from transport protocols to support diverse use cases?
Distributed Machine Learning
Distributed machine learning, e.g., federated learning, is an application that is currently perceived as a major driver for in-network computing. Large-scale training networks are expected to enable higher degrees of parallelization and handling of larger model sizes. How would we run such workloads as distributed systems, within data centers but potentially also across the Internet?
It is important to understand the performance requirements of such systems. Initial systems were build with bespoke High-Performance Computing (HPC) architectures and communication technologies such as Infiniband. Such systems used in-network aggregation functions and defined corresponding architectures such as SHArP.
Today's data center systems employ RDMA and RDMA over Ethernet (RoCE) as low-layer abstraction for efficient packet-based communication on layer 2, without addressing higher layer transport and system design aspects.
Collective Communications
In parallel computing architectures, Message Passing Interface (MPI) is typically used to provide efficient and portable inter-process communication for high-performance computing. One of the concepts developed in MPI is Collective Communication, a set of bespoke data aggregation and distribution patterns for different data-oriented distributed computing scenarios, such as:
- Broadcasting, e.g., for distributing configuration data or common ML models
- Scattering: single process involves a single process sending distinct pieces of data to each process
- Gathering: one process collecting and combining data pieces from other processes
- All-to-all communications: every process sends data to every other processes
- Reduction: collect data from all processes, aggregate and send result
Today's Collective Communication implementations are implementing these patterns for different underlaying networks and inter-process facilities. For GPU-based Collective Communications in today's networks, often a ring-based communication is applied, leading to quite some inefficiencies with respect to communication overhead and idle times of the different processors. See this presentation from Tencent at the recent AIDC side meeting at IETF-118. Other implementations use peer-to-peer communication models.
Collective Communication in the Network
From a networking perspective, the question is how to map collective communication better to Internet technology-based networked systems, avoiding unnessary duplication, providing typical transport protocol features such as reliability and congestion control, and enabling an optimal placement of corresponding aggregation functions.
This would incur a set of challenges such as
- Transport
- Reliability: underlying network lacks communication reliability
- Application data units instead of packets
- Blocking & non-blocking communication modes
- Security (potentially)
- Multi-destination delivery
- IP-Multicast possibly not the best fit
- Computing in the Network Framework
- Generic operations as primitives (at least per application domain)
- Stringent performance requirement
- Control, Optimizations, Management
- Topology and utilization awareness
- Scheduling communication and computation for optimal performance
We discussed these challenges in two recently submitted Internet Drafts on Transport for Collective Communications, and I discussed these issues in more detail during the talk.
Data-Oriented Collective Communications
I proposed the direction of data-oriented Collective Communication and discussed how concepts from Information-Centric distributed computing could possibly employed to achieve efficient and practical multi-destination transport, reliability and congestion control, and flexible placement of aggregation functions with a name-based identity scheme.
Promising features would include:
- Data-oriented communication model
- Locator-less model conducive to data production and consumption at different places in the network (computing)
- Multi-destination delivery included
- In-network retransmission and caching could help with reliability and performance
However, I also mentioned some challenges:
- Receiver-driven transport results in polling – efficient enough?
- RDMA-like communication unexplored
- Security concept: data-oriented security good – unclear whether it can be afforded
- Exact scheduling may be at odds with current ICN system design – more work needed
In summary, this seems to be rich field for future systems research. Distributed machine learning drives the development of new concepts for communication and computing. It clearly needs efficient multi-destination communication and an efficient mapping of MPI-inspired Collective Communication. The current abstractions do not fit well, and pure IP packet level communication is too limited. Connection-oriented transport seems to be at odds with the communication semantics, which makes data-oriented communication attractive. Such an approach could work with a name-based approach, i.e., without addresses, which is conducive to data production and consumption. Certainly, the challenging performance requirements call for more research and possibly evolution of current ICN protocols.
References
- [CFN-ICN] Compute-First Networking: Distributed Computing meets ICN
- [DISTCOMPICN] Distributed Computing in ICN
- [IETFCollectiveCommunications] Collective Communication: Better Network Abstractions for AI
- [IETF118AIDC] Side meeting at IETF-118 on AI in Data Centers
- [IETF118CC] Side meeting at IETF-118 on Collective Communications
- [NETCLONE] NetClone: Fast, Scalable, and Dynamic Request Cloning for Microsecond-Scale RPCs
- [RoCE] RDMA over Ethernet (RoCE)
- [SHARP] Richard L. Graham, Devendar Bureddy, Pak Lui, Hal Rosenstock, Gilad Shainer, Gil Bloch, Dror Goldenerg, Mike Dubman, Sasha Kotchubievsky, Vladimir Koushnir, Lion Levi, Alex Margolin, Tamir Ronen, Alexander Shpiner, Oded Wertheim, and Eitan Zahavi. 2016. Scalable hierarchical aggregation protocol (SHArP): a hardware architecture for efficient data reduction. In Proceedings of the First Workshop on Optimization of Communication in HPC (COM-HPC '16). IEEE Press, 1–10.
Distributed Computing in Information-Centric Networking
This is an introduction to our paper:
- Wei Geng, Yulong Zhang, Dirk Kutscher, Abhishek Kumar, Sasu Tarkoma, Pan Hui; Sok: Distributed Computing in ICN; 10th ACM Conference on Information-Centric Networking (ACM ICN '23); October 9 — 10, 2023, Reykjavik, Iceland; https://doi.org/10.1145/3623565.3623712; pre-print available at https://arxiv.org/abs/2309.08973.
Distributed computing is the basis for all relevant applications on the Internet. Based on well-established principles, different mechanisms, implementations, and applications have been developed that form the foundation of the modern Web.
The Internet with its stateless forwarding service and end-to-endcommunication model promotes certain types of communication for distributed computing. For example, IP addresses and/or DNS names provide different means for identifying computing components. Reliable transport protocols (e.g., TCP, QUIC) promote interconnecting modules. Communication patterns such as REST and protocol implementations such as HTTP enable certain types of distributed computing interactions, and security frameworks such as TLS and the web PKI constrain the use of public-key cryptography for different security functions.
From Distributed Computing...
Distributed computing has different facets, for example, client-server computing, web services, stream processing, distributed consensus systems, and Turing-complete distributed computing platforms. There are also different perspectives on how distributed computing should be implemented on servers and network platforms, a research area that we refer to as Computing in the Network. Active Networking, one of the earliest works on computing in the network, intended to inject programmability and customization of data packets in the network itself; however, security and complexity considerations proved to be major limiting factors, preventing its wider deployment.
Dataplane programmability refers to the ability to program behavior, including application logic, on network elements and SmartNICs, thus enabling some form in-network computing. Alternatively, different types of server platforms and light-weight execution environments are enabling other forms of distributing computation in networked systems, such as architectural patterns, such as edge computing.
... To Computing in the Network
With currently available Internet technologies, we can observe a relatively succinct layering of networking and distributed computing, i.e., distributed computing is typically implemented in overlays with Content Distribution Networks (CDNs) being prominent and ubiquitous example. Recently, there has been growing interest in revisiting this relationship, for example by the IRTF Computing in the NetworkResearch Group (COINRG) – motivated by advances in network and server platforms, e.g., through the development of programmable data plane platforms and the development of different types of distributed computing frameworks, e.g., stream processing and microservice frameworks.
This is also motivated by the recent development of new distributed computing applications such as distributed machine learning (ML), and emerging new applications such as Metaverse suggest new levels of scale in terms of data volume for distributed computing and the pervasiveness of distributed computing tasks in such systems. There are two research questions that stem from these developments:
-
How can we build distributed computing systems in the network that can leverage the on-path location of compute functions, e.g., optimally aligning stream processing topologies with networked computing platform topologies?
-
How can the network support distributed computing in general, so that the design and operation of such systems can be simplified, but also so that different optimizations can be achieved to improve performance and robustness?
Issues in Legacy Distributed Computing
Although there are many distributed computing applications, it is also worth noting that there are many limitations and performance issues. Factors such as network latency, data skew, checkpoint overhead, back pressure, garbage collection overhead, and issues related to performance, memory management, and serialization and deserialization overhead can all influence the efficiency. Various optimization techniques can be implemented to alleviate these issues, including memory adjustment, refining the checkpointing process, and adopting efficient data structures and algorithms.
Some performance problems and complexity issues stem from the overlay nature of current systems and their way of achieving the above-mentioned mechanisms with temporary solutions based on TCP/IP and associated protocols such as DNS. For example, Network Service Mesh has been characterized as architecturally complex because of the so-called sidecar approaches and their implementation problems.
In systems that are layered on top of HTTP or TCP (or QUIC), compute nodes typically cannot assess the network performance directly – only indirectly through observed throughput and buffer under-runs. Information-centric data-flow systems, such as IceFlow, intend to provide better visibility and thus better joint optimization potential by more direct access to data-oriented communication resources. Then, some coordination tasks that are based on exchanging updates of shared application state can be elegantly mapped to named data publication in a hierarchical namespace, as the different dataset synchronization (Sync) protocols in NDN demonstrated.
Information-Centric Distributed Computing
In our paper on Distributed Computing in ICN at ACM ICN-2023, we focus on distributed computing and on how information-centricity in the network and application layer can support the development and operation of such systems. The rich set of distributed computing systems in ICN suggests that ICN provides some benefits for distributed computing that could offer advantages such as better performance, security, and productivity when building corresponding applications.
ICN with its data-oriented operation and generally more powerful forwarding layer provides an attractive platform for distributed computing. Several different distributed computing protocols and systems have been proposed for ICN, with different feature sets and different technical approaches, including Remote Method Invocation (RMI) as an interaction model as well as more comprehensive distributed computing platforms. RMI systems such as RICE leverage the fundamental named-based forwarding service in ICN systems and map requests to Interest messages and method names to content names (although the actual implementation is more intricate). Method parameters and results are also represented as content objects, which provides an elegant platform for such interactions.
ICN generally attempts to provide a more useful service to data-oriented applications but can also be leveraged to support distributed computing specifically.
Names
Accessing named data in the network as a native service can remove the need for mapping application logic identifiers such as function names to network and process identifiers (IP addresses, port numbers), thus simplifying implementation and run-time operation, as demonstrated by systems such as Named Function Networking (NFN), RICE, and IceFlow. It is worth noting that, although ICN does not generally require an explicit mapping of names to other domain identifiers, such networks require suitable forwarding state, e.g., obtained from configuration, dynamic learning, or routing.
Data-orientedness
ICN's notion of immutable data with strong name-content binding through cryptographic signatures and hashes seems to be conducive to many distributed computing scenarios, as both static data objects and dynamic computation results in those systems such as input parameters and result values can be directly sent as ICN data objects. NFN has first demonstrated this.
Securing distributed computing could be supported better in so far as ICN does not require additional dependencies on public-key or pipe securing infrastructure, as keys and certificates are simply named data objects and centralized trust anchors are not necessarily needed. Larger data collections can be aggregated and re-purposed by manifests (FLIC), enabling "small" and "big data" computing in one single framework that is congruent to the packet-level communication in a network. IceFlow uses such an aggregation approach to share identical stream processing results objects in multiple consumer contexts.
Data-orientedness eliminates the need for connections; even reliable communication in ICN is completely data-oriented. If higher-layer (distributed computing) transactions can be mapped to the network layer data retrieval, then server complexity can be reduced (no need to maintain several connections), and consumers get direct visibility into network performance. This can enable performance optimizations, such as linking network and computing flow control loops (one realization of joint optimization), as showed by IceFlow.
Location independence and data sharing
Embracing the principle of accessing named and authenticated data also enables location independence, i.e., corresponding data can be obtained from any place in the network, such as replication points (repos) and caches. This fundamentally enables better multi-source/path capabilities as well as data sharing, i.e., multiple data retrieval operations for one named data object by different consumers can potentially be completed by a cache, repo, or peer in the network.
Stateful Forwarding
ICN provides stateful, symmetric forwarding, which enables general performance optimizations such as in-network retransmissions, more control over multipath forwarding, and load balancing. This concept could be extended to support distributed computing specifically, for example, if load balancing is performed based on RTT observations for idempotent remote-method invocations.
More Networking, less Management
The combination of data-oriented, connection-less operation, and stateful (more powerful) forwarding in ICN shifts functionality from management and orchestration layers (back) to the network layer, which can enable complexity reduction, which can be especially pronounced in distributed computing. For example, legacy stream processing and service mesh platforms typically must manage connectivity between deployment units (pods in Kubernetes). In Apache Flink, a central orchestrator manages the connections between task managers (node agents). Systems such as IceFlow have demonstrated a more self-organized and decentralized stream-processing approach, and the presented principles are applicable to other forms of distributed computing.
In summary, we can observe that ICN's general approach of having the network providing a more natural (data retrieval) platform for applications benefits distributed computing in similar ways as it benefits other applications. One particularly promising approach is the elimination of layer barriers, which enables certain optimizations.
In addition to NFN, there are other approaches that jointly optimize the utilization of network and computing resources to provide network service mesh-like platforms, such as edge intelligence using federated learning, advanced CDNs where nodes can dynamically adapt to user demands according to content popularity, such as iCDN and OpenCDN, and general computing systems, such as Compute-First Networking, IceFlow, and ICedge.
Our paper on Distributed Computing in ICN at ACM ICN-2023 provides a comprehensive analysis and understanding of distributed computing systems in ICN, based on a survey of more than 50 papers. Naturally, these different efforts cannot be directly compared due to their difference in nature. We categorized different ICN distributed computing systems, and individual approaches and highlighted their specific properties.
The scope of this study is technologies for ICN-enabled distributed computing. Specifically, we divide the different approaches into four categories, as shown in the figure above: enablers, protocols, orchestration, and applications. The contributions of this study are as follows:
- A discussion of the benefits and challenges of distributed computing in ICN.
- A categorization of different proposed distributed computing systems in ICN.
- A discussion of lessons learned from these systems.
- A discussion of existing challenges and promising directions for future work.
Recent Research on Distributed Computing in ICN
I am providing some pointers to my previous research on distributed computing in ICN below.
The paper that has led to this article:
- Wei Geng, Yulong Zhang, Dirk Kutscher, Abhishek Kumar, Sasu Tarkoma, Pan Hui; Sok: Distributed Computing in ICN; 10th ACM Conference on Information-Centric Networking (ACM ICN '23); October 9 — 10, 2023, Reykjavik, Iceland; https://doi.org/10.1145/3623565.3623712; pre-print available at https://arxiv.org/abs/2309.08973.
Current work in the Computing in the Network Research Group of the IRTF:
- Dirk Kutscher, Teemu Kärkkäinen, Jörg Ott; Directions for Computing in the Network; Internet Draft draft-irtf-coinrg-dir-00, Work in Progress; August 2023
Reflexive Forwarding and Remote Method Invocation
Providing a unified remote computation capability in ICN presents some unique challenges, among which are timer management, client authorization, and binding to state held by servers, while maintaining the advantages of ICN protocol designs like CCN and NDN. In the RICE work,we developed a unified approach to remote function invocation in ICN that exploits the attractive ICN properties of name-based routing, receiver-driven flow and congestion control, flow balance, and object-oriented security while presenting a natural programming model to the application developer. The RICE protocol is leveraging an ICN extension called Reflexive Forwarding that provides ICN-idiomatic method parameter transmission.
- RICE: Remote Method Invocation in ICN (best paper award at ACM ICN-2018)
- Reflexive Forwarding in ICN
Distributed Computing Frameworks
Leveraging RICE as a mechanism, we have developed Compute-First Networking (CFN) in ICN, a Turing-complete distributed computing platform. IceFlow is a proposal for Dataflow in ICN in a decentralized manner.
- Compute-First Networking (CFN): Distributed Computing Meets ICN
- IceFlow: Information-Centric Dataflow: Re-Imagining Reactive Distributed Computing
Applications
Based on Reflexive Forwarding, we have developed a concept for RESTful ICN that leverages CCNx key exchange for setting up security contexts and keys that could then be used for secure, data-oriented REST-like communication.
Delay-Tolerant LoRa leveraged Reflexive Forwarding to enable constrained LoRa nodes to "phone home" when they want to transmit data, thus enabling new ways (without central network and application servers) for connecting LoRa networks to the Internet.
Reflexive Forwarding in Named Data Networking
Current Information-Centric Networking protocols such as CCNx and NDN have a wide range of useful applications in content retrieval and other scenarios that depend only on a robust two-way exchange in the form of a request and response (represented by an Interest-Data exchange in the case of the two protocols noted above). A number of important applications however, require placing large amounts of data in the Interest message, and/or more than one two-way handshake.
While these can be accomplished using independent Interest-Data exchanges by reversing the roles of consumer and producer, such approaches can be both clumsy for applications and problematic from a state management, congestion control, or security standpoint. Reflexive Forwarding is a proposed extension to the CCNx and NDN protocol architectures that eliminates the problems inherent in using independent Interest-Data exchanges for such applications.
The protocol is specified in draft-oran-icnrg-reflexive-forwarding and has been used in a few of our research projects such as:
- RICE: Remote Method Invocation in ICN (best paper award at ACM ICN-2018)
- Compute-First Networking (CFN): Distributed Computing Meets ICN
- RESTful ICN
- Delay-Tolerant LoRa ICN Networking
My student intern Xinchen Jin from ShanghaiTech has implemented the Reflexing Forwarding specification in NDN (with modifications to ndn-cxx and NFD) and set up a testbed in mini-NDN for experiments over multiple forwarders.
Resources
Recruiting PostDocs, PhD and MPhil Students for Networked Systems Research
I am looking for PostDocs, PhD students, and MPhil students for joining my Networked Systems team at The Hong Kong University of Science and Technology in Guangzhou, China.
HKUST is a leading international research university ranked 1st by Times Higher Education Young University Rankings 2020 and 27th by QS World University Rankings 2021. Our new HKUST(GZ) campus in Guangzhou synergizes with and maintains the same academic standard as the original Hong Kong Clear Water Bay campus.
HKUST(GZ) follows a new innovative cross-discplinary approach, where computer science research interacts with hard and natural sciences, system engineering and socio-economic research.
Research Areas
I am pursuing systems research on topics such as:
- Distributed Computing and Networking (Compute-First Networking, Computing in the Network);
- Information-Centric Networking (ICN); and
- Internet architecture and decentralized communication.
We are addressing different applications such as:
- Enabling new networked systems such as next-generation Web, network-supported AR/VR ("Metaverse");
- Advancing the Internet and the Web to a more secure, privacy-preserving and overall more user-centric infrastructure
- Secure and scalable edge computing;
- Infrastructure for data science; and
- Data-oriented IoT.
Expected Qualifications and Background
- Ability to build software systems;
- Knowledge in computer networking and distributed systems; and
- Ambition to combine excellent research with building systems and artefacts that matter.
If you are interested in joining HKUST(GZ) as postdoc, postgraduate, or MPhil student please feel free to reach out to me. My e-mail address at HKUST: dku@ust.hk
Links
Dagstuhl Seminar on Compute-First Networking
Eve Schooler, Jon Crowcroft, Phil Eardley, and myself organized an online Dagstuhl seminar on Compute-First Networking earlier in June that was attended by an excellent group of researchers from distributed computing, networking and data analytics communities.
Dagstuhl has now published the seminar report that discusses new perspectives on doing Computing in the Networking, use cases and that includes many references to relevant literature and on-going projects in the field.
Executive Summary
Edge- and more generally In-Network Computing are key elements in many traditional content distribution services today, typically connecting cloud-based computing to consumers. The advent of new programmable hardware platforms, research and wide deployment of distributed computing technologies for data processing, as well as new exciting use cases such as distributed Machine Learning and Metaverse-style ubiquitous computing are now inspiring research of more fine-granular and more principled approaches to distributed computing in the "Edge-To-Cloud Continuum".
The Compute-First Networking Dagstuhl seminar has brought together researchers and practitioners in the fields of distributed computing, network programmability, Internet of Things, and data analytics to explore the potential, possible technological components, as well as open research questions in an exciting new field that will likely induce a paradigm shift for networking and its relationship with computing.
Traditional overlay-based in-network computing is typically limited to quite specific purposes, for example CDN-style edge computing. At the same time, network programmability approaches such as Software-Defined Networking and corresponding languages such as P4 are often perceived as too limited for application-level programming. Compute-First Networking (CFN) views networking and computing holistically and aims at leveraging network programmability, server- and serverless in-network computing and modern distributed computing abstraction to develop a new system's approach for an environment where computing is not merely and add-on to existing networks, but where networking is re-imagined with a broader and ubiquitous notion of programmability.
We expect this approach to enable several benefits: it can help to unlock distributed computing from the existing silos of individual cloud and CDN platforms – a necessary condition to enable Keiichi Matsuda's vision of Hyper-Reality and Metaverse concepts where the physical world, human users and different forms of analytics, and visual rendering services constantly engage in information exchanges, directly at the edges of the network. It can also help to provide reliable, scalable, privacy-preserving and universally available platforms for Distributed Machine Learning applications that will play a key role in future large-scale data collection and analytics.
CFN's integrated approach allows for several optimizations, for example a more informed and more adaptive resource optimization that can take into account dynamically changing network conditions, availability of utilization of compute platforms as well as application requirements and adaptation boundaries, thus enabling more
responsive and better-performing applications.
Several interesting research challenges have been identified that should be addressed in order to realize the CFN vision: How should the different levels of programmability in todays system be integrated into a consistent approach? How would programming and communication abstractions look like? How do orchestration systems need to evolve in order to be usable in these potentially large scale scenarios? How can be guarantee security and privacy properties of a distributed computing slice without having to rely on just location attributes? How would the special requirements and properties of relevant applications such as Distributed Machine best be mapped to CFN – or should distributed data processing for federated or split Machine Learning play a more prominent role in designing CFN abstractions?
This seminar was an important first step in identifying the potential and a first set of interesting new research challenges for re-imaging distributed computing through CFN – an exciting new topic for networking and distributed computing research.
ACM ICN-2020 Highlights
ACM ICN-2020 took place online from September 29th to October 1st 2020. This is a quick summary of the main technical highlights from my personal perspective. Overall, it was a high-quality event, and it was great to see the progress that is being made by different teams. Here, I am focusing specifically on Architecture, Content Distribution, Programmability, and Performance. If you are interested in the complete program, all papers, presentation material, and presentation videos are available on the conference website.
Architecture
The Information-Centric Networking concept can be implemented in different ways (and some people would argue that some overlay systems for content distribution and data processing are essentially information-centric). ICN systems have often been associated with clean-slate approaches, requiring difficult to imagine fork-lift replacement of larger parts of the infrastructure. While this has never the case (because you can always run ICN protocols over different underlays or directly map the semantics to IPv6), it is still interesting to learn about new approaches and to compare existing data-oriented frameworks to pure ICN systems.
Named-Data Transport
In their paper Named-Data Transport: An End-to-End Approach for an Information-Centric IP Internet (Presentation) Abdulazaz Albalawi and J. J. Garcia-Luna-Aceves have developed an alternative implementation of the accessing named data concept called Named-Data Transport (NDT) that can leverage existing Internet routing and DNS, while still providing the general properties (accessing named-data securely, in-network caching, receiver-driven operation).
The system is based on three components: 1) A connection-free reliable transport protocol, called Named Data Transport Protocol (NDTP), 2) a DNS extension (my-DNS) for manifest records that describe content items and their chunks, and 3) NDT Proxies that act as transparent caches and that track pending requests, similar to ICN forwarders, but at the transport layer.
In NDT, content names are based on DNS domain names, and each name is mapped to an individual manifest record (in the DNS). These records provide a mapping to a list of IP addresses hosting content replicas. When requesting such records, the idea is that the system would be able apply similar traffic steering as today's CDNs, i.e., provide the requestor with a list of topologically close locations. Producers would be responsible for producing and publishing such manifests.
The Named Data Transport Protocol (NDTP) is a receiver-driven transport protocol (on top of UDP) used by consumers and NDT Proxies which behave logically like ICN forwarders. There is more to the whole approach (such as security, name privacy etc.).
In my view, NDT is an example of a resolution-based ICN system with interesting ideas for deployability. In principle, resolution-based ICN has been pursued by other approaches before (such as NetInf). In general, these systems have a better initial deployment story at the cost of requiring additional infrastructure (and resolution steps during operation.)
RESTful Information-Centric Web of Things
In the Internet of Things, ICN has demonstrated many benefits in terms of reduced code complexity, better data availability, and reduced communication overhead compared to many vertically integrated IoT stacks and location/connection-based protocols.
In their paper Toward a RESTful Information-Centric Web of Things: A Deeper Look at Data Orientation in CoAP (presentation), Cenk Gündoğan, Christian Amsüss, Thomas C. Schmidt, and Matthias Wählisch compare a CoAP and OSCORE (Object Security for Constrained RESTFul Environments) based network of CoAP clients, servers, and proxies with a corresponding NDN setup.
The authors investigated the possibility of building a restful Web of Things that adheres to ICN first principles using the CoAP protocol suite (instead of a native ICN protocol framework). The results showed, since CoAP is quite modular and can be used in different ways, this is indeed possible, if one is willing to give up strict end-to-end semantics and to introduce proxies that mimic ICN forwarder behavior. (The paper reports on many other things, such as extensive performance measurements and comparisons.)
In my view, this is an interesting Gedankenexperiment, and there was a lively discussion at the conference. One of the discussion topics was the question how accurate the comparison really is. For example, while is is possible to construct a CoAP proxy chain that mimics ICN behavior, real-world scenarios would require additional functionality in the CoAP network (routing, dealing with disruptions etc.) that might lead to a different level of complexity (that would possibly be less pronounced in an native ICN environment).
Still, the important take-away of this paper is that some applications of CoAP & OSCORE exhibit information-centric properties, and it is an interesting question whether, for a green-field deployment, the user would not be better served by a native ICN approach.
Content Distribution
Content Distribution and ICN have a long history, sometimes challenged by some misunderstandings. Because one of the early ICN approaches was called Content-Centric Networking (CCN), it was often assumed that ICN would disrupt or replace Content Distribution Networks (CDNs) or that it was a CDN-like technology.
While ICN will certainly help with large-scale content distribution and potentially also change/simplify CDN operations, the core idea is actually about accessing named data securely as a principal network service -- for all applications (that's why Named Data Networking -- NDN -- is a better name).
Managed content distribution as such will continue to be important, even in an ICN world. Surely, it will enjoy better support from the network as today's CDN can expect, thus enabling new exciting applications and simplifying operations, but I prefer avoiding the notion of ICN replacing CDN.
When looking at actual networks and applications today, it is fair to say that almost nothing works without CDN. What we are seeing today is hyperscalers and essentially all the (so-called) OTT video providers extending their systems into ISP networks, by simply shipping standalone edge caches such as Netflix OCA servers as standalone systems to ISPs.
Each of these providers have their own special requirements of how to map customers to edge caches, how to implement traffic steering etc, which is painful enough for operators already. I expect this to become even more pressing as we shift more and more linear live TV to the Internet. Flash-crowd audiences such as viewers of UEFA Champions' League matches will require a massive extension of the already extensive edge caching infrastructure and require massive investments but also significant complexity with respect to traffic steering and guaranteeing a decent viewing experience.
In that context, it is no wonder that people try to resort to IP-Multicast for ensuring a more scaleable last-mile distribution such as this proposal by Akamai and others. Marrying IP-Multicast with a CDN-overlay is (IMO) not exactly complexity reduction, so I think we are now at a tipping point where the Internet in terms of concepts and deployable physical infrastructure can provide many cool services, but where the limited features of the network layers requires a prohibitive amount of complexity -- to an extend where people start looking for better solutions.
At ICN-2020, CDN was thus discussed quite extensively again -- with many interesting, complementary contributions.
Keynote by Bruce Maggs on The Economics of Content Distribution
We were extremely happy to have Bruce Maggs (Emerald Innovations, on leave from Duke University, ex NEC researcher, one of the founding employees of Akamai) delivering his keynote on the Economics of Content Delivery. In his talk Bruce explained different economic aspects (flow of payments, cost of goods sold) but also challenges for different CDN services such as live-streaming.
The take-aways for ICN were:
- Incentives and cost must be aligned
- Performance benefits from caching
- Reducing latency is valuable to content providers
- Reducing network is valuable to ISPs.
- If there was caching at the core (in addition to the edge)
- What is the additional benefit?
- Who pays for that?
- Protocol innovation is still possible
- In the past, people thought that HTTP/TLS/TPC/IP is difficult to overcome
- QUIC demonstrates that new protocols can be introduced
The socio-economic discussion resonated quite well with me, as some of earlier ICN projects in Europe tried to address these aspects relatively early in 2008. I believe this was due to the operator and vendor influence at the time. In retrospect, I would say that the approaches at that time were possibly too much top-down and premature (trying to revert value chains and find new business models). It is only now that we understand the economics of CDN, its complexity and real cost that (in my view) represent barriers to innovation -- and that we can start to imagine actually implementing different systems.
Far Cry: Will CDNs Hear NDN's Call?
In their paper Far Cry: Will CDNs Hear CDN's Call? (presentation), Chavoosh Ghasemi, Hamed Yousefi, and Beichuan Zhang tried to compare NDN with enterprise CDN (a particular variant of CDN) with respect to caching and retrieval of static contents.
In their work, the authors deployed an adaptive video streaming service over three different networks: Akamai, Fastly, and the NDN testbed. They had users in four different continents and conducted a two-week experiment, comparing Quality of Experience, Origin workload, failure resiliency, and content security.
I cannot summarize of all of the results here, but the conclusions by the authors were:
- CDNs outperform the current NDN testbed deployment in terms of QoE (achievable video resolution in a DASH-setting)
- Origin workload and failure resiliency are mainly the products of the network design -- and the NDN testbed outperforms current CDNs
- More as an interpretation: NDN can realize a resilient, secure, and scalable content network given appropriate software and protocol maturity and hardware resources.
The paper was discussed intensively at the conference , for example, it was debated how comparable the plain NDN testbed and its network service really are -- to a production-level CDN.
In my view, the value of this paper lies in the created experiment facilities and the attempt to establish some ground truth (based on current NDN maturity). I hope that this work can leverage by more experiments in the future.
iCDN: An NDN-based CDN
In their paper iCDN: An NDN-based CDN (presentation), Chavoosh Ghasemi, Hamed Yousefi, and Beichuan Zhang (i.e., the same authors), pursue a more forward-looking approach. In this paper, they develop a CDN service based on ICN mechanisms, i.e., trying to conceive a future CDN system that does not need to take the current network's limitations into account.
One of the interesting ICN properties is that the main service of accessing named data does not require any notion of location. Sometimes people assume that an Information-Centric system always needs to map names to locators such as IP addresses, but this is a really limited view. Instead, it is possible to build the network solely on forwarding INTERESTs for named data based on forwarding information of that same namespace. A forwarder may have more than forwarding info base entry for the same name -- from a consumer (application) perspective these are completely equivalent.
Because of intrinsic object security, it does not matter from which particular host a content object is served. There can be several copies -- all equivalent. When creating copies of original content, e.g., by cloning a data repository, the new copy needs to be announced (by injecting routing information) , and from that point on, it is reachable without any additional management, configuration or other out-of-band mechanisms.
When applying this notion to CDN scenarios, it is easy to understand the simplification opportunities. In ICN, content repositories can be added to the network, and in-network name-based forwarding will find the closest copy automatically.
For iCDN, the authors have leveraged this basic notion and built an ICN-based CDN that does not need any client-to-cache mapping and overlay routing mechanisms. Based on that, iCDN features logical partitions and cache hierarchies for content namespaces (for acknowledging that there may be different CDN providers, hosting different content services).
iCDNs employ cache hierarchies to exploit on-path and off-oath caches without relying on application-layer routing functions. The idea was to provide a scalable, adaptive solution that can cope with dynamic network changes as well as dynamic changes in content popularity.
There are more details to this approach, and of course the debate on what is the best ICN-based CDN design has just started. Still, this paper is an interesting contribution in my view, because it illustrates the opportunities for rethinking CDN nicely.
Programmability
Programmability and ICN has two facets: 1) Implementing distributed computing with ICN (for example as in CFN -- Compute-First Networking) and 2) implementing ICN with programmable infrastructure. ACM ICN-2020 has seen contributions in both directions.
Result Provenance in Named Function Networking
In their paper Result Provenance in Named Function Networking (presentation), Claudio Marxer and Christian Tschudin have leveraged their previous work on Named Function Networking (NFN) and developed a result provenance framework for distributed computing in NFN.
In this work, the authors augmented NFN with a data structure that creates transparency of the genesis of every evaluation results so that entities in the system can ascertain result provenance. The main idea is the introduction of so-called provenance records that capture meta data about the genesis of the computation result. The paper discusses integration of these records into NDN and procedures for provenance checks and trust computation.
In my view, the interesting contribution of this work is the illustration of how the general concept of provenance verification can be implemented in a data-oriented system such as the ICN-based Named Function Networking framework. The results may be (so some extend) to other ICN-based in-network computing systems, so I hope this paper will start a thread of activities on this subject.
ENDN: An Enhanced NDN Architecture with a P4-programmable Data Plane
In their paper ENDN: An Enhanced NDN Architecture with a P4-programmable Data Plane (presentation), Ouassim Karrakchou, Nancy Samaan, and Ahmed Karmouch present an NDN system that is implemented in a P4-programmable data plane, i.e., a system in which applications can interact with a control plane that configures the data plane according to the required services.
The work in this paper is based on the notion that applications specify their content delivery requirements to the network, i.e., the control plane of a network. The control plane provide a catalogue of content delivery services, which are then translated into data plane configurations that ultimately get installed on P4 switches.
Examples of such services include Content Delivery Pattern services (whether the system is based on INTEREST/DATA or some stateful data forwarding), Content Name Rewrite services (enabling the network to rewrite certain names in INTERESTs), Adaptive Forwarding services (next-hop selection) etc.
In my view, this paper is interesting because it provides a relatively advanced perspective of how applications specify required behavior to a programmable ICN network. Moreover, the authors implemented this successfully on P4 switches and described relevant lessons learned and achievements in the paper.
Performance
Performance has historically always been an interesting topic in ICN. On the one hand, ICN provides substantial performance increases in the network due to its forwarding and caching features. On the other hand, it has been shown that implementing an ICN forwarder that operates at modern network line-speeds is challenging.
NDN-DPDK: NDN Forwarding at 100 Gbps on Commodity Hardware
In their paper NDN-DPDK: NDN Forwarding at 100 Gbps on Commodity Hardware (presentation), Junxiao Shi, Davide Pesavento, and Lotfi Benmohamed present their design of a DPDK-based forwarder.
The authors have developed a complete NDN implementation that runs on real hardware and that supports the complete NDN protocol and name matching semantics.
This work is interesting because the authors describe the different optimization techniques including better algorithms and more efficient data structures, as well as making use of the parallelism offered by modern multi-core CPUS and multiple hardware queues with user-space drivers for kernel-bypass.
This work represents the first software forwarder implementation that is able to achieve 100 Gpbs without compromises in NDN protocols semantics. The authors have published the source at https://github.com/usnistgov/ndn-dpdk.
ACM CoNEXT Workshop on Emerging In-Network Computing Paradigms (ENCP)
Edge- and, more generally, in-network computing is receiving a lot attention in research and industry fora. The ability to decentralize computing, to achieve low latency communication to distributed application logic, and the potential for privacy-preserving analytics are just a few examples that motivate a new approach for looking at computing and networking.
What are the interesting research questions from a networking and distributed computing perspective? In-network computing can be conceived in many different ways – from active networking, data plane programmability, running virtualized functions, service chaining, to distributed computing. What abstractions do we need to program, optimize, and to manage such systems? What is the relationship to cloud networking?
These questions will be discussed at the first workshop on Emerging In-Network Computing (ENCP) that takes place at ACM CoNEXT-2019 on December 9th in Orlando.
We have received many interesting submission and were able to put together a really interesting program that covers both Network Programmability and In-Network Computing Architectures and Protocols. Check out the full program here.
Many thanks to my co-organizers Spyros Mastorakis and Abderrahmen Mtibaa, to our steering committee members Jon Crowcroft, Satyajayant (Jay) Misra, and Dave Oran, and to our great Technical Program Committee for putting this together.
Links
ACM ICN-2019 Highlights
ACM ICN-2019 took place in the week of September 23 in Macau, SAR China. The conference was co-located with Information-Centric-Networking-related side events: the TouchNDN Workshop on Creating Distributed Media Experiences with TouchDesigner and NDN before and an IRTF ICNRG meeting after the conference. In the following, I am providing a summary of some highlights of the whole week from my (naturally very subjective) perspective.
Applications
ICN with its accessing named data in the network paradigm is supposed provide a different, hopefully better, service to application compared to the traditional stack of TCP/IP, DNS and application-layer protocols. Research in this space is often addressing one of two interesting research questions: 1) What is the potential for building or re-factoring applications that use ICN and what is the impact on existing designs; and 2) what requirements can be learned for the evolution of ICN, what services are useful on top of an ICN network layer, and/or how should the ICN network layer be improved.
Network Management
The best paper at the conference on Lessons Learned Building a Secure Network Measurement Framework using Basic NDN by Kathleen Nichols took the approach of investigating how a network measurement system can be implemented without inventing new features for the NDN network layer. Instead, Kathleen's work explored the features and usability support mechanisms that would be needed for implementing her Distributed Network Measurement Protocol (DNMP) in terms of frameworks and libraries leveraging existing NDN. DNMP is secure, role-based framework for requesting, carrying out, and collecting measurements in NDN forwarders. As such it represents a class of applications where applications both send and receive data that is organized by hierarchical topics in a namespace which implies a conceptual approach where applications do not (want to) talk to specific producers but are really operating in an information-centric style.
Communication in such a system involves one-to-many, many-to-one, and any-to-any communications about information (not data objects hosted at named nodes). DNMP employs a publish/subscribe model inspired by protocols such as MQTT where publishers and subscribers communicate through hierarchically structured topics. Instead of existing frameworks for data set reconciliation, with DNMP work includes the development of a lightweight pub/sub sync protocol called syncps that uses Difference Digests, solving the multi-party set reconciliation problem with prior context.
In a role-based system such as DNMP that uses secure Named-Data-based communication, automating authentication and access control is typically a major challenge. DNMP leverages earlier work on Trust Schema but extends this by a Versatile Security Toolkit (VerSec) that integrates with the transport framework to simplify integration of trust rules. VerSec is about to be released under GPL.
I found this paper really interesting to read because it is a nice illustration of what kind of higher layer services and APIs non-trivial application require. Also, the approach of using the NDN network layer as is but implementing additional functionality as libraries and frameworks seems promising with respect to establishing a stable network layer platform where innovation can happen independently on top. Moreover, the paper embraces Information-Centric thinking nicely and demonstrates the concept with a relevant application. Finally, I am looking forward to see the VerSec software which could make it easier for developers to implement rigorous security and validation in the applications.
Distributed Media Experiences
Jeff Burke and Peter Gusev organized the very cool TouchNDN workshop on Creating Distributed Media Experiences with TouchDesigner and NDN at the School of Creative Media at the City University of Hong Kong (summary presentation). The background is that video distribution/access has evolved significantly from linear TV broadcast to todays applications. Yet, many systems still seem to be built in a way that optimizes for linear video streaming to consumer eye balls, with a frame sequence abstraction.
Creative media applications such as Live Show Control (example) exhibit a much richer interaction with digital video, often combing 3D modelling with flexible, non-sequential access to video based on (for example) semantics, specific time intervals, quality layers, or spatial coordinates.
Combine this with dynamic lightning, sound control and instrumentation of theater effects, and you get an idea of an environment where various pieces of digital media are mixed together creatively and spontaneously. Incidentally, a famous venue for such an installation is the Spectacle at MGM Cotai, close to the venue of ACM ICN-2019 in Macau.
Derivative's TouchDesigner is a development platform for such realtime user experiences. It is frequently used for projection mapping, interactive visualization and other applications. The Center for Research in Engineering, Media and Performance (REMAP) has developed an integration of NDN with TouchDesigner's realtime 3D engine via the NDN-Common-Name-Library stack as a platform for experimenting with data-centric media. The objective is to provide a more natural networked media platform that does not have to deal with addresses (L2 or L3) but enables applications to publish and request media assets in namespaces that reflect the structure of the data. Combing this with other general ICN properties such as implicit multicast distribution and in-network caching results in a much more adequate platform for creating realtime multimedia experiences.
The TouchNDN workshop was one of REMAP's activities on converging their NDN research with artistic and cultural projects, trying to get NDN out of the lab and into the hands of creators in arts, culture, and entertainment. It is also an eye-opener for the ICN community for learning about trends and opportunities in real-time rendering and visual programming which seems to bear lots of potential for innovation -- both from the artistic as well as from the networking perspective.
Personally, I think it's a great, inspiring project that teaches us a lot about more interesting properties and metrics (flexible access, natural APIs, usability, utility for enabling innovations) compared to the usual quantitative performance metrics from the last century.
Inter-Server Game State Synchronization
Massive Multiplayer Online Role-Playing Games (MMORPGs) allow up to thousands of players to play in the same shared virtual world. Those worlds are often distributed on multiple servers of a server cluster, because a single server would not be able to handle the computational load caused by the large number of players interacting in a huge virtual world. This distribution of the world on a server cluster requires to synchronize relevant game state information among the servers. The synchronization requires every server to send updated game state information to the other servers in the cluster, resulting in redundantly sent traffic when utilizing current IP infrastructure.
In their paper Inter-Server Game State Synchronization using Named Data Networking Philipp Moll, Sebastian Theuermann, Natascha Rauscher, Hermann Hellwagner, and Jeff Burke started from the assumption that ICN's implicit multicast support and the ability to to decouple the game state information from the producing server could reduce the amount of redundant traffic and also help with robustness and availability in the presence of server failures.
They built a ICNified version of Minecraft and developed protocols for synchronizing game state in a server cluster over NDN. Their evaluation results indicated the benefits on an ICN-based approach for inter-server game state synchronization despite larger packet overheads (compared to TCP/IP). The authors made all their artefacts required for reproducing the results available on github.
Panel on Industry Applications of ICN
I had the pleasure of moderating a panel on industry applications of ICN, featuring Richard Chow (Intel), Kathleen Nichols (Pollere Inc.), and Kent Wu (Hong Kong Applied Science and Technology Research Institute). Recent ICN research has produced various platforms for experimentation and application development. One welcome development consists of initial ICN deployment mechanisms that do not require a forklift replacement of large parts of the Internet. At the same time, new technologies and use cases, such as edge computing, massively scalable multiparty communication, and linear video distribution, impose challenges on the existing infrastructure. This panel with experts from different application domains discussed pain points with current systems, opportunities and promising results for building specific applications with ICN, and challenges, shortcomings, and ideas for future evolution of ICN.
What was interesting to learn was how different groups pick up the results and available software to build prototypes for research and industry applications and what they perceive as challenges in applying ICN.
Decentralization
Growing concerns about centralization, surveillance and loss of digital sovereignty are currently fuelling many activities around P2P-inspired communication and storage networks, decentralized web ("web3") efforts as well as group such as the IRTF Research Group on Decentralized Internet Infrastructure (DINRG). One particular concern is the almost universal dependency on central cloud platforms for anchoring trust in applications that are actually of a rather local nature such as smart home platforms. Since such platforms often entail rent-seeking or surveillance-based business models, it is becoming increasingly important to investigate alternatives.
NDN/CCN-based ICN with its built-in PKI system provides some elements for an alternative design. In NDN/CCN it is possible to set up secure communication relationships without necessarily depending on third-party platforms which could be leveraged for more decentralized designs of IoT systems, social media and many other applications.
Decentralized and Secure Multimedia Sharing
A particularly important application domain is multimedia sharing where surveillance and manipulation campaigns by the dominant platforms have led to the development of alternative federated social media applications such as Mastodon and Diaspora. In their paper Decentralized and Secure Multimedia Sharing Application over Named Data Networking Ashlesh Gawande, Jeremy Clark, Damian Coomes, and Lan Wang described their design and implementation of npChat (NDN Photo Chat), a multimedia sharing application that provides similar functionality as today’s media-sharing based social networking applications without requiring any centralized service providers.
The major contributions of this work include identifying the specific requirements for a fully decentralized application, and designing and implementing NDN-based mechanisms to enable users to discover other users in the local network and through mutual friends, build friendship via multi-modal trust establishment mirrored from the real world, subscribe to friends’ multimedia data updates via pub-sub, and control access to their own published media.
This paper is interesting in my view because it illustrates the challenges and some design options nicely. It also suggests further research in terms of namespace design, name privacy and trust models. The authors developed an NDN-based prototype for Android systems that is supposed to appear on the Android Play store soon.
Exploring the Relationship of ICN and IPFS
We were happy to have David Dias, Adin Schmahmann, Cole Brown, and Evan Miyazono from Protocol Labs at the conference who held a tutorial on IPFS that also touched upon the relationship of IPFS and some ICN approaches.
Protocol Lab's InterPlanetary File System (IPFS) is a peer-to-peer content-addressable distributed filesystem that seeks to connect all computing devices with the same system of files. It is an opensource community-driven project, with reference implementations in Go and Javascript, and a global community of millions of users. IPFS resembles past and present efforts to build and deploy Information-Centric Networking approaches to content storage, resolution, distribution and delivery. IPFS and libp2p, which is the modular network stack of IPFS, are based on name-resolution based routing. The resolution system is based on Kademlia DHT and content is addressed by flat hash-based names. IPFS sees significant real-world usage already and is projected to become one of the main decentralised storage platforms in the near future. The objective of this tutorial is to make the audience familiar with IPFS and able to use the tools provided by the project for research and development.
Interestingly IPFS bear quite some similarities with earlier ICN systems such as NetInf but is using traditional transport and application layer protocols for the actual data transfer. One of the interesting research questions in that space are how IPFS system could be improved with today's ICN technology (as an underlay) but also how the design of a future IPFS-like system could leverage additional ICN mechanisms such as Trust Schema, data set reconciliation protocols, and remote method invocation. The paper Towards Peer-to-Peer Content Retrieval Markets: Enhancing IPFS with ICN by Onur Ascigil, Sergi Reñé, Michał Król et al. explored some of these options.
IoT
IoT is one of the interesting application areas for ICN, especially IoT in constrained environments, where the more powerful forwarding model (stateful forwarding and in-network caching) and the associated possibility for more fine-grained control of storage and communication resources incurs significant optimization potential (which was also a topic at this year's conference).
QoS Management in Constrained NDN Networks
Quality of Service (QoS) in the IP world mainly manages forwarding resources, i.e., link capacities and buffer spaces. In addition, Information Centric Networking (ICN) offers resource dimensions such as in-network caches and forwarding state. In constrained wireless networks, these resources are scarce with a potentially high impact due to lossy radio transmission. In their paper Gain More for Less: The Surprising Benefits of QoS Management in Constrained NDN Networks Cenk Gündoğan, Jakob Pfender, Michael Frey, Thomas C. Schmidt, Felix Shzu-Juraschek, and Matthias Wählisch explored the two basic service qualities (i) prompt and (ii) reliable traffic forwarding for the case of NDN. The resources that were taken into account are forwarding and queuing priorities, as well as the utilization of caches and of forwarding state space. The authors treated QoS resources not only in isolation, but also correlated their use on local nodes and between network members. Network-wide coordination is based on simple, predefined QoS code points. The results indicate that coordinated QoS management in ICN is more than the sum of its parts and exceeds the impact QoS can have in the IP world.
What I found interesting about his paper is the validation in real-world experiments that demonstrated impressive improvements, based on the coordinated QoS management approach. This work comes timely considering the current ICN QoS discussion in ICNRG, for example in draft-oran-icnrg-qosarch. Also, the authors made their artefacts available on github for enabling reproducing their results.
How Much ICN Is Inside of Bluetooth Mesh?
Bluetooth mesh is a new mode of Bluetooth operation for low-energy devices that offers group-based publish-subscribe as a network service with additional caching capabilities. These features resemble concepts of information-centric networking (ICN), and the analogy to ICN has been repeatedly drawn in the Bluetooth community. In their paper Bluetooth Mesh under the Microscope: How much ICN is Inside? Hauke Petersen, Peter Kietzmann, Cenk Gündoğan, Thomas C. Schmidt, and Matthias Wählisch compared Bluetooth mesh with ICN both conceptually and in real-world experiments. They contrasted both architectures and their design decisions in detail. They conducted experiments on an IoT testbed using NDN/CCNx and Bluetooth Mesh on constrained RIOT nodes.
Interestingly the authors found that the implementation of ICN principles and mechanisms in Bluetooth Mesh is rather limited. In fact, Bluetooth Mesh performs flooding without content caching and merely using the equivalent of multicast addresses as a surrogate for names. Based on these findings, the authors discuss options of how ICN support for Bluetooth could or should look like, so the paper is interesting both for understanding the actual working of Bluetooth Mesh as well as for ideas for improving Bluetooth Mesh. The authors made their artefacts available on github for enabling reproducing their results.
ICN and LoRa
LoRa is an interesting technology for its usage of license-free sub-gigahertz spectrum and bi-directional communication capabilities. We were happy to have Kent Wu and Xiaoyu Zhao from ASTRI at the conference and the ICNRG meeting who talked about their LoRa prototype development for a smart metering system for water consumption in Hong Kong. In addition to that, the ICNRG also discussed different options for integrating ICN and LoRa and got an update by Peter Kietzmann on the state of LoRa support in the RIOT OS. This is an exciting area for innovation, and we expect more work and interesting results in the future.
New Frontiers
Appying ICN to big data storage and processing and to distributed computing are really promising research directions that were explored by papers at the conference.
NDN and Hadoop
The Hadoop Distributed File System (HDFS) is a network file system used to support multiple widely-used big data frameworks that can scale to run on large clusters. In their paper On the Power of In-Network Caching in the Hadoop Distributed File System Eric Newberry and Beichuan Zhang evaluate the effectiveness of using in-network caching on switches in HDFS- supported clusters in order to reduce per-link bandwidth usage in the network.
They discovered that some applications featured large amounts of data requested by multiple clients and that, by caching read data in the network, the average per-link bandwidth usage of read operations in these applications could be reduced by more than half. They also found that the choice of cache replacement policy could have a significant impact on caching effectiveness in this environment, with LIRS and ARC generally performing the best for larger and smaller cache sizes, respectively. The authors also developed a mechanism to reduce the total per-link bandwidth usage of HDFS write operations by replacing write pipelining with multicast.
Overall, the evaluation results are promising, and it will be interesting to see how the adoption of additional ICN concepts and mechanisms and caching could be useful for big data storage and processing.
Compute-First Networking
Although, as a co-author, I am clearly biased, I am quite convinced of the potential for distributed computing and ICN that we described in a paper co-authored by Michał Król, Spyridon Mastorakis, David Oran, and myself.
Edge- and, more generally, in-network computing is receiving a lot attention in research and industry fora. What are the interesting research questions from a networking perspective? In-network computing can be conceived in many different ways – from active networking, data plane programmability, running virtualized functions, service chaining, to distributed computing. Modern distributed computing frameworks and domain-specific languages provide a convenient and robust way to structure large distributed applications and deploy them on either data center or edge computing environments. The current systems suffer however from the need for a complex underlay of services to allow them to run effectively on existing Internet protocols. These services include centralized schedulers, DNS-based name translation, stateful load balancers, and heavy-weight transport protocols.
Over the past years, we have been working on alternative approaches, trying to find ways for integrating networking and computing in new ways, so that distributed computing can leverage networking capabilities directly and optimize usage of networking and computing resources in a holistic fashion. Here is a summary of our latest paper.
Projects
The following list provides a selection of some research projects and other activities I have been previously been involved in.
Named Data Microverse
Our project proposal on Named Data Microverse was selected as a winner of the Future of Data Challenge.
The Named Data Microverse project explores how Information-Centric Networking (ICN) can enable a free, open and decentralized approach to “the metaverse”. The project aims to balances scalability and market-based innovation with democratization, trustworthiness, and equitable empowerment of individuals. ICN provides an architectural foundation for secure, distributed applications to be created more easily and provides resilience in natural disasters, better mobility support, cloud-optional local communication, improved privacy, and other benefits that are not addressed solely by “Web3” technologies.
This is a joint project with Jeff Burke and Lixia Zhang at UCLA.
MAVERIC: In-Network Computing for 5G Campus Networks
The MAVERIC project will develop a mobile 5G campus network system with a special focus on automated deployment, monitoring as well as flexible and digitally sovereign in-network computing. The main use cases within the project are processes and tasks on ship yards. This environment is particularly harsh and has very high requirements regarding availability, security and confidentiality.
Piccolo: In-Network Computing
The Piccolo research project is developing new solutions for in-network computing that remove known and emerging deficiencies of edge/fog computing. Starting from a set of innovative industry-relevant use cases, we are creating a distributed computing platform that can leverage different kinds of underlying infrastructure that can cater to various business needs and user preferences, and that will provide an open platform for future applications.
Our motivation is that the centralised cloud computing model in use today has difficulty handling new and emerging applications. Ever-more powerful user and IoT devices are producing enormous amounts of data – too much to send into the cloud for centralised processing, and further the round trip time is too large for the stringent latency requirements of some applications. Also, there are increasing concerns about leaving data privacy at the mercy of big cloud operators. Shifting from centralized to in-network compute can alleviate these concerns and thereby open up new horizons for application development and create new infrastructure markets.
Compute-First Networking
Edge- and, more generally, in-network computing is receiving a lot attention in research and industry fora. What are the interesting research questions from a networking perspective? In-network computing can be conceived in many different ways – from active networking, data plane programmability, running virtualized functions, service chaining, to distributed computing. Modern distributed computing frameworks and domain-specific languages provide a convenient and robust way to structure large distributed applications and deploy them on either data center or edge computing environments. The current systems suffer however from the need for a complex underlay of services to allow them to run effectively on existing Internet protocols. These services include centralized schedulers, DNS-based name translation, stateful load balancers, and heavy-weight transport protocols.
Over the past years, we have been working on alternative approaches, trying to find ways for integrating networking and computing in new ways, so that distributed computing can leverage networking capabilities directly and optimize usage of networking and computing resources in a holistic fashion.
Please read the online article for more information and links to papers.
OPNFV
OPNFV is a carrier-grade, integrated, open source platform to accelerate the introduction of new NFV products and services. As an open source project, OPNFV is uniquely positioned to bring together the work of standards bodies, open source communities and commercial suppliers to deliver a de facto standard open source NFV platform for the industry.The OPNFV community is collaborating on a carrier-grade, integrated, open source platform to accelerate the introduction of new NFV products and services. By integrating components from upstream projects, the community can perform performance and use case-based testing to ensure the platform’s suitability for NFV use cases. OPNFV will also work upstream--with other open source communities--to bring the learnings from its work directly to those communities in the form of blueprints, patches, and code contribution. The scope of OPNFV’s initial release is focused on building NFV Infrastructure (NFVI) and Virtualized Infrastructure Management (VIM) by integrating components from upstream projects such as OpenDaylight, OpenStack, Ceph Storage, KVM, Open vSwitch, and Linux. These components, along with application programmable interfaces (APIs) to other NFV elements form the basic infrastructure required for Virtualized Network Functions (VNF) and Management and Network Orchestration (MANO) components. OPNFV’s goal is to increase performance and power efficiency; improve reliability, availability, and serviceability; and deliver comprehensive platform instrumentation.
Fostering a diverse community of developers who bring different needs, ideas and knowledge to the table means faster time to market and stronger code. We hope you’ll join OPNFV as we work together to effect the game-changing networking transformation that is NFV.
More information on OPNFV: www.opnfv.org.
SSICLOPS
The Scalable and Secure Infrastructures for Cloud Operations (SSICLOPS, pronounced “cyclops”) project focuses on techniques for the management of federated private cloud infrastructures, in particular cloud networking techniques within software-defined data centres and across wide-area networks. SSICLOPS is funded by the European Commission under the Horizon2020 programme.SSICLOPS will empower enterprises to create and operate high-performance private cloud infrastructure that allows flexible scaling through federation with other private clouds without compromising on their service level and security requirements. SSICLOPS federation will support the efficient integration of clouds, no matter if they are geographically collocated or spread out, belong to the same or different administrative entities or jurisdictions: in all cases, SSICLOPS will deliver maximum performance for inter-cloud communication, enforce legal and security constraints, and minimize the overall resource consumption. In such a federation, individual enterprises will be able to dynamically scale in/out their private cloud services: because they dynamically offer own spare resources (when available) and take in resources from others when needed. This allows maximizing own infrastructure utilization while minimizing excess capacity needs for each federation member. SSICLOPS-powered private clouds will offer fine-grained monitoring and tuning capabilities along with workload planning and optimization tools to maximize the performance across a broad spectrum of workloads and across a wide operational scale, as we will demonstrate using four highly diverse use cases. The SSICLOPS solution will be based upon state-of-the-art open source products used broadly in private cloud deployments today to provide enterprises with full control over their own deployment.
More information on SSICLOPS: ssiclops.eu.
GreenICN
Information Centric Networking (ICN) is a new paradigm where the network provides users with named content, instead of communication channels between hosts. Research on ICN is at an early stage, with many key issues still open, including naming, routing, resource control, security, privacy and a migration path from the current Internet. Also missing for efficient information dissemination is seamless support of contentbased publish/subscribe. Further, and importantly, current proposals do not sufficiently address energy efficiency. GreenICN aims to bridge this gap, addressing how the ICN network and devices can operate in a highly scalable and energy-efficient way.The project exploity the designed infrastructure to support two exemplary application scenarios: 1. The aftermath of a disaster e.g., hurricane or tsunami, when energy and communication resources are at a premium and it is critical to efficiently distribute disaster notification and critical rescue information. Key to this is the ability to exploit fragmented networks with only intermittent connectivity;
- Scalable, efficient pub/sub video delivery, a key requirement in both normal and disaster situations.
GreenICN will also expose a functionality-rich API to spur the creation of new applications and services expected to drive EU and Japanese industry and consumers into ICN adoption. Our team, comprising researchers with diverse expertise, system and network equipment manufacturers, device vendors, a startup, and mobile telecommunications operators, is very well positioned to design, prototype and deploy GreenICN technology, and validate usability and performance of real-world GreenICN applications, contributing to create a new, low-energy, Information-Centric Internet. Our expertise and experience in standardization will enable us to make major contributions to standards bodies. Our efforts will foster continued close cooperation between both industrial and research communities of Europe and Japan.
More information on GreenICN: www.greenicn.org.
SAIL
SAIL (Scalable & Adaptive Internet Solutions) is aiming at designing architectures for the Networks of the Future, as part of the European Commission’s 7th Framework Program. SAIL has three main technical strands: Network of Information (information-centric networking), Cloud Networking (combining virtual networking with cloud computing), and Open Connectivity Services (transport and routing services that can be controlled and orchestrated over various technologies).My main interest is the research on information-centric networking. The main idea is to move from a host-based communication paradigm, where host addresses/IDs are the principal communication objects, to a paradigm that is based on named-content. In some current application areas such as content distribution and peer-to-peer communication we can observe that communication is actually no longer about setting up end-to-end connections to origin server in order to access a certain service/content. Instead, users are interested in named content (represented by, for instance, Torrents or URLs) and a corresponding distribution system provides lookup and distribution services that enable interested receivers to obtain the content (copies of the content or content chunks). So far, this paradigm is applied to isolated, mostly overlaid, applications or distribution platforms. The intention in SAIL is to generalize these concepts for a ubiquitous communication platform, where name-based content, in-network-storage, and efficient distribution is available to any application. Several research questions are related to this: 1) how to design a naming framework that allows to name all information objects, is scalable in terms of lookup table size and lookup latency while still meeting security requirements; 2) how to efficiently move content to appropriate location in the network; 3) how to manage mobility, multi-interface nodes and disruption-tolerance; and 4) how to evolve socio-economics with potential new roles for content providers/consumers, as well as network/cache operators.
More information on SAIL: http://www.sail-project.eu/.
CHIANTI
CHIANTI is a Small or medium-scale focused research project (STREP) and part of the ICT initiative of 7th EU framework programme. CHIANTI is developing technologies for enabling effective, robust, and cost-efficient communication services in challenging network environments, e.g., for providing a productive and stable Internet access to passengers in high-speed trains. Different to many existing approaches, CHIANTI is developing technologies that do not require a complete network coverage. Instead, CHIANTI will provide perceived seamless connectivity despite disruptions, changing network characteristics etc. and will thus enable users on the move to use today\'s and future network more productively.More information on CHIANTI: http://www.chianti-ict.org/. ScaleNetWithin ScaleNet, academia and industry jointly work on the scaleable and converged multi-access operator\'s network from tomorrow, focusing on 2010 onwards.ScaleNet is addressing both service and network convergence. The multi-play of services in ScaleNet embraces voice and video telephony, Mobile TV, massively multiplayer online gaming and Internet Access. Network convergence is seen as the migration of heterogeneous physical and logical network elements of fixed and mobile networks into one single (IP based) infrastructure.TZI has been developing a robust Mobile TV application for the converged ScaleNet network infrastracture. More information on ScaleNet: http://www.scalenet.de/.
Network Service Maps
Network Service Maps are an enabling technology for facilitating network access in heterogeneous, potentially challenged networks, such as sparsely distributed WLAN hotspots. Networks Service Maps are based on the notion that future heterogeneous wireless networks will encompass different link layer technologies and allow selecting the most appropriate network depending on different criteria. To support mobile nodes in the selection process, network information services are developed that provide the mobile node with sufficient information about its network neighborhood, typically focusing on the optimization of handover processes. In this research, we take a more general approach towards network information services, which is needed to support mobile communications in the existing environments of WLAN hotspots and wide area mobile communications networks. We introduce the notion of service maps, a mobile data management approach allowing a mobile user to obtain a detailed view of available networks and the services they offer depending on the user context such as geographic position, mobility paths, and application requirements.More information on Network Service Maps is available at: http://service-maps.net
Kasuari Emulation Framework
The Kasuari framework is mainly intended to help with (IP) protocol development and testing. One of its features is the possibility to run unmodified real-world networked applications on a virtual host under simulated network conditions. The framework is based on Xen 3.0, and comes with scripts to run the virtual machines, a pre-configured filesystem image (with DTN and AODV implementations), a copy-on-write driver and a few other tools. It can be used for testing almost any kernel module or networked application that runs on Linux, and it allows to simulate complex and realistic (wireless) networks using a slightly adopted version of the ns2 network simulator.More information on the Kasuari Emulation Framework is available at: http://www.kasuari.org/
Drive-thru Internet
The Drive-thru Internet project investigates the usability of IEEE 802.11 technology for providing network access to mobile users in moving vehicles. The idea of Drive-thru Internet is to provide hot spots along the road -- within a city, on a highway, or even on high-speed freeways such as autobahns. They need to be placed in a way that a vehicle driving by will obtain WLAN access for some (relatively short) period of time; if located in rest areas, the driver may exit and pass by slowly or even stop to prolong the connectivity period. One or more locally interconnected access points form a so-called connectivity island that may provide local services as well as Internet access. Several of these connectivity islands along a road or in the same geographic area may be interconnected and cooperate to provide network access with intermittent connectivity for a larger area.More information about the Drive-thru Internet project: http://www.drive-thru-internet.org/
Internet Media Guides
Internet Media Guides (IMGs) are a generalization of Electronic Program Guides (EPGs) as known from digital video broadcasting (DVB). They are independent of specific metadata formats and thus are able to support a broad range of applications, including EPG distribution for TV networks and distribution of session descriptions for Internet-based multimedia sessions. Unlike most existing approaches, the IMG framework is also completely independent of specific delivery networks for the media content described in media guides -- and it is also independent of the distribution mechanisms for the media guides themselves: IMGs can be distributed in unidirectional broadcast networks, they can also retrieved over established query/response protocols such as HTTP, and they allow for asynchronous change notifications to interested subscribers.At TZI we have developed IMG distribution implementations that are availabe for download. More information on the IMG work is available at: https://prj.tzi.org/cgi-bin/trac.cgi/wiki/TZI-IMG
Mbus
The Message Bus (Mbus) is a light-weight local coordination protocol for developing component-based distributed applications that has been developed by Universität Bremen and University College London. Mbus provides a simple and flexible message oriented communication channel for a group of components that may be distributed on multiple hosts in a local network. The Mbus transport services include useful features such as peer location, point-to-point and group communication and security. The protocol specification has been published as RFC 3259.Mbus implementations have been developed for different programming languages and platforms, including small one-chip computers. The protocol has been applied to different application domains, e.g., for coordinating application components in decomposed multimedia conferencing applications and for providing coordination services for pervasice computing environments such as home networks. This web site provides some details on the Mbus protocol itself as well as on extensions, implementations and applications: http://www.mbus.org/
6WINIT
The 6WINIT project that has concluded January 2003 has validated the introduction of the new Mobile Wireless Internet in Europe. It has investigated and validated the set-up of one of the first European operational IPv6-3G Mobile Internet initiatives, providing the 6WINIT project customers with native IPv6 access points and native IPv6 services in a 3G environment.More information about 6WINIT: - Local 6WINIT description (german)
MECCANO
The objective of the MECCANO project that has concluded in May 2000 was to provide all the technology components, other than the data network itself, to support collaborative research and technical development through the deployment of enhanced tools for multimedia collaboration in Europe. The project has improved and deployed existing conferencing toolsets with a particular application aim of distance education and of conferencing.MECCANO homepage
Winspect
The Winspect Project (Wearable Computing in Inspection) has developed a system to support the maintenance staff dealing with the inspection of industrial cranes at a Bremen steel plant. We have investigated the use of wireless, wearable computers in industrial environments and have developed different applications, e.g., multimedia conferencing and data inspection support applications on PC-platform based werable computers.More information about Winspect: - Winspect homepage (german)
CONTRABAND
The CONTRABAND project (Conferencing for Transport Breakdown and Accident Management and Networking of Dispatchers) has developed a multimedia multiparty conferencing system that is tailored for application in both engineering and accident management usage situations. For the latter type of application a mobile conferencing component has been developed that is based on a wearable, wireless computer.
MEDUSA
The sensor network MEDUSA (Multispectral Environment Data Unit for Surveillance Application) enables a regular monitoring of waters regardless of optical visibility, an inspection of reported oil pollution, securing evidence regarding polluters and providing support for the ships assigned to combat pollution. To be able to operate regardless of the time of day and weather, several types of sensors e.g. radar, infrared and ultraviolet line scanners, and video or low-light-level cameras are used. With the help of this equipment it is possible to detect pollution (e.g. oil or algae) on or below the sea surface in parts even at a distance of up to 50 km, subsequently to classify it in overflight and determine its amount.MEDUSA homepage