Hedge 120: Information Centric Networking
I was on The Hedge Podcast with Russ White and Alvaro Retana to discuss Information-Centric Networking and the future of the Internet.
Connecting the Metaverse: In-Network Computing as Infrastructure
Ubiquitous virtual reality environments such as Metaverse have been described as the future mobile Internet, alluding to their expected profound impact on the way how information is retrieved, processed, rendered, and consumed. While detailed designs are still emerging, early visions such Keeichi Matsuda’s Hyper-Reality project have already outlined usage models and expectations on connectivity and data availability to enable rich interactions with the physical world and blending it with dynamically computed artefacts.
Metaverse systems will challenge traditional client-server-inspired web models, centralized security trust anchors and server-style distributed computing. The new network will be based on dynamic interactions between humans, the phyiscal world, and computing processes in an edge-to-cloud continuum. This talk will outline the associated challenges, review recent work in distributed computing and suggest some approaches for evolving networking and computing to enable Metaverse – not as a dystopian vision but as an opportunity for societies and their citizens.
Information-Centric Networking Research Group Update December 2021
The Information-Centric Networking Research Group (ICNRG) of the Internet Research Task Force (IRTF) has recently published two RFC and held a research meeting on December 10th 2021.
Recent RFC Publications
ICNRG published two RFCs recently:
RFC 9139: ICN Adaptation to Low-Power Wireless Personal Area Networks (LoWPANs)
RFC 9139 defines a convergence layer for Content-Centric Networking (CCNx) and Named Data Networking (NDN) over IEEE 802.15.4 Low-Power Wireless Personal Area Networks (LoWPANs). A new frame format is specified to adapt CCNx and NDN packets to the small MTU size of IEEE 802.15.4. For that, syntactic and semantic changes to the TLV-based header formats are described.
To support compatibility with other LoWPAN technologies that may coexist on a wireless medium, the dispatching scheme provided by IPv6 over LoWPAN (6LoWPAN) is extended to include new dispatch types for CCNx and NDN. Additionally, the fragmentation component of the 6LoWPAN dispatching framework is applied to Information-Centric Network (ICN) chunks.
In its second part, the document defines stateless and stateful compression schemes to improve efficiency on constrained links. Stateless compression reduces TLV expressions to static header fields for common use cases. Stateful compression schemes elide states local to the LoWPAN and replace names in Data packets by short local identifiers.
The ICN LoWPAN specification is a great platform for future experiments with ICN in constrained networking environments, including but not limited to LoWPAN networks.
RFC 9138: Design Considerations for Name Resolution Service in Information-Centric Networking (ICN)
RFC 9138 provides the functionalities and design considerations for a Name Resolution Service (NRS) in Information-Centric Networking (ICN). The purpose of an NRS in ICN is to translate an object name into some other information such as a locator, another name, etc. in order to forward the object request.
Since naming data independently from its current location (where it is stored) is a primary concept of ICN, how to find any NDO using a location-independent name is one of the most important design challenges in ICN. Such ICN routing may comprise three steps:
- Name resolution: matches/translates a content name to the locator of the content producer or source that can provide the content.
- Content request routing: routes the content request towards the content's location based either on its name or locator.
- Content delivery: transfers the content to the requester.
Among the three steps of ICN routing, this document investigates only the name resolution step, which translates a content name to the content locator. In addition, this document covers various possible types of name resolution in ICN such as one name to another name, name to locator, name to manifest, name to metadata, etc.
ICNRG Meeting on December 10th 2021
Agenda
1 | Chairs’ Presentation: Status, Updates | Chairs | 05 min |
2 | Zenoh - The Edge Data Fabric | Carlos Guimarães | 30 min |
3 | The SPAN Network Architecture | Rhett Sampson | 30 min |
4 | NDNts API Design | Junxiao Shi | 30 min |
6 | Wrap-Up, Next Steps | Chairs | 5 min |
Zenoh
- Carlos Guimarães
- Presentation
- Video Recording
Carlos Guimarães presented zenoh – The Edge Data Fabric. zenoh is an ICN-inspired data distribution and processing system that zenoh aims at unifying data in motion, data at rest and computations. It blends traditional pub/sub with geo distributed storage, queries and computations, adopting a hierarchical naming scheme and other ICN properties. zenoh provides a high level API for high performance pub/sub and distributed queries, data representation transcoding, an implementation of geo-distributed storage and distributed computed values.
SPAN Network Architecture
- Rhett Sampson and Jaime Lorca
- Presentation
- Video Recording
Rhett Sampson and Jaime Llorca presented GT Systems' SPAN-AI content distribution system (CDN as a service) that uses ICN/CCN/NDN for the implementation of their distributed content delivery system, leveraging name-based routing.
NDNts API Design
- Junxiao Shi
- Presentation
- Video Recording
Junxiao Shi presented the NDNts API Design. NDNts is an NDN implementation in TypeScript, aiming to facilitate NDN application development in browsers and on the Node.js platform.
The development of NDNts led to some insights on NDN low-level API design (packet decoding, fragmentation, notion of "faces", retransmission logic etc.) that Junxiao shared in his presentation.
Dagstuhl Seminar on Compute-First Networking
Eve Schooler, Jon Crowcroft, Phil Eardley, and myself organized an online Dagstuhl seminar on Compute-First Networking earlier in June that was attended by an excellent group of researchers from distributed computing, networking and data analytics communities.
Dagstuhl has now published the seminar report that discusses new perspectives on doing Computing in the Networking, use cases and that includes many references to relevant literature and on-going projects in the field.
Executive Summary
Edge- and more generally In-Network Computing are key elements in many traditional content distribution services today, typically connecting cloud-based computing to consumers. The advent of new programmable hardware platforms, research and wide deployment of distributed computing technologies for data processing, as well as new exciting use cases such as distributed Machine Learning and Metaverse-style ubiquitous computing are now inspiring research of more fine-granular and more principled approaches to distributed computing in the "Edge-To-Cloud Continuum".
The Compute-First Networking Dagstuhl seminar has brought together researchers and practitioners in the fields of distributed computing, network programmability, Internet of Things, and data analytics to explore the potential, possible technological components, as well as open research questions in an exciting new field that will likely induce a paradigm shift for networking and its relationship with computing.
Traditional overlay-based in-network computing is typically limited to quite specific purposes, for example CDN-style edge computing. At the same time, network programmability approaches such as Software-Defined Networking and corresponding languages such as P4 are often perceived as too limited for application-level programming. Compute-First Networking (CFN) views networking and computing holistically and aims at leveraging network programmability, server- and serverless in-network computing and modern distributed computing abstraction to develop a new system's approach for an environment where computing is not merely and add-on to existing networks, but where networking is re-imagined with a broader and ubiquitous notion of programmability.
We expect this approach to enable several benefits: it can help to unlock distributed computing from the existing silos of individual cloud and CDN platforms – a necessary condition to enable Keiichi Matsuda's vision of Hyper-Reality and Metaverse concepts where the physical world, human users and different forms of analytics, and visual rendering services constantly engage in information exchanges, directly at the edges of the network. It can also help to provide reliable, scalable, privacy-preserving and universally available platforms for Distributed Machine Learning applications that will play a key role in future large-scale data collection and analytics.
CFN's integrated approach allows for several optimizations, for example a more informed and more adaptive resource optimization that can take into account dynamically changing network conditions, availability of utilization of compute platforms as well as application requirements and adaptation boundaries, thus enabling more
responsive and better-performing applications.
Several interesting research challenges have been identified that should be addressed in order to realize the CFN vision: How should the different levels of programmability in todays system be integrated into a consistent approach? How would programming and communication abstractions look like? How do orchestration systems need to evolve in order to be usable in these potentially large scale scenarios? How can be guarantee security and privacy properties of a distributed computing slice without having to rely on just location attributes? How would the special requirements and properties of relevant applications such as Distributed Machine best be mapped to CFN – or should distributed data processing for federated or split Machine Learning play a more prominent role in designing CFN abstractions?
This seminar was an important first step in identifying the potential and a first set of interesting new research challenges for re-imaging distributed computing through CFN – an exciting new topic for networking and distributed computing research.
Addressing in the Internet
There was a side meeting on Internet Addressing at IETF-112 this week, discussing potential gaps in Internet Addressing and potential use cases that would suggest new addressing structures.
Looking at the realities in the Internet today, I do not think that actual relevant use cases and current issues in the Internet are served well by just a new addressing approach for the Internet Protocol. Instead I believe that there needs to be architectural discussion first – and addressing might eventually fall out as a result.
Information-Centric Dataflow: Re-Imagining Reactive Distributed Computing
The Dataflow paradigm is a popular distributed computing abstraction that is leveraged by several popular data processing frameworks such as Apache Flink and Google Dataflow. Fundamentally, Dataflow is based on the concept of asynchronous messaging between computing nodes, where data controls program execution, i.e., computations are triggered by incoming data and associated conditions. This typically leads to very modular system architectures that enable re-use, re-composition, and parallel execution naturally. Most of the popular distributed processing frameworks today are implemented as overlays, i.e., they allow for instantiating computations and for inter-connecting them, for example by creating and maintaining communication channels between nodes such as system processes and microservices.
Connections and Overlays
The connection-based approach incurs several architectural problems and inefficiencies, for example: application logic is concerned with receiving and producing data as a result of computation processes but connections imply transport endpoint addresses that are typically not congruent. This typically implies a mapping or orchestration system. One key goal for Dataflow systems is to enable parallel execution, i.e., one computation is run in parallel, which also affects the communication relationships with upstream producers and downstream consumers. For example, when parallelizing a computation step, it typically implies that each instance is consuming a partition of the inputs instead of all the inputs. An indirection- and connection-based approach makes it harder to configure (and especially to dynamically re-configure) such dataflow graphs.
In some variants of Dataflow, for example stream processing, it can be attractive if one computation output can be consumed by multiple downstream functions. Connection-based overlays typically require duplicating the data for each such connection, incurring significant overheads. In large-scale scenarios, the computation functions may be distributed to multiple hosts that are inter-connected in a network. Orchestrators may have visibility into compute resource availability but typically have to treat the TCP/IP network as a blackbox. As a result, the actual data flow is locked into a set of overlay connections that do not necessarily follow optimal paths, i.e., the communication flows are incongruent with the logical data flows.
IceFlow: Information-Centric Dataflow
In our ACM ICN 2021 paper Vision: Information-Centric Dataflow – Re-Imagining Reactive Distributed Computing, we present IceFlow – an Information-Centric Dataflow system approach that supports traditional Dataflow with Information-Centric principles and that can be used as a drop-in replacement for existing Dataflow-based frameworks.
In addition to the paper, we also show a live of a joint optimization of computing and networking resources in IceFlow: Decentralized ICN-based dataflow system implementation.
IceFlow’s objectives are:
- reducing complexity in Dataflow systems by removing connection-based overlays and corresponding orchestration requirements;
- enabling efficient communication by reducing data duplication; and
- enabling additional improvements through more direct communication and caching in the network.
IceFlow is employing access to authenticated data in the network as per CCNx/NDN-based ICN for the communication between computation functions and provides additional features such as flowcontrol, partitions for data streaming, and a window concept for synchronizing computations in streaming pipelines. The contributions of this paper are:
- an ICN naming scheme for Dataflow;
- a concept for receiver-driven flow control in IceFlow-based Dataflow systems and for dealing with parallel processing in IceFlowbased Dataflow systems; and
- a prototype implementation.
Links
Zensur im Internet
In der neuen Folge unseres Podcasts Neulich im Netz widmen wir uns eines etwas delikateren Themas: Zensur im Internet
Insbesondere geht es um die "Great Firewall of China" (GFW), die wir in Bezug auf ihre technische Umsetzung und Probleme analysiert haben.
Anhand von Publikationen und eigenen Erfahrungen analyisieren wir, wie die GFW grob funtioniert, kontinuiierlich weiterentwickelt wird, und wie effektiv unterschiedliche Werkzeuge wie VPNs, shadowsocks usw. sind.
Diese und weitere Aspekte von Zensur im Internet in der dritten Episode von Neulich im Netz.
Ist das DNS Noch zu Retten?
In der neuen Folge unseres Podcasts Neulich im Netz geht um Namen im Internet, d.h., um das Domain Name System (DNS). Wir sprechen über grundlegende DNS-Funktionen, die Bedeutung, die das DNS für das Internet und das Web hat und über Anwendungen wie Tracking und Traffic Steering, die man vielleicht nicht unbedingt mit Namensauflösung in Verbindung bringt.
Wir diskutieren, inwieweit das technische Design des DNS und seine heutige Verwendung zu Sicherheitsproblemen führt und beurteilen einige vorgeschlagene Verbesserungen. Ist das DNS in der heutigen Form noch zu retten? Wie stehen die Chancen dafür? Diese und andere Frage in der zweiten Episode von Neulich im Netz.
Neulich im Netz – the Internet Technologies Podcast
I am pleased to announce that I have teamed up with Rolf Winter for a new bi-weekly video podcast series: Neulich im Netz.
We are covering current and relevant developments in Internet technologies and networked systems in general. We are starting with the german language channel.
Our first episode has been released today: We are talking about Covid-19 and the Internet.
- Website: https://www.neulich-im.net/
- YouTube
Information-Centric Networking Research Update December 2020
The IRTF Information-Centric Networking Research Group (ICNRG) held a meeting on December 1st 2020. Here is a summary of the research highlights. You can find all the presentation and the meetings minutes on the IETF datatracker.
Big Data Processing
Edmund Yeh (Northeastern University) presented an overview of recent and current research on supporting Data-Centric Ecosystems for Large-Scale Data-Intensive Science through ICN in the NSF SANDIE Project (SDN-Assisted NDN for Data Intensive Experiments) and in the NSF N-DISE project (NDN for Data Intensive Science Experiments).
Data-intensive science applications such as processing of LHC and genomics data pose interesting challenges to system design and efficient resource usage: from an application perspective these system require accessing named data, independent of location, transport mechanisms etc.
The underlying infrastructures however typically focus on addresses, processes, servers, and connections, which also has repercussions on the security architectures (securing containers and delivery pipes).
The research work in the SANDIE and N-DISE project is applying a data-centric approach to system and network design through the whole data lifecycle, i.e., data is uniquely named and authenticated/encrypted directly at the production phase and then delivered, replicated, stored and made available under that name.
Using basic ICN mechanisms (accessing named data, opportunistic caching, receiver-driven operation, and implicit multicast), accessing, processing and re-using data for data-intensive applications can be much optimized.
Further optimizations can be achieved through:
- joint optimization of forwarding and caching resources as described in Jointly Optimal Routing and Caching for Arbitrary Network
Topologies; and - high-speek DPDK-based forwarding NDN-DPDK: NDN Forwarding at 100 Gbps on Commodity Hardware.
The team has applied this to accelerating XRootD for scalable
fault-tolerant data access and demonstrated throughput rates of over 6.7 Gbps and 10 times acceleration.
The newly started N-DISE project will continue this research, aiming at developing a production-ready NDN-based petascale data distribution, caching, access, and computation platform that could server major science programs, with LHC high energy physics as a primary target use case. Technically, the work will focus on created NDN-DPDK consumer and producer applications, packaging NDN-DPDK and applications into containers for diverse platforms, and advancing ICN data integrity and provenance mechanisms.
Broker-Based Publish/Subscribe
Nameseok Ko of ETRI presented a design for a Broker-based Pub/Sub System for NDN.
Pub/sub is a popular communication pattern for loosely coupled producers and consumers, supporting one-to-many asynchronous push-based communication. In principle, ICN is amenable to broker-less, distributed implementations of the Pub/sub pattern, for example through dataset synchronization techniques a la Psync.
The presented design is addressing constrained environments such as IoT with low-performance producers, potentially connected to larger systems with scalability and naming flexibility requirements that are difficult to meet with existing approaches. For these environments, the ETRI team has developed a multi-broker based approach, where brokers act as rendezvous points for publishers and subscribers and as gateways to other brokers.
Technically, the system is based on
- a logical separation of topic data management (brokers map the topic name to topic rendezvous nodes names through hashing);
- topic manifests that list rendezvous nodes holding named data streams; and
- data manifests describing data names for a data stream.
This system is supposed to be easily scalable and offloads constrained publishers and subscribers, thus supporting IoT environments that are connected to less constrained infrastructure.
NDN-Based Ethereum Blockchain
Quang Tung Thai of ETRI presented results from experiments with an NDN-based Ethereum Blockchain implementation.
Data communication in today's blockchain networks is known to be highly redundant due to the significant amount of duplication that occurs by implementing gossip protocols in connection-oriented overlays. In Ethereum blocks and transaction are broadcast over a such a P2P overlay that is based on a Kademlia-like DHT for finding peers and on TCP communication between peers.
Small objects are pushed directly to all managed peers, whereas large objects are pushed to a few managed peers and are then announced to the remaining peers for subsequent downloading with obvious redundancy and inefficiency.
While the blocks/transaction broadcasting seems to be a good fit for ICN dataset synchronization techniques such as Psync, it turns out that it cannot directly replace the complete Gossip system in Ethereum, as the P2P overlay is still needed for data validation according to the ETRI team.
In the presented work, this has been addressed this by designing an NDN-based P2P system for data announcements that is paired with a NDN-based data retrieval that could still provide most of the efficiency gains. The design is based on the following ideas:
- blockchain nodes have routable prefixes (node names);
- all data objects (blocks/transactions) have globally unique names (so that regular ICN forwarding/caching benefits can apply);
- object names are mapped to nodes names through forwarding hints;
- the existence of new objects is announced through the P2P overlay, and the object is then retrieved using regular ICN Interest/Data; and
- validation still takes place in overlay nodes.
The ETRI team has implemented a fully functional NDN-based Ethereum blockchain client based on geth, the official go-based client, where the TCP/IP P2P module has been replaced by an NDN module. First testbed-based experiments yielded promising efficiency gains, i.e., the traffic redundancy can be translated to higher throughput.
Producer Anonymity based on Onion Routing in Named Data Networking
Toru Hasegawa of Osaka University has presented a scheme for Producer Anonymity based on Onion Routing in NDN.
Baseline ICN provides a somewhat asymmetric flavor of anonymity: in general, consumers enjoy anonymity because CCNx/NDN-based ICN does not have the notion of source addresses, and because INTEREST can be aggregated in the network which could provide additional (opportunistic) anonymity.
In many applications though, endpoints will be both consumers and producers at the same time, especially when providing information to others that needs to be requests through Interest/Data exchanges. In addition, the baseline consumer anonymity does not provide very strong content-consumer unlinkability – so that additional measures are required.
The authors have developed a system that is
- achieving producer anonymity against adversaries who analyze content names, signatures and packet routes; and is
- leveraging mostly baseline NDN mechanisms.
The design is based on the Hidden Service in
Tor and is employing so-called self-certifying names as producer pseudonyms so that consumers can talk to producers through rendezvous point without exposing a routable name. In order to prevent en-route information leakage, producers communicate with other other nodes only through circuits. Additional anonymity for rendezvous communication is achieved through RICE.
The system has been implemented using the ndn-cxx library, with AES-128 for encryption and HMAC-SHA-256 for message digests. One advantage of the system is that it can provide the same level of anonymity as Tor's Hidden Service with less of anonymizing routers, which results in reduced latency and higher throughput.
A Data-Centric View on the Web of Things
Cenk Gündoğan provided a presentation on a Data-centric View on the Web of Things which followed up on his paper at ACM ICN-2020 on Toward a RESTful Information-Centric Web of Things: A Deeper Look at Data Orientation in CoAP.
This presentation was discussing the adoption of information-centric properties in the CoAP-based IoT technology stack, for example:
- request-response semantics (through regular CoAP GET method semantics);
- stateful forwarding and caching (could be achieved through CoAP proxy chaining); and
- content object security (OSCORE).
General ICN principles can be found in different protocols, at different layers. For example DASH-based video streaming is essentially ICN on top of HTTP from an application perspective. Similar comparisons could be made in other domains, namely IoT, specifically for the CoAP technology stack.
The general question here is whether a corresponding CoAP system with application-layer proxying and object security would be comparable to an ICN-based system with respect to feature completeness and efficiency (communication- and implementation-wise).
Other questions that the authors are currently investigating include how relevant ICN features such as the implicit multicast ability could be added/mapped to CoAP and how ICN's name-based routing and forwarding strategies (that could work without dedicated routing protocols in some scenarios) could be matched by CoAP systems (without completely re-implementing ICN on top of CoAP).