Dirk Kutscher

Personal web page

Information-Centric Networking Research Update December 2020

without comments

The IRTF Information-Centric Networking Research Group (ICNRG) held a meeting on December 1st 2020. Here is a summary of the research highlights. You can find all the presentation and the meetings minutes on the IETF datatracker.


Big Data Processing

Edmund Yeh (Northeastern University) presented an overview of recent and current research on supporting Data-Centric Ecosystems for Large-Scale Data-Intensive Science through ICN in the NSF SANDIE Project (SDN-Assisted NDN for Data Intensive Experiments) and in the NSF N-DISE project (NDN for Data Intensive Science Experiments).

Data-intensive science applications such as processing of LHC and genomics data pose interesting challenges to system design and efficient resource usage: from an application perspective these system require accessing named data, independent of location, transport mechanisms etc.

The underlying infrastructures however typically focus on addresses, processes, servers, and connections, which also has repercussions on the security architectures (securing containers and delivery pipes).

The research work in the SANDIE and N-DISE project is applying a data-centric approach to system and network design through the whole data lifecycle, i.e., data is uniquely named and authenticated/encrypted directly at the production phase and then delivered, replicated, stored and made available under that name.

Using basic ICN mechanisms (accessing named data, opportunistic caching, receiver-driven operation, and implicit multicast), accessing, processing and re-using data for data-intensive applications can be much optimized.

Further optimizations can be achieved through:

The team has applied this to accelerating XRootD for scalable
fault-tolerant data access and demonstrated throughput rates of over 6.7 Gbps and 10 times acceleration.

The newly started N-DISE project will continue this research, aiming at developing a production-ready NDN-based petascale data distribution, caching, access, and computation platform that could server major science programs, with LHC high energy physics as a primary target use case. Technically, the work will focus on created NDN-DPDK consumer and producer applications, packaging NDN-DPDK and applications into containers for diverse platforms, and advancing ICN data integrity and provenance mechanisms.


Broker-Based Publish/Subscribe

Nameseok Ko of ETRI presented a design for a Broker-based Pub/Sub System for NDN.

Pub/sub is a popular communication pattern for loosely coupled producers and consumers, supporting one-to-many asynchronous push-based communication. In principle, ICN is amenable to broker-less, distributed implementations of the Pub/sub pattern, for example through dataset synchronization techniques a la Psync.

The presented design is addressing constrained environments such as IoT with low-performance producers, potentially connected to larger systems with scalability and naming flexibility requirements that are difficult to meet with existing approaches. For these environments, the ETRI team has developed a multi-broker based approach, where brokers act as rendezvous points for publishers and subscribers and as gateways to other brokers.

Technically, the system is based on

  • a logical separation of topic data management (brokers map the topic name to topic rendezvous nodes names through hashing);
  • topic manifests that list rendezvous nodes holding named data streams; and
  • data manifests describing data names for a data stream.

This system is supposed to be easily scalable and offloads constrained publishers and subscribers, thus supporting IoT environments that are connected to less constrained infrastructure.


NDN-Based Ethereum Blockchain

Quang Tung Thai of ETRI presented results from experiments with an NDN-based Ethereum Blockchain implementation.

Data communication in today's blockchain networks is known to be highly redundant due to the significant amount of duplication that occurs by implementing gossip protocols in connection-oriented overlays. In Ethereum blocks and transaction are broadcast over a such a P2P overlay that is based on a Kademlia-like DHT for finding peers and on TCP communication between peers.

Small objects are pushed directly to all managed peers, whereas large objects are pushed to a few managed peers and are then announced to the remaining peers for subsequent downloading with obvious redundancy and inefficiency.

While the blocks/transaction broadcasting seems to be a good fit for ICN dataset synchronization techniques such as Psync, it turns out that it cannot directly replace the complete Gossip system in Ethereum, as the P2P overlay is still needed for data validation according to the ETRI team.

In the presented work, this has been addressed this by designing an NDN-based P2P system for data announcements that is paired with a NDN-based data retrieval that could still provide most of the efficiency gains. The design is based on the following ideas:

  • blockchain nodes have routable prefixes (node names);
  • all data objects (blocks/transactions) have globally unique names (so that regular ICN forwarding/caching benefits can apply);
  • object names are mapped to nodes names through forwarding hints;
  • the existence of new objects is announced through the P2P overlay, and the object is then retrieved using regular ICN Interest/Data; and
  • validation still takes place in overlay nodes.

The ETRI team has implemented a fully functional NDN-based Ethereum blockchain client based on geth, the official go-based client, where the TCP/IP P2P module has been replaced by an NDN module. First testbed-based experiments yielded promising efficiency gains, i.e., the traffic redundancy can be translated to higher throughput.


Producer Anonymity based on Onion Routing in Named Data Networking

Toru Hasegawa of Osaka University has presented a scheme for Producer Anonymity based on Onion Routing in NDN.

Baseline ICN provides a somewhat asymmetric flavor of anonymity: in general, consumers enjoy anonymity because CCNx/NDN-based ICN does not have the notion of source addresses, and because INTEREST can be aggregated in the network which could provide additional (opportunistic) anonymity.

In many applications though, endpoints will be both consumers and producers at the same time, especially when providing information to others that needs to be requests through Interest/Data exchanges. In addition, the baseline consumer anonymity does not provide very strong content-consumer unlinkability – so that additional measures are required.

The authors have developed a system that is

  • achieving producer anonymity against adversaries who analyze content names, signatures and packet routes; and is
  • leveraging mostly baseline NDN mechanisms.

The design is based on the Hidden Service in
Tor
and is employing so-called self-certifying names as producer pseudonyms so that consumers can talk to producers through rendezvous point without exposing a routable name. In order to prevent en-route information leakage, producers communicate with other other nodes only through circuits. Additional anonymity for rendezvous communication is achieved through RICE.

The system has been implemented using the ndn-cxx library, with AES-128 for encryption and HMAC-SHA-256 for message digests. One advantage of the system is that it can provide the same level of anonymity as Tor's Hidden Service with less of anonymizing routers, which results in reduced latency and higher throughput.


A Data-Centric View on the Web of Things

Cenk Gündoğan provided a presentation on a Data-centric View on the Web of Things which followed up on his paper at ACM ICN-2020 on Toward a RESTful Information-Centric Web of Things: A Deeper Look at Data Orientation in CoAP.

This presentation was discussing the adoption of information-centric properties in the CoAP-based IoT technology stack, for example:

  • request-response semantics (through regular CoAP GET method semantics);
  • stateful forwarding and caching (could be achieved through CoAP proxy chaining); and
  • content object security (OSCORE).

General ICN principles can be found in different protocols, at different layers. For example DASH-based video streaming is essentially ICN on top of HTTP from an application perspective. Similar comparisons could be made in other domains, namely IoT, specifically for the CoAP technology stack.

The general question here is whether a corresponding CoAP system with application-layer proxying and object security would be comparable to an ICN-based system with respect to feature completeness and efficiency (communication- and implementation-wise).

Other questions that the authors are currently investigating include how relevant ICN features such as the implicit multicast ability could be added/mapped to CoAP and how ICN's name-based routing and forwarding strategies (that could work without dedicated routing protocols in some scenarios) could be matched by CoAP systems (without completely re-implementing ICN on top of CoAP).

Written by dkutscher

December 18th, 2020 at 12:21 am

Posted in IRTF

Tagged with , ,