Dirk Kutscher

Personal web page

Zensur im Internet

without comments

In der neuen Folge unseres Podcasts Neulich im Netz widmen wir uns eines etwas delikateren Themas: Zensur im Internet

Insbesondere geht es um die "Great Firewall of China" (GFW), die wir in Bezug auf ihre technische Umsetzung und Probleme analysiert haben.

Anhand von Publikationen und eigenen Erfahrungen analyisieren wir, wie die GFW grob funtioniert, kontinuiierlich weiterentwickelt wird, und wie effektiv unterschiedliche Werkzeuge wie VPNs, shadowsocks usw. sind.

Diese und weitere Aspekte von Zensur im Internet in der dritten Episode von Neulich im Netz.

Written by dkutscher

June 23rd, 2021 at 9:56 am

Ist das DNS Noch zu Retten?

without comments

In der neuen Folge unseres Podcasts Neulich im Netz geht um Namen im Internet, d.h., um das Domain Name System (DNS). Wir sprechen über grundlegende DNS-Funktionen, die Bedeutung, die das DNS für das Internet und das Web hat und über Anwendungen wie Tracking und Traffic Steering, die man vielleicht nicht unbedingt mit Namensauflösung in Verbindung bringt.

Wir diskutieren, inwieweit das technische Design des DNS und seine heutige Verwendung zu Sicherheitsproblemen führt und beurteilen einige vorgeschlagene Verbesserungen. Ist das DNS in der heutigen Form noch zu retten? Wie stehen die Chancen dafür? Diese und andere Frage in der zweiten Episode von Neulich im Netz.

Written by dkutscher

June 9th, 2021 at 11:01 am

Neulich im Netz – the Internet Technologies Podcast

without comments

I am pleased to announce that I have teamed up with Rolf Winter for a new bi-weekly video podcast series: Neulich im Netz.

Neulich im Netz

We are covering current and relevant developments in Internet technologies and networked systems in general. We are starting with the german language channel.

Our first episode has been released today: We are talking about Covid-19 and the Internet.

Written by dkutscher

May 26th, 2021 at 10:44 am

Posted in personal,Posts

Tagged with ,

Information-Centric Networking Research Update December 2020

without comments

The IRTF Information-Centric Networking Research Group (ICNRG) held a meeting on December 1st 2020. Here is a summary of the research highlights. You can find all the presentation and the meetings minutes on the IETF datatracker.


Big Data Processing

Edmund Yeh (Northeastern University) presented an overview of recent and current research on supporting Data-Centric Ecosystems for Large-Scale Data-Intensive Science through ICN in the NSF SANDIE Project (SDN-Assisted NDN for Data Intensive Experiments) and in the NSF N-DISE project (NDN for Data Intensive Science Experiments).

Data-intensive science applications such as processing of LHC and genomics data pose interesting challenges to system design and efficient resource usage: from an application perspective these system require accessing named data, independent of location, transport mechanisms etc.

The underlying infrastructures however typically focus on addresses, processes, servers, and connections, which also has repercussions on the security architectures (securing containers and delivery pipes).

The research work in the SANDIE and N-DISE project is applying a data-centric approach to system and network design through the whole data lifecycle, i.e., data is uniquely named and authenticated/encrypted directly at the production phase and then delivered, replicated, stored and made available under that name.

Using basic ICN mechanisms (accessing named data, opportunistic caching, receiver-driven operation, and implicit multicast), accessing, processing and re-using data for data-intensive applications can be much optimized.

Further optimizations can be achieved through:

The team has applied this to accelerating XRootD for scalable
fault-tolerant data access and demonstrated throughput rates of over 6.7 Gbps and 10 times acceleration.

The newly started N-DISE project will continue this research, aiming at developing a production-ready NDN-based petascale data distribution, caching, access, and computation platform that could server major science programs, with LHC high energy physics as a primary target use case. Technically, the work will focus on created NDN-DPDK consumer and producer applications, packaging NDN-DPDK and applications into containers for diverse platforms, and advancing ICN data integrity and provenance mechanisms.


Broker-Based Publish/Subscribe

Nameseok Ko of ETRI presented a design for a Broker-based Pub/Sub System for NDN.

Pub/sub is a popular communication pattern for loosely coupled producers and consumers, supporting one-to-many asynchronous push-based communication. In principle, ICN is amenable to broker-less, distributed implementations of the Pub/sub pattern, for example through dataset synchronization techniques a la Psync.

The presented design is addressing constrained environments such as IoT with low-performance producers, potentially connected to larger systems with scalability and naming flexibility requirements that are difficult to meet with existing approaches. For these environments, the ETRI team has developed a multi-broker based approach, where brokers act as rendezvous points for publishers and subscribers and as gateways to other brokers.

Technically, the system is based on

  • a logical separation of topic data management (brokers map the topic name to topic rendezvous nodes names through hashing);
  • topic manifests that list rendezvous nodes holding named data streams; and
  • data manifests describing data names for a data stream.

This system is supposed to be easily scalable and offloads constrained publishers and subscribers, thus supporting IoT environments that are connected to less constrained infrastructure.


NDN-Based Ethereum Blockchain

Quang Tung Thai of ETRI presented results from experiments with an NDN-based Ethereum Blockchain implementation.

Data communication in today's blockchain networks is known to be highly redundant due to the significant amount of duplication that occurs by implementing gossip protocols in connection-oriented overlays. In Ethereum blocks and transaction are broadcast over a such a P2P overlay that is based on a Kademlia-like DHT for finding peers and on TCP communication between peers.

Small objects are pushed directly to all managed peers, whereas large objects are pushed to a few managed peers and are then announced to the remaining peers for subsequent downloading with obvious redundancy and inefficiency.

While the blocks/transaction broadcasting seems to be a good fit for ICN dataset synchronization techniques such as Psync, it turns out that it cannot directly replace the complete Gossip system in Ethereum, as the P2P overlay is still needed for data validation according to the ETRI team.

In the presented work, this has been addressed this by designing an NDN-based P2P system for data announcements that is paired with a NDN-based data retrieval that could still provide most of the efficiency gains. The design is based on the following ideas:

  • blockchain nodes have routable prefixes (node names);
  • all data objects (blocks/transactions) have globally unique names (so that regular ICN forwarding/caching benefits can apply);
  • object names are mapped to nodes names through forwarding hints;
  • the existence of new objects is announced through the P2P overlay, and the object is then retrieved using regular ICN Interest/Data; and
  • validation still takes place in overlay nodes.

The ETRI team has implemented a fully functional NDN-based Ethereum blockchain client based on geth, the official go-based client, where the TCP/IP P2P module has been replaced by an NDN module. First testbed-based experiments yielded promising efficiency gains, i.e., the traffic redundancy can be translated to higher throughput.


Producer Anonymity based on Onion Routing in Named Data Networking

Toru Hasegawa of Osaka University has presented a scheme for Producer Anonymity based on Onion Routing in NDN.

Baseline ICN provides a somewhat asymmetric flavor of anonymity: in general, consumers enjoy anonymity because CCNx/NDN-based ICN does not have the notion of source addresses, and because INTEREST can be aggregated in the network which could provide additional (opportunistic) anonymity.

In many applications though, endpoints will be both consumers and producers at the same time, especially when providing information to others that needs to be requests through Interest/Data exchanges. In addition, the baseline consumer anonymity does not provide very strong content-consumer unlinkability – so that additional measures are required.

The authors have developed a system that is

  • achieving producer anonymity against adversaries who analyze content names, signatures and packet routes; and is
  • leveraging mostly baseline NDN mechanisms.

The design is based on the Hidden Service in
Tor
and is employing so-called self-certifying names as producer pseudonyms so that consumers can talk to producers through rendezvous point without exposing a routable name. In order to prevent en-route information leakage, producers communicate with other other nodes only through circuits. Additional anonymity for rendezvous communication is achieved through RICE.

The system has been implemented using the ndn-cxx library, with AES-128 for encryption and HMAC-SHA-256 for message digests. One advantage of the system is that it can provide the same level of anonymity as Tor's Hidden Service with less of anonymizing routers, which results in reduced latency and higher throughput.


A Data-Centric View on the Web of Things

Cenk Gündoğan provided a presentation on a Data-centric View on the Web of Things which followed up on his paper at ACM ICN-2020 on Toward a RESTful Information-Centric Web of Things: A Deeper Look at Data Orientation in CoAP.

This presentation was discussing the adoption of information-centric properties in the CoAP-based IoT technology stack, for example:

  • request-response semantics (through regular CoAP GET method semantics);
  • stateful forwarding and caching (could be achieved through CoAP proxy chaining); and
  • content object security (OSCORE).

General ICN principles can be found in different protocols, at different layers. For example DASH-based video streaming is essentially ICN on top of HTTP from an application perspective. Similar comparisons could be made in other domains, namely IoT, specifically for the CoAP technology stack.

The general question here is whether a corresponding CoAP system with application-layer proxying and object security would be comparable to an ICN-based system with respect to feature completeness and efficiency (communication- and implementation-wise).

Other questions that the authors are currently investigating include how relevant ICN features such as the implicit multicast ability could be added/mapped to CoAP and how ICN's name-based routing and forwarding strategies (that could work without dedicated routing protocols in some scenarios) could be matched by CoAP systems (without completely re-implementing ICN on top of CoAP).

Written by dkutscher

December 18th, 2020 at 12:21 am

Posted in IRTF

Tagged with , ,

Piccolo Project on In-Network Computing

without comments

We started a new project on in-network computing.

Nine partners from leading companies and universities in the UK and Germany (Arm, Robert Bosch GmbH, BT, Fluentic Networks Ltd., InnoRoute GmbH, Peer Stritzinger GmbH, Sensing Feeling, the Technical University Munich, and the University of Applied Sciences Emden/Leer), kicked off the Piccolo research project on October 15th, aiming to set a shining example of European research collaboration in challenging times.

Piccolo develops new solutions for in-network computing that remove known and emerging deficiencies of edge and fog computing. Piccolo aims to provide new levels of support for innovative applications such as highly scalable vision processing and automotive edge computing.

The research direction in the Piccolo project is about developing in-network computing platforms that are secure and ethical by design, support fine-granular modularisation, are independent of specific network architectures and that provide new levels of performance and robustness by applying a joint optimisation approach for both networking and computing resources.

The Piccolo project is a two-year CELTIC-NEXT project and is funded by BMWi in Germany and Innovate-UK in the UK, as well as the partners themselves.

Please have a look at the press release on our website for more information.

Written by dkutscher

November 11th, 2020 at 11:35 am

Posted in Projects

Tagged with , ,

Re-Thinking LoRaWAN

without comments

Low-power, long-range radio systems such as LoRaWAN represent one of the few remaining networked system domains that still feature a complete vertical stack with special link- and network layer designs independent of IP. Similar to local IoT systems for low-power networks (LoWPANs), the main service of these systems is to make data available at minimal energy consumption, but over longer distances. LoRaWAN (the system that comprises the LoRa PHY and MAC) supports bi-directional communication, if the IoT device has the energy budget. Application developers interface with the system using a centralized server that terminates the LoRaWAN protocol and makes data available on the Internet.

While LoRaWAN applications are typically providing access to named data, the existing LoRaWAN stack does not support this way of communicating. LoRaWAN is device-centric and is generally designed as a device-to-server messaging system – with centralized servers that serve as rendezvous point for accessing sensor data. The current design imposes rigid constraints and does not facilitate accessing named data natively, which results in many point solutions and dependencies on central server instances.

In our demo paper & presentation at ACM ICN-2020, we are therefore describing how Information-Centric Networking could provide a more natural communication style for LoRa applications and how ICN could help to conceive LoRa networks in a more distributed fashion compared to todays mainstream LoRaWAN deployments. For LoWPANs (e.g., 802.15.4 networks), ICN has already demonstrated to be an attractive and viable alternative to legacy integrated special purpose stacks – we believe that
LoRa communication provides similar opportunities.

Watch my Peter Kietzmann's talk about it here:

Written by dkutscher

October 6th, 2020 at 10:39 pm

Posted in Events,IRTF,Projects,Talks

Tagged with , , ,

ACM ICN-2020 Highlights

without comments

ACM ICN-2020 took place online from September 29th to October 1st 2020. This is a quick summary of the main technical highlights from my personal perspective. Overall, it was a high-quality event, and it was great to see the progress that is being made by different teams. Here, I am focusing specifically on Architecture, Content Distribution, Programmability, and Performance. If you are interested in the complete program, all papers, presentation material, and presentation videos are available on the conference website.

Architecture

The Information-Centric Networking concept can be implemented in different ways (and some people would argue that some overlay systems for content distribution and data processing are essentially information-centric). ICN systems have often been associated with clean-slate approaches, requiring difficult to imagine fork-lift replacement of larger parts of the infrastructure. While this has never the case (because you can always run ICN protocols over different underlays or directly map the semantics to IPv6), it is still interesting to learn about new approaches and to compare existing data-oriented frameworks to pure ICN systems.

Named-Data Transport

In their paper Named-Data Transport: An End-to-End Approach for an Information-Centric IP Internet (Presentation) Abdulazaz Albalawi and J. J. Garcia-Luna-Aceves have developed an alternative implementation of the accessing named data concept called Named-Data Transport (NDT) that can leverage existing Internet routing and DNS, while still providing the general properties (accessing named-data securely, in-network caching, receiver-driven operation).

The system is based on three components: 1) A connection-free reliable transport protocol, called Named Data Transport Protocol (NDTP), 2) a DNS extension (my-DNS) for manifest records that describe content items and their chunks, and 3) NDT Proxies that act as transparent caches and that track pending requests, similar to ICN forwarders, but at the transport layer.

In NDT, content names are based on DNS domain names, and each name is mapped to an individual manifest record (in the DNS). These records provide a mapping to a list of IP addresses hosting content replicas. When requesting such records, the idea is that the system would be able apply similar traffic steering as today's CDNs, i.e., provide the requestor with a list of topologically close locations. Producers would be responsible for producing and publishing such manifests.

The Named Data Transport Protocol (NDTP) is a receiver-driven transport protocol (on top of UDP) used by consumers and NDT Proxies which behave logically like ICN forwarders. There is more to the whole approach (such as security, name privacy etc.).

In my view, NDT is an example of a resolution-based ICN system with interesting ideas for deployability. In principle, resolution-based ICN has been pursued by other approaches before (such as NetInf). In general, these systems have a better initial deployment story at the cost of requiring additional infrastructure (and resolution steps during operation.)

RESTful Information-Centric Web of Things

In the Internet of Things, ICN has demonstrated many benefits in terms of reduced code complexity, better data availability, and reduced communication overhead compared to many vertically integrated IoT stacks and location/connection-based protocols.

In their paper Toward a RESTful Information-Centric Web of Things: A Deeper Look at Data Orientation in CoAP (presentation), Cenk Gündoğan, Christian Amsüss, Thomas C. Schmidt, and Matthias Wählisch compare a CoAP and OSCORE (Object Security for Constrained RESTFul Environments) based network of CoAP clients, servers, and proxies with a corresponding NDN setup.

The authors investigated the possibility of building a restful Web of Things that adheres to ICN first principles using the CoAP protocol suite (instead of a native ICN protocol framework). The results showed, since CoAP is quite modular and can be used in different ways, this is indeed possible, if one is willing to give up strict end-to-end semantics and to introduce proxies that mimic ICN forwarder behavior. (The paper reports on many other things, such as extensive performance measurements and comparisons.)

In my view, this is an interesting Gedankenexperiment, and there was a lively discussion at the conference. One of the discussion topics was the question how accurate the comparison really is. For example, while is is possible to construct a CoAP proxy chain that mimics ICN behavior, real-world scenarios would require additional functionality in the CoAP network (routing, dealing with disruptions etc.) that might lead to a different level of complexity (that would possibly be less pronounced in an native ICN environment).

Still, the important take-away of this paper is that some applications of CoAP & OSCORE exhibit information-centric properties, and it is an interesting question whether, for a green-field deployment, the user would not be better served by a native ICN approach.

Content Distribution

Content Distribution and ICN have a long history, sometimes challenged by some misunderstandings. Because one of the early ICN approaches was called Content-Centric Networking (CCN), it was often assumed that ICN would disrupt or replace Content Distribution Networks (CDNs) or that it was a CDN-like technology.

While ICN will certainly help with large-scale content distribution and potentially also change/simplify CDN operations, the core idea is actually about accessing named data securely as a principal network service -- for all applications (that's why Named Data Networking -- NDN -- is a better name).

Managed content distribution as such will continue to be important, even in an ICN world. Surely, it will enjoy better support from the network as today's CDN can expect, thus enabling new exciting applications and simplifying operations, but I prefer avoiding the notion of ICN replacing CDN.

When looking at actual networks and applications today, it is fair to say that almost nothing works without CDN. What we are seeing today is hyperscalers and essentially all the (so-called) OTT video providers extending their systems into ISP networks, by simply shipping standalone edge caches such as Netflix OCA servers as standalone systems to ISPs.

Each of these providers have their own special requirements of how to map customers to edge caches, how to implement traffic steering etc, which is painful enough for operators already. I expect this to become even more pressing as we shift more and more linear live TV to the Internet. Flash-crowd audiences such as viewers of UEFA Champions' League matches will require a massive extension of the already extensive edge caching infrastructure and require massive investments but also significant complexity with respect to traffic steering and guaranteeing a decent viewing experience.

In that context, it is no wonder that people try to resort to IP-Multicast for ensuring a more scaleable last-mile distribution such as this proposal by Akamai and others. Marrying IP-Multicast with a CDN-overlay is (IMO) not exactly complexity reduction, so I think we are now at a tipping point where the Internet in terms of concepts and deployable physical infrastructure can provide many cool services, but where the limited features of the network layers requires a prohibitive amount of complexity -- to an extend where people start looking for better solutions.

At ICN-2020, CDN was thus discussed quite extensively again -- with many interesting, complementary contributions.

Keynote by Bruce Maggs on The Economics of Content Distribution

We were extremely happy to have Bruce Maggs (Emerald Innovations, on leave from Duke University, ex NEC researcher, one of the founding employees of Akamai) delivering his keynote on the Economics of Content Delivery. In his talk Bruce explained different economic aspects (flow of payments, cost of goods sold) but also challenges for different CDN services such as live-streaming.

The take-aways for ICN were:

  • Incentives and cost must be aligned
  • Performance benefits from caching
    • Reducing latency is valuable to content providers
    • Reducing network is valuable to ISPs.
  • If there was caching at the core (in addition to the edge)
    • What is the additional benefit?
    • Who pays for that?
  • Protocol innovation is still possible
    • In the past, people thought that HTTP/TLS/TPC/IP is difficult to overcome
    • QUIC demonstrates that new protocols can be introduced

The socio-economic discussion resonated quite well with me, as some of earlier ICN projects in Europe tried to address these aspects relatively early in 2008. I believe this was due to the operator and vendor influence at the time. In retrospect, I would say that the approaches at that time were possibly too much top-down and premature (trying to revert value chains and find new business models). It is only now that we understand the economics of CDN, its complexity and real cost that (in my view) represent barriers to innovation -- and that we can start to imagine actually implementing different systems.

Far Cry: Will CDNs Hear NDN's Call?

In their paper Far Cry: Will CDNs Hear CDN's Call? (presentation), Chavoosh Ghasemi, Hamed Yousefi, and Beichuan Zhang tried to compare NDN with enterprise CDN (a particular variant of CDN) with respect to caching and retrieval of static contents.

In their work, the authors deployed an adaptive video streaming service over three different networks: Akamai, Fastly, and the NDN testbed. They had users in four different continents and conducted a two-week experiment, comparing Quality of Experience, Origin workload, failure resiliency, and content security.

I cannot summarize of all of the results here, but the conclusions by the authors were:

  • CDNs outperform the current NDN testbed deployment in terms of QoE (achievable video resolution in a DASH-setting)
  • Origin workload and failure resiliency are mainly the products of the network design -- and the NDN testbed outperforms current CDNs
  • More as an interpretation: NDN can realize a resilient, secure, and scalable content network given appropriate software and protocol maturity and hardware resources.

The paper was discussed intensively at the conference , for example, it was debated how comparable the plain NDN testbed and its network service really are -- to a production-level CDN.

In my view, the value of this paper lies in the created experiment facilities and the attempt to establish some ground truth (based on current NDN maturity). I hope that this work can leverage by more experiments in the future.

iCDN: An NDN-based CDN

In their paper iCDN: An NDN-based CDN (presentation), Chavoosh Ghasemi, Hamed Yousefi, and Beichuan Zhang (i.e., the same authors), pursue a more forward-looking approach. In this paper, they develop a CDN service based on ICN mechanisms, i.e., trying to conceive a future CDN system that does not need to take the current network's limitations into account.

One of the interesting ICN properties is that the main service of accessing named data does not require any notion of location. Sometimes people assume that an Information-Centric system always needs to map names to locators such as IP addresses, but this is a really limited view. Instead, it is possible to build the network solely on forwarding INTERESTs for named data based on forwarding information of that same namespace. A forwarder may have more than forwarding info base entry for the same name -- from a consumer (application) perspective these are completely equivalent.

Because of intrinsic object security, it does not matter from which particular host a content object is served. There can be several copies -- all equivalent. When creating copies of original content, e.g., by cloning a data repository, the new copy needs to be announced (by injecting routing information) , and from that point on, it is reachable without any additional management, configuration or other out-of-band mechanisms.

When applying this notion to CDN scenarios, it is easy to understand the simplification opportunities. In ICN, content repositories can be added to the network, and in-network name-based forwarding will find the closest copy automatically.

For iCDN, the authors have leveraged this basic notion and built an ICN-based CDN that does not need any client-to-cache mapping and overlay routing mechanisms. Based on that, iCDN features logical partitions and cache hierarchies for content namespaces (for acknowledging that there may be different CDN providers, hosting different content services).

iCDNs employ cache hierarchies to exploit on-path and off-oath caches without relying on application-layer routing functions. The idea was to provide a scalable, adaptive solution that can cope with dynamic network changes as well as dynamic changes in content popularity.

There are more details to this approach, and of course the debate on what is the best ICN-based CDN design has just started. Still, this paper is an interesting contribution in my view, because it illustrates the opportunities for rethinking CDN nicely.

Programmability

Programmability and ICN has two facets: 1) Implementing distributed computing with ICN (for example as in CFN -- Compute-First Networking) and 2) implementing ICN with programmable infrastructure. ACM ICN-2020 has seen contributions in both directions.

Result Provenance in Named Function Networking

In their paper Result Provenance in Named Function Networking (presentation), Claudio Marxer and Christian Tschudin have leveraged their previous work on Named Function Networking (NFN) and developed a result provenance framework for distributed computing in NFN.

In this work, the authors augmented NFN with a data structure that creates transparency of the genesis of every evaluation results so that entities in the system can ascertain result provenance. The main idea is the introduction of so-called provenance records that capture meta data about the genesis of the computation result. The paper discusses integration of these records into NDN and procedures for provenance checks and trust computation.

In my view, the interesting contribution of this work is the illustration of how the general concept of provenance verification can be implemented in a data-oriented system such as the ICN-based Named Function Networking framework. The results may be (so some extend) to other ICN-based in-network computing systems, so I hope this paper will start a thread of activities on this subject.

ENDN: An Enhanced NDN Architecture with a P4-programmable Data Plane

In their paper ENDN: An Enhanced NDN Architecture with a P4-programmable Data Plane (presentation), Ouassim Karrakchou, Nancy Samaan, and Ahmed Karmouch present an NDN system that is implemented in a P4-programmable data plane, i.e., a system in which applications can interact with a control plane that configures the data plane according to the required services.

The work in this paper is based on the notion that applications specify their content delivery requirements to the network, i.e., the control plane of a network. The control plane provide a catalogue of content delivery services, which are then translated into data plane configurations that ultimately get installed on P4 switches.

Examples of such services include Content Delivery Pattern services (whether the system is based on INTEREST/DATA or some stateful data forwarding), Content Name Rewrite services (enabling the network to rewrite certain names in INTERESTs), Adaptive Forwarding services (next-hop selection) etc.

In my view, this paper is interesting because it provides a relatively advanced perspective of how applications specify required behavior to a programmable ICN network. Moreover, the authors implemented this successfully on P4 switches and described relevant lessons learned and achievements in the paper.

Performance

Performance has historically always been an interesting topic in ICN. On the one hand, ICN provides substantial performance increases in the network due to its forwarding and caching features. On the other hand, it has been shown that implementing an ICN forwarder that operates at modern network line-speeds is challenging.

NDN-DPDK: NDN Forwarding at 100 Gbps on Commodity Hardware

In their paper NDN-DPDK: NDN Forwarding at 100 Gbps on Commodity Hardware (presentation), Junxiao Shi, Davide Pesavento, and Lotfi Benmohamed present their design of a DPDK-based forwarder.

The authors have developed a complete NDN implementation that runs on real hardware and that supports the complete NDN protocol and name matching semantics.

This work is interesting because the authors describe the different optimization techniques including better algorithms and more efficient data structures, as well as making use of the parallelism offered by modern multi-core CPUS and multiple hardware queues with user-space drivers for kernel-bypass.

This work represents the first software forwarder implementation that is able to achieve 100 Gpbs without compromises in NDN protocols semantics. The authors have published the source at https://github.com/usnistgov/ndn-dpdk.

Written by dkutscher

October 4th, 2020 at 12:28 am

Posted in Events

Tagged with ,

Reflexive Forwarding for Information-Centric Networking

without comments

In most Internet (two-party) communication scenarios, we have to deal with connection setup protocols, for example for TCP (three-way handshake), TLS (three-way key agreement), HTTP (leveraging TLS/TCP before GET-RESPONSE). The most important concern is to make sure that both parties know that they have succesfully established a connection and to agree on its parameters.

In client-server communication, there are other, application-layer, requirements as well, for example authenticating and authorizing peer and checking input parameters. Web applications today, typically serve a mix of static and dynamic content, and the generation of such dynamic content requires considerable amount of client input (as request parameters), which in results in considerable amounts of data (Google: "Request headers today vary in size from ~200 bytes to over 2KB.", SPDY Whitepaper).

When designing connection establishment protocols and their interaction with higher layer protocols, there are a few, sometimes contradicting objectives:

  • fast connection setup: calls for minimizing the number of round-trips;
  • reliable connection and security context setup: reliable state synchronization requires a three-way handshake); and
  • robustness against attacks from unauthorized or unwanted clients: could be done by filtering connection attempts, by authentication checks, or other parameter checks on the server.

The goal to minimize the number of round-trips can contradict with robustness: For example, in a dynamic web content scenario, spawning a server worker thread for processing a malicious client request that will have to be declined can be huge resource waste and thus make the services susceptible to DOS attacks.

These are general trade-offs in many distributed computing and web-based systems. In Information-Centric Networking (ICN), there can be additional objectives such as maintaining client (consumer) anonymity (to the network) to avoid finger-printing and tracking (ICN does not have source addresses).

Current ICN protocols such as CCNx and NDN have a wide range of useful applications in content retrieval and other scenarios that depend only on a robust two-way exchange in the form of a request and response (represented by an Interest-Data exchange in the case of the two protocols noted above).

A number of important applications however, require placing large amounts of data in the Interest message, and/or more than one two-way handshake. While these can be accomplished using independent Interest-Data exchanges by reversing the roles of consumer and producer, such approaches can be both clumsy for applications and problematic from a state management, congestion control, or security standpoint.

For RICE, Remote Method Invocation for ICN, we developed a corresponding scheme that addresses the different objectives mentioned above.

In draft-oran-icnrg-reflexive-forwarding we have now provided a formal specification of a corresponding Reflexive Forwarding extension to the CCNx and NDN protocol architectures that eliminates the problems inherent in using independent Interest-Data exchanges for such applications. It updates RFC8569 and RFC8609.

The approach that we have taken here is to extend the ICN forwarding node requirements, so in addition to the general state synchronization problems, this Internet Draft raises the question of evolvability of core ICN protocols.

Discussion on the ICNRG mailing list.

Written by dkutscher

April 3rd, 2020 at 5:06 pm

Posted in Blogroll,IRTF

Tagged with , , ,

Back to Humboldt — or How to Organize your Teaching in covid-19 Times

without comments

Many university-level teachers have switched or will have to switch to online teaching and coaching. My university switched relatively seamlessly a few weeks ago already, and I have a received quite a few requests for advice, so let me share some thoughts here.

The TL;DR summary:

  • keep calm and carry on;
  • understand your objectives and teaching methodology;
  • balance technology and didactics concerns;
  • avoid tool chaos;
  • use existing infrastructure;
  • understand scalability requirements and infrastructure constraints;
  • record everything;
  • leverage new possibilities;
  • and if you have pick one online teaching tool, use BigBlueButton (see below).

First of all, it's interesting to see how different universities in different countries approach the covid-19 crisis. Some have switched immediately to online teaching (or just extended their already existing online courses). Others announced extended Easter breaks, and there are even discussions of just canceling the summer term.

While the prospect of Corona holidays (or just more time for other work) may sounds attractive, I would strongly advise against it for two reasons:

  1. Extended breaks, suspended periods of teaching etc., will most likely result in more stress for everybody (professors, students, admin staff) later. In a situation with many uncertainties that may well hurt us more in the end.
  2. The lock-down (in whatever regional variant) is necessary, but it's obvious that you cannot lock down everything (hospitals, food production etc.). Every social and business activity that is locked down will hurt society in some way. There are some activities that cannot continue right now and absolutely have to be suspended to avoid community transmission, which is causing enough problems (small shops, artists but also larger scale factories). Luckily, there are some professions that can be re-organized and continue in some way -- university-level teaching is one of those. These activities should continue just to minimize societal damage.

In my university, the term started on March 1st. Luckily, the executive leadership had been quite up-to-speed regarding covid-19-related measures in the weeks before in terms of communication, sanitation, travel advice etc. When the federal state of Lower Saxony in Germany announced the suspension of presence-based teaching in universities in schools on March 13th, it did not come unexpected, and we continued most courses online in the next weeks.

Obviously, not everything went perfectly, and there were a few lessons learned that I will summarize in the following. I do research and teaching in Networked Systems (Computer Science), so there is a certain technology bias here.

Avoid Tool Chaos -- Use Existing Infrastructure as Much as Possible

Most universities are using some kind of learning management or e-learning system such as Moodle. They are never perfect of course, but it is really a good idea to use them as much as possible, because:

  • your students are already enrolled and typically know the system well;
  • Moodle and similar systems provide a ton of collaboration features that you may have ignored so far but that are really useful such as Wikis, forums, etc. They may not always be super-fancy as some individual externally-hosted services -- but think about your priorities in crisis times…
  • Your learning management and e-learning systems are production tools that contribute to your university's core business, so there is a good probability that they are actually well-provisioned and well maintained -- good in times of fast-growing demand.

Of course everyone has their favorite Wiki, shared editing, online collaboration tool etc., but just going for these incurs cost on two sides:

  • You have to select, assess, set-up and configure them. When things break because of exploding demand, you have to re-iterate etc.
  • The combinatorial explosion for students that have to deal with all the different preferred tools is significant.

Understand Your Objectives and Teaching Methodology

When presence-based teaching is suspended, many people would probably think: "OK, I have to do Zoom lectures now".

First of all, translating all presence-based activities directly to online lectures will most likely be extremely stressful for both you and your students. You would not imagine the fatigue that sets in after a full day of different online courses. So, it's not unreasonable to scale down both the density of individual lectures as well as their cadence.

Moreover, not everything has to be done in synchronous online meetings. Luckily, there are many ways to share knowledge, engage in discussions that are used productively in non-covid-19 times such as shared editing, Wiki, forums (see above).

Finding a good balance between synchronous and asynchronous teaching/collaboration can also really transform your courses from traditional teaching & examination to something more interesting.

Also, reach out to the didactics experts in your organization (or elsewhere). They may actually have good tips for unconventional methods that were never quite practical but might just be useful now.

In that context, most people would probably agree that good education ("Bildung" in german) is more than programming skills into brains, so the crisis could be a good opportunity to double down on (r)evolving education to the Humboldtian model of higher education.

Specifically, I am referring to promoting self-determination and responsibility, encouraging self-motivated learning, and combining research and teaching.

So when thinking about methodology and tools, do not only think about the tools that you use -- also think about what you offer to enable students study, discuss, research without you. Luckily, most students don't need to be taught with respect to good online collaboration tools, but it may still be useful to provide a space on a reliable platform, at least for kickstarting things.

With respect to tools, personally, I have converged to:

  • asynchronous collaboration tools (forum, Wiki, tests) in Moodle;
  • file sharing, collaborative software development with gitlab;
  • online teaching with BigBlueButton (see below);
  • online discussion (smaller groups, chat rooms for students) with Jitsi Meet (see below);

I have been using many of those before anyway, so no big change.

As a meta-remark, I would also recommend to manage everyone's expectations (especially your own ones): crisis means change for everyone, lots of improvisation, hopefully lots of volunteer efforts. There is really no need to expects (and demand) perfection. While it's good to carry on and make sure students get their education and degrees, nobody will be angry if there is slow-start, some slack in-between and some prioritization of fewer topics -- maybe quite the opposite.

Online Teaching and Tools

Having sorted out asynchronous vs. synchronous collaboration above, there is still a lot to be said about online teaching tools. Again, it's quite important to understand your objectives -- and what the individual tools are actually intended for. Also, when deciding on a particular tools or platform, it does not hurt to understand some technical basics, such as how Internet multimedia works, what scalability means, how infrastructure constraints may affect your experience etc.

It's very natural: when we have worked with some online communication/collaboration tool, and it worked OK-ish, we tend to use it again, sometimes also for unintended purposes. However, it's important to understand two things:

  1. Online teaching can mean different things (making video available vs. interactive online classrooms), and although you can stretch things sometimes, there is not one tool that fits all purposes.
  2. Just the fact that a tools worked once for you, and the user experience was not completely horrible, does not imply that it will work well in your lecture.

Different Forms of Online Teaching

This might be self-explanatory to most, but let me just quickly explain the basics:

  • Lecture Streaming/Broadcasting (on YouTube, Twitch and similar platforms) is great for distributing recorded or live content to audiences. Although there are chat-based feedback mechanisms, it's not the same as virtual classrooms with interactive discussion, collaborative editing etc. Don't get me wrong, I have used YouTube for live lectures myself and it's OK if you can accept the constraints. I would use it for public, pre-recorded content mostly.
  • General-purpose online multimedia communication (with Skype, Jitsi Meet, Google Hangouts, WebEx, Zoom etc.) is great to discuss in your team, family etc., and sometimes they can also scale to larger conference calls, but they are not primarily intended for online teaching. For example, these tools would often lack collaborative editing, integrated document sharing, integration with learning management and e-learning systems etc. Of courses, some of them are also quite feature-rich and certainly usable (I have done lectures with tools like that without problems), but it's better to treat them as a fallback -- for example in crisis times, when you need a fast solution.
  • Online Teaching tools (for example BigBlueButton) are specialized multimedia conferencing tools that provide extra functionality to ensure a good teaching and learning experience. For example, they make sure that presentation material sharing works really well (and is not just window sharing in a video stream), they use reasonable resolution and bandwidth settings that balance quality and resource efficiency, somebody has thought about UX design for teachers' view, they make it easy to share session recordings, and they integrate with your LMS.

The point here is not that one of these is better than the others -- these are really just different categories for different purposes. If you can, pick the right one for your needs.

Technical Constraints

Believe it or not, thanks to relentless research and engineering efforts by the networking community, namely the IETF and IRTF, (interactive) multimedia real-time communication is technically a solved problem. Still, sometimes things get screwed up badly -- why?

With most hosted online communication tools, there is really no point in extrapolating one's one-time experience to general applicability in a class room scenario. Even if tool A worked well in your class today, it does not have to mean it will well for your colleague tomorrow. There are different factors that affect, for example scalability and usability, especially in crisis times.

For example, when talking about scalability, there are two dimensions:

  1. How many participants can you have in one session? Obviously, this also depends on what you do, e.g., is there one video sender or 100? But independent of that, some tools may have design characteristics (video formats and encoding options, protocols, scalability of the server software) that make them support larger crowds better or worse than others.
  2. How many conferences can you have the at the time? Assuming you are a university, this would be an interesting question. For hosted systems, this fundamentally depends on the available (or allocated) resources, i.e., servers at your provider.

In other words, there can be systems that are great, a pleasure to use from a software design perspective, but in order to make a credible statement on applicability to your online teaching, you need to consider:

  • Who is hosting the system?
  • How does performance isolation work?
  • How oversubscribed is the service (now and at peak times)?
  • What is the latency between you (your participants) and the server(s), i.e., where are they hosted?

Some conference systems work with static resource allocation, for example one virtual machine per personalized conference server. This can work well, depending on how many VMs are allocated to one physical server. Others may use modern cloud-native auto-scaling. In general, great -- but it still depends on how generous you are with respect to resource allocation.

The point I am trying to make is that it is often not very helpful to recommend your popular tool if you cannot say anything about the deployment parameters and the particular scaling approach.

Third-party Hosting vs. Self-Hosting

With all these uncertainties with externally hosted services one might ask: isn't it better to run a self-hosted conferencing server (farm), for example a licensed commercial system or an Open-Source system?

Well, this depends a lot on your infrastructure: Assuming you are using interactive online teaching with at least one video stream at a time, with todays technology you would need about 1 MBit/s per participant, i.e., 100 MBit/s for a class of hundred (on average -- can easily be double or more, depending on video quality). A university has many simultaneous lectures, so you might be reaching 1GBits/s with 10 simultaneous lectures already. That's not necessarily too much -- it depends on your institutions internal network and access to the Internet.

Video servers are also relatively resource-hungry. Nothing that cannot be handled by a few powerful servers, but you would have to set them up with a load balancer, maintain them etc.

This can all be done, and your typical sysadmins should be able to do that -- but it's probably not the right choice when your provost tells you that everybody has to switch to online teaching tomorrow.

My Recommendations

I am using a mix of asynchronous and different synchronous collaboration tools, i.e., as mentioned above:

  • Moodle-based management and collaboration (mailing lists, basic Wiki, forum, tests etc.)
  • gitlab (git repository plus Wiki mostly)
  • Self-hosted Jitsi Meet for online meetings
  • University-hosted BigBlueButton for online teaching

In the spirit of full transparency, I am also using WebEx as a fallback (kindly sponsored by Cisco) just to have some redundancy.

I am considering to use some form of instant messaging system (probably Jabber) to create a more inclusive, connected community for courses, but have not found the time yet to set this up. I used Slack before for university projects, but I don't want to make it a rule for all courses.

Jitsi Meet

Jitsi Meet is an Open Source video conferencing tool that you can use using Jitsi's server or host on your own infrastructure. It's using WebRTC (i.e., media streaming in your browser). Communication between your browser and the server is encrypted. The server is not mixing video (like on some systems) but is using selective forwarding (i.e., switching the main video stream depending on configuration and on who is currently talking).

It's a great tool that is sometimes underestimated because you don't see the different options when you just use the public service. For example Jitsi can do recording, youTube live streaming, collaborative editing (through Etherpad). There are also ways to run it with load-balancers for better scalability and availability, and you can even network the video bridges for better experience in global, large-scale conferences.

Load of our Jitsi VM with three parallel conferences (lectures, meetings)

There is still work to do with respect to video codecs (at least the transmission rates can be quite high sometimes), the way that recording and streaming is implemented (through a pseudo client that grabs video from a Chrome client) and usability of the server software (nothing crazy).

My recommendations for running your own Jitsi Meet server:

  • Use the docker installation option that runs the different server components in docker components and makes the initial setup really easy (including automatic Let's Encrypt certificate installation);
  • Encourage your users to use the desktop client application (instead of running it in a browser). The desktop client contains the same WebRTC code in a package. It works a bit better (performance- and reliability-wise) compared to using your average Chrome or Firefox -- I suspect because of potential feature interaction with Addons (and I use quite many).
  • Use the options for muting participants on joining and enforce certain rules for larger meetings (i.e., mute everybody except the main presenter, turn off video unless needed etc.)
  • Jitsi will happily use the best video quality that your camera can produce. Often, that is not needed -- you can configure lesser quality (which can reduce server/network load significantly).
  • The bottleneck in a Jitsi-Server is typically the network interface, so try to run in on server with a 10Gbit/s-interface for better results.

In my group, we set up our own server in the week before presence-based teaching got suspended, and it has proven to be super-valuable, especially when many of the centralized server-based systems failed on day one. Some of my colleagues are using it regularly for their courses.

BigBlueButton

BigBlueButton is a web conferencing system designed for online teaching and learning:

  • You can maintain several rooms (say one per class, each of which with its own configuration, recordings etc.). Every room has a unique URI -- that's all students have to know.
  • the UI is designed to enable tracking video, shared material, participants roster, chat windows;
  • presenting slides etc. is not done via screen/application sharing but though a dedicated distribution channel: you can upload PDFs and other formats to the server that then distributed this to the clients -- works much better than screen sharing in a video stream;
  • live multi-user whiteboards;
  • user polling;
  • recordings and replay on the platform; and
  • learning management system integration.

I am really convinced by the overall look and feel of the system and its performance and scalability. So far, I have used in lectures with up to 60 participants (on a spare, not high-end server) without any problems. The resource requirements seem comparatively lower compared to Jitsi Meet, probably also because of a more careful configuration of default video sending rates.

BigBlueButton Management Console (in Firefox)

From a security perspective, BigBlueButton uses encryption for all communication between your browser and the server (but the server still needs to have access, like the Jitsi Meet server).

My recommendation for running your own BigBlueButton server:

  • The client software runs well in a number of browser (tested Safari, Chrome, Firefox) so far. If you want to use WebRTC desktop/application window sharing, make sure you use it with Firefox or Chrome.
  • For university-scale deployments there is a load balancer for BigBlueButton that would allow to add more servers as you grow.

In summary, I am convinced that BigBlueButton addresses most if not all online teaching requirements. If you find a way to run or it (or have it hosted), you should use it.

Recording

Many of us had experimented with recording lectures before (for example, for flipped classroom setups). Now, with the general shift to online lectures, recording essentially becomes a "by-product", i.e., you can just turn it on (if your system supports it).

In the current crisis period, recording is actually not only nice-to-have -- I would even say that it's a crucial feature:

  • Having the possibility to provide access to recorded lectures can remove a lot of pressure in times of distress. There is a lot to process for each of us and just knowing that there will be recordings creates additional assurance.
  • Online teaching sessions are peak-utilization periods. Having videos for asynchronous consumption can help distributing the load because not everybody has to join the simultaneous live streaming.
  • Depending on the region your students live in, access networks may not be perfect, especially not if you have to share a low-bandwidth link with another student who is supposed to follow online lectures.
  • In times of lock-downs and travel restrictions, some of your students may actually be out of the country without a chance to return any time soon. They may not even be in the same timezone…
  • Although you would expect that everyone is currently practicing home-sheltering and should have lots of time on their hands, don't forget that crisis times can actually mean real crisis for
    individual people: they may have to take care of family members or
    themselves, queue at supermarkets, doctors' offices etc. -- so just because you are sitting at in your home office, does not have to imply that everybody is.

In summary, consider recording everything (with participant consent) and make it available whenever possible.

To Zoom or not to Zoom?

Zoom has been getting a lot of bad press recently and I'm getting many questions from colleagues and friends about it.

First of all, Zoom is a modern video conferencing service with excellent scaling properties, so performance is typically good, and it's also very easy to use.

Should you use it?

No

While some of the recent online articles are hyperbole and based on an incorrect understanding of how these systems typically work, there are, in my opinion, some strong arguments to stay away from Zoom:

  • Zoom is presenting itself as a (paid) conferencing service. However it has turned out that they also work with quite a few infamous tracking systems, i.e., they share data about you (and everyone who is using it) with the online tracking industry. Many websites do that because it's their main business model, and many users haven't been aware before GDPR at least forced them obtain your consent. It's not obvious why a commercial conferencing service has to do that, though.
  • For some websites and services, we have gotten used to ubiquitous tracking and we may either accept or not, or find ways to contain it (difficult). Personally, we may even be OK with tracking. However, it's a different thing for online teaching where you have a captive audience. By using a tracking-encumbered system in your lecture, you are essentially forcing your students to use it, too -- and to become a subject of tracking and surveillance themselves.
  • The Zoom client runs in a web browser (where expert users may be able to contain the tracking to some extent), however Zoom is trying to force users to install the standalone application.
  • Unfortunately, Zoom has demonstrated time-over-time that they do not understand basic system security, for example: webcam hack, malware installation trick on MacOS, use of single AES-128 key in ECB mode.

Bruce Schneier has summarized the most critical issues. Zoom is apparently another company that adopted the "grow fast, apologize later" approach and is now trying surf on the covid-19 wave to accelerate their growth at whatever cost.

These models used to be standard in the web and advertisement industry. Time is changing though, and as more people understand the problems of intransparent, uncontrolled surveillance, aggregation and unlimited storage, these business ethics will become increasingly unacceptable. In a few years we will look at it bewildered -- like we look at Weinstein-type misogyny in #MeToo times.

I am hoping that Zoom as a company gets the message, but I am not confident to be honest.

Luckily, for online teaching, we don't have to care because there are better alternatives anyway.

So, make wise choices.

Updates

  • 2020-04-04: Fixed formatting and other nits; added Zoom AES-128-ECB vulnerability and links to Citizen Lab's and Bruce Schneier's blog postings about it.

Written by dkutscher

April 2nd, 2020 at 11:01 pm

Keynote at IEEE HotICN-2019

without comments

I had the pleasure of being invited for a keynote at IEEE HotICN-2019 in Chongqing. I talked about key ICN properties (from my perspective), about general research areas, and three specific topics: Quality of Service, Forwarding Plane Interaction with the Routing System and Applications, and In-Network Computing.

HotICN-2019

Written by dkutscher

December 16th, 2019 at 9:47 pm

Posted in Events

Tagged with , , , ,