Archive for the ‘Posts’ Category
Ist das DNS Noch zu Retten?
In der neuen Folge unseres Podcasts Neulich im Netz geht um Namen im Internet, d.h., um das Domain Name System (DNS). Wir sprechen über grundlegende DNS-Funktionen, die Bedeutung, die das DNS für das Internet und das Web hat und über Anwendungen wie Tracking und Traffic Steering, die man vielleicht nicht unbedingt mit Namensauflösung in Verbindung bringt.
Wir diskutieren, inwieweit das technische Design des DNS und seine heutige Verwendung zu Sicherheitsproblemen führt und beurteilen einige vorgeschlagene Verbesserungen. Ist das DNS in der heutigen Form noch zu retten? Wie stehen die Chancen dafür? Diese und andere Frage in der zweiten Episode von Neulich im Netz.
Neulich im Netz – the Internet Technologies Podcast
I am pleased to announce that I have teamed up with Rolf Winter for a new bi-weekly video podcast series: Neulich im Netz.
We are covering current and relevant developments in Internet technologies and networked systems in general. We are starting with the german language channel.
Our first episode has been released today: We are talking about Covid-19 and the Internet.
- Website: https://www.neulich-im.net/
- YouTube
Great Expectations
Protocol Design and Socioeconomic Realities
The Internet & Web as a whole qualify as wildly successful technologies, each of which empowered by wildly successful protocols per RFC 5218's definition [1]. As the Internet & Web became critical infrastructure and business platforms, most of the originally articulated design goals and features such as global reach, permissionless innovation, accessibility etc. [5] got overshadowed by the trade-offs that they incur. For example, global reach —intended as enabling global connectivity — can also imply global reach for infiltration, regime change and infrastructure attacks by state actors. Permissionless innovation — motivated by the intention to overcome the lack of innovation options in traditional telephone networks — has also led us to permissionless surveillance and mass-manipulation-based business models that have been characterized as detrimental from a societal perspective.
Most of these developments cannot be directly ascribed to Internet technologies alone. For example, most user surveillance and data extraction technologies are actually based on web protocol mechanisms and particular web protocol design decisions. While it has been documented that some of these technology and standards developments have been motivated by particular economic interests [2], it is unclear whether different Internet design decisions could have led to a different, "better" outcome. Fundamentally, economic drivers in different societies (and on a global scale) cannot be controlled through technology and standards development alone.
This memo is thus rather focused on specific protocol design and evolution questions, specifically on the question how technical design decisions relate to socio-economic effects, and aims at providing input for future design discussions, leveraging experience from 50 years of Internet evolution, 30 years of Web evolution, observations from economic realities, and from years of Future Internet research.
IP Service Model
The IP service model was clearly designed to provide a minimal layer over different link layer technologies to enable inter-networking at low implementation cost [3]. Starting off as an experiment, looking for feasible initial deployment strategies, this was clearly a reasonable approach. The IP service model of packet-switched end-to-end best-effort communication between hosts (host interfaces) over a network of networks, was implemented by:
- an addressing scheme that allows specifying source and destination host (interface) addresses in a topologically structured address space; and
- minimal per-hop behavior (stateless forwarding of individual packets).
The minimal model implied punting many functions to other layers, encapsulation, and/or "management" services (transport, dealing with names, security). Multicast was not excluded by the architecture, but also not very well supported, so that IP Multicast (and the required inter-domain multicast routing protocols) did not find much deployment outside well-controlled local domains (for example, telco IP TV).
The resulting system of end-to-end transport over a minimal packet forwarding service has served many applications and system implementations. However, over time, technical application as well as business requirements have led to additional infrastructure, extensions and new way of using Internet technologies, for example:
- in-network transport performance optimization to provide better control loop localization in mobile networks;
- massive CDN infrastructure to provide more scalable popular content distribution;
- (need for) access control, authorization based on IP and transport layer identifiers;
- user-tracking based on IP and transport layer identifiers; and
- usage of DNS for localization, destination rewriting, and user tracking.
It can be argued that some of these approaches and developments have also led to some of the centralization/consolidation issues that are discussed today – especially with respect to CDN that is essentially inevitable for any large-scale content distribution (both static and live content). Looking at the original designs, the later understood commercial needs and the outcome today, one could ask the question, how would a different Internet service model and different network capabilities affect the tussle balance [5] between different actors and interests in the Internet?
For example, a more powerful forwarding service with more elaborate (and more complex) per-hop-behavior could employ (soft-) stateful forwarding, enabling certain forms of in-network congestion control. Some form of caching could help making services such as local retransmissions and potential data sharing at the edge a network service function, removing the need for some middleboxes.
Other systems such as the NDN/CCNx variants of ICN employ the principle of accessing named-data in the network, where each packet must be requested by INTEREST messages that are visible to forwarders. Forwarders can aggregate INTERESTs for the same data, and in conjunction with in-network storage, this can implement an implicit multicast distribution service for near-simultaneous transmissions.
In ICN, receiver-driven operation could eliminate certain DoS attack vectors, and the lack of source addresses (due to stateful forwarding) could provide some form of anonymity. The use of expressive, possibly application-relevant names could enable better visibility by the network —however potentially enabling both, more robust access control and (on the negative side) more effective hooks for censoring communication and monitoring user traffic.
This short discussion alone illustrates how certain design decisions can play out in the real world later and that even little changes in the architecture and protocol mechanisms can shift the tussle balance between actors, possibly in unintended ways. As Clark argued in [3], it is important to understand the corresponding effects or architectural changes, let alone bigger redesign efforts.
The Internet design choices at a time were motivated by certain requirements that were valid at the time — but may not all still hold today. Todays networking platforms are by far more powerful, more programmable. The main applications are totally different as are the business players and the governance structures. This process of change may continue in the future, which adds another level of difficulty for any change of architecture elements and core protocols. However, this does not mean that we should not try it.
Network Address Translation
Network Address Translation (NAT) has been criticized for impeding transport layer innovation, adding brittleness, and delaying IPv6 adoption. At the same time NAT was deemed necessary for growing the Internet eco system, for enabling local network extensions at the edge without administrative configuration. It also provides a limited form of protection against certain types of attacks. As such it addressed shortcomings of the system.
The implicit client-initiated port-forwarding (the technical reason for the limit attack protection mentioned above) is obviously blocking both unwanted and wanted communication, which makes it difficult to run servers at homes, enterprise sites etc. in a sound way (manual configuration of port forwarding still comes with limitations). This however could be seen as one of the drivers for the centralization of servers in data centers ("cloud") that is a concern in some discussions today. [4]
What does this mean for assessing and potentially evolving previous design decisions? The NAT use cases and their technical realization are connected to several trade-offs that impose non-trivial challenges for potential architecture and protocol evolution: 1) Easy extensibility at the edge vs. scalable routing; 2) Threat protection vs. decentralized nature of the system; 3) Interoperability vs. transport innovation.
In a positive light, use cases such as local communication and dynamic Internet extension at the edge (with the associated security challenges) represent interesting requirements that can help finding the right balance in the design space for future network designs.
Encryption
Pervasive monitoring is an attack [7], and it is important to assess existing protocol and security frameworks with respect to changes in the way that the Internet is being used by corporations and state-level actors and to develop new protocols where needed. QUIC is encrypting transport headers in addition to application data, intending to make user tracking and other monitoring attacks harder to mount.
Economically however, the more important use case of user tracking today is the systematic surveillance of individuals on the web, i.e., through a massive network of tracking, aggregation and analytics entities [6]. Ubiquitous encryption of transport and application protocols does not prevent this at all — on the contrary, it makes it more difficult to detect, analyze, and, where needed, prevent user tracking. This does not render connection encryption useless (especially not because surveillance in the network and on web platforms complement each other through aggregation and commercial trading of personally identifying information (PII)) but it requires a careful consideration of the trade-offs.
For example, perfect protection against on-path monitoring is only effective if it covers the complete path between a user agent and the corresponding application server. This shifts the tussle balance between confidentiality and network control (enterprise firewalls, parental control etc.) significantly. Specifically for QUIC, which is intended to run in user space, i.e., without the potential for OS control, users may end up in situations where they have to trust the application service providers (who typically control the client side as well, through apps or browsers, as well parts of the CDN and network infrastructure) to transfer information without leaking PII irresponsibly.
If the Snowden revelations led to a better understanding of the nature and scope of pervasive monitoring and to best current practices for Internet protocol design, what is the adequate response to the continuous revelations of the workings and extent of the surveillance industry? What protocol mechanisms and API should we develop, and what should we rather avoid?
DNS encryption is another example that illustrates the trade-offs. Unencrypted DNS (especially with the EDNS0 client subnet option, depending on prefix length and network topology) can increase of privacy violations by on-path/intermediary monitoring.
DNS encryption can counter certain on-path monitoring attacks — but it could effectively make the privacy situation for users worse, if it is implemented by centralizing servers (so that application service providers, in addition to tracking user behaviour for one application, can now also monitor DNS communication for all applications). This has been recognized in current proposals, e.g., limiting the scope for DNS encryption to stub-to-resolver communication. While this can be enforced by architectural oversight in standards development, we do not yet know how we can enforce this in actual implementation, for example for DNS over QUIC.
Future Challenges: In-Network Computing
Recent advances in platform virtualization, link layer technologies and data plane programmability have led to a growing set of use cases where computation near users or data consuming applications is needed — for example for addressing minimal latency requirements for compute-intensive interactive applications (networked Augmented Reality, AR), for addressing privacy sensitivity (avoiding raw data copies outside a perimeter by processing data locally), and for speeding up distributed computation by putting computation at convenient places in a network topology.
In-network computing has mainly been perceived in four main variants so far: 1) Active Networking, adapting the per-hop-behavior of network elements with respect to packets in flows, 2) Edge Computing as an extension of virtual-machine (VM) based platform-as-a-service to access networks, 3) programming the data plane of SDN switches (leveraging powerful programmable switch CPUs and programming abstractions such as P4), and 4) application-layer data processing frameworks.
Active Networking has not found much deployment due to its problematic security properties and complexity. Programmable data planes can be used in data centers with uniform infrastructure, good control over the infrastructure, and the feasibility of centralized control over function placement and scheduling. Due to the still limited, packet-based programmability model, most applications today are point solutions that can demonstrate benefits for particular optimizations, however often without addressing transport protocol services or data security that would be required for most applications running in shared infrastructure today.
Edge Computing (just as traditional cloud computing) has a fairly coarse-grained (VM-based) computation-model and is hence typically deploying centralized positioning/scheduling though virtual infrastructure management (VIM) systems. Application-layer data processing such as Apache Flink on the other hand, provide attractive dataflow programming models for event-based stream processing and light-weight fault-tolerance mechanisms — however systems such as Flink are not designed for dynamic scheduling of compute functions.
Ongoing research efforts (for example in the proposed IRTF COIN RG) have started exploring this space and the potential role that future network and transport layer protocols can play. It is feasible to integrate networking and computing beyond overlays, potentially ? What would be a minimal service (like IP today) that has the potential for broad reach, permissionless innovation, and evolution paths to avoid early ossification?
Conclusions
Although the impact of Internet technology design decisions may be smaller than we would like to think, it is nevertheless important to assess the trade-offs in the past and the potential socio-economic effects that different decisions could have in the future. One challenge is the depth of the stack and the interactions across the stack (e.g., the perspective of CDN addressing shortcomings of the IP service layer, or the perspective of NAT and centralization). The applicability of new technology proposals therefore needs a far more thorough analysis — beyond proof-of-concepts and performance evaluations.
References
[1] D. Thaler, B. Aboba; What Makes for a Successful Protocol?; RFC 5218; July 2008
[2] S. Greenstein; How The Internet Became Commercial; Princeton University Press; 2017
[3] David Clark; Designing an Internet; MIT Press; October 2018
[4] Jari Arkko et al.; Considerations on Internet Consolidation and the Internet Architecture; Internet Draft https://tools.ietf.org/html/draft-arkko-iab-internet-consolidation-01; March 2019
[5] Internet Society; Internet Invariants: What Really Matters; https://www.internetsociety.org/internet-invariants-what-really-matters/; February 2012
[6] Shosanna Zuboff; The Age of Surveillance Capitalism; PublicAffairs; 2019
[7] Stephen Farrell, Hannes Tschofenig; Pervasive Monitoring is an Attack; RFC 7258; May 2014
Change Log
- 2019-06-07: fixed several typos and added clarification regarding EDNS0 client subnet (thanks to Dave Plonka)
5G: It’s the Network, Stupid
Current 5G network discussion are often focusing on providing more comprehensive and integrated orchestration and management functions in order to improve "end-to-end" managebility and programmability, derived from NGMN and similar requirements. While these are important challenges, this memo takes the perspective that in order to arrive at a more powerful network, it is important to understand the pain points and the reasons for certain design choices of today's networks. Understanding the drivers for traffic management systems, middleboxes, CDNs and other application-layer overlays should be taken as a basis for analyzing 5G uses cases and their requirements. In this memo, I am making the point that many of today's business needs and the ambitious 5G use cases do call for a more powerful data forwarding plane, taking ICN as an example. Features of such a forwarding plane would include better support for heterogeneous networks (access networks and whole network deployments), multi-path communication, in-network storage and implementation of operator policies. This would help to avoid overlay silos and finally simplify network management.
Introduction
5G is the current title for much of the network system work in the telco industry these days. All companies, SDOs and industry fora are now working on 5G technologies. There seems to be a rough consensus on requirements and use cases, and first proposals seem to suggest that the design and the implementation will have something to do with SDN/NFV. In general, the assumptions are that 5G will be faster (thanks to better radio), larger (intended to cover more connected devices due to IoT, smart city, new markets), more flexible (network programmability), and converged (unification of mobile and fixed network access/core).
The implications for the actual system architecture and the way we will communicate in the future are not that clear. Some of the current proposals seems to suggest a network platform that is going to provide significant application support in the network. Other proposals seem to be targeted at trying to apply SDN/NFV to the design.
In this memo, I am making the point that the design of a new system architecture and the formulation of requirements for that should be based on a good understanding of the realities and problems of today's networks. I am claiming that we should use the tools we have now developed like NFV, SDN but also knowledge about efficient transport, content distribution, security to rethink network and system architecture.
I will start with a discussion of pain points in today's networks, before I address popular 5G use cases as proposed by NGMN. I am assessing a few 5G design options and formulate a constructive proposal as conclusion.
Notes:
- The views presented here are my own.
- This is based on a recent presentation I did on this topic. If you are interested, please find the presentation material here: "Security and Transport Performance in 5G".
Today's Networks
The commercial success of today's mobile and fixed networks is clearly based on the success of the Internet and the web. Web applications are hugely popular, especially web-based video video services. It's a bit ironic that these applications are sometimes called Over-the-Top (OTT) applications (from a network operator perspective) -- clearly these are the applications -- there are essentially no other applications that are of interest to users (except for audio telephony which is still treated as a special application).
Current networks are largely leveraging Internet technologies (namely IP) -- however we had to develop a large set of additional gear to make a useful service out of it.
For example, mobility management: Based on the "seamless connectivity" service requirement LTE employs an anchor-point-based mobility approach, implementing through tunneling (either GTP- or proxy-MIP-based). This concept lends itself to a centralized design with the usual inefficiency and scalability problems -- hence people have started inventing technologies like Selected Traffic Offload when they figured out that most users actually just want to access a web resource -- for which seamless IP connectivity is not necessarily required. Current work in the IETF DMM WG and in 3GPP is concerned with generalizing this principle towards decentralized mobility management.
Most extra work needs to be done on the performance side: TCP proxies, traffic management systems, application traffic optimizers, CDNs.
Figure 1 contrasts the theoretic architecture with a more typical implementation.
Figure 1: Mobile Network Performance Enhancing Functions (Copyright 2015 NEC)
The motivation for this extra functionality is as follows:
- TCP proxies are tools for mobile operators for tuning network performance with respect to their requirements. TCP's end-to-end congestion control does not work so well when it has to bridge heterogeneous networks with different causes for delays and packet loss. One of the reasons it does not work so well especially in mobile networks is actually the design of the system as a virtual-circuit-like service: Significant buffering, variable latency, no AQM, no congestion notification. So as a result, you end up with proxies that manipulate the flow/generation of ACKs to trick senders etc. This is typically really helping performance -- otherwise, I hope, these boxes would not be deployed, because they are also creating some problems.[Honda-2011]
- Traffic management systems have a similar motivation: give operators a tool for implementing performance & capacity sharing policies. There are really different implementations of this concept, but in general, these systems typically work like this: a centralized traffic management system collects real-time and long-term load and performance-related data from base stations, routers etc. and uses that to configure policers on gateways, base stations etc. The policies may be flow-specific (e.g., to reduce current congestion contribution of specific flow) or application-type-specific (enforce specific treatment of a group of flows) etc. Surely, it does not sound like a terribly elegant or scalable approach -- but it is done nevertheless because IP itself does not provide sufficient traffic management features itself, so that some of this could be done in-band. Another reason is that without AQM and ECN, such management-based approaches are perceived as the only option to have the network re-act to overload.
- Application-traffic optimizers are mainly video optimizers these days. Their job is caching, pacing, transcoding of video traffic, e.g., youTube. There may be other purposes such as user behavior analytics, statistics etc. These systems are implemented as a transparent chain of traffic classifiers, load balancers and the actual application function. TCP/IP per se does not offer caching on the network/transport layer and explicit HTTP proxies have interoperability problems, so this motivates this implementation approach. Obviously, this will all become more difficult/expensive as more encryption is deployed, e.g., through HTTP/2.
- Network/application server cooperation. An extended variant of traffic management is the Mobile Traffic Throughput Guidance proposal. This is about sharing base station and other relevant information to application servers outside the operator domain to enable applications (video senders) to adapt faster and more proactively. Again, this is done because of a perceived lack of corresponding network/transport layer functionality.
- CDN deployment is ubiquitous these days. No major web service is deployed without it. CDNs are large-scale content distribution/management networks that provide functions such as pro-active distribution, caching, transcoding, filtering etc. There are different CDN providers, and some operators actually own or cooperate closely with CDN providers. A typical deployment is to run CDN nodes close to the operator network, e.g., in a co-location point, although there is a trend to move CDNs deeper into the network. CDNs are essentially like large-scale application-traffic optimizers. But since CDN nodes are normally on the direct path for all user traffic, they require explicit redirection which is done through DNS-based resolution of DNS names to operator (telco or DNS) CDN nodes. But as on-path application traffic optimizers, CDNs have problems with respect to encryption, i.e., they normally cannot intercept TLS communication between a user and a orign server without impersonating that server. The reason that TLS and CDN still works today is that CDN nodes today are configured with their own certificates for a certain domain (e.g., "cdn.example.com") that are linked to a valid trust chain so that users' browsers accept those certificates. While this works, it should be mentioned that this is still problematic from an e2e encryption perspective. The user actually expected an encrypted communication channel between her application and the application server on the orign server, but what she gets is merely an encrypted connection to the next CDN node.
- Transport encryption will proliferate very fast due to the integration of TLS into HTTP/2 and the "always encrypt" policy in major web browsers. It will see a significant uptake once CDNs start deploying it, i.e., also as adapters to legacy HTTP/1.1 servers. As mentioned above, it will render most of the existing traffic management and application traffic optimizers useless or at least make it more expensive to use them. This is creating quite some concerns on the mobile operator side -- which fuels current discussions on if and how the network, user application, and application servers should cooperate for exchanging at least some traffic management information in the presence of ubiquitous encryption (cf. IAB/GSMA MARNEW workshop). Unfortunately, according to some views at least, such management information cannot be (reliably) transferred in an IP or TCP header today, so there is discussion about creating overlay solutions with better support for signaling management and other meta information.
Summarizing, it is not surprising that we need a significant amount of gear in today's network to make them work and perform well: IP forwarding concepts and the whole network architecture were not designed for this scale of commercial deployment, for specific business needs and performance requirements.
Unfortunately, we had to hack the system to some extent to get this functionality integrated: localized congestion control loops require transparent (and brittle) TCP proxies. The lack of in-network visibility of imminent congestion on multiple bottleneck made us resort to management-based approaches, and the lack of network/transport caching as well as the lack for policy-based request forwarding gave us CDN. I did not mention much about problems, but lack of true end-to-end security in the presence of connection-based security and CDN is certainly a big one. The fact that CDN and the DNS-based cache selection is essentially only an overlay over the network shows when we try to do multipath communication in an CDN network. I could go on.
These things are really normal when systems grow over time and people learn what is needed, what did work well, what did not work so well etc. At some point, you have learned enough that you can build a new system.
5G Use Cases and Requirements
The mobile operator industry has been trying to approach the 5G topic by formulating the following use cases in the NGMN 5G White Paper:
- Broadband access in dense areas ("Pervasive Video")
- Broadband access everywhere ("50+ Mbps Everywhere")
- Higher User mobility ("High-Speed Train")
- Massive Internet of Things ("Sensor Networks")
- Extreme real-time communications ("Tactile Internet")
- Lifeline communications ("Natural Disaster")
- Ultra-reliable communications ("E-Health Services")
- Broadcast-like services ("Broadcast Services")
The NGMN White Paper does not claim this list to be exhaustive. I would add Affordable Access as another use case, i.e., along the lines of what is discussed in the Global Access to the Internet for All (GAIA) community.
Also, what is not explicitly mentioned is industry networks (also known as Industry-4.0 in some communities), i.e., the concept to 1) use Internet and virtual networking technology and platforms for factory networks and the like, and 2) to interconnect industry sites. Obviously for both cases, the challenge would be guaranteeing upper latency bounds, reliability -- when running over a multiplexed network infrastructure.
Finally, I would like to add The next use case to the list, i.e., I would like to emphasize the need to keep the network open for future innovations that cannot be planned or imagined today. This has to do with permissionless innovation, avoiding in-network silos, creating a powerful general-purpose platform.
Everyone has their own interpretation when it comes to deriving requirements, but in my view the following can be inferred:
- 5G access will be much more heterogeneous with respect to link layer properties, bandwidth, latency, availability. For example, extremely high frequency communication such as mmWave communication is sometimes mentioned. This would offer super-high throughput and low latency, however only in very small cells. It has interesting challenges, for example, ramping up sending rates in a TCP session with peers on the Internet or managing connectivity for mobile users. Then there are very constrained networks in IoT scenarios, or cheap but low-bandwidth radios in GAIA scenarios. Finding a good network abstraction for all these different kinds of networks seems to be an interesting challenge.
- Use cases such as broadband access everywhere and tactile Internet require a super low latency -- especially the latter would not tolerate full path delay, so would need some local communication possibilities (e.g., through caching or edge computing).
- Related to the increased heterogeneity, I also predict that mobile devices would have more simultaneous access options, i.e., they'd be able to select interface or how to use them in parallel depending on performance requirements and cost constraints.
- Lifeline communication e.g., in disaster scenarios would call for a network that is able to provide useful services in the presence of fragmentation, loss of core network connectivity etc. The GreenICN project has investigated this intensively. Clearly, centralized control and gateways would not lend themselves to such scenarios.
5G Design Options
In the current design discussions I am aware of, there are few ideas that come up frequently:
- Data/control plane split through SDN: this is essentially the idea to design switch capabilities and a programmable interface for enabling controllers to program GTP processing behavior. It's a straightforward idea for generalizing PDN-GW platforms, but it's clearly orthogonal to the requirements listed above.
- Simplified mobile core: Accepting the fact that not all mobile applications would need perfect mobility management and seamless connectivity, one idea is to simplify the architecture in a way that it provides a layered service stack, i.e., with a minimal baseline layer that is less complicated and less costly to operate. This could actually help with performance improvement goals.
- Sliced network architecture: Potentially based on the simplified mobile network idea, there are also proposals to apply virtualization to the mobile network and to offer separate slices. There are two variants to this: Multitenancy for MVNOs and Quality of Service Slicing. Multitenancy for MVNOs is relatively straightforward and is essentially about allowing MVNOs deeper access to a physical network operator's network through virtualizing most core and access network functions, including base stations.
- Quality of Service Slicing: another view on slicing is to offer individual Quality-of-Service slices (like the not so frequently used QCI classes in UMTS and LTE). For example, there would be the best-effort slice, the interactive multi-media slide, the IoT slice etc. It's really like mapping traditional QoS to virtual network concepts -- with similar problems: how would an operator know which slice configurations will be needed in the future? How would an applicaton or a user select slices? How would such a system correspond to network neutrality requirements -- how would it maintain the permission-less innovation feature of the Internet?
- (Deep) In-network caching and computing: For use cases such as "Tactile Internet", but also for more profane applications such as IoT gateways and caching in the access network, there are many ideas for moving such functions deeper into the network. Industry initiatives such as Mobile Edge Computing are pursuing this in a limited fashion today.Technically, this would be about managing IaaS and about shifting function containers to the right place in the network. More future-looking proposals are assuming arbitrary application layer compute functions in arbitrary places in the network. There are different motivations by different players: Network operators see this as an opportunity to "create value" for their networks, i.e., offering platforms that can host such functions. CDN providers see this is an opportunity to extend their platform, both in terms of reach as well as functionality (Akamai). If you extend a CDN massively you could effectively run an overlay multicast distribution network. Again, the shortcoming of the underlying network and transport layer are motivating factors for doing this as an overlay.
- Network service programmability and orchestration: Extending the in-network compute concept, you could also envision a distributed programmable platform that would offer more flexible programmability than just pushing containers to specified locations. For example, a next-generation Mobile-TV provider could operate a multicast-overlay in an operator network, with functions chains for caching, transcoding etc. The distribution, run-time management etc. would then be subject to an application-independent orchestration function. This idea is also motivated by the "value creation" proposition, i.e., network operators would provide the platform and orchestration functions to application service developers/providers. (cf. SONATA project).
Silos in the Network
The last two approaches raise interesting questions as to how manageable this approach would be in the end. There are most likely many CDN providers who would want to run their functions deeper in the network. Then there are also specific applications that require some caching but would not want to use external CDNs platforms, for example video-on-demand services. As a result, you could end up with a collection of silos, each with their specific overlay as depicted in figure 2.
Figure 2: Overlay Silos (Copyright 2015 NEC)
The "deep silo" approach is also motivated by the connection-based communication and security model. Because it is not really possible to share data (while maintaining security properties such as access control rights, authenticity) in the network, we tend to build silos that are centered around the model of enabling a connection to a named server.
There is a particular risk associated with the "deep silo" approach and security. Assume a large number of virtualized CDN nodes, each of those maintaining certificates and public keys for the overall CDN service. Hardening these platforms so that none of these would eventually leak seems to be a major objective. In general, running services on massively distributed software functions deep in the network has risks like this -- which makes the overall approach appear questionable in my opinion.
Centralized Control and Orchestration
The orchestration topic highlights a particular problem: the existing shortcomings of the network infrastructure with respect to its forwarding and self-management capabilities already require a worrying collection of management functions as explained in section "Today's Networks". Instead of empowering the network, removing the need for transparent middleboxes, overlays etc., we might be taking the need for network management to the extreme -- by adding more overlays, more application-layer functions in the network etc.
This is exemplified by misapplying the SDN paradigm towards complete "end-to-end" network control with a network management mindset. Let me explain this: SDN (OpenFlow in particular) was once created as a programmatic interface to enterprise/campus networks that would allow implementing a consistent security policies (isolating nodes on a (virtualized) network). That was motivated by the fact that this is difficult to achieve with the traditional control plane and network management tool set. Also, as mentioned above, IP is really limited with respect to traffic management support, hooks for policy implementation etc.
With OpenFlow, a controller in a local domain is enabled to program forwarding and limited transformation rules into switches in a network so that they could be treated as a virtual switch. This can be done in well-controlled domains (enterprise/campus networks, data centers) and remove the need for some distributed control plane functions and protocols. Since larger parts of mobile networks run in data centers, this is also a valid technology for 5G -- as a tool to implement network control to achieve better network flexibility and policy implementation.
What (in my opinion) does not work so well is to elevate the SDN centralized control paradigm to a mantra for network architecture by applying the centralized control concept to the Internet. For example (slightly exaggerating) creating a powerful centralized controller for controlling base station radio communication, transport network, core network, middleboxes, application servers is likely to create a complex and soon ossified system with a gigantic control overhead. Not only will you have to master the timing issues if you want to achieve fine granular control across layers, you will also have to think about domain-to-domain controller interaction ("east/west interfaces") etc. Anyone remembering "Intelligent Networks"?
Instead, it would be more productive to think about desirable forwarding plane features and proper network abstractions for that -- and then use SDN to control networks in a programmatic fashion, i.e., without fine-grained re-active control and without tying network management & orchestration to network programmability.
Way Forward
So, what do to do about 5G? First of all, it is important to understand that "5G" is not going to be a sudden major fork-lift upgrade of the network. It is actually an innovation effort title, and we are going to see changes in phases.
- Optimizing LTE system implementation through NFV and SDN is happening right now. I would also list "Data/control plane split through SDN" in this category. I would not call this core 5G work -- it would not change the system architecture and interfaces -- but it would be useful in a sense that we improve the infrastructure and explore the potential for more fundamental architecture changes.
- Introduce modern AQM, ECN and transport protocols NOW. A lot of progress has been made in past years (Experimenting with ECN, improving fair queueing and AQM), and it's about time to get these technologies deployed, especially in the presence of ubiquitous encryption, when DPI-based traffic management has less leverage. It's really important to reduce latency further and to enable applications to respond and adapt to congestion faster. One work item here is to get the interworking of IP and link layer protocols correct. In that context, it would also be useful to rethink capacity sharing and traffic management. For example, try to learn from the (experimental) IETF ConEx effort to find ways to combine performance, smarter ways of capacity sharing than traditional TCP fairness, and incentives for applications to cooperate better -- without requiring a complicated traffic management system to enforce this.
- Enable competition and innovation on the network service provider side: This may sound odd first, but in order to move towards 5G, the anticipated use cases, also including GAIA-inspired services, it would be good if it was easier to start new services, not only as virtual services on top of existing networks. In that context the FCC efforts for spectrum sharing between incumbents and new players are interesting.
- Avoid "Intelligent Networks-2.0". It may sound tempting to create super-powerful platforms for in-network services, APIs for service creation etc. to create a more valuable network. There may be even a case for certain applications, for example IoT gateways. But be careful when defining use case and requirements without actually talking to stake holders that are building Internet and Web Services. For example, services like youTube would best benefit from an efficient, low-latency bitpipe -- not from a network service platform. The fundamental risk is that we are building a very elaborate service platform with powerful orchestration etc. that is just too complicated and costly to use, or may impede innovation by enforcing certain communication forms -- so that application service providers would refrain from using it -- and do everything "over-the-top". Or worse, they would start their own network services. If you don't think this is possible, I recommend taking a look at Project Fi, Google's MVNO approach. BTW, this is what happened to Intelligent Networks. Their problem was not that you could not build and operate networks that way -- IN was just too inflexible for innovation, one of the factors that led to the development of SIP-based VoIP "over the top".
- Innovate on the forwarding plane. In order to address the performance requirements, especially considering increased access technology heterogeneity and more flexibility with respect to network deployment options thanks to NFV and SDN, we need a more powerful forwarding plane that enables the network to better deal with local bottlenecks, multipath communication opportunities, in-network storage for local repair, data sharing and rate adaptation. This would enable us to let the network handle many important optimization itself -- without requiring fine-grained control from network management. It would enable us to provide such functions in an efficient, application-independent way -- without creating different silos with similar functionality that is entangled with application-specifics.
Powerful Forwarding Plane
The last point is the motivation for people to look into Information-Centric Networking (ICN) as a 5G forwarding plane. ICN is based on the notion of providing "access to named data" as the fundamental network service. Named data can be packets, Application-Data Units, chunks, or objects. Data is secured (cryptographically bound to a name and/or orign) so that is does not need connection-based security. This facilitates application-independent caching in the network and other functions that are today done in application-specific silos.
ICN routers have better visibility of performance because they can measure interface/path performance in correlation with requests names -- for every hop where this is needed. This enables a forwarding plane that is powerful enough to handle challenges such as intermittent connectivity, multiple local bottlenecks, varying path performance -- without adding too much complexity. Operators can configure different, powerful, forwarding strategies on individual routers, which is the key to support the different 5G use cases and heterogeneous access networks.
Especially for 5G, ICN would make mobility management much easier -- in a way that it would not need the current anchor-based mobility management schemes. For example, requestor hand-over is just a matter of (re-) requesting named data on new network attachment points. ICN forwarding strategies and in-network caching would make this as seamless as today's managed mobility.
There are different specific ideas on how to make use of ICN in 5G (e.g. Cisco's). There are also other benefits such as having a unfied communication abstraction for both the mobile network part of 5G and IoT networks that would be better discussed in a separate posting. The important notion is as follows:
- We have learned much about required functionality to make the Internet useful for diverse sets of commercial and non-commercial applications. For many of those we had to revert to application-layer overlays and elaborated network management support. With that knowledge we can now redesign the interplay of network layer, transport layer, and application as well as network management to build better networks.
- The key question to me is to find a suitable network and forwarding plane abstraction, i.e., to define the capabilities of nodes in the network and find a good function split between forwarding plane and SDN control and network management (the latter two are two different things). The general approach for simplification should be to only do things in network management that you cannot do on the network layer. ICN is just an example of how to do design such a function split -- and it illustrates the benefits.
- The named data approach is a better fit to modern communication requirements. It provides object security, enables data consumption independent of the current source of the bits, which is turn a prerequisite of in-network caching, device-to-device communication and delay-tolerant communication, all of which is deemed critical for 5G. We have moved from physical circuits to TCP connections -- it's now time to go one step further from telephony towards networked computing.
You might ask what this has to do with SDN and NFV. As mentioned above SDN and NFV are really network implementation approaches and infrastructure operation improvements. NFV is obviously an enabler for innovation in a sense as it enables and automates the deployment of software in the network, including ICN functions. ICN could very well be implemented with SDN.
In fact, ICN may actually enable an interesting evolution of today's OpenFlow model. In SDN for IP (take OpenFlow as an example), you have to deal with the fact that endpoint identity and next-hop forwarding information is entangled in IP addresses. Consequently, SDN applications typically implement the desired forwarding behavior through header rewriting in order to interwork with existing infrastructure (and to encode additional information in packet headers). Software-Defined ICN would rather have to do with programming Forwarding Information Bases, configuring forwarding strategies and caching policies -- so a more pro-active, actual programming-like approach. IP SDN and ICN SDN could well coexists, for example in separate slices in a shared infrastructure.
Again, the important notion for 5G is to emphasize networking capabilities and abstraction -- with a focus on performance, application-independence and openness to innovation. The question is not so much whether we should do that or not -- but rather who is going to do it. Cisco’s Paul Mankiewich, SP Mobility CTO, has expressed this as follows:
If the network operator industry fails to create an ICN-like architecture then someone like Google will and they will put it behind the SP's IP transport network.
In fact Google has many ingredients for that already: Project Fi as virtual bitpipe across service providers' networks, QUIC as a vehicle for redesigning transport and application layer protocols, Google CDN and the whole Google cloud as the infrastructure platform.
In that sense, it might not be too unreasonable to say that those who refuse to learn from the history of Intelligent Networks are doomed to repeat it.
Privacy, Performance, Protocols: ICN Researchers meet in Prague
The IRTF Information-Centric Networking Research Group (ICNRG) had another ICN research fest with two meetings this week in Prague where IETF-93 is taking place.
ICN is an approach to evolve the Internet infrastructure to directly support information distribution by introducing uniquely named data as a core Internet principle. Data becomes independent from location, application, storage, and means of transportation, enabling in-network caching and replication. This enables the design of more robust, secure and better performing networked systems.
This week, more than 100 researchers got together to discuss recent advances in protocol development, performance optimizations, user privacy and new use cases.
One of the protocols that are developed in ICNRG is CCNx, a network protocol that provides requests (Interests) for named data and Content Object responses. The protocol semantics are specified in draft-irtf-icnrg-ccnxsemantics, and the protocol format is specified in draft-irtf-icnrg-ccnxmessages. The ICN community is currently discussing several extensions to the protocol, including support for "manifest" objects, which would facilitate the distribution of larger, chunked objects and add additional performance and flexibility to ICN systems.
Another highlight of the meeting was a presentation by Iannis Psaras from UCL on Solving the Congestion Problem using ICNPrinciples, an approach that is using Resource Pooling as a tool to manage uncertainty in congestion management.
Vasilis Sourlas (UCL) presented Information Resilience through User-Assisted Caching in Disruptive Content-Centric Networks. The corresponding paper won the IFIP 2015 best paper award and describes work from the GreenICN project. The approach relies on a modified NDN router design that features a “Satisfied Interest Table” (SIT) that enables user-assisted caching.
Bengt Ahlgren (SICS) presented on the Applicability and Tradeoffs of ICN for Efficient IoT (draft-lindgren-icnrg-efficientiot). This document outlines the tradeoffs involved in utilizing Information Centric Networking (ICN) for the Internet of Things (IoT) scenarios. It describes the contexts and applications where the IoT would benefit from ICN, and where a host-centric approach would be better. The requirements imposed by the heterogeneous nature of IoT networks are discussed (e.g., in terms of connectivity, power availability, computational and storage capacity). Design choices are then proposed for an IoT architecture to handle these requirements, while providing efficiency and scalability. An objective is to not require any IoT specific changes of the ICN architecture per se, but we do indicate some potential modifications of ICN that would improve efficiency and scalability for IoT and other applications.
Dirk Trossen (Inter Digital) presented IPoverICN – the Better IP?, a presentation of the EU-H2020 POINT project that is developing an IP over ICN system. The hypothesis of this project is that IPoverICN has the potential to run IP services better than in standard IP networks.
Mark Stapp (Cisco) presented on Private Communication in ICN. This presentation asks the question whether ICN needs better privacy protection to achieve parity with IP for user privacy in the presence of ubiquitous encryption. The discussion initiated an intensive discussion on privacy requirements for ICN that will continue in upcoming meetings.
Jan Seedorf (NEC) presented on Using ICN in Disaster Scenarios (draft-seedorf-icn-disaster/). This is a presentation of the GreenICN project and summarized some research challenges for coping with natural or human-generated, large-scale disasters. Further, the document discusses potential directions for applying Information Centric Networking (ICN) to address these challenges.
All presentations and detailed notes can be found at the ICNRG Wiki.
This summer will host a series of additional ICN events:
The Beauty of ICN in IoT
One of the papers recenty presented at ICN-2014 described an interesting IoT implementation and corresponding experiment with CCN-Lite on the RIOT platform.
Previously, ICN has been perceived as providing conceptual benefits such as
- simplified, natural APIs to developers;
- increased robustness through caching;
- facilitating data fusion through hop-by-hop replication;
- reduced network stack layering; and
- inherent auto-configuration.
The authors describe their implementation of CCN-Lite on RIOT and their approach to realize IoT communication in a 60 node testbed. The idea is to apply Reactive Optimistic Name-based Routing (RONR), i.e., an ICN name-based forwarding approach to send requests for named information in an IoT network using a hybrid flooding/unicast approach.
Some results of their comparison:
- 70% less ROM, 80% less RAM usage by the stack implementation (compared to a RPL/6lowpan implementation);
- 50% reduction of transmitted packets thanks to RONR and ICN caching.
Some pointers for further reading:
- Emmanuel Baccelli, Christian Mehlis, Oliver Hahm, Thomas C. Schmidt, Matthias Wählisch; Information Centric Networking in the IoT: Experiments with NDN in the Wild; ACM ICN-2014; September 2014; [paper], [presentation]
- CCN-Lite
- RIOT -- the friendly OS for the IoT
- ACM SIGCOMM ICN-2014