content at Dirk Kutscher

Archive for the ‘content’ tag

Next Steps for Content Syndication

This is a follow-up on Mark Nottinhgam's blog post on What RSS Needs that I read with some interest.

RSS and Atom have been enabling non-mediated feeds for website updates that are very useful and once were quite popular until the Web took a different direction. Mark is discussing some areas that should be addressed for revitalizing such feeds, based on what we know today. He talked about Community, User Agency, Interoperability Tests, Best Practices for Feeds, Browser Integration, Authenticated Feeds, and Publisher Engagement. Check out his blog posting for details.

I would like to offer some additional thoughts:

Features that should be maintained from RSS/Atom

Receiver-driven operation

The user device ("client") should generally be in control and fetch updates based on its own schedule and requirements. This fits well with typical web interactions, i.e., HTTP GET. See below for additional ideas in section "Protocol Independence".

Aggregation

Aggregation, i.e., the combination of different input feed for forming a new feed as a feature in RSS and Atom. This should obviously be maintained. It may need some additional security (authentication) mechanisms – see below under "Data-oriented security".

User-controlled interaction with feed content

Mark mentioned some features such as feedback from feed readers to content providers, e.g., using so-called "privacy-preserving measurement". This should be made clearly optional, and the user should be offered opting-in, i.e., it should not be the default.

New Ideas

Learn from ActivityPub

In general, it would be good to study ActivityPub and see what features and design elements would be useful. ActivityPub is a decentralized social networking protocol based on the ActivityStreams JSON data format. It does a lot more than one would need for syndication (notably it is designed for bi-directional updates), but some properties are, in my opinion, useful for syndication, too.

Modularization

In RSS, a feed is typically a single XML document that contains a channel with items for the individual updates. When a feed is updated, the entire document is regenerated, and the receiver then has to filter updates that had been received before. Atom had a feed paging concept that allowed clients to navigate through paginated feed entries, but each of those is still a standalone document.

To enable better sharing, re-use of feed updated in different context and more scalable distribution, feed updates could provide a more modular structure, in similar ways as ActivityPub does.

Protocol independence

RSS and Atom are technically not bound to HTTP, although that is of course the dominant way of using them. However, it is theoretically possible to disseminate feed updates through other means, e.g., e-mail, and I think this should be considered for a future syndication system as well.

More specifically, push-based operation should be enabled (beyond e-mail). For example, it should be possible to receive feed updates via broadcast/multicast channels.

Another example may be publish/subscribe-based updated. There is a W3C Recommendation called WebSub that specified a HTTP-based pub/sub framework for feed updates. I am suggesting to use this as an example, but not necessarily as the only way to do pub/sub and pushed updated.

Moreover, it should be possible to use the syndication framework in "local-first" environments, i.e., with non-public-facing servers.

Data-oriented security

Thes use cases have some security implications. It must be possible to authenticate feed updates independent of the communication channel.

Written by dkutscher

August 25th, 2024 at 3:24 pm

Posted in Posts

Tagged with ActivityPub, Atom, content, RSS, syndication, web

Content Retrieval on the Decentralised Web

without comments

Trends and Emerging Technologies for Content Retrieval on the Decentralized Web

The control, governance, and management of the web have become increasingly centralised, resulting in security, privacy, and censorship concerns. Decentralised initiatives have emerged to address these issues, beginning with decentralised file systems. These systems have gained popularity, with major platforms serving millions of content requests daily. Complementing the file systems are decentralised search engines and name registry infrastructures, together forming the basis of a decentralised web. We have published a survey paper that analyses research trends and emerging technologies for content retrieval on the decentralised web, encompassing both academic literature and industrial projects.

Challenges

Several challenges hinder the realisation of a fully decentralised web. Achieving comparable performance to centralised systems without compromising decentralisation is a key challenge. Hybrid infrastructures, blending centralised components with verifiability mechanisms, show promise to improve decentralised initiatives. While decentralised file systems have seen more mature deployments, they still face challenges such as usability, performance, privacy, and content moderation. Integrating these systems with decentralised name-registries offers a potential for improved usability with human-readable and persistent names for content. Further research is needed to address security concerns in decentralised name-registries and enhance governance and crypto-economic incentive mechanisms.

References

Navin V. Keizer, Onur Ascigil, Michał Król, Dirk Kutscher, and George Pavlou; A Survey on Content Retrieval on the Decentralised Web; ACM Computing Surveys; March 2024; https://doi.org/10.1145/3649132

Written by dkutscher

March 7th, 2024 at 6:51 am

Posted in Publications

Tagged with content, decentralized, web

Dirk Kutscher