06.11.2011

decoupling of the Google Reader features

The main issue which I see is that a website which has the full set of features of Google Reader must store a lot of posts. The full set of features of Google Reader includes reading feeds, generating feeds of shared posts, searching in posts, including posts that disappeared in source feeds. In order to shrink size of a database, a replacement of Google Reader can store only identifiers of posts. Then it will not be a feed reader, of course. Thus we have 2 options:

  • A0) Store posts.
  • A1) Store post identifiers.

I consider below the choice A1. It offers a radical decoupling of features but requires a lot of changes in formats and existing programs. All posts must have a unique identifier throughout the web. This identifier consists of the identifier of a feed and the identifier-in-the-feed. Identifier-in-the-feed is supported in at least Atom feed format. If some feed does not contain post identifiers IMO we should just give up. It is too much a hassle to assign identifiers if the feed author did not bother to do that.

Identifiers of feeds are surprisingly tricky. Feeds are often intersecting, i.e. 1 post may appear in several feeds:

  • News websites often have distinct feeds for different topics, and there are posts belonging to many topics at once.
  • Often 1 feed is published in 2 formats: RSS, Atom.
  • A feed has several variants, each variant offers different number of recent posts (10, 20, 50).

IMHO aforementioned variants of feeds must be somehow joined.

Thus we need programs:

  • A feed reader. There are a lot of them.
  • A post sharer. It stores and publishes only identifiers of shared posts. GUI is not necessary, as it is more convenient to share from a feed reader, where a user is able to read a post before sharing it.
  • A post source. Since the post sharer gives only identifiers of shared posts, a user should be able to obtain contents of posts somewhere else. The user will obtain contents from the post source. The post source may be:
    • A standalone blog. It’s a source of one feed.
    • A blogosphere. It’s a source of many feeds which have the common prefix.
    • A database that searches in blogs, like Google Blog Search, Yandex Blog Search, etc. It’s a source of a lot of feeds.

Thus we need the following features:

  • B0) The feed reader sends a post identifier to the post sharer.
  • B1) The feed reader takes a feed of identifiers of shared posts from the post sharer and loads contents of each shared post from the post source. Then any post source must offer an API for requesting the post with any given identifier.

B1 requires profound changes in existing programs, namely, the feed reader and the feed source. Perhaps in the short run the post sharer will just publish a feed that contains identifiers and contents of shared posts. I.e. the post sharer incorporates the feed source. This solution is essentially equal to A0.

All above does not cover:

  • Searching in shared posts. Subscribe in your feed reader to your own feed and search in your feed reader.
  • Comments. This belongs to newsgroups/webforums. Not all users of Google Reader read and/or write comments.
  • Restricted access. Dealing with groups of users is a complex issue on its own and is similar to what a social network does. You may just go to Google Plus.

Комментариев нет :

Отправить комментарий