17th level Hacker


Steve Gillmor was over at the EBig RSS and Blogging SIG, talking about attention.xml. The format was pretty free form, very participatory. He was pulling people out of the audience and asking them questions. The audio should be up on ITConversations. Here are some notes.

He asked a question about how many people use aggregators. About 50 percent of the people did. Bill Flitter introduced what RSS is and what role it’s playing. Then Steve goes into a description of RSS and how it’s feeding into attention.xml. Microsoft had the first syndication format, CDF - channel definition format. It was supposed to work in with the Windows desktop. There were question about Atom, but Steve thinks that’s just reinventing the wheel. The publishing part was valuable, but eventually Winer will merge in the good parts and it’ll become a new version, but it’ll still be RSS.

Netscape was the original source of RSS, but they got whomped my Microsoft. Dave Winer took over custodianship of the spec. Then blogs come in, and tools for blogging start emitting RSS, we start to get lots of XML objects floating around. As the blogosphere grew it was hart to maintain some sort of overview and keep on top of the feed. You have problems of using search, false positives, dead ends. RSS changed that equation by creating a publish/subscribe mechnism. An alert mechanism in addition to a syndication mechanism.

Question about who owns the RSS spec, who controls it. Steve says that the law of unintended consequences is always in play. Who owns HTML? How transparent is the technology is the real question. The great thing that Dave Winer did was allowing RSS to reach a disruptive critical mass. There are next generation questions, like how to manage the flood, how to get monetary recompense back to the authors. SOAP is an example of how standards bodies aren’t an important factor to look at when thinking about formats. Winer says that SOAP was coopted by vendors and made more complex. Wonderful speech by Adam Bosworth on this issue talking about the properties of keeping things simple, and why simple wins, and why simple RSS will win out. Also good commentary on the dynamics of attention.

Now we’re in a stage that Steve calls the inforouter. Radio Userland raised the number of feeds on the network very quickly, that was the tipping point for RSS itself. The other major factor was the permalink and The New York Times. The permalink is a distinct URL for an individual post, so that it can still be referrenced as the site evolves. The New York Times supported this, and other publishers were forced to do the same. (This might change with The New York Times, they might pull stuff behind a paywall).

Effectively the browser is dead, it’ll become integrated into the inforouter. NetNewsWire exploits the webkit toolkit to have a client built in. You can go from captured RSS info and browse out of that. We’re going to move more and more into a full text feed. Raises concerns about business models, and that’s where Pheedo and others come in. And it’s one of the primary focuses of attention.xml. Fundamentally RSS is about time. It is more efficient in terms of consuming information, and it will win. Also, you can store on a local system, something you can’t do with the normal browser.

There is always a need for a persistent local store, to some degree. How do you preserve important information on the local client? There is need for this on both end, client and server. Once you can capture enough info, you have an information management problem.

The blogosphere developed the A-List, very popular bloggers. This was the first boundary intersection. People who exploited the low barrier to entry. People who hadn’t been in the field very long, but were PR or engineers. The A-List developed like a star system, it was keyed into pagerank, based on links for authority. Creates an aristocracy, that some people find annoying, and it’s destabilizing to the incumbents in the media. Transition away from the publisher being the authority to the author being the authority (like Dan Gillmor with grassroots journalism). Enterprize wuffie is also developing, like Schwartz for Sun. Uses his blog to set the tone and dominate the conversation. Keeping two steps ahead of the trade media. He’s created a lot of momentum when he comes on the Gillmor Gang, waves of executives taking blogging as a new PR/marketing tool.

Finally, go read the Long Tail article at Wired if you haven’t already. Dsecribes the effects of the network quite well. For example, 60 percent of Amazon revenue comes from offerings that aren’t from the blockbuster list. Sometimes servicing the tail is more efficient than servicing the head. The development of the tail to give rise to new forms of interaction that support what they’re looking for.

Sidepoint: any metadata that can flow out of an interaction is important. There should be a cloud of attention information that can flow around between different points of access.

NetNewsWire is an inforouter, it embedds the browser and has a database store. Take a look at the Mozilla team also, they’re comitted to supported attention.xml also.

Technorati was built around Dave Sifry trying to find out who was talking about him, it was the first attention engine. Feedburner also provides some attention info, and according to the info from Feedburner Bloglines has about 50% of the market. (Correction from Steve: he was talking about Bloglines not Feedburner at this point) FeedburnerBloglines exposes very little info, but it does tell you how many people are subscribed to a feed. Using some of these services we can get some info about the RSS space, in terms of readership. If we are moving to a new model around RSS, this is the new pagerank. Rojo is also leaning in this direction. Currently in beta. It’s a social network plus an aggregator, provides attention info, tracks at a more granular info.

Attention metadata is who you read, what parts you read, and how long you read it for. If you know who you read and a timestamp you can figure out a priority. You need some metadata to scan to be able to figure out what to read. The most important thing is how much the news will affect us, not how well written it is. The factors of importance are highly variable and personal, and they can be tuned and improved.

The fundamental of RSS is that you can take less time, that really requires full text feeds. If someone in one post refers to another, that should be cached also on your machine so you can follow the conversations. Dave Sifry sighted that the pagerank/wuffie/readership is greater for full text feeds, people gravitate toward them. That’s why publishers are having issues, cause people vote with their feet, but monetizing content in RSS is hard. It’s a push and pull issue, and hasn’t settled down yet. The feed of headlines is really commoditized, there is little info different between feeds from different sources.

So if full text feeds are the way to go, how do people make money off of it? He pulled up Eric Rice and asked him how RSS affected him. Eric didn’t remember his first encounter or reaction, but he knows he signed up and started playing with it at some point. But he does remember that his information consumption went up radically. Eric says he’ll probably have 750 messages when he gets home, there is a lot of information there. But he just scans sometimes. Sometimes he goes straight to specific sources, sometimes it’s all scanning. Steve calls this the window of opportunity, there is a finite window in which to deal with the information you get. To read it, respond to it, share it. And increasing the hit rate is very valuable. Reference to podcasting and taking down time and putting it back into the window.

Asked Elle about the post where she said that what’s more important to her is not what she’s interested in, but filtering out what she’s not interested in. Elle said there were two points to what she reads. There’s Google News type stuff, that has an important effect of serendipity. And stuff for work, which is very pointed. Steve called out letting the system pull out false positives, as contrasted with only giving you information that you already know about. You learn more from what you didn’t expect than what you do expect.

As soon as we make it to “the world of RSS”, our biggest problem is the attention info, and making sure that it’s used. Having the tools know about the pagerank/attention so information that you know who has linked to what so that you can put the important stuff in your window. The problem of information overload is intractable, we’ll never have the time necessary to do all that we want. The more that we throw out automatically the more time we have to deal with what remains.

OPML is one of the use cases for attention, a simple case. Attention.xml would add information into that OPML style format. Assume we have some tools that track attention and exchange it. Steve has some people he reads early and often (Doc, Jon Udell, Dave Winer), those people would be high on his attention list. If all those people comment on a post, that post is probably of interest. You should be able to harness the human filtering. Technorati, Feedster and others should be able to make use of that information also, they should be able to infer results based on this information as well.

Question from Steve Tennent, is attention.xml providing a editorial function? Steve said no, that’s the A-List bloggers. In terms of the editorial function the value of attention is emerging the editorial info that’s present in other humans consuming information.