I think we can all agree that finding information on the Web is hard. I mean, sure, Google can turn up a few high-quality links, but any rigorous and deep use of the Web remains difficult -- particularly finding information of which we're not aware. One of the joys of researching at a library is to look up a particular book, find its place in the stacks, and then develop a crick in your neck as you drink in all the topically-related books gathered around the one you originally sought.
Many have sought to replicate this experience online through visualizations of Web spaces. The Atlas of Cyberspaces is littered with attempts at overlaying some meaningful spatial framework in order to highlight semantic relationships and strengthen recall.
Yet none of these have really caught on. Systems like WebMap only make the process of finding information more difficult, because they require a user to understand yet another layer of abstraction, and a highly arbitrary one at that. The main problem here is that visualizing semantic relationships is nearly impossible. There are far too many dimensions of meaning, and the dimensions that are particularly meaningful to me might have little relevance to others.
In an essay I just finished on information visualization, I wrote this passage:
World Wide Web visualizations begin to make sense when documents are no longer treated semantically, with attempts at assaying their meaning, but socially, by following how they are connected through links. Visualization tools like TouchGraph, which borrow from network analysis diagramming to depict web site connections, benefit from the proven effectiveness of social network visualizations.
Which made me think of Andrew Odlyzko's oft-cited notion that communication, not content, is the internet's killer app. And the realization that visualizing connections between sites helps overcome many of the problems that visualization semantics faces, in that those connections doubtlessly encompass many semantic qualities (thus wrapping up n dimensions of semantics into a single link), and that the act of connecting sites is an extremely meaningful act in itself, whereas the passive comparing of semantic meaning is far less likely to be relevant. Now, all I'm doing here is merely retreading over Google's ground (the order of their results is guided by the act of linking, not simply semantic meaning), but all that does is bolster the thesis.
Anyway. Some food for thought on a Monday afternoon.
3 comments so far. Add a comment.
Previous entry: "The Secret To Meeting Women Is To Not Bathe."
Next entry: "Marti Hearst on Information Visualization"
Hi Peter, your comment about giving preference to the social rather then the semantic qualities of documents relates to my own ideas about automated vs. human categorization.
What does it mean to treat a document semantically? To me it means that the document has been programmatically examined for major themes/keywords. Or perhaps it was a person, the author of the document or a human categorizer who assigned the keywords, but in this case the keywords reflect the opinions of a single person, who has limited knowledge and a domain specific vocabulary.
Social classification on the other hand implies that the documents are categorized by the many people who somehow link groups of documents (and other items) together. Perhaps they cite the documents in their own writing. Or, as is the case with Amazon.com, users indicate a relationship between books by buying one book as well as another. Such methods are likely to be noisy (i.e. show similarity when there isnít any), but clear patterns begin to emerge when there are a large number of users.
Thus on the one hand the categorization is a result of a computer algorithm (the semantic understanding of which can not approach that of a human), or the efforts of a single person, and on the other as a result of the combined valuations of hundreds of individuals with their own unique backgrounds and interests (which is what google uses to determine page similarity).
I believe that it is the underlying collaborative filtering effect which is the basis for social categorization that makes it superior to semantic classification by a computer or a single individual.
Posted by Alex Shapiro @ 06/24/2002 08:59 PM PST [link to this comment]
Interesting stuff chaps. I work as an Internet Researcher in London and we have trialled apps like Touch Graph for the purposes of mapping online communities and identifying spheres of influence for sites. Unfortunately we found Touch Graph's basis on Google Similar Pages hoplessly unreliable. I tried to investigate how Google Similar pages operated but came to the conclusion that it must be keywords alone. But Alex here suggests that it is much more than that, could you or anyone else please explain further?
We had more joy with Net Locator, a piece of freeware from Holland. It was developed by GOVCOM.ORG a coaltion of designers, academics and government research scientists based in Amsterdam. The idea is very simple (and therefore rather flawed) it presupposes that Webmasters use linklists to frame the position of other organsiations on the Internet. In this way links to sites constitute a politics of relevance, and therefore provide a means to demarcate or map issue debates/communities online. With the software you plug in up to eight link lists from sites you consider to represent the most authoritative and publicly visible reference point for a given issue e.g. GM Crops. The software then identifies common sites from these linklists and begins to filter out the wheat from the chaff in terms of relevant sites to the debate. After a lot of iteration you have identified a set of interdpendent sites that , apparently, comprimises the web community on that issue.
Unfortunately the software doesn't create the map for you, you have to do that yourself. But very interesting tool anyway. What do others think about this?
Posted by Iain Ferguson @ 06/25/2002 04:11 AM PST [link to this comment]
You got to sit down and spend more time with architects, talking about this and that, but a hell of a lot about spaces and how to layer them. I really beleive doing so would be very valuable to you.
All the best,
Posted by Frederik Andersen @ 06/25/2002 03:01 PM PST [link to this comment]
Add A New Comment: