September 28, 2005

Adding tag clouds to drupal

I meant to post this quite some time ago but there’s no time like the present.

Kent Bye from Echo Chamber Project sent me some links describing the tag-cloud implementation for Drupal he’s been speccing out:

I finished a couple of posts that further specify a Drupal tag cloud – the font distribution algorithm has lots of graphics of my Power Law distribution of tags.

I created a couple of posts that specify some of my ideas for tagadelic additions (let me know of others who might be able to help manifest an automatic tag cloud that can be personalized by user or identity):

The first post walks through the evolution of the font distribution algorithm for a tagadelic tag cloud.

There are a lot of graphs to help you see how I came up with the algorithm based upon my free-tagging data from my site.

I also came up with three flowcharts for creating different types of tag clouds.

The first one is the most basic implementation that uses all tags from all users.

The second takes into account specific category vocabularies as well as all of the nodes from specific user uids.

The third creates a tag cloud based upon the identity of a group of users (in my case pro-war and anti-war).

September 27, 2005

Rashmi Sinha on the cognitive process behind tagging

Rashmi Sinha has posted an interesting hypothesis on the cognitive psychology behind tagging (with easy illustrations for those of us who don’t remember psych class): A cognitive analysis of tagging (or how the lower cognitive cost of tagging makes it popular).

September 16, 2005

Ian Davis on Why Tagging Is Expensive

Last week Ian Davis wrote an interesting post on Why Tagging is Expensive:

On the surface tagging seems to offer a new paradigm of organising information, one that reduces the cost of entry and so enables a long tail of participation to emerge. I’ve come to realise that the cost isn’t removed, instead it’s displaced and possibly increased. Tagging bulldozes the cost of classification and piles it onto the price of discovery.

There’s a saying I’ve heard once or twice (I wish I could attribute it): “The cost of metadata is in its application, but the value of metadata is in its use.”

Not exactly something you’ll be quoting at dinner parties, but it nicely captures the cost/benefit gaps of metadata.

The arguments against professional classification (including Clay’s views on tagging) have almost always worked on the cost side of the equation. Automated indexing, search and now tagging are seen as ways drive down classification costs. But as Davis explains, classification costs are only one part of the system:

In my view the total cost of an information retrieval system is the cost of classification plus the cost of discovery. In the formal classification world you have a very small number of people incurring a high cost in order to reduce the costs incurred by a very large number of people. In contrast the tagging world has the unit costs reversed: it’s cheap to classify, expensive to find. But the numbers of people involved are large in both cases so you end up with a lot of people paying a tiny cost to classify added to a lot of people paying a high price to discover. I think it’s pretty likely that the total cost is going to end up much higher than in the classification scenario.

Here’s an analogy. I visit a lot of thrift stores. The true cost of an item in a thrift store is a function of the time it takes me to find it, not the price (which is always cheap). A very large thrift store is more likely to have what I want, but at a greater discovery cost. Like, a thrift store is great for serendipitous discovery but not so good for known item retrieval. Put another way, wouldn’t be your first choice if you needed articles on Rousseau and the French Revolution, just like the Sally Ann wouldn’t be your first choice if you needed a smoking jacket, size 42T.

Where I think Davis might be wrong is suggesting that the discovery costs are shifted back to the user. In fact, the costs are shifted to search, blogs and other more efficient discovery tools. In large part this is because the domain of tagging systems has been the “big messy” web.

In that case, the “classicial” economics of information retrieval don’t apply because there are often multiple ways of finding things. Or because Google can radically lower your discovery costs by selling keyword advertising to offset their infrastructure. Or because algorithms can do much of the heavy lifting. Or because users’ expectations are for “just good enough” results. Or because users are not interesting in finding so much as tracking. And so on.

But I’d argue that once the domain is constrained–by subject, by context, by user population, by privacy/security, by business goals, or by those things in combination–the economic prinicples of classification and retrieval come back into play. Because other discovery tools are either not available or not optimal, poorly designed retrieval systems do shift the burden back to the user. (Karl Fast’s thoughts on problems in the middle are worth a read here).

In that middle ground–and the “big messy” web contains probably millions of cases where local structure is valuable, not to mention information systems that aren’t part of the “big messy” web–I think there’s a large area where a mixture of emergent, algorithmic, formal and now social classification systems will make for optimal retreival.

September 9, 2005

Paper: The Structure of Collaborative Tagging Systems

Here’s a good paper from HP Labs that analyzes tagging patterns on The Structure of Collaborative Tagging Systems (PDF). Lots of interesting stuff on tagging behaviour, frequency, semantics and categories.

(Via Ed Vielmetti.)

September 6, 2005

Tagging for Katrina

Nancy White is part of the community of bloggers who are putting their online networks to work in supporting disaster recovery efforts in the wake of Hurricane Katrina. Staci Kramer has suggested that tags can assist in this work by helping people find and organize different types of information related to the recovery effort. They e-mailed me to ask for a quck tagging “how to” that could help bloggers and online networkers use tags more effectively. My first crack at this is below; suggestions and improvements are welcomed here or on the wiki version.

Tagging for Katrina

The Internet is a crucial tool for people helping in the Katrina recovery, and for Katrina survivors looking for loved ones, food, shelter or other assistance. As the online community has joined the recovery effort, the explosion of online resources has made it harder and harder for people to find what they are looking for.

Tags can help to organize the wealth of online Katrina information so that survivors and supporters can work together more effectively and more quickly. For example, Andy Carvin has set up a Katrina blog that uses the tag hurricanekatrina to pull photos from Flickr.

Here’s a quick guide to how your web site or blog can use tags to make information more accessible, and on how tags can help people find the information they need.

What are tags?

Tags are just keywords (or categories) that describe some kind of online content so that it will be easier to find. I can use the tag chocolate to describe a blog post I’ve written, a web site I’ve discovered, a photo I have taken, or a dessert shop I liked. There are web sites that organize all these types of content (blog posts, web bookmarks, photos and restaurant reviews) so no matter what kind of information you are storing or you are looking for, tags can help you find it.

I’ve also written a more detailed introduction to tagging.

How can they help with the Katrina recovery?

Tags can help to organize information for hurricane survivors — but only if there is some consistency in how we use these tags. Tag your blog posts; tag and share your digital photos; tag and store useful web links so that others can find them too.

Which tags should I use?

There is no “right” tag for any topic, but it’s helpful to use tags that are similar to what other people are using so that related information will get pulled together in a few central web pages. The tag hurricanekatrina has emerged as the most common tag for people to use for any information related to the recovery. Use this tag in any blog post, photo, or web link that is related to the hurricane.

You can use more than one tag for any blog post (or photo, or web link) so use all the tags that you think would help someone find the information you are describing. For example, if you are posting a photo of a child who is looking for her parents, you might tag it hurricanekatrina, survivor, missing, found and child.

How to tag blog posts:

If you are writing a blog post about Hurricane Katrina you can make sure it’s included in the Katrina page on Technorati. (Technorati lets you search blog posts the way Google lets you search web pages; while there are other blog search engines out there Technorati makes a point of tracking blog posts by tag, and including other kinds of tagged content on its subject pages.)

If your blog uses categories, you might want to create a category called hurricanekatrina. Post all your Katrina-related posts to that category, and Technorati should convert your category to a tag; then use one (or more) of the tags below in the body of a specific post in order to describe it in greater detail.

To include it in a blog post you can paste this bit of code into your blog post:

<a href=”” rel=”tag”>hurricanekatrina</a>

How to tag and share photos:

If you are collecting photos, consider joining Flickr a photo sharing service that is accumulating a large collection of Katrina images under the tag hurricanekatrina. Flickr offers good instructions on how to upload, share and tag your photos; just tag them with hurricanekatrina in addition to any other tags that help to describe your photo.

How to tag and share online resources:

If you’ve found a web site or blog post that includes useful information or even inspiration, you can share it with other people. delicious is a “social bookmarking” service that lets people share their favourite web sites. (It also does a great job of helping you keep your favourite web sites organized, so you can find them again yourself.) It’s quite easy to use but if you’d like a helping hand there is a good online introduction available. Once you are up and running on you can use it to share any resource you find that is related to Katrina; just use the tags below as a guideline to how you could tag the resources you find. Make sure to use the tag Katrina in addition to any other tags you use.

Suggested tags:

Here are some other tags you might use to describe a blog post, a web link to a useful resource, or a photo related to the disaster. Use these tags IN ADDITION to the hurricanekatrina tag.

I’m including the Technorati code for each tag; if you are writing a blog post and want to tag it with one or more of these tags, include the Technorati code for that tag IN ADDITION to using the hurricanekatrina code:

<a href=”” rel=”tag”>hurricanekatrina</a>
Tag: For: Technorati code:
hurricanekatrina anything related to the hurricane <a href=""
children lost and found kids (use with peoplefinder — see below); other news about
<a href="" rel="tag">children</a>
donation for an offer of or story about financial assistance, food, or other support
(not shelter)
<a href="" rel="tag">donation</a>
hero people who survived; people who are making a difference <a href="" rel="tag">hero</a>
housing seeking or offering housing <a href="" rel="tag">housing</a>
inspiration good wishes, encouragement and uplifting news <a href="" rel="tag">inspiration</a>
job work available or work needed <a href="" rel="tag">job</a>
neworleans related to the city. (note: no space in the city’s name) <a href="" rel="tag">neworleans</a>


news of people who are missing (use two tags: peoplefinder + found)

<a href="" rel="tag">peoplefinder</a>

<a href="" rel="tag">found</a>

peoplefinder+lost news of people who have been found (use two tags: peoplefinder + found)

<a href="" rel="tag">peoplefinder</a>

<a href="" rel="tag">lost</a>

pets+found news of pets who have been found (use two tags: pets + found)

<a href="" rel="tag">pets</a>

<a href="" rel="tag">found</a>


lost or found pets

<a href="" rel="tag">pets</a>

<a href="" rel="tag">lost</a>

politics issues around disaster preparedness and recovery <a href="" rel="tag">politics</a>
rebuilding efforts to rebuild the communities devastated by Katrina <a href="" rel="tag">rebuilding</a>
rescue related to immediate rescue of survivors <a href="" rel="tag">rescue</a>
volunteer if there is a way you can help personally, or a reference to someone who
is volunteering, use this tag
<a href="" rel="tag">volunteer</a>
wanted use this along with another tag like "job" or "housing"
to indicate that you are looking for help
<a href="" rel="tag">wanted</a>
available use this along with another tag like "job" or "housing"
to indicate that you are offering help
<a href="" rel="tag">available</a>

These tags are just a suggestion. Use whatever you think will help you — and others — find the information you’re tagging.

I have some more suggestions on tagging that may be helpful (especially for users). And please feel free to contact me personally (alex [at] alexandrasamuel [dot] com) with any questions.

September 2, 2005

Tom Coates on bubble-up folksonomies

Tom Coates has a great post on bubble-up folksonomies–using tags to augment conceptual hierarchies. The example he gives involves tagging songs (in his Phonetags project) and using those tags to understand broader categories like album and artist.

…because you have a semantic understanding of the relationship between concepts like a ’song’, an ‘album’ and an ‘artist’ you can allow people to drill-down or move up through various hierarchies of data and track the changes in an artist’s style over time. For me, this is a pretty compelling argument that understanding semantic relationships between concepts makes folksonomic tagging even more exciting, rather than less so, and may indicate a changing role for librarians towards owning formal conceptual relationships rather than descriptive, evocative metadata. But that’s a post for another time. (My emphasis)

I like this idea for a couple of reasons. First, it recognizes the value of some semantics in the system (that is, the humans-at-both-ends-of-the-rope approach would be insufficient on its own). Second, it solves a real problem–helping people find good music that they’ll like.

And In the comments Kevin Marks says that Technorati is using the bubble-up approach to recommend blogs by tag (rather than just posts).