May 24, 2005

Dynamic Growth of Tag Clouds

A persistent theme in the interest in folksonomy, and tagging in particular, is the threat of majority control. When the meaning of a link is determined by the masses, it can be inimical to anyone with a marginal point of view, and, the larger the pool the majority is pulled from, the greater the pressures towards lowest common denominator choices.

You can think of this as the ‘picking a restaurant’ problem. If you and one other person need to agree on a restaurant, you have at least a chance of selling them on the wonders of Uzbeki cuisine, and schlepping out to that hole in the wall in Queens. Once you need agreement among a large enough group, though, it’s Pizzeria Uno for you. In many social situations, scale drives the group to the lowest common denominator.

Last week, I had a chance to look at tagging at scale, related to my Ontology Is Overrated piece, and wrote a little script to examine the development of the consensus view of tags. After OIO came out, it was tagged on del.icio.us fairly frequently over the following week. As of this writing, a little over 700 users have tagged it, with 450+ unique tags, roughly two-thirds of which tags were (of course) used by one and only one user. I used this data set to hack up a python script that shows me the list of unique tags, sorted by popularity, after any given user tagged the post.

The output looks something like:

1. tagging mooch
2. tagging mooch fordrew
3. tagging ontology mooch fordrew folksonomy clayshirky
4. tagging ontology mooch fordrew folksonomy clayshirky
5. tagging ontology folksonomy clayshirky mooch fordrew

which is to say, after the first user tagged it, the total tag cloud was ‘tagging mooch’; after the second user tagged it, it became ‘tagging mooch fordrew’; and so on. I then sliced this data two ways — I first truncated the list to the top N tags, and then collapsed lines that were identical, so I could tell when a stable set of those N tags arose, and how long it lasted.

The current top 3 tags for OIO are ‘ontology tags folksonomy’ — these three tags were the top 3 after about 20% of the users tagged the piece. However, it took only 10 users (not quite 1.5% of the current total) before the top 3 tags were ‘tagging ontology folksonomy’, conveying much the same sense, with only the use of ‘tagging’ instead of ‘tags’ making this different from the current set of 3.

The current top 8 tags are: ontology tags folksonomy tagging classification del.icio.us shirky web. Interestingly, below the top 8, the list has never stabilized, with the 9th and 10th terms being, variously, taxonomy, metadata, toread, article, categories, and categorization, even after the top 8 became set.

I’m setting the script loose on some other frequently tagged links, to test the following hypotheses:

Popular tags get set quickly, but not in stone

It only took 10 users for ‘ontology tags folksonomy’ to sort to the top of the tag list, meaning that even a small group of users can pretty quickly create much of the consensus value around a given link. This is in keeping with the idea of lowest common denominator tagging. However, though this consensus was established quickly, it was not frozen, with the positions among those three words varying, and with tags eventually replacing tagging.

I’ve only found one popular link so far that violates this idea, for the original Adaptive Path piece on Ajax. For this link, the tag ‘ajax’ is overwhelmingly #1, with 1171 occurrences from 2352 taggers. (Second place is ‘javascript’, with a mere 644 tags.) Yet over 800 people, more than a third of the total, tagged it before ‘ajax’ hit the #1 spot — it’s as if you can see Ajax becoming a real term as enough people read the article. The Ajax article may be a one off, or there may be some small but instructive number of links whose consensus view changes slowly, documenting the rise of some new concept.

Beneath a fairly high threshold, tags remain in flux

This is not in keeping with fears of the lowest common denominator — the deeper into the tag list you go, the less stable the tags and order are, suggesting that groups, even large groups,have simple consensus views but highly varied overall views of a particular link. More importantly, the larger the group, the larger that variability becomes. As you’d expect with this sort of distribution, the top few tags get ever more popular, even as the tail gets longer. Scale, and even scale with strong consensus on a few tags, are not in fact incompatibile with variety — in fact, in this situation, scale supports variety.

And this is why the ‘majoritarian tyranny’ argument fails — the relevant unit of opinion is not the user, but the tag, and the variety of tags grows with the number of users. Tagging isn’t voting, in other words, with each user committed to one and only one choice, and the views people share widely are less numerous, considered as unique occurrences, than the views they share with only a few other people, or with no one else at all. Tagging isn’t like getting a group to go to a restaurant, in other words, because there’s no requirement for the users to converge on a single opinion.

In the same way cities offer more varied experiences to their citizens as they get larger, the popularity of a link increases the tag variability, as well as increasing the likelihood that any tag you’d use will have been added by someone else already. Provided a tagging system is mainly for personal value, with social value as a seocond-order benefit (as del.icio.us is), then scale increases varibility and reduces the constraints of consensus.

Readers are good at finding implicit themes

A number of readers tagged the piece with tags relating to information architecture or the Semantic Web, which are only noteworthy because I intentionally never used those phrases in the piece. The piece is an argument about information architecture and about the Semantic Web, but only by extension, since the idea of predictive classification is core to both of those efforts, not because I took on (or even mentioned) the particular efforts in either of those fields.

This is another answer to Tim Bray’s question: Taggers are good at characterizing material in ways that search engines are incapable of, and tags are thus good for letting you find material whose characterization does not appear in the text itself.

You can see subcultures cycle through the tag lists

During a period of about 120 users’ additions of OIO, 20 or of them used the tag ‘ia’, putting it between #7 and #10 during that period. Now it is down to #17. This suggests that one or a few IA-oriented sites or mailing lists posted the link, and it got a flurry of attention from those taggers in a narrower window of time. This in turn suggests a conversationally tightly-knit IA community.

Something similar happened with variations on ’socialsoftware’ and my name. Since the first notices that the piece was out were the shirky.com RSS feed and Many-to-Many, people using either social software or my as a memory of the source of the information were disproportinately represented early on. As the piece started to get pointed to fromelsewhere, those effects faded.

All of these are just hunches, of course, based on looking at a few well-trafficked links. Still to come: more data, and looking at tag growth for links tagged just a few times. But already, the time-slice view is exposing a degree of dynamism in tagging behavior not obvious from looking at representations of the current state of any given set of tags.

14 Comments »

  1. Nice post Clay - interesting results. Have you seen or looked at anything about tags as precursors, foreshadowing issues as they become more popular? Alex over at Future Now posted an interesting question about this last night on the blog there, and it seems to be something worth thinking about.

    Comment by gregburton — May 24, 2005 @ 4:23 pm

  2. No, haven’t done anything like that — the issue there is that predicitve rather than descriptive analysis would require a global view.

    (unless, I suddenly think to myself, you were watching large scale RSS feeds — the feed for ‘design’, say, or ‘programming’, and were looking for precursors within those. hmmm…)

    Anyway, will check out Alex’s post, thanks for the heads up.

    -c

    Comment by cshirky — May 24, 2005 @ 4:36 pm

  3. I think, it would be very interesting to find out (i) if people read the full article before posting it on del.icio.us, and (ii) which bookmarklet they used.

    I’m user #11, and I for one didn’t read your article in full before I decided I wanted to bookmark it for later reference. So, my tags would only be an initial guess until I revisited the article.

    And, I did use the “experimental post to del.icio.us” javascript’let which gave me a hint what ideas other users had about your article.

    Maybe there’s a way for you to find out about this from more users… Because this really makes a difference in your, otherwise excellent, analysis. Thanks for sharing it with us!

    Comment by frankwestphal — May 24, 2005 @ 6:41 pm

  4. I have to say I never look at what tags other posters to delicious are using, and I wonder how many people do. You might be able to get a ratio of people using the experimental post to delicious interface. I bet Josh has that figure.

    It would give you an idea of how much follow the leader there is and how much people are just thinking the same. Your analysis suggests at least that there are groups who are perceiving things similarly within group and different across groups.

    Comment by budGibson — May 24, 2005 @ 9:39 pm

  5. […]

    You’re It!
    a blog on tagging

    « Dynamic Growth of Tag Clouds

    Tag Sets Bad, Tag Clouds Good
    by C […]

    Pingback by You’re It! » Blog Archive » Tag Sets Bad, Tag Clouds Good — May 25, 2005 @ 9:29 am

  6. […] efore a certain event, or after a certain event. Independently Clay Shirky was coming at a similar conclusion, although he more focused on temporal changes that seem more si […]

    Pingback by P.S.: » Tagclouds and cultural changes — May 28, 2005 @ 7:28 am

  7. […] us is), then scale increases varibility and reduces the constraints of consensus.” - http://tagsonomy.com/index.php/dynamic-growth-of-tag-clouds/#comments It& […]

    Pingback by getluky.net » Folksonomies Grow Out of Personal Value — June 3, 2005 @ 2:38 am

  8. […] us is), then scale increases varibility and reduces the constraints of consensus.” - http://tagsonomy.com/index.php/dynamic-growth-of-tag-clouds/ It’s my persona […]

    Pingback by getluky.net » Folksonomies Grow Out of Personal Value — June 3, 2005 @ 2:40 am

  9. […] to begin to use and understand the humanity of tags. Metatags are key to meta-knowledge Clay Shirky: “Taggers are good at characterizing material in ways that search […]

    Pingback by the sift everything experiment » Wheelbarrow: Metatags — June 20, 2005 @ 7:05 am

  10. Let’s talk about tagging …

    I have been a small part of some dicussi

    Trackback by aqualung — July 17, 2005 @ 9:38 am

  11. […] s. For more info see… Tagclouds and cultural changes and Clay Shirky Dynamic Growth of Tag Clouds How do you trust tags? Wuffie baby. Cory Doctorow&#82 […]

    Pingback by SmallBiz » Glossary of terms — August 18, 2005 @ 6:25 pm

  12. […] nity of tags. Metatags: first derivative of thought. Metatags are key to meta-knowledge Clay Shirky: “Taggers are good at characterizing material in ways that search […]

    Pingback by the sift everything experiment » Wheelbarrow: Metatags — December 12, 2005 @ 12:06 am

  13. […] s (das Script wird momentan überarbeitet) Artikel Haben Sie schon eine Tag Cloud? Dynamic Growth of Tag Clouds A Tag Cloud Epoch for Freshness Meine Frage beantwo […]

    Pingback by The Blog.ch.Blog » Blog Archive » Was macht eine Tag-Wolke in der Nacht? — December 13, 2005 @ 4:18 pm

  14. Submit RSS Feeds For Endless Traffic!

    Why do people submit RSS feeds? Surely there must be reasons behind them doing so faithfully. Webmasters submit RSS feeds when they update their website with new web pages or build new websites, while companies do so when new products enter the market….

    Trackback by Feeds Get Rss — March 21, 2008 @ 10:00 am

RSS feed for comments on this post. | TrackBack URI
You can also bookmark this on del.icio.us or check the cosmos

Leave a comment

You must be logged in to post a comment.