Last week Ian Davis wrote an interesting post on Why Tagging is Expensive:
On the surface tagging seems to offer a new paradigm of organising information, one that reduces the cost of entry and so enables a long tail of participation to emerge. I’ve come to realise that the cost isn’t removed, instead it’s displaced and possibly increased. Tagging bulldozes the cost of classification and piles it onto the price of discovery.
There’s a saying I’ve heard once or twice (I wish I could attribute it): “The cost of metadata is in its application, but the value of metadata is in its use.”
Not exactly something you’ll be quoting at dinner parties, but it nicely captures the cost/benefit gaps of metadata.
The arguments against professional classification (including Clay’s views on tagging) have almost always worked on the cost side of the equation. Automated indexing, search and now tagging are seen as ways drive down classification costs. But as Davis explains, classification costs are only one part of the system:
In my view the total cost of an information retrieval system is the cost of classification plus the cost of discovery. In the formal classification world you have a very small number of people incurring a high cost in order to reduce the costs incurred by a very large number of people. In contrast the tagging world has the unit costs reversed: it’s cheap to classify, expensive to find. But the numbers of people involved are large in both cases so you end up with a lot of people paying a tiny cost to classify added to a lot of people paying a high price to discover. I think it’s pretty likely that the total cost is going to end up much higher than in the classification scenario.
Here’s an analogy. I visit a lot of thrift stores. The true cost of an item in a thrift store is a function of the time it takes me to find it, not the price (which is always cheap). A very large thrift store is more likely to have what I want, but at a greater discovery cost. Like del.icio.us, a thrift store is great for serendipitous discovery but not so good for known item retrieval. Put another way, del.icio.us wouldn’t be your first choice if you needed articles on Rousseau and the French Revolution, just like the Sally Ann wouldn’t be your first choice if you needed a smoking jacket, size 42T.
Where I think Davis might be wrong is suggesting that the discovery costs are shifted back to the user. In fact, the costs are shifted to search, blogs and other more efficient discovery tools. In large part this is because the domain of tagging systems has been the “big messy” web.
In that case, the “classicial” economics of information retrieval don’t apply because there are often multiple ways of finding things. Or because Google can radically lower your discovery costs by selling keyword advertising to offset their infrastructure. Or because algorithms can do much of the heavy lifting. Or because users’ expectations are for “just good enough” results. Or because users are not interesting in finding so much as tracking. And so on.
But I’d argue that once the domain is constrained–by subject, by context, by user population, by privacy/security, by business goals, or by those things in combination–the economic prinicples of classification and retrieval come back into play. Because other discovery tools are either not available or not optimal, poorly designed retrieval systems do shift the burden back to the user. (Karl Fast’s thoughts on problems in the middle are worth a read here).
In that middle ground–and the “big messy” web contains probably millions of cases where local structure is valuable, not to mention information systems that aren’t part of the “big messy” web–I think there’s a large area where a mixture of emergent, algorithmic, formal and now social classification systems will make for optimal retreival.
Taxonomy (formal classification) has the same problem as tagging for the person who didn’t create the taxonomy: They don’t necessarily know what the terms in the tree mean, so the navigation process will be slow until they understand the unwritten structure behind the tree. As you say, search is the savior in either case. The taxonomy or tags can then be used as helpful clues for other, similar content.
Comment by jackvinson — September 20, 2005 @ 9:22 pm
[…] d behaviours, and adopting an approach on the basis of fitness-for-purpose. via Tagsonomy You’re It! » Ian Davis on Why Tagging Is Expensive
informat […]
Pingback by meaningful chunks » Blog Archive » you’re it! » ian davis on why tagging is expensive — September 23, 2005 @ 3:18 am
Yeah, I agree with this for the most part, Gene. Right on.
The problem is that discovery of information with professionally classified content in large information retreival systems (or even small ones) is not necessarily made easier for the average user. At LexisNexis we see only a small percentage of users interacting directly with our taxonomies. These are mostly information professionals (librarians, info brokers, researchers, etc). Why is that? The average user doesn’t know what a controlled vocabulary is and how it can help, even if we put it under their noses. The learning curve to understand taxonomies in IR is high.
How do you present a complex classification system to the average searcher so he or she can interact and use it effectively without much (or any) thought?
This is where the information science community, in my opinion, has failed to innovate and where IAs must excel. Take a look at any large online library catelogue, for instance. Terrible interaction and usability (usually). You need training (e.g. bibliographic instruction) and practice in order to understand it. In other words, the price of discovery of classified information is high for the end user. The contention that it is not is theoretical. Or do you have a practical or concrete example of easy-to-use taxonomies or controlled vocabularies?
Comment by Jim — September 25, 2005 @ 5:00 am
[…] r blogging related news and posts from around the web I stumbled across today: Gene Smith chimes in on Ian Davis’s article “Why Tagging is Expensive̶ […]
Pingback by Blog Blog » Blog Archive » Blog Links 09-28-05 — September 28, 2005 @ 12:59 pm
I disagree with Jim’s comments that, contrary to Ian Davis’ initial comments, the price of discovery via classified information is high for the end user.
Of course users have to understand what a controlled vocabulary is and to understand taxonomies before using them. However, this is a ‘one-off’ cost. Once the user has understood the need for such tools (via an information literacy session, for instance) they are better placed to take advantage of them. The price of resource discovery for the user henceforth declines.
Controlled vocabularies (or taxonomies) are tools; a means to an end. For a tool to be useful, one has to understand how it operates in order to take advantage of what it offers. Tagging harbours few rules and therefore its use as a ‘tool’ is limited. Indeed, in stark contrast to a controlled vocabulary, an information literacy session with tagging would not enable a user to improve his/her chances of discovering relevant resources, since the rules of discovery are not predictable enough nor are they learnable. Tagging therefore contrasts with controlled vocabularies by imposing a ‘perpetual discovery cost’ on the user, rather than a ‘one-off’ cost. That is not to say that tagging can be useful in particular contexts, but high precision searching will never be its forte.
It is true that many users lack the skills necessary to take advantage of advanced searching / browsing options. However, the exponential growth in information literacy instruction and information literacy research reflects the international need for users to understand these tools and, as the IL literature reveals, understanding such tools need not be long or particularly arduous. There are many dedicated librarians and information professionals working extremely hard to ensure their user groups have the best IL skills possible.
It’s also worth noting that Ian’s contention is not merely theoretical, as Jim suggests, but is actually borne out by much empirical evidence and research within the fields of the library, information and computing sciences.
Comment by CDLR — October 17, 2005 @ 9:05 am
[…] n early adopter and understand ‘tags’ very well here are two good posts ‘Tagging is expensive‘ and ‘The Year in Tags‘ Check out &# […]
Pingback by Vinu’s Online Cloud » cleaning up the mess … aarggh — December 31, 2005 @ 1:53 am