July 5, 2007

Introduction: Joe Lamantia

Hello, world! Joe Lamantia here, as the most recent addition to the Tag Team, with greetings and salutations for one and all. Readers of and contributors to this blog may know some of my writing on tagging and tag clouds, which is what brings me to this semi-structured thought collective.

I’ve been active in the user experience, information architecture, and internet technology communities for ten years, before the emergence of tagging as it’s now known, in the era when metadata of any sort was never mentioned in Business Week or the New York Times. Ironically, this history also places me nearer the ‘new’ end of the professional spectrum amongst the posters here at You’re It! It is a group I am privileged to join.

When not thinking about tagging, tagclouds, folksonomies, and social metadata, I enjoy crafting essays about enterprise information architecture, user experience, the design and structure of complex systems, mental models, organizational culture, and the like at joelamantia.com.

October 16, 2006

Jason Toal’s survey results

Hmm, I meant to blog this over a month ago. Jason Toal posted some results of a tagging study I participated in. Apparently SurveyMonkey is a security risk, so he can’t include his results in his academic work. Still, the results provide an interesting snapshot of one small group of people’s use of tags.

In the same email message where he announced the above blog post, Toal also pointed to the My Cassettes project which makes interesting use of tags through its Flickr group and Delicious id.

July 31, 2006

Review of physical tagging practices at this year’s IA Summit

You’re It contributor Gene Smith posted a fascinating study of physical tagging habits from this year’s IA Summit (A micro-study of tagging) on his Atomiq blog.

Not surprisingly, a few of the contributors to this blog (Don, Peter, myself) are found to have conceivably “overthought” things a bit, presenting variations on the tag cloud concept that people find as confusing with ink on paper as they do in the sidebar of many a website.

May 31, 2006

Tagging 2.0 panel at SXSW2006 now a podcast

The Tagging 2.0 panel I organized at South by SouthWest 2006 in March is now a Tagging 2.0 podcast among the many SXSW 2006 podcasts you can download.The Tagging 2.0 panel was one of the “highly-rated panels” this year, tied for first place with a number of other entertaining and informative panels too so check out their podcasts as they become available as well.

May 16, 2006

Social bookmarking in the enterprise

Stacy Surla, an information architect at MITRE Corp., posted the IA Institude mailing list last week about a social bookmarking system they’ve implemented on their intranet.

She also plugs an upcoming (next week!) Collaborative Web Tagging Workshop in Edinburgh.

In a followup to that same post, Cody Burleson from IBM mentioned that in addition to the Dogear application they use internally at IBM for “delicious-style” page tagging, they’ve also implmented a collaborative “people tagging” tool called Fringe tied to their BluePages corporate directory.

April 28, 2006

Siderean’s tagged facets

Siderean, one of the interesting faceted classification companies, has announced some new capabilities that aim at automating the generation of metadata and that integrate tagging with facets.

The automation comes from entity extraction tools (plus the ability to integrate third party tools, because, frankly, Siderean is not in the entity extraction business) that isolate names of people, places, organizations, dates, etc. from a collection of pages. This addresses one of the real inhibitors of the use of faceted classification: The data has to already be well structured and well tagged. That makes it great for browsing databases but not as good for browsing big piles of unstructured data (= documents).

The system integrates tags in a useful way. Users can tag items and then use tags to further specify searches through the faceted interface. In fact, the tags can be “bucketed” and treated as facets. The tags can be marked as personal or public, and can be associated with groups and other contexts. Yes, the system does integrate with del.icio.us. (Siderean fooled around with this in a beta project called — wonderfully — fac.etio.us.

Siderean also announced that it’s now using the faceted information to drive analytics. This is really “just” another way of displaying the faceted information. But it can be quite useful because a faceted system has so much data built into it. For example, a library system might know that (and this is a made-up example) there were fifteen times as many books about Iraq published in the past two years than in the past twenty; it has to know this if it’s going to let users browse for books by subject and then by year (or vice versa). Siderean’s analytics offering follows that of Endeca.

Faceted classification is young. It’s exciting watching imaginative companies like Siderean invent new twists and turns right under our eyes.

April 3, 2006

Interview with Gordon Luk (FreeTag)

Nearly ten months ago, at the suggestion of Andy Baio I interviewed Gordon Luk (via IM) about FreeTag, an “Open Source Tagging / Folksonomy module for PHP/MySQL applications” he originally created for Upcoming and announced almost a year ago in his blog.

In the meantime I’ve continually intended to edit the chat transcript into a coherent article a post it here. Unfortunately, a strange thing called “life” has intruded. Then, I ran into Andy in Austin at South By Southwest and my embarrassment over sitting on this dialogue returned to the surface, kicking the to-do back to the top of my list.

I started thinking I should touch base with Gordon again, and find out who else has adopted FreeTag lately and any other news updates or developments but then I realized this was just another form of procrastination. What the web wants me to do is post what I’ve got and then Gordon or anyone else can comment on it, or correct it, or update it, and so on.

So, without further ado, here is my interview with Gordon Luk:

xian: Can you tell me how you got the idea for freetag?

Gordon: Sure! It starts with a discussion of who I eat lunch with, actually. I am lucky enough to work with some really smart guys – among them, Andy Baio, Phil Fibiger, Greg Knauss, Christian Newton, and Jason Stuck.

We got to talking about tagging when the term folksonomy was coined.

I can’t remember exactly who had the idea, but we started discussing cross-site interactions between tags on different platforms.

In what sense?

The idea that you could be browsing puppies on flickr, and perhaps you could extract some of del.icio.us’s puppy-tagged links.

Was Technorati doing their pages yet that show items tagged by several different systems?

At that point, I don’t believe so. We got a few of our other friends involved, including the venerable Leonard Lin. Greg included Leonard Richardson on the email that he sent out that night by mistake, so we got some of his feedback too.

So when did it turn into a plan to actually do something?

Well, first it turned into a wiki.

Naturally…

I started off in the direction of creating a PHP class that would implement a standardized XML-RPC or REST communication layer. Greg was more of a proponent of the actual standard to be implemented by that layer.

At that point, we all got busy and it sat for a couple of months.

During another lunchtime conversation, I came up with the idea for eatlunch.at and made it that weekend.

I wanted to use it as a testbed so I could play with tagging, so instead of building it into the whole site, I made the tagging system generic.

One thing that interests me is the enabling or catalysing idea of not just pumping out yet another site or application but instead producing a plug-in that can be distributed across a whole class of projects.

It seems altruistic in the sense of it’s not yet another system trying to collect my contact info, but on the other hand, I’m surprised people don’t modularize like that more often.

Yeah, that’s absolutely very interesting – I wrote a post not too long ago about how I’m interested in the strange inversion of privacy preferences that we subject ourselves to on social services.

Especially public ones like del.icio.us.

We really wanted to enable cross-communication between sites, because it seemed like such a no-brainer once we started talking about it. Typically, when you’re dealing with hierarchies, every site dev has their own view of the world, and things don’t match too well. With freetagging (the term used back then), it doesn’t really matter, because the classification systems emerge from the utility of the application and data.

It’s interesting how tagging is emerging as a kind of meta-glue for the web (if it is – still not sure).

It’s interesting that tag clouds (and now del.icio.us’s recommended tags) are enforcing community standards for popular tags, because with a distributed system, you’d have that not only on a single site, but you could implement that across a wide range of sites.

There’s a tension there – still not clear where it’s going, but it’s fun to watch it emerge (or in your case, i suppose, help move it along). So, the wiki hosted the debate about how to implement or at what conceptual level to implement the idea?

Yes, it might actually still be around, too. It’s hard to say, because we all worked on it for about a week before getting too busy to do anything about it. It was mostly planning and RFC-style note-taking. It was a lot of design work, no coding involved.

Not even pseudocode?

Well, I guess it depends on your definition of that. I think there was some standard communication XML-RPC samples that were flying around, and there was also some API specs that I wrote up.

so did you just sit down and hack out the first version next?

I actually wrote it the same weekend as I wrote eatlunch.at’s core code. It was pretty crummy at first – had some serious issues with special chars, and just ignored quoted tags entirely, among other problems. But the core was there – the schema and a basic API.

Luckily, i’d been practicing with generalized module development through work. I owe Mike Benoit of phpGACL thanks for helping teach me generalized module style in PHP.

phpGACL is a generalized access control lists module that fits into PHP-MySQL apps. It’s an excellent module for anyone to start with. It’s pretty well separated and very generalized. I’d recommend looking at both that and Freetag, because each does things well in a different way. (I get nerdy when I talk about this stuff, so feel free to let me know if I go too far.)

OK, so was implementing it in Upcoming the next test case after eatlunch.at?

Yes, when Andy asked me if I’d like to help with Upcoming, I was chomping at the bit to implement Freetag and see how well it worked. I implemented the core Freetag API in Upcoming in about an hour and a half.

I had event tagging, listing of tags, and tag clouds all done within that timespan.

It made me really implement the trickier things about writing a tagging system, because Andy’s got such a big user base, I can’t get away with being lazy about certain bugs.

Specifically what did you have to nail down?

I really ended up polishing it up to support quoted tags, better ordering and limits on each API function, and normalization. I also had to rewrite the core to separate raw tags and normalized tags, because Andy wanted it to work like Flickr. But that wasn’t too hard once I understood what it meant.

When developing a generalized API, it’s important to provide as many parameters as possible to your core calls – such as offsets, limits, sort order, and sort direction.

So a limit on each API function in that sense means what exactly?

Such as, show me only 5 tags at once, and start 10 tags down in the list. In that case, 5 is the limit, and 10 is the offset.

I understand normalization in a database context but what does it mean when you talk about normalized tags?

It’s a tricky topic – if you look at flickr and upcoming, here’s what we do when someone tags something as “John’s First Movie!” We take that, and normalize it by removing any non-allowed characters, then we lowercase it. Then we store that as an independent tag in Upcoming.

I’m not sure how Flickr does theirs, but in each case, if you’re not the creator of that tag, you’ll see “johnsfirstmovie”. If you’re the actual creator, theoretically you wanted it to be “John’s First Movie,” at least so you can find it again later. So we keep that as a raw tag.

Unfortunately, FreeTag doesn’t go completely normalized between raw and normalized tags, for performance reasons. So it’s not perfectly normalized, but it’s close.

I adjust most of the API functions to handle that so you don’t get duplicates, but that’s a bit technical, you probably don’t need to worry about that.

Sadly, Delicious doesn’t do that, so I have tags there called “foo and bar”

One of my recent Freetag releases implemented a feature where you can pass in all of your configuration parameters to the constructor of the class. That means you don’t have to go in and edit config files each time you upgrade.

One of the cool things that lets you do is keep around your custom valid characters pattern, so you can pick your normalization scheme for yourself.

That lets you keep dashes, underscores, spaces, or even high ascii (for internationalized sites) in the normalized format, if you want it.

I wonder if the web helps force you to plan ahead that way, as it is such a moving target of an environment. It’s almost never a good idea to nail things down too literally.

It’s one of the biggest challenges of developing a generalized module like Freetag. You really need to think ahead and make sure that it’s as generic as possible, so that people don’t have to hack into it themselves and potentially lose their modifications every time they want to upgrade.

It’s all so meta-

Yeah, it’s definitely pretty meta and kinda hard. I have a newfound respect for open source software maintainers.

Has the Upcoming user base given any feedback to you or Andy?

Yes, they actually ended up filing a bug about the tag normalization on the wiki. I ended up explaining it, and they moved it to its own page.

Meaning they thought the feature was a bug?

Yes, that’s what happened. I know that a lot of people really liked the contributions I made to Upcoming, just based upon the press when we released.

So that is a bit of intelligence into what people expect and what confuses them (I’m thinking like a UI/IA guy now).

Hehe, yeah, it confuses people when their perspective doesn’t match that of others. But I think you’ll see that more and more on the web, especially as sites get more complex.

Yeah, for sure. User-experience is a series of tradeoffs. It’s easy to stand off to one side and say it should be optimized for users just like oneself.

The other major things I’ve worked on with Upcoming have been the REST-like API, and the invite feature.

REST-like, does that mean not 100% RESTful?

Hah, I’m specifically using that word, because I know guys who bring up all the time that our API isn’t fully RESTian. AFAIK, there are very few fully RESTful web applications out there that are popular.

Everyone makes tradeoffs – like what happened with Backpack and their $_GET and google web accel fiasco.

Yeah, fundamentalism is never pretty.

I made sure to use $_POST instead on the state-changing calls, which turned out to be the right move. However, I didn’t design with the verb/noun aspect of REST, so I hear that all the time.

People are always mailing in, who don’t understand POST. It’s hard, because everyone understands how to construct a url and make a GET request.

So as far as making an easy platform for beginners to write apps upon, GET is probably the way to go.

In the beginning, it was written, that the HTTP should have four verbs, and Tim Berners-Lee saw that it was good.

Yes, but not even cURL implements DELETE. That’s why I don’t fix that bug.

Yeah, I think I’d be wary of using DELETE outside of a totally secure web app environment, and even then I’d have second thoughts.

well, I overload POST to DELETE for me, but you’ve got to authenticate, etc. But its’ a tricky subject, and I figure by saying REST-like instead of RESTful, I kind of avoid it.

REST-esque

That’s a good one.

It is interesting that you need to think about these things when you’re developing for such a wide potential base.

Yeah, it’s a lot more challenging, because I really want to do things the right way. That’s why i’m lucky to get emails from people smarter than me, telling me how to do things better.

Ok, so have there been any other (significant) implementations yet? I imagine that Upcoming really promoted the hell out of FreeTag, relatively speaking.

A few pretty cool ones – Blogskins implemented it over on their site really quickly too.

I’ve gotten some emails from people planning on using it, and when those go public I’ll be sure to announce it on the mailing list.

It could really speed up adoption of tagging.

OK, let’s take one step back and let me ask you where you think all this tagging is leading us, with the cross-platform tagging idea or maybe other things (that i can’t really imagine, yet) that might be built on top of a heavily tagged web.

Well, I think we’ll start to see tagging systems interoperate once the first person gets out the gate in implementing a tag communication standard. Maybe that will be me, I’m not sure.

But once that happens, I think we’ll see convergence on a wider scale into a really interesting set of tags.

What will that enable beyond the obvious ability to tag more than one kind of thing with the same gesture?

Really freakin big tag clouds.

I’m being a little facetious, but that is actually where you might see things go.

If you’ve ever seen Flittr, it kind of consolidates tagging systems in a one-off way, taking one tag and finding samples in different systems. It’s just kind of slow, unfortunately.

I’ll check it out – sounds interesting at least as a proof of concept.

I personally don’t have time to do this right now, but it would be awesome to have a tag thunderstorm, where you can browse a global tag cloud aggregated from many sites, and then dig down into individual ones.

That does sound pretty cool! But don’t we already have problems with tag clouds (scaling, imposing norms on people vs. harnessing self-interest…)?

I don’t really mind tag clouds that much. In my API, the function that generates one is called silly_list.

Well, they are sort of a stab at the kinds of interfaces we’ve been waiting for for 20 years or so, with an almost 3-D sense of space, relative importance, closeness, etc.

Yeah, totally. I think sometimes it’s just popular to be contrarian.

I don’t think we’ll see the death of hierarchy anytime soon.

You just have to look at how hard it is sometime to dig data out of niche wikis.

When there aren’t that many people tagging a set of stuff, it’s not really that useful.

Do you think folder-like hierarchies and free-tagging complement each other well?

Absolutely. Both are useful – in some ways, it’s kind of the opposition between Google and Yahoo.

I think tag systems are just the collapsed leaves of individual categorization trees, right? That’s totally my nutshell view of what’s going on.

Sure, in a sense, and they do overlapping well without a lot of either duplication or aliasing.

You’re basically flattening then merging personal hierarchies.

Well this is a lot for me to chew on. Thanks for taking the time out to talk to me.

Thanks for asking me to talk about it!

My pleasure, and we can thank Andy for suggesting it too. I’ll be keeping an eye on your stuff, I’m sure.

Sounds great. It was a lot of fun talking about it, and I’ll look forward to seeing what comes from it!

…and, scene.

Gordon, I apologize for taking so long on this. In the end I figured the conversation works better than any sort of “article” I could have turned it into.

March 29, 2006

Introduction: Thomas Vander Wal

I am Thomas Vander Wal and I am pleased to be invited to have a place to post and chat about the subject of tagging. I have spent much time pondering, playing with, researching, and developing tools with light metadata and tagging since at least the early 90s. I had seen tagging as only partially working, but having too many faults to be practical. That was until I came across del.icio.us. Del.icio.us seemed to change the usefulness of tagging, where tagging was completely messy prior, having an identity tied to the tag applied to the object being tagged allowed for easier means to derive clarity. Not long after Flickr added tagging to their incubating photo sharing tool. While Flickr does not provide easy means for everybody to tag information nor an easy means to see all the items an identity has tagged.

It was around this change in tagging that Gene Smith on the Information Architecture Institute asked what we should call this tagging that is bottom-up, social, and emergent along the lines of del.icio.us and Flickr. At this point I chimed in folksonomy, which quickly turned into a meme after Gene blogged about it.

Since that point I have been keeping a collections of other’s tagging and folksonomy works bookmarked in my and writing about it on my personal blog Folksonomy :: Off the Top and my more formal blog Folksonomy :: Archives :: Personal InfoCloud.

March 11, 2006

Notes on Beyond Folksonomies at SXSW

I posted my notes on the Beyond Folksonomies at SXSW at my “The Power of Many” blog.

Update: Scot Hacker took notes on this panel too.

Technorati tags: , (in case Technorati’s not picking up our native tags)

February 16, 2006

Tag Summit

This year’s Information Architecture Summit is chock full of sessions on tagging. Here are some highlights for those of you that haven’t yet spent your spring conference budget:

Tagging and Beyond: Personal, Social and Collaborative Information Architecture
I’m moderating this panel on social information architecture, looking at everything from tagging to collaborative filtering as ways groups of people can organize information. I was lucky to find excellent panelists: Rashmi Sinha, Scott Golder, Mimi Yin and danah boyd.

Tags and facets, tags and languages: a case study.
“Peter Van Dijck will discuss a project (a public website) that attempts 2 things: to combine a folksonomy with a faceted approach, and to localize a folksonomy.” (I think this is about Mefeedia, which has a nifty example of faceted tags.).

From Pace Layering to Resilience Theory: the Complex Implications of Tagging for Information Architecture
Karl Fast and Grant Campbell “will present a framework for adapting theories of complexity, pace layering and resilience to the question of tagging and folksonomies, and their influence on the practice of information architecture.”

Exploring the context of user, creator and intermediate tagging
This paper examines the differences in the context of user, author and intermediary assigned keywords or tags using the social bookmarking sites Citeulike and Connotea.

The life of tags
Anthony Charles and Jason Toals’ paper “will discuss the potential of adding metadata to the metadata through weighting of tags or sequencing of tags through more formal structures.”

And these are just the sessions that are directly about tags. Many others, like Rashmi’s session or Travis Wilson’s, will talk about tagging in the context of other subjects. I suspect tagging will also be on the agendas of keynoter David Weinberger and provocateur Peter Morville.

This year’s summit is in Vancouver, March 24 to 27. The price is great ($600 for three days of sessions) and if you’re a member of STC-ID, IAI, UPA, AIGA, or SIG CHI you can register for a steep discount.