Tagging is tedious
Imagine tagging everything digital that you own/use, so that it becomes part of a continuous data landscape that you navigate using your tagapps.
Sounds great but it isn't going to happen.
The tagapps will be there and so will a range of data storage devices as well as large collection of personal digital assets needing to be tagged. What isn't going to happen is the manual tagging of all of it.
Yes, we are all excited about the social possibilites of tagged data on the web. Flickr is fun, and actually useful if you have a lot of photos to share. De.licio.us is useful even if you don't share your tags. But for folksonomies to become the dominant mechanism for data humanization, the act of tagging has just got to move beyond typing keywords at a computer. I am not suggesting we wait till someone perfects the Vulcan mind meld with the computer. There are more immediate and simpler ways to take the tedium out of tagging. Herewith some of my suggestions to take the tedium out of tagging.
(A word from our sponsors ... Take the tedium out of the tag, use ExciTag(tm) - feels great, less typing ... )
Tagging by search terms.
Merge the search client and the tagging client so that search results come pre-tagged with the search terms - then you get to add/subtract what you want. The search terms typed in once get re-used. DRY - don't repeat yourself.
Voice Tagging.
I'd like to be able to say keywords when I look at a photo or listen to music or read a file and have the my words be translated into text tags. This includes words like "Oooh cool" which I might say when I look at a photo, read an email or look at a neat piece of code. I would never type these words as tags but when I capture them from my words, I capture also the emotion in the moment. They become "emotitags" - tagging data with snippets of emotion. This is feasible on the Mac today. Maybe someone is already working on it.
Vector Tagging.
When large quantities of text need to be tagged one rough but feasible mechanism is the mechanism derived from text retrieval techniques - word frequencies. Generate the word frequencies for a chunk of text, eliminate the usual "junk words" - like "a", "an", "the" .... and then pick the top say 5 most frequently occurring words as the tags. Very crude but a good first pass. Combine this with full text search and fielded search and you really have something, ... I think.
Sample frame tagging.
How do you tag a video stream that may have semantically disparate chunks of content? Here's another wild eyed suggestion. There's technology today that will digest a video stream in real time and output representative frames for large segments of video, in effect summarizing a video stream with a single image. A larger stream would have a number of images in sequence representing the segments in the stream.
Now tag the summary images like you would a Flickr photo, add another tag common to all that signifies the steram they all come from, add emotitags if you wish.
Collective tagging by attention filtering.
What if a population of users is attracted to one particular image out of a collection of images, or a URL on the net, does it make sense to use their tags as a way to tag this item? Take the set of tags that is common to all the users interested in a photo. Does that common set of tags qualify as valid metadata to attach to the photo? There's an intriguing but possibly valid hypothesis here - to wit - "if a collection of observers filters out an item out of a collection of items then the intersection of their metadata is valid metadata for that item". Simply put - if I pay attention to something it must resonate with my interests i.e. it's metadata must have something in common with my metadata. If a large group pays attention to something ( i.e. collectively filters it out of the background noise ) then it must resonate with something they all have in common (intersection of metadata). Is "resonate with" tantamount to "have the same tags as" - that is the question.
I'm sure there's more I haven't thought of and that smarter people have been thinking about for a long time. Any thoughts?
Tags: voicetag vectortag emotitag searchtag
Sounds great but it isn't going to happen.
The tagapps will be there and so will a range of data storage devices as well as large collection of personal digital assets needing to be tagged. What isn't going to happen is the manual tagging of all of it.
Yes, we are all excited about the social possibilites of tagged data on the web. Flickr is fun, and actually useful if you have a lot of photos to share. De.licio.us is useful even if you don't share your tags. But for folksonomies to become the dominant mechanism for data humanization, the act of tagging has just got to move beyond typing keywords at a computer. I am not suggesting we wait till someone perfects the Vulcan mind meld with the computer. There are more immediate and simpler ways to take the tedium out of tagging. Herewith some of my suggestions to take the tedium out of tagging.
(A word from our sponsors ... Take the tedium out of the tag, use ExciTag(tm) - feels great, less typing ... )
Tagging by search terms.
Merge the search client and the tagging client so that search results come pre-tagged with the search terms - then you get to add/subtract what you want. The search terms typed in once get re-used. DRY - don't repeat yourself.
Voice Tagging.
I'd like to be able to say keywords when I look at a photo or listen to music or read a file and have the my words be translated into text tags. This includes words like "Oooh cool" which I might say when I look at a photo, read an email or look at a neat piece of code. I would never type these words as tags but when I capture them from my words, I capture also the emotion in the moment. They become "emotitags" - tagging data with snippets of emotion. This is feasible on the Mac today. Maybe someone is already working on it.
Vector Tagging.
When large quantities of text need to be tagged one rough but feasible mechanism is the mechanism derived from text retrieval techniques - word frequencies. Generate the word frequencies for a chunk of text, eliminate the usual "junk words" - like "a", "an", "the" .... and then pick the top say 5 most frequently occurring words as the tags. Very crude but a good first pass. Combine this with full text search and fielded search and you really have something, ... I think.
Sample frame tagging.
How do you tag a video stream that may have semantically disparate chunks of content? Here's another wild eyed suggestion. There's technology today that will digest a video stream in real time and output representative frames for large segments of video, in effect summarizing a video stream with a single image. A larger stream would have a number of images in sequence representing the segments in the stream.
Now tag the summary images like you would a Flickr photo, add another tag common to all that signifies the steram they all come from, add emotitags if you wish.
Collective tagging by attention filtering.
What if a population of users is attracted to one particular image out of a collection of images, or a URL on the net, does it make sense to use their tags as a way to tag this item? Take the set of tags that is common to all the users interested in a photo. Does that common set of tags qualify as valid metadata to attach to the photo? There's an intriguing but possibly valid hypothesis here - to wit - "if a collection of observers filters out an item out of a collection of items then the intersection of their metadata is valid metadata for that item". Simply put - if I pay attention to something it must resonate with my interests i.e. it's metadata must have something in common with my metadata. If a large group pays attention to something ( i.e. collectively filters it out of the background noise ) then it must resonate with something they all have in common (intersection of metadata). Is "resonate with" tantamount to "have the same tags as" - that is the question.
I'm sure there's more I haven't thought of and that smarter people have been thinking about for a long time. Any thoughts?
Tags: voicetag vectortag emotitag searchtag



8 Comments:
It seems like you are describing two, possible three dominant styles of "tagging":
internal
Term vectors, sample frames. The "tags" are inherent in the thing being tagged, regardless of the person doing the tagging.
personal/external
Search terms, voice emotive. The tags here are personal to the tagger but external to the thing being tagged. One is the path the person took to reach the resource, the other is the person's reaction.
public/external
Attention filtering, collaborative filtering. Tags here seem to be a factor of crowd behavior.
The personal & external ones interest me a great deal, and seem most closely aligned to what you mean by "Data 2.0." An individual's subjective experience -- their activities or feelings -- at the time they are moving past resources become a valuable search cue for finding those resources again. Time is already such a cue, as with hunting through e-mail archives near strongly-remembered dates. Location could be such a cue, if the $#@% mobile service carriers stopped hoarding geo data. Emotion still makes me a little uncomfortable. Not sure I want GSR recording equipment attached to me all day.
Michal,
Yes indeed! I had not thought of it that way but you are absolutely right to slice it in terms of scope - internal vs personal/external vs public/external.
Re: time I did mention timestamps in the earlier blog ("Web 2.0 needs Data 2.0" ) but location is also an obvious data attribute to be used. Which leads to the question --
" What's the spatial equivalent of timestamp?"
"placestamp" ?? , "geostamp" ?? :-).
Nice post and a nice blog - I'm subscribing. I wrote something similar about the melt of search terms and tags here: Coming to terms with tags - so needless to say I agree on that part.
Voicetagging and other ideas are cool also. Clarifies the need for open systems and standards so that instead of writing about voicetagging you can just go ahead and write the voicetagging module itself for our benefit :)
Your ideas for taking the work out of tagging are good ones. Looking at people's del.icio.us tags, I've found most tags fall into these categories: attributes, actions, and identity/affinity.
attributes
Characteristics of the thing being tagged. Some are inrinsic ("internal" according to migurski above) and some are personal ("cool", etc). Like you said, it might be possible to automate intrinsic tags using text analysis.
actions
Labels things according to what I want to do with them (ex: for_project_x, read_later, to_do, etc.) Personally, I think existing tag systems need to support these "action tags" directly. Users shouldn't have to pollute the taxonomy with made up tags in order to support their workflow.
identity/affinity
The article's idea of "attention filtering" is what will drive tagging into the mainstream. If everything you tagged was automatically tagged with characteristics of who you are (40-ish software engineer living in Chicago) and your affinity groups (Texas-born, film buffs, graduated from UT Austin in 1983) then tagged objects become incredibly more valuable.
Ex: Today, searching for "chicago + pizza" returns Chicago-style pizza places anywhere. Instead, imagine how much better the results would be if you could look for "pizza + tagger-lives-in:chicago". Google can't touch this.
There's a load of money to be made by someone who offers subscribing/tagging based on these identity/affinity properties.
OK, but here's the catch: privacy
Users will not participate unless they can be assured, up front, that their personal metadata is shared privately. Any next-generation tag service that misuses personal metadata indescriminately, say by pushing the wrong kind of "targeted ads", will scare away the public forever.
No one wants to be "outed" by Google Ads.
A few ideas that came up while reading your article:
AJAX and tagging
I think it is important to offer an extra benefit for the user that tags something. For instance if he adds tags, the app instantly gives "related items" (as Matt showed).
Distributed tagging (refering to your article "tagging is tedious")
Often I come to a website asking "what is this website all about"? Then I click my bookmarklet for delicious that suggests me a few common tags for this website. This should be a firefox extension or similar: For each website (of even each element of that website), the app should search for tags for that item. This feature will not be implementable if all these tagging webservices are kept centralized as they are now. Another crucial thing is: I want to have tags of "my people". Therefore I need some thing af community building (or some web of trust thingie)
Nitin,
A late comment, but I just found this blog. Please have a look at Simpy, as it already implements some of the features you mentioned (features 1, 3, and 5), as well as some of the services that previous commentators made (e.g. related tags), as well as some other interesting ones, such as the notion of related users.
Philipp,
are you looking for this: http://del.icio.us/help/firefox/extension ? :)
Nitin, you put forth a lot of good ideas, and then some terrific discussion ensues. Rather than wait for someone else to take up these ideas, are you working on implementing any of them? I think you, and the community that comments to your posts, could potentially do something really interesting. I've done a bit on a tagging app, but it's still a pretty crude first pass.
Post a Comment
<< Home