Joho the BlogWhat is a folksonomy anyway? - Joho the Blog

What is a folksonomy anyway?

After poking around at Thomas Vander Wal’s site and writings (1 2) and at Wikipedia, it turns out that I think I’m wrong about what “folksonomy” means. (And, yes, I’ve poked around both before.)

Thomas seems to call a folksonomy any set of uncontrolled tags (no prior taxonomy, no controlled vocabulary) done by individuals and posted in public. Thomas likes the term especially if the taggers are supplying tags that make sense to them as individuals, rather than suggesting ones they think will work socially. I assume that means that he thinks that if (oddly) there were absolutely no shared tags at Delicious.com, it’d still count it as a folksonomy.

I had been thinking that a folksonomy is one way order emerges from such set of tags: Some are more popular than others. In fact (I’d thought), if you really want a folksonomy to develop, you give people some feedback about how others are tagging the same or similar objects. (Thomas has a useful distinction between broad and narrow folksonomies that applies here.) That feedback causes further crystallization around particular terms. There are other, non-folksonomic ways order may emerge, including using heuristics to cluster tags.

The difference is this: 1) Any site that enables free tagging is necessarily a folksonomy. or 2. A folksonomy is a way of developing a bottom-up taxonomy from the tags being generated at such a site.

If a folksonomy is #1 not #2, then we need a name for #2. And I have the perfect name for it: A folksonomy. :)

By the way, you can see the difference between the two in the Wikipedia article on folksonomy. It equates “folksonomy” with tagging. In the section of the article that presents problems with folksonomy, the criticisms are criticisms of tagging. For example, the first problem cited is that folksonomies “encourage idiosyncratic tagging…” But by what I thought was the definition, folksonomies actually help solve the problem of overly-idiosyncratic tagging. I was on the verge of “correcting” the Wikipedia article when I realized my definition was off. [Tags: ]

19 Responses to “What is a folksonomy anyway?”

  1. Let me see if I can help…

    In short a folksonomy is a set of uncontrolled tags provided by individuals for their own retrieval purposes of that object and these tags are shared publicly. (I have blogged this in my ( http://www.vanderwal.net/random/entrysel.php?blog=1750 ) )

    So if del.icio.us had no shared tags it would not be a folksonomy. They need to be shared, but it is also important that the tags applied by the person to information and/or media they are consuming is done for their own retrieval. This means that Gmail, while tagged for one’s own purpose is not a folksonomy as these tags are not shared with others, they are just tags.

    The discussion that triggered the coining of folksonomy was what made del.icio.us and flickr (the two initial examples we looked at in the IAI listserve) different from all the tagging efforts that came before them (tagging seems to have been around on the web since the 1997 or 98). The big thing that was different, from say Bitzi, was people tagging information in their own vocabulary for their own reuse. Tagging information for others as a priority seems to make it far less accurate as a person may not understand the terms they are using (well understand them as other may).

    There is a second element in this that seems to be growing more important to me the more I have looked at folksonomy. This second element is the individual’s voice is important. A person’s understanding of the vocabulary terms they use is not wrong in their eyes, it may change over time (possibly to become closer in-inline to the social norm). But this is also how language changes and emerges. We know there are varying ways people request a can of Pepsi in the United States based on their regional linguistic inheritance or their cultural terminology adoption. If you were to ask that person to put a label on the Pepsi so they (them self) could retrieve it and other like products out of an automated warehouse, that would generate their own personal usage.

    There are a couple of things that this point of reference provides in a folksonomy, it is the cultural perspective as well as the discipline perspectives the person brings with them. A person labeling that Pepsi may put the *pop* tag on it if they are from the Chicago area, but may also add *soda* because that is what the local culture calls it and she has been living there a few years and that term she has been adopting out of practice/absorbsion so she is better understood. She also works in the food service industry so also labels it *carbonated beverage* as they are stored differently than still beverages. She has an understanding that depending on the context she comes at this automated retrieval from it may influence the term she uses to make the request.

    In talking with people who seems to be connectors in del.icio.us this is the approach that they use. The tag things with more than one tag as it may have many facets, but they also have more than one vocabulary they use because they interact with various cultures and/or disciplines. Shifting vocabularies, be it languages, or terminology based on an academic or professional discipline is sometimes tough. As I spent a far amount of time at conferences in non-English language countries this Fall I noticed this happening a lot in person. A persons slides may be in English and they present in English, but their native language is Dutch. As they would get to question and answer sessions they may get a question in Dutch and respond in Dutch. But if they were asked to repeat the answer in another language they had a difficult time shifting. Two or three people stated they needed to respond in the language the question was asked as that is how the answer formed in their head and it would take some rough translating to provide the other language response. Each of these people was very bright. I immediately thought of folksonomies and people’s refindabilty problems of their own bookmarks.

    I have spent a lot of time informally talking to people who use social bookmarking tools. The people who have problems with refindabilty of their own bookmarks have used few tags (three or less) on each bookmark and they have usually bookmarked it based on what they thought others would call it (after all it is social bookmarking). Those that love the tools and have an easy time refinding their bookmarks often have five or more tags per bookmark, tag for their own recall, and tag from more than one perspective (slang they use, cultural perspectives that they participate, and discipline). In our Pepsi retrieval example the person may professionally deal with the bar staff and wait staff that have their own slang, which she uses when she interacts with them and that staff uses *p-carb* or *C* as that is its label on the bar hand wand (C for cola and Pepsi is the only cola line they sell).

    These people with different perspectives and tag for their own recall, based on what may be at the forefront of their mind when they are trying to recall it become the connectors in a folksonomy. These people are building the thesaurus for the rest of us.

    The crystallization that comes around the terms in a folksonomy may be representative a few variables. One is the community that is tagging. Each tag services seems to have a slightly different community associated with it, therefore giving value to the community it self. The homogeneity of the tagging could be representative of the community, or it could be the stability of the term associated with the object being tagged. The patterns built for the majority (the spike in network effect terms) are interesting to me. This could be the taxonomy, which may have already been built (the folksonomy reinforces the taxonomy).

    But what I find of more interest is the value of connecting people to object they are interested based on terms farther down (and out in long tail terms). The popular terms may be the ones that have made the object findable already.

    But, with the folksonomy we are getting a broader view of the real terms that people use to find the information (if the individual who has tagged the object – it can be a real name or a cloaked name to protect the real persons identity – is separated from everybody else (as in del.icio.us) we can use that to filter out people who we personally do not agree with their term (or they could be attempting to spam) or choose to follow them just with that term). Taxonomies are limited, as there is never enough money or time to make them complete (particularly for a large collection of people). Even within a enterprise business your taxonomy will never be fully accurate. Lets take the 80-20 rule, Pareto Principle, as a rough guide. We built our taxonomy with a budget to build for success of 80 percent of the organization’s population with 20 percent of the effort needed to get to 100 percent. We nailed this with our testing metrics we set. This means that 1 in 5 people will have difficulty finding the information they need easily. If I am running a business that is a horrible ratio for efficiency. Fortunately we normally are better off than that, but still it is a long way from perfect.

    What the long tail of the folksonomy provides is two things. One is it fills the gaps in the taxonomy built on authority (authority is only valid for the group(s) it is working to support). The other is emergent vocabulary can be identified and used/integrated.

    I have worked on many intranets in the past 10 years. The biggest problem to findability is the vocabulary does not work for everybody (this is also a problem with the lack of use in libraries as people know the book exists, but the library does not organize the book by what they call it). If we let people tag things by what they would call it so to find it again, we all gain.

    So, in short a site that enables free tagging is a folksonomy (if it is for a persons own recall) and it is a way to develop results that can be used to build a bottom-up taxonomy. A taxonomy is normally generated by authority. In a folksonomy the authority is the individual person and their understanding of their own vocabulary for their own recall. To turn the folksonomy into a taxonomy will take a little more work, but the laborious part of the work has already been done (the tagging/labelling/metadata).

    On Wikipedia… I have modified the entry to state folksonomy is a subset of tagging. Tagging has been around a long time (at least a dog year). Not all tagging is a folksonomy. I agree that the Wikipedia criticisms of folksonomy are actually problems with tagging. Folksonomy has three data points that can be used to build solutions to the problems (the three data points are outlined in the Broad Folksonomy – 1) clear understanding of the object being tagged, 2) the individual tag used, and 3) a distinct identity of the person tagging. Things look messy until you see there are patterns based on people tagging. The people tagging can be grouped based on the same tags applied to the same objects and we have a community of interest with a similar vocabulary it can be used rather well to find information if your interests are close to theirs.

    I think your definition may be correct as what I just stated does solve the problem of overly-idiosyncratic tagging. But, idiosyncratic tagging fills the gaps that exist in authoritative taxonomies.

  2. Oh, hit post rather than preview. I was rearranging sections and editing in this wonderfully restrictive textbox for comments (good in that it normally restrains people from long responses).

  3. For me, ‘folksonomy’ is tarnished by very much the ambiguity you (DW) identify – it’s not always clear whether it stands for the practice of social tagging (neutral), or for a potentially interesting emergent property of social tagging (speculative), or for the way we can imagine social tagging working some way down the line (visionary), or for a belief that social tagging the next big thing (marketing). (It’s similar to the uncertainty that rises, like a miasma, when anyone uses the words ‘Wikipedia’ and ‘authority’…) I tend to use Peter Merholz’s ‘ethnoclassification’ for the neutral & speculative meanings & reserve ‘folksonomy’ for when I’m commenting on the visionary/marketing stuff. See also here and here.

  4. David, I think the learning what others tag an item is secondary to the capturing what an individual actually calls something. The enculturation process happens as it joins communities of understanding through vocabulary terms (a folksonomy can act as a thesaurus) and is quite powerful as that, but I really believe that is secondary to the real value.

  5. First, Thomas, thank you for the extended and very helpful post.

    Thomas, it seems to me that this is a discussion of the scope of the definition.

    I don’t think you want to build into the definition of “folksonomy” the idea that they have to built out of tags created for an individual’s own use (i.e., not tags people create because they think others will find them useful) because then Delicious is no longer a folksonomy (because it influences us by showing us the most popular tags as we’re tagging a bookmark). Nor do I think you want to define it so that it equates to a tag space (such as all the tags that people create at Delicious) because then we have to say things such as “Flickr uses heuristic clustering techniques to order its folksonomy.” The lovely term “folksonomy” still seems to me most useful if it is ONE way that order is brought to a tag space.

    Now, back to the point you raise in your second msg. When you say “secondary,” in what sense? Do you mean that you think capturing tags done without regard to what tags others are using is a better way of tagging? In what ways? I can think of ways in which social tagging better serves the needs of the taggers. Or do you mean that it is secondary and thus not a part of the definition of “folksonomy”?

    If the definition of “folksonomy” includes that it results from individualist tagging, then we’re going to have to come up with another word for what happens when people tag socially. But that just seems silly. Don’t we want to be able to argue about which is the better way to do a folksonomy, but still admit both are folksonomies? If so, individualistic tagging cannot be part of the definition of folksonomy.

    So, you have my double plea: 1. I hope the def doesn’t equate folksonomies with tag spaces but instead refers to one particularly felicitous way of organizing a tag space; 2. I hope individualistic tagging isn’t built into the def, even if such tagging is far and away the better way to tag.

  6. I don’t think you want to build into the definition of “folksonomy” the idea that they have to built out of tags created for an individual’s own use (i.e., not tags people create because they think others will find them useful) because then Delicious is no longer a folksonomy (because it influences us by showing us the most popular tags as we’re tagging a bookmark). Nor do I think you want to define it so that it equates to a tag space (such as all the tags that people create at Delicious) because then we have to say things such as “Flickr uses heuristic clustering techniques to order its folksonomy.” The lovely term “folksonomy” still seems to me most useful if it is ONE way that order is brought to a tag space.

    I’ve worked on the assumption that what Tom Coates calls a bubble-up folksonomy is an analytic construct describing a social ordering of information. Tom summarizes it as follows:

    “…there are concepts in the world that can be loosely described as being made up of aggregations of other smaller component concepts. In such systems, if you encourage the tagging of the smallest component parts, then you can aggregate those tags up through the whole system. You get – essentially – free metadata on a whole range of other concepts.”

    As I understand Thomas Vander Wal’s point, it is that the properties of tags resulting from individuals attempting to keep track of information on the Web that they want to find again provides more information value to a folksonomy than the properties of tags done with someone else’s information concepts in mind. It is more of a “this is what I think about that” than a “this is what I think about that compared to what you think about that.” It seems to me that the information value of the first is that it aggregates across the Long Tail with no bias towards the tall end, since people are not second guessing their own concept of the content they tag; they aren’t “fitting” their tag into the tail. The latter approach to tagging, where the person doing the tagging does so with the fit of their tags to existing tags in mind, will bias the aggregation, clustering, process towards the tall end of the tail. Coates points to this problem in his discussion of the phonetags project at the BBC and the difficulties in discerning the brand (pop, alternative, etc.) of a specific radio show if the aggregation process counts all tags. His thinking is that popular songs are tagged a lot but do not necessarily represent the character of the show.

    I would also point to CiteULike, which aims to help academics keep track of the academic papers they read on the web. As people add readings to their libraries CiteULike extracts citation details and asks the person to create tags for the entry. People can make their personal libraries public and see who else is reading the same literature. CiteULike claims, “this can help you discover literature which is relevant to your field but you may not have known about.” If the aim of the tagging is to bring organization to the Long Tail, having people tag with other people in mind is more likely to reproduce the conventional understanding of what academics read and bias the tagging towards the tall end of the tail.

  7. First, to be clear: Whether or not individualistic or social tagging is best, my point is that the term “folksonomy” should not refer exclusively to either.

    Second, I’m not saying that only social tagging is best. Larry you point to great examples of where and why individualistic tagging is better. But, there are times and ways when social is useful also. E.g., in a purely individualistic world, you end up with lots of “to_read” tags (i.e., personal reminders to get around to reading a particular bookmark) that are of little social, aggregative value. In corporate environments, where there is a conscious effort to build high-value tagstreams, social tagging can be crucial. So, it isn’t either/or. In fact, I wish Delicious et al. let me designate some tags as private and some as public.

  8. Dave, what would be the benefit of private tagging?

  9. >>what would be the benefit of private tagging?
    Such a collectivist! :-)

    Tripp, private tags benefit from not needing to be generic enough to be useful to the world at large. I can tag something ‘UofR_Buddies’ and know what it means years down the road, but it’s relatively opaque (or even possibly misleading) to anyone who didn’t attend the University of Richmond from 1988-1992, though of nontrivial use to me.

    In terms of the practice rather than the particulars of private tagging, one benefit is that stuff I decide to keep around for my own purposes and don’t intend to share doesn’t pollute the data others are intending to aggregate.

    For example, if I go around looking for, say pictures of lighthouses around the country so I can compare and contrast to lighthouses I’m likely to see during a forthcoming trip to Boston, then it makes sense for me to tag them “Boston,” but if I can’t keep my tagging private then I will be peeing in the public Boston tag pool if I don’t go out of my way to name them “pix_for_RCM_boston_trip_06” or something similarly painful and unambiguous.

    -Rich

  10. I am the collector. Are you the key master?

    Okay, enough with the marshmallow jokes.

    But, as a collectivist…and a folk musician, part of the enjoyment is the demanding work of sifting through the variety of permutations, reiterations and copies in order to sift out something akin to a source.

    So, as I understand what has been written here, narrowing a folksonomy too far keeps it from being “folksy” if you will. One can, yes, argue that it is created by folk, but the point is the aggregation, no? I mean, the aggregation becomes more useful as unexpected crossovers occur. For example, one may google (Is that a folksonomy aggregator?) “folksonomy, Weinberger’ and this thread will appear. One may also google “Ghost Busters” and the same will appear…though with fewer overall connections and thus be number 1,876,364,765 out of 2,000,000,000 possible hits.

    It is the crossover that makes the internet and folksonomy powerful.

    I think that this is the place of the internet. It should be “folksy.” Intranets and old style ethernets are perhaps better places to narrow aggregations.

  11. I must disagree. The internet is bigger than any single purpose. It serves narrow and wide aggregation equally well. We might usefully argue the purpose of a site like Delicious, but arguing that the internet is “for” anything is like saying “air is for breathing.” It leaves out all sorts of other uses, and imposes a very myopic definition.

    The internet, for example, serves as a perfect vehicle for me to store a bunch of private information out on a server that I must access from several machines. A very private use. Then there’s e-commerce. A one-to-one conversation between a retailer and a customer. Private by necessity, and unaggregable thanks to the encryption involved.

  12. There are going to be so many permutations and approaches to making sense of private tags and public tags (where “private” = done for oneself, but still accessible by the public). This will all intersect with social networking software, analyzing tag usage among users who know one another and users who are share interests and demographics. We need, imo, room for both public and private. That’s why I’d like social tagging sites to allow me to designate a tag as private.

    And that’s why I really hope that “folksonomy” does not come to mean only private tagging. It’s a much more useful term if it comes in flavors.

  13. David,

    Is the kind of social tagging you describe as high-value in corporate environments typically designed for knowledge management purposes? I would not include folksonomies as a part of knowledge management, though I suppose they might be useful to that challenge on intranets. I just don’t think you get emergent taxonomies that way, though I’m admittedly not very clear myself on what an emergent taxonomy looks like.

    I find the approach of Jill Walker intriquing myself. She treats folksonomies as instances of feral hypertext. Take the following quote from her essay, “Feral Hypertext: When Hypertext Literature Escapes Control.” She describes the point of view much better than I can:

    What feral hypertexts have in common is that they have reverted to the wild, in one respect or another. They are no longer tame. They won’t do what we expect and they refuse to stay put within boundaries we’ve defined. They don’t follow standards – indeed, they appear to revel in the non-standard, while perhaps building new kinds of standard that we don’t yet understand.”

    That new kind of standard she refers to could very well be taxonomies resulting from the emergent order of folksonomies.

    Doesn’t a recognition that folksonomies are an emergent order bring the discussion back to the question of whether hypertext subverts hierarchy. Walker’s conception is that early hypertext systems were domesticated, bred in the captivity of mainframes. Whereas, on the web some forms of hypertext have gone feral. “Feral hypertext is no longer tame and domesticated, but is fundamentally out of our control.”

  14. Larry, are domesticated hypertext public? Meaning, can I link to it on my blog and have instant access?

  15. Delicious is a public folksonomy: When you enter a tag, it shows you the other popular tags for that bookmark. It also shows you your own, so it’s private as well. It works both ways.

  16. yes.this is my site http://yasamohuel.goldenelf.com/ultracet/pdr_ultracet.html Thanks.

  17. Tripp, you have to work to domesticate this stuff. Check out James Governor’s piece on tag gardening…

    http://www.redmonk.com/jgovernor/archives/001186.html

  18. Good idea about the private versus shared tags! I’ve struggled with the purpose of tagging because some people do it selfishly and others do it for the benefit of others. I’ve documented my thinking of folksonomies here.

    Let me ask you this…why would you want private tags that you would not share with others?

    My answer would be that I sometimes create tags that are used for blogging purposes. That is, I might create a “forblog” tag which is the label picked up when I send an rss feed to my blog. This label, however, benefits noone except myself. In fact, it muddies the waters a bit.

    Therefore, I think separating the tags has some benefit.

  19. i visit here rarely, because most of the content is too technical for this retired librarian. i have, however, followed the classification converstations with some interest. i was happy to see the NCSU (north carolina state) information, because i retired from there many years ago, having served my time in the interim before the arrival of the wonderous susan nutter.

    i am a regular daily kos visitor and have been amused at the grounds-up folksonomy evoltuion there. i don’t know if david w is in touch with kos, but it would be nice to have him visit and post an explanation of how tagging develops, long tail and all.


Web Joho only

Comments (RSS).  RSS icon