Monday, May 02, 2005

Reinventing the wheel

Librarians and information scientists have studied various methods of information organization and retrieval for a number of years. Thus, when articles like this are published, I am both discouraged that our work has been ignored and encouraged that the work we do continues to have value today:

'Tags' Ease Sifting of Digital Data

This concept of "tagging," though, is nothing other than indexing. Consider Lancaster's definition of the process of indexing: "...the indexer describes [a document's] contents by using one or several index terms, often selecting them from some form of controlled vocabulary. ...the terms assigned by an indexer serve as access points through which an item can be located and retrieved in a subject search..." (Indexing and Abstracting in Theory and Practice (2003)).

Traditionally, indexing has been in the hands of experts or been automated to a certain degree. So, is user indexing, as described in the "tagging" article, a new concept? No. Lancaster notes the following research: "Fidel (1994) uses the term 'user-centered indexing' to refer to the principle of indexing on the basis of requests expected from a particular audience. Hjorland (2001) agrees that indexing must be tailored to the needs of a particular clientele... writers such as Shatford (1986) and Enser (1995) point out that collections of images can be viewed quite differently by different groups of users. Thus, each group has different indexing needs. This led Brown et al. (1996) to suggest the need for a 'democratic' approach to indexing, with users of the images adding their own terms to a record where necessary and appropriate." Others Lancaster cites include Desser (1997) and Villarroel et al. (2002).

(The vocabularies that emerge from these user indexing systems are known as "folksonomies." This has been a hot topic in the blogging community this past spring.)

Lancaster has written:

"While new faces and new approaches are always welcome, it is unfortunate that many now working the field [of information retrieval] have absolutely no previous background and, thus, no firm foundation upon which to build. ... Many ideas appearing today have obvious antecedents in the literature of 30 or 40 years ago, yet these pioneering works are completely unknown to current investigators."

Perhaps the best examples of this ignorance is the popularity associated with metadata. Many people think this is a new idea, but, metadata, after all, is only structured information used to describe some resouce. Catalogers/librarians have been creating "metadata" for years to describe books. As Milstead and Feldman (1999) write, "...librarians and indexers have been producing and standardizing metadata for centuries."

It is popular to portray librarians as graybeards and the work that they do as outdated in the information age. But if "metadata" and "tagging" and "folksonomies" are any indication, the time-tested prinicples of information organization practiced by librarians and the research into information retrieval are as relevant as ever today.

Might I recommend to everyone Lancaster’s excellent Indexing and Abstracting in Theory and Practice (2003). He is one of my professional heroes. His historical perspective and comprehensive literature reviews are important for the work that librarians, "information architects," information scientists, and indexers do.

[If you want the complete citation information to any of the above references, let me know.]

Alec

4 comments:

Barbara said...

It's always amusing to see how people dream up ways to do what we've done for 100 years and say "hey, guys - guess what! we can make things easier to find!" And even more amusing to watch the language evolve. Cataloging - boring! Taxonomies, tagging, folksonomies - way cool. I mean, tagging is like guerilla art, right? Taxonomy sounds significant and deep, and folksonomies is one of those words you think you should know but you're not quite sure what it means so you nod and try to act as if your hip to it. Whereas cataloging belongs in those libraries that for some odd reason are always described as either dusty or musty. (Achoo!)

I may just have to check out this hero of yours. I have to confess I tend to equate Lancaster with his notorious "paperless society" predition. He was 90% right and 100% wrong.

Barbara

Alec said...

Lancaster graciously recanted in the article "Second Thoughts on the Paperless Society." Library Journal, Sept. 15, 1999. p. 48-51.

Barbara said...

Yes, I remember Lancaster's "oh, hang a moment..." article. Another interesting take on this whole issue is the book The Myth of the Paperless Office by Abigail J. Sellen and Richard H. R. Harper (MIT, 2001) - has some very interesting things to say about paper (and about Melville Dewey, by the way...)

As for what Brad said about disrupting librarians' authority - since we have no authority over things that we don't put into our catalogs (e.g. our subscription indexes, much less the Web) is it a big deal to disrupt it? I'm not being snarky here - just curious. Is the idea that a person can improve the findability of things threatening to librarians? I don't think so. It's just ... well, yes, actually, we had thought of that...

Now, whether we've adequately found ways of doing it well is another matter. Our catalogs are not easy for students to use, especially when they don't know what they are looking for. Which Umberto Eco says is the purpose of a library - to help you find things you don't know you are looking for.

Barbara

Alec said...

Brad, as I mentioned, this idea isn't new. Lancaster cited, for example, Brown et al. (1996), who proposed user indexing nearly a decade ago.

Alec