User- based tagging is the affiliation of one item or record or set of metadata with a specific reference, adopting user vocabulary as opposed to controlled vocabulary. Generally the intention is to augment the metadata; to associate it with a specific heading/s or title/s and this is for the purpose of information seeking and retrieval. There is a much debated argument about whether this serves only to increase chaos; to make finding that needle in the haystack even harder, and therefore to make the evaluation of retrieved information less defined or clear. On the other hand, there is also the argument that tagging serves the wide spread need for speed and serendipity in information seeking, as the vocabulary used is relevant and concise. The aim of this paper was to carefully compare and evaluate current research findings on user- based tagging that supports both claims, and to provide some direction for further investigation.
There is a plethora of literature based around business evolution and technological advances, however, focusing on the social growth of the use of the world wide web in the production of prevalent information and its accessibility, it was relatively effortless to find numerous recent research to provide both the history of tagging, its growing validity, and where it will take the world of information organisation and retrieval. Current research centres around the following issues:
- Tagging has become a social phenomenon since 2003, and even since then has rapidly evolved due to the advent and evolution of web 2 tools
- and the development of technology such as SMART phones, operating systems and accessibility
- Library of Congress Subject Headings as the authority controlled indexing has become insufficient and ineffective as information seeking has evolved, and search facilities and techniques have therefore adapted to the impatience of its users!
- Folksonomy (user vocabulary) validity by institutions and the use of third generation applications such as LibraryThing to augment catalog metadata, and welcome user enhancement has seen an expansion of chance information retrieval
- The development of the semantic web; multiple facets allowing for the monitoring of some tagging quality, is driving the development of metadata management
- Serendipity as a thing of the future?
- Metadata Management Systems as repositories located on the web as an answer to information organisation and resource discovery service effectiveness
RQ1: What evidence is there to suggest that controlled vocabulary has become insufficient and ineffective as an IR tool?
RQ2: As user vocabulary or folksonomies become validated by HE institutions, can the quality of tagging be monitored ? What would it bring to the world of resource discovery? What problems could tagging potentially cause?
RQ3: Are we moving towards a semantic web? Is serendipity the information seeking of the future and how will resource discovery maintain its effectiveness?
There is a clear concern with the user augmentation of records. Tagging records with user key words has both positive possibilities for information retrieval and also bibliographic records management as an industry (metadata management); in terms of cost effectiveness, user effectiveness and quality, this has the potential to bring constructive and encouraging growth to information management. However, in considering the limitations of folksonomies on accuracy and relevancy, chaotic user headings may have a negative impact on resource discovery.
Pros of user based tagging and its positive possibilities:
- Speed of searching
- Lack of knowledge or erudition irrelevant
- Serendipity and the values and advantages of
- Folksonomies can complement LCSH and be used in a library catalog
- Value added metadata augmentation creates extra usage in catalog
- Moves metadata management forwards with collaborative purpose, value and facilitation into social indexing on the web
As Kwan Yi (2009, p.897) postulates
Collaborative tagging commonly relies on post-coordination and presents a user-centred view; professional indexing with controlled vocabularies involves pre- or post-coordination and a system-centred view. Thus, the linking of two such resources is valuable in that it can integrate the views of both the users and systems in indexing and information organization.
There is a strong lean towards a combining of controlled indexing vocabulary or thesauri with user generated tags to create hybrid metadata for LMS item records. Thomas (2009, p.431) notes that
A hybrid catalog combining both LCSH and a folksonomy would result in richer metadata and be stronger than the sum of its parts, giving users the best of both worlds.
Cons of user based tagging and its limitations:
- Lack of controlled language and potential impact on databases of LMS – can there be order in such chaos?
- Lack of authority control resulting in the absence of synonym and homograph control
- How is it better than LCSH? What are the benefits over using LCSH?
- Can it realistically be used effectively and without chaos in a library catalog?
- Is there any value to serendipity? What are the disadvantages?
- There is no set criteria or standardisation for tagging
- Can it be effective in resource discovery? Can applications such as LibraryThing bring anything to current metadata in the library catalog?
- Metadata access and augmentation v. ownership
- Question of accuracy
- Question of quality
- Resource discovery relevancy and evaluation
Quality and access is paramount in records creation and management, and maintaining a form of standardised indexing has been the crux of solving the issue of organising chaos. Within semantic metadata management however, the developing concept is to support the collation of these item level metadata from diverse institutional collections via OAI, and then enhance that record, improving its quality and then making it accessible and available for augmentation within a central repository for use by resource discovery services. These records, rather than owned or created on an institutional level, would be produced by the item creators, and ‘ ‘pointed to’ by the service when required. The user augmentations would revolve around institutional needs in this case. This would be of benefit to serendipitous information seeking by relevant users. This is noted by Foster and Ford (2003, 59. 3, 337)
Despite the difficulties surrounding what is still a relatively fuzzy sensitising concept, serendipity would appear to be an important component of the complex phenomenon that is information seeking. In the present study, it emerged as an important aspect of how researchers encounter information and generate new ideas – from interviews which neither focused on nor anticipated it.
However, in the question of benefits of folksonomies over using LCSH, the issues of criteria are highlighted. Can any real criteria be used for tagging? Can suggested tagging be positive? Can this match up to the largest general indexing vocabulary in the English language; rich vocabulary that covers all subject areas, contains cross-references across terms (rich links) and offers synonym and homograph control? In a variety of studies reviewed in the literature, it was found that in user based tagging there was irregular usage of key words. Kwan Yi (2009, p.875) highlights this:
The prevalent use of single word tags over multiword tags, and that of noun tags over other grammatical forms……irregular tags formats, such as singular and plural forms of noun tags, abbreviations, acronyms, and homographs………..[and that] the importance of formal guidelines for the purpose of instructing users in the creation of tags [would be stressed].
The exploration of these questions has generated tentative explanations that require further investigation. These hypotheses in themselves look towards examination and discovery:
H1: With the evolution of technology and its accessibility, speed rather than accuracy or even relevancy is the paramount requirement of the user, and the role of information providers is to facilitate those current user needs
H2: Resource vocabulary is constantly evolving and therefore the use of folksonomies to augment catalog metadata rather than replace it, is a prudent move forwards in the world of resource description
H3: Metadata access and augmentation versus ownership and rigidity is the new driving force in resource discovery and description.
The chaotic and subjective tagging of information by users is not only difficult to standardise, quantify and qualitatively research, but also there is some complexity involved with any possible or potential matching or marrying up to current LCSH controlled vocabulary.
There is a lack of literature about semantic web development and knowledge organisation for third generation applications, to be able to explore thoroughly at this stage the issues and challenges facing contemporary knowledge organisation design and implementation.
Access to the world wide web has made the business of knowledge organisation more widespread encompassing knowledge managers, information management, social community, and web development teams. Their concerns do not lay solely in the storage and retrieval of information, but include digital information creation and dissemination of that information.
Resource description is in a state of flux. With the continued delay of RDA and now branching into semantic web, and the notion of metadata management systems to supplement library management systems, it is obvious that it is in itself a chaotic ‘world’ of political correctness and professional politics that will need accuracy in its resolution.
To continue to be approachable by its users, library IR tools will need to adapt and align to social tools as well as other indexing, communication and research
tools. The W3C is developing the RDF or Resource Description Framework which will be a standardised language for encoding knowledge/information on web pages to make it readable by machines, and therefore LMS. So it is clear that not only syntax, but now semantics are also critical and indicative in resource discovery.