April 08, 2005
FAST: A Subject Headings Schema Designed for the 21st Century
Welcome to the OCLC presentation. [ed. note - Must find new introduction]
OCLC Research Objectives
- Data Mining
- Understanding Users
Research Areas of Interest to Me
- Knowledge Organization and Semantic Web
- Authority Control
- Metadata Schema Transformation
FAST is a collaborative research project between OCLC, LoC, ALA/SAC. FAST stands for Faceted Application of Subject Terminology.
Google is the 21st Century
Latent Semantic Indexing and Page Rank
Library Catalogs are the 19th Century
LCC, DDC, LCSH, Card Catalogs, Controlled Vocabularies
Library catalogs technologies have been "digitized" and all are expensive compared to Google.
The question is which technology do we apply to get to the grey literature.
FAST is attempting to position itself as the answer, somewhere in between these two technologies and leveraging the positives from both.
FAST is a new approach to Subject Vocabularies
It's cheaper and easier to use than LCSH for electronic objects and compatible with a variety of the new metadata schemas
It is simple in structure and syntax
Usable by non-catalogers in non-library environments
Is designed for semantic interoperability
Is an adaptation of an existing schema
LCSH is the obvious choice for the usual reasons (established, supported), but sucks for cataloging electronic, web-delivered resources. It was designed for pre-coordinated card catalogs.
One of the problems is that LCSH has rules for creating new headings that aren't established. Of the more than 8.5 million distinct topical headings in WorldCat:
over 3 million of the headings are not established, but valid and used in multiple bibliographic records,
over 5 million are not established, but valid and used only once,
only 100K are established.
This is as ridiculous as it sounds. The rules are human derivied and hard to explain to a computer. They allow for a proliferation of headings, which defeats the purpose of grouping similar items under common headings.
FAST contains far fewer established topical headings (about 400K).
FAST normalizes the form of heading for machine encoding.
?Are FAST headings defined via RDF triples?
FAST will use the MARC 21 authority format.
FAST has 8 facets:
- Topical
- Geographic
- Form (Genre)
- Chronological
- Personal Names (Names as subjects)
- Corporate Names (Names as subjects)
- Conference Uniform Titles
- Meetings Uniform Titles
Fast will keep general or "X" subject divisions
[ed. - Genre lists are always inadequate]
FAST is till hierarchical, but loses specificity.
FAST facilitates both pre- and post-coordination.
FAST is available as a OCLC SiteSearch database
The authority file is in beta
Fast enables cool faceted search and browsing technologes (not semantic web).
Posted by MetaMetadata at April 8, 2005 09:57 AM | TrackBack