« Wah Lum Kung Fu 35th Anniversay | Main | Nerdcore Hiphop »

April 08, 2005

Redefining the Role of Catalogers in the Age of the Semanitc Web

Last presentation of the day, and probably the most interesting (by which we (who's we?) mean the least practical (meaningful?)).

Slides available on web

Semantic Web

There's a new one coming through from the W3C, wait what the heck is the old one?

Machine-Understandable Data (as opposed to Machine-Readable Data)

Not machine storage, index and retrieval

But machines mimicing the operations of someone who understands what the data is.

Moving beyond the notions of word frequency in documents, link ranking and boolean queries.

Semantic Web depends on:

  • XML - eXtensible Markup Language
  • RDF - Resource Description Framework
  • OWL - Web Ontology Language


XML

XML allows data elements to be encoded according to their semantic meaning (not just their physical format like html). XML separates meaning from format, content from design.

XML provides access to the deep web, providing web-accessible structure for data contained in databases.

XML allows web pages to be both dynamic and cataloged. XML is a switching lanaguage, a translation language.

XML means multiple, overlapping markup standards.

RDF

RDF is an XML-based model for making [Aristotelian] statements about resources.

RDF is subject-predicate-object triples

"Resource A has an attribute with the value "B"

Subject = Resource A

Predicate = has an attribute with the value

Object = B

You'll notice that the above three identifiers are themselves triples (the predicate in this case is = (wait, this comment is a triple (you get the picture))

Assumptions behind RDF:

  1. RDF can be expressed in XML; it is not, however, dependent on XML [RDF just sucks in XML]
  2. Everything--not just websites--can be assigned a Uniform Resource Identifier (URI), not necessarily a Uniform Resource Locator (URL) (even people, ideas, emotions, values)
  3. Information organization works best when organized from the ground up rather than the top down.
  4. Information organization is a matter of making "statements" about resources that preserve context and make sense within the information system

[My problems with RDF

Can we make Aristotelian Statements "that preserve context and make sense?"

Is there an end to the recursion of triples?

Who controls the dictionaries?

Whe creates/controls the semantic equivilancies?]

A Semantic Web is really a web of triples

And looks like a web when you display it.

Subjects and Objects are Entities

Predicates are arrows

Predicates are the elements?

OWL

Ontologies define the predicates/arrows/elements

In CS equals a machine-readable expression of a shared conceptual framwork.

Defining meaningful entities and their relationships with each other.

Ontologies are usually expressed as a combination of a classification scheme and controlled vocabularies.

Ontologies are designed to link similar concepts in different namespaces.

Ontolgoies are designed to increase a search agent's ability to interpret data in a different domain.

Ontologies link namespaces via equivilancies.

Certain processes of inferential logic are possible upon properly prepared RDF encoded documents linked via Ontologies.

This logic leads to proof and trust.

Why should libraries care about the Semantic Web.

How libraries currently use the WWW.

Libraries are using the web (most prominently via OAI PMH) as an irrigation system for swift transfer.

Detailed metadata within an otherwise closed domain are simplified and made available to other places. These other places also simplify their more detailed and otherwise closed domain specific metadata.

What's it like outside the library?

  1. The world is full of experts
  2. The world is full of enthusiasts
  3. Human culture is saturated with complex, nuanced, important relationships

What can libraries bring to the table?

Information resource description, organization and access is what we do (we ask the right questions that no one else thinks of)

Information evaluation (collection development, reference)

We do not realize what a rich, sophisticated body of theory on bibliographic relationships [Ranganathan]

A speculation on what bibliographic representation might look like in the Semantic Web (see slides)

Separate truly local information from non-local bibliographic information, author information, work [I assume in the FRBR sense] information and related works.

Cataloger Repositioning

We shift our cataloging role from taking the time to write out the limited amount of information we are able to write to finding reliable information and linking to it via RDF. We would create/control the semantic equivilancies, based on dictionaries we trust (we're good at finding trustworthy information). We would control the semantic web because we would take the time to build it. [Does this subvert the ground up approach? Is it feasible?]

Our mind is full of ideas and these ideas depend on documents. The intensity of our ideas is dependent upon repeatedly encountering these ideas in documents. Things accrue meaning as you encounter them again, and again, and again.

What we do in libraries is make it possible to bump into texts again. We make it possible that you will encounter a document more than once, having seen something you like, you can put your hands on it again. Ideas can build upon each other in part because of the work in libraries. This role of the library will not go away in the near future.

What is the purpose of the catalog (Lubetzky)

To provide access to entites that accrue from the objects on our shelves

authors (and all their works)

works (and all their editions)

We can give to the Semantic Web (FRBR), an intellectual structure upon which we can build these meaniful semantic structures.

Work =>> Expression =>> =>> Manifestation =>> Item

Questions from the Audience

Can we really define the dictionaries/Do we really need to?

Two completing notions

Complexity and Emergence

Emergent Semantics

The Web came out of the scientific community enamored of the belief that:

If you want structure you have to be able to step back and let the interested parties duke it out.

Information Organization came out of the library community with a low tolerance of the mess that will accompany an emergent structure.

The $64,000 question is Can The Semantics Emerge?

Someone needs to do the study and find out: does meaningful, shared semantics emerge?

Complexity Theory from AI

Three situations posited.

1. A Change, nothing happens

2. A Change, all hell breaks loose

3. 3 A Change, A Shift, the community continues to cohere in a new location

Maybe the library community is the part that holds everything together in times of change. Maybe we need to introduce enough of the concern for order and consistency to mix well with the excitement of the new. We hold the line and make the Semantic Web useful.

Posted by MetaMetadata at April 8, 2005 01:49 PM | TrackBack
Comments

I feel like I am sitting with you. I'm going to get a coke, you want anything while I'm up?

Posted by: J Commander at April 8, 2005 03:43 PM