Lifecycle Metadata for
Digital Objects
November 13, 2006
Usage Metadata
What is Usage Metadata?
Internal users (with respect to the creator)
External users (with respect to the creator)
Internal users (with respect to the
repository)
External users (with respect to the
repository)
Creator Usage
The creator’s actual use of the object
The creator’s colleagues’ use of the object
Version control
As work product presented for accountability
Author reuse/recycling
To fulfill the object’s function
Object used for reference, template
Parts of object reused
The creator’s customers’ use of the object
Object’s function: mediates relationship
Object’s function: profit for creator
Repository Usage
Management usage
Object maintenance and preservation
Object analysis
Object use analysis
Designated user community
Object viewing
Object acquisition
What kind of usage to track
Server-side tracking
Server logs (direct Web searching)
Query logs etc. (from OPACS, repositories)
Fielded
Full-text
Client-side tracking
Cookies and other spyware
Explicit installations (like WebTracker)
Why and whether track usage I
(external)
Observe what individuals use
Library/archives ethics forbid identifiable tracking; can be done
via proxy demographics
The law (FBI access—but thwarted by IP addresses in logs: who
designed that?)
Serve users better
Who
Designated user community
Everyone
How
Give ‘em what they want
Adjustments to acquisition policies, schedules
Adjustments to preservation policies, schedules
Reappraisal?
Why and whether II (internal)
Manage the repository
Protect the repository
Expansion of capacity
More efficient use of capacity
Detect malicious intrusions
Detect suspicious activity
Manage dissemination better
Manage objects better in the repository
Monitor user trends for storage management
Monitor user trends for preservation management
So is this usage data really
metadata?
It depends on where you sit…
Some of this log material is metadata with
respect to the objects being managed
Popular objects will get “fat” (depending on how
longitudinal metadata is kept)
Popularity and selection may define new classes
of “archival bond” and hence new metadata for
objects and collections
Some of it (and some of the same) is data
with respect to the management task
How to track usage
The system can do it for you (see Covey,
Zawitz, Jones et al.--with some help)
OPACs and repositories (if the vendor will allow it
or the existing programming provides for it)
Web server logs
Firewall logs
Depends on what the logs track and how
deeply they identify individual resources
But TLA isn’t the whole story
Qualitative investigations
Server log behavior needs further analysis
Patterns in logs need further identification
See Choo et al. for an example of
triangulation between behavior logs, surveys,
and interviews to test a model of categorizing
usage patterns
User-contributed metadata
Annotation
Tagging and Folksonomies
Discursive explanation (can it be parsed?)
Subject terms or sort of
Del.icio.us, flickr
Is this democratized classification likely to
make cataloging obsolete?
Or is this something new?
How is it useful as metadata?
And now for something
completely different
The Dreaded Final Assignment
What to create: parts
Develop a METS Profile Schema (in XML) suitable to
your data object type
Based on the DSpace schema
Include relevant namespaces to cover all the kinds of
metadata we will have discussed
Using the schema you have created as a template in
XMetaL, mark up the actual example you have been
working from
Make an annotated bibliography of sources that you
used in searching your object type (gather the items
you posted, anything else)
Include all this in a document that describes why you
chose the metadata sets/namespaces that you chose
Metadata sets
Detailed description of elements, drawn from
sources but specifying what kind of data goes
into each element
Can be tabular or element by element; both
forms can help clarity
element by element to present detailed
description, cite comparative sources
tabular for a summary
DSpace and METS
The DSpace METS profile calls for the use of
three metadata sets
Descriptive metadata: MODS (as
crosswalked from Dublin Core)
Administrative metadata
Technical metadata: PREMIS
Rights metadata: Creative Commons
What you need to know
What are the kinds of descriptive and administrative
metadata your object needs?
Check the basic DC-to-MODS crosswalk: do you
need more than this to describe your object?
Check PREMIS and the Creative Commons rights
schemas: you will certainly find that PREMIS does
not address technical metadata in any way. Hence
you will need to round up the technical metadata
appropriate to your object, to be contained in the
techMD segment of the METS document
What is an annotated
bibliography?
A list of the most important (most frequently cited)
resources on the subject of interest, with full citation
Annotated with a clear description of the contents of
the source
For our purposes the annotation should include
what YOU think about the source and why you
found it useful to your task or why it is totally useless