The University of Illinois Open Archives Initiative Metadata Harvesting Project
ASP OAI 2.0 Data Provider
File/Database -- ver. 1.5
Descriptive Metadata Stored in XML Files
Administrative Information Stored in Database
Disclaimer: The following is 'quick and dirty' documentation to hopefully get you started. It assumes a fair amount of
familiarity with configuring the Microsoft IIS web server and other minutia such as
editing text files, etc. Hopefully, we will eventually have time for some better documentation. Thanks.
WHAT THIS IS:
This is an example of a metadata provider service as described in release 2.0 of the
Open Archives Initiative Protocol for Metadata Harvesting. It uses
- Microsoft Internet Information Server;
- Microsoft ASP with VBScript and JScript;
- Microsoft Windows Script;
- Microsoft XML Core Services (MSXML);
- Microsoft ActiveX Data Objects (ADODB) and ODBC-compliant Data Source;
- Microsoft Access.
This application can be downloaded onto an appropriately configured
Microsoft Windows NT/2000 system and used with minimal configuration changes.
Note: Related implementations for other system architectures (Database only,
File System only, File System/Database hybrid, etc) and/or for other platforms
are also available on the
SourceForge UILIB-OAI project web
.
INSTALLATION:
-
Unpack the downloaded zip file to install all files into the root directory of the C
drive (C:\) on a system running Microsoft IIS ver. 4 or later. (Installation into
a different directory is possible, but may require significant changes to the example
database and scripts. These dependencies may be removed in a future release.)
You must also preserve subdirectory names during unzipping.
A directory named C:\ASP_OAI_2.0_DP_FILEDB\ will be created and contains
all necessary program files and sample data files.
-
Make the C:\ASP_OAI_2.0_DP_FILEDB\ directory available as an
IIS virtual application (Create a new Virtual Directory with an
active application. Make sure that this application allows scripting access.
For information on how to do this refer to the IIS documentation.)
Assume that the name of the virtual directory is ASPOAIDP-FILEDB.
-
Check the Enable session state on the ASP configuration panel.
The session state is only for the server-side transition. It does
not affect the stateless OAI protocol operation. The client (i.e. the OAI harvester
does not need to support or enable the cookie functionality for cooperating with this
data provider.
-
Now the base URL of the ASP OAI 2.0 Data Provider for File/Database 1.5 is running
at your local system is:
http://<your hostname or IP>/ASPOAIDP-FILEDB/oai.asp
You should test the installation from browsers at local and remote sites
with the following command:
http://<your hostname or IP>/ASPOAIDP-FILEDB/oai.asp?verb=Identify
We have frequently found that firewalls must be re-configured to allow access
from remote sites.
-
This data provider can be configured through the XML configuration file:
RepositoryDescription.xml.
Repository identity and metadata properties can be customized.
-
If a database is involved in the processing, remember to permit the IIS user account
(usually, IUSR_<NetBIOS name of the host>) to read all the necessary data from
the database.
ASSUMPTIONS:
-
Each XML metadata item is described in a separate .xml text file.
-
For each object, administrative information necessary for response to OAI
PMH requests is stored in a relational database. (We provide an example of
Microsoft Access Database).
-
For each supported metadata format, the user must
provide a handler (in the form of an ASP file) which reads the metadata
records and converts or transforms these records into the specified metadata
format. In this release, we provide three handlers (metadata-oai_dc.asp,
metadata-marc.asp, and metadata-marc_direct.asp) as examples.
Both metadata-oai_dc.asp and metadata-marc.asp use the XSLT technology
to transform the metadata from the original format (we use MARC records as
examples) into the specified formats. metadata-marc_direct.asp is an example
handler which outputs the metadata directly without any processing.
-
To use this data provider, the user needs to create the following 3 tables
(if "Set" is supported) in the existing database as the interface between
the data provider and database. Besides, in our example metadata handlers,
we assume that the complete file name (including the full path) for the
specified XML file is store in a table named metadata.
- ids
-
Table ids contains the following fields:
- recID: (integer)
Numerical ID of each metadata record.
recID is used to connect to the existing data items.
- OAI_Identifier: (string)
Persistent string ID of each metadata record.
This ID has to remain identical for the same data item even
if recID of this data item changes.
- objectName *: (string)
Human readable string representation of a metadata record.
- lastModDate: (date)
Last modified date of the item.
- exportable *: (character)
"Y" is exportable; "N" is not exportable.
- deleted *: (character)
"Y" indicates that the item is deleted;
"N" means the item is available.
- sets
- [Optional]
If the repository supports "Set", set information is stored in this table.
Table sets contains the following fields:
- setID: (integer)
Numerical ID of each set.
setID is used in table setmap for establishing the relationship
between sets and metadata records.
- setName: (string)
Human readable description of the set.
- setSpec: (string)
String identifier of the set. setSpec must comply the standard of
OAI Protocol for Metadata Harvesting.
- lastModDate *: (date)
Last modified date of the set.
- setmap
- [Optional]
If the repository supports "Set", the many-to-many relation between metadata
and sets is stored in this table.
Table setmap contains the following fields:
- setID: (integer)
Numerical ID of each set.
setID is used in table setmap for establishing the relationship
between sets and metadata records.
- recID: (integer)
Numerical ID of each metadata record.
recID is used to connect to the existing data items.
- lastModDate *: (date)
Last modified date of this item.
The fields marked with (*) are not currently used in the provider.
ARCHITECTURE:
- Operating System / Platform
- Microsoft Windows NT 4 Server SP 6
- Microsoft Windows NT 4 Workstation SP 6
- Microsoft 2000 Advanced Server
- Microsoft 2000 Professional, SP 2
-
- Microsoft Internet Information Server (IIS), version 4 or higher
-
- Microsoft Active Server Pages (ASP)
- ASP modules included use VBScript and JScript
-
- Microsoft Windows Script, version 5.6 or higher
- The script support object library is available free from the Microsoft Website at
http://msdn.microsoft.com/nhp/Default.asp?contentid=28001169
-
- Microsoft XML Parser (MSXML) 4.0
- This parser is available free from the Microsoft Website at
http://msdn.microsoft.com/xml
-
- Microsoft ActiveX Data Objects (ADODB) and ODBC-compliant Data Source
- For illustration we include a Microsoft Access Database with Stored Procedures.
However, with minor changes, a Microsoft SQL Server database or other ODBC-compliant
data source could be used as well.
OAI PROTOCOL CONFORMANCE & XML Schema Definition Documents for Validation:
As installed locally, this system has been validated using version 1.45a of
the OAI Repository Explorer (available at
http://jingluo.dlib.vt.edu/~oai/cgi-bin/Explorer/2.0-1.45/testoai).
The OAI Repository Explorer tests for conformance to OAI Protocol release
2.0. When you have you have completed installing this tool and believe everything is set up properly, go
to this site and enter the URL for your data to validate your system.
INCLUDED FILES:
- global.asa
- The global.asa file retrieves parameters from RepositoryDescription.xml to
configure the repository. It also makes a one-time list of all sets and data available which is
used by all harvesters. If sets or files are added or deleted, the application must be
stopped and restarted to refresh this list.
- RepositoryDescription.xml
- The configuration file for the OAI Data Provider.
It is XML-formatted and self-illustrative.
Please refer to the file for repository settings.
- README.html
- This HTML file.
- license.html
- The Open Source license for this code.
- functions.inc
-
This contains various functions that are needed by other script files.
This file is included in the scripts that need access to the functions.
functions.inc provides functions and subroutines such as parsing OAI identifiers,
generating UTC datestamps, creating and parsing resumption tokens, etc.
These are mostly functions that should be reusable across many different
OAI implementations and are not specific to this implementation.
- oai.asp
- The main Active Server Page script code. All OAI requests are dispatched by oai.asp.
Most of this code is written in VBScript, a small amount of JScript is also used.
- *.asp (other than oai.asp and metadata-*.asp)
-
Each of these ASP files corresponds to a single OAI request and is called by oai.asp.
- metadata-*.asp
-
User-provided handlers for supported metadata formats. In this release, we provide
three examples: metadata-oai_dc.asp, metadata-marc.asp, and metadata-marc_direct.asp
to demonstrate how this data provider works with user-provided handlers.
- Identity.xls
-
Files with names ending in .xls transform metadata from the storage XML format,
MARC in the case of the data samples provided with this package, into the format
requested by a harvester. Identity.xls translates from the MARC storage format
into the MARC format expected by harvesters. As you can imagine,
this transformation is trivial.
- MARC21slim2OAIDC.xsl
- MARC21slimUtils.xsl
- These XSL Transforming stylesheets transform *.xml object descriptive
metadata files to XML structures appropriate for transmittal in response
to OAI requests.
- ASPOAI_FILEDB_DB.mdb
- An Access database containing necessary OAI administrative metadata and
related information as discussed above. Refer to the database itself for
the data schema.
- ASPOAI_FILEDB_DB_empty.mdb
- An empty example of the database.
- Marc/*.xml
- These are sample object descriptive metadata files.
All code provided in this illustrative implementation is being made available
under OpenSource license.
AUTHORS:
- Thomas G. Habing
- Research Programmer, Digital Library Initiative
University of Illinois at Urbana-Champaign
052 Grainger Engineering Library, MC-274
thabing@uiuc.edu
- Timothy W. Cole
- Mathematics Librarian
University of Illinois at Urbana-Champaign
214 Altgeld Hall, MC-382
t-cole3@uiuc.edu
- Ying-ping Chen
- Graduate Assistant
University of Illinois at Urbana-Champaign
052 Grainger Engineering Library, MC-274
ychen21@uiuc.edu
- Joanne Kaczmarek
- Project Coordinator
University of Illinois at Urbana-Champaign
052 Grainger Engineering Library, MC-274
jkaczmar@uiuc.edu