The University of Illinois Open Archives Initiative Metadata Harvesting Project
JSP OAI 2.0 Data Provider
Database -- ver. 1.3
Implemented to Store Metadata and Administrative Information
in a MySQL Database
Disclaimer: The following is 'quick and dirty' documentation to hopefully get you started. It assumes a fair amount of
familiarity with configuring the Apache Tomcat server
and other minutia such as
manipulating MySQL databases, editing text files, etc. Hopefully, we will eventually have time for some better documentation. Thanks.
WHAT THIS IS:
This is an example of a metadata provider service as described in release 2.0 of the
Open Archives Initiative Protocol for Metadata Harvesting. It uses
- Java Server Pages technologies
- Apache web server with Tomcat (developed using version 4.1.12) as the servlet container
- MySQL database (developed using version Ver 11.18 Distrib 3.23.56)
- Java 2 SDK (developed using version 1.4.1)
- Java JDBC driver for MySQL database
This application was developed under Linux, and can be downloaded onto an appropriately configured system. As delivered it will serve MARC21 records in Simple Dublin Core format with minimal configuration changes, provided that you use the existing database structure.
Note: Related implementations for the Microsoft Windows 2000 platform, relying on ASP, are available. These other implementations are available for various metadata storage architectures (Database only, File System only, File System/Database hybrid). Source files for related Windows implementations available on the SourceForge UILIB-OAI project web.
INSTALLATION:
-
Attach to the default Tomcat web application directory, webapps, and unpack the zip file. This step will create a directory named JSPOAI. You can rename this directory to suit your installation.
Read-only access for Tomcat is adequate for this directory and its contents.
Frequently firewalls must be re-configured to allow access
from remote sites.
Inspect the contents of configuration file OAI.ini in JSPOAI. You may change the username and password needed to access MySql, or leave them as found.
- Modify /JSPOAI/WEB-INF/web.xml to include your site name and administrator. You must create rights in MySql for the oai user, using MySql's GRANT command. See documention on this command at www.mysql.com. You will need administrative privileges to MySql, which are different from root privileges.
- If necessary, install MySQL from www.mysql.com
- Download and install the JDBC driver for MySQL from the www.mysql.com download page. It is currently listed as MySQL Connector/J on that page. According to the documentation accompaining the driver, it may be placed in several locations on your machine. We recommend creating a new directory /JSPOAI/WEB-INF/lib and placing it there.
- Create a MySQL database with the Unix command
mysql < MySqlDBCreateScript.sql
where MySqlDBCreateScript.sql
is a file included with this package. This command will create a database named oai and
populate it with 7 MARC21 records.
The structure of the database is given below.
- You can use the provided data to confirm that the installation has been successful. If you have installed the application in, for instance,
/var/tomcat/webapps/JSPOAI, and have left the default Tomcat service
port as 8080, you can test the installation with the following URL and verb:
http://<your hostname or IP>:8080/JSPOAI/oai.jsp?verb=Identify
If you have changed the directory name of JSPOAI to, for instance, DuVallCollection then your site will be accessed as:
http://<your hostname or IP>:8080/DuVallCollection/oai.jsp
- If the provided database structure meets your needs, clean out the sample Marc records and load your Marc data.
Otherwise, use another database design and modifiy all routines in the program with names beginning with the letters db.
Changes in design of the table metadata also will require changes to the subroutine OAI_DC_Handler.
This routine reads the database table metadata and outputs a metadata item in XML - Simple Dublin Core format.
- If your native metadata (as contained in the table metadata) is other than MARC21,
you must also modify subroutine OAI_DC_Handler to properly transform native metadata scheme to simple DC.
- An included routine OAI_OTHER_Handler (not used in sample default implementation) can be developed to read the database table metadata and output your metadata in an additional format. To locate all places in the program which must be modifed to support use of an additional, non-Simple DC format, search globally for the string OTHER.
-
This data provider can be configured through the configuration file,
OAI.ini. This file is used to set repository-wide
values that will appear in OAI "Identify" responses and to
set pointers to location of your database. OAI.ini is also
used to set the number of records, identifiers or set
names sent per batch. Larger batches reduce transmission
time but may overload the harvesting machine.
- Beginning with revision 1.3, the scheme has been hardcoded to "oai," and is no longer configurable in OAI.ini. You may still choose the namespace-identifier in the OAI.ini file, which is the second part of the oai identifier string. Also as of 1.3, the third part of the oai identifier, the unique database record identifier, may be chosen at your discretion. It no longer needs to be numeric, and it may contain colons.
DATABASE TABLES:
- ids
-
Table ids contains the following fields:
- recID: (integer, primary key, auto-increment)
Numerical ID of each metadata record
used to connect record-level data to field-level data in table metadata. Each object in the repository has one and only one record in this table.
- OAI_Identifier: (string)
Persistent string ID of each metadata record.
This ID has to remain identical for the same data item even
 
if recID of this data item changes. Note, however, that tables setmap and metadata are tied to this table by recID, not OAI_Identifier.
- objectName *: (string)
 
Human readable string representation of a metadata record.
- lastModDate: (date)
Last modified date of the item. Set by whatever software creates or modifies the database record for this item. Read by program.
- exportable *: (character)
"Y" is exportable; "N" is not exportable. Used to prevent provisional or proprietary records from being exported.
- deleted *: (character)
"Y" indicates that the item is deleted;
"N" means the item is available.
- metadata
-
The primary data table, metadata does not have a primary key. It contains the following fields:
- recID: (integer)
A foreign key into ids. This table contains one
record for each field in the database for this object. Therefore, many metadata records will have the same recID.
setID is used in table setmap for establishing the relationship
between sets and metadata records.
- varName: (string)
Name of the variable to follow in the next field, often name of a tag, such as "title"
- value: (string up to 255 characters long)
The value of the variable named in the previous field, for instance if varName were "title", then this field might contain "Old Tractors and the Men Who Love Them"
- scheme: (string)
Encoding scheme, see http://dublincore.org/documents/dcmi-terms/#H3, or analogout information for non-DC formats.
- subfield: (string)
Element refinement, see http://dublincore.org/documents/dcmi-terms/#H3, or analogous information for non-DC formats.
- lastModDate *: (datestamp)
Last modified date of the set. This timestamp is set automatically by MySQL.
- sets
- If the repository supports sets, set information is stored in this table. If sets are not supported, this table should be present but contain no records.
Table sets contains the following fields:
- setID: (integer, primary key, auto-increment)
Numerical ID of each set.
setID is used in table setmap for establishing the relationship
between sets and metadata records.
- setName: (string)
Human readable description of the set.
- setSpec: (string)
String identifier of the set. setSpec must comply the standard of
OAI Protocol for Metadata Harvesting.
- lastModDate *: (datestamp)
Last modified date of the set. This timestamp is set automatically by MySQL.
- setmap
-
If the repository supports "Set", the many-to-many relation between metadata
and sets is stored in this table. If the repository does not support sets, this table should be present but empty.
Table setmap contains the following fields:
- setID: (integer)
A foreign key into sets. The unique ID number of a set which contains the object designated by recID in the following field.
- recID: (integer)
A foreign key into ids, the ID number of an object in ids and metadata.
- lastModDate *: (datestamp)
Last modified date of this item. This timestamp is set automatically by MySQL.
The fields marked with (*) are not currently used in the provider.
ARCHITECTURE:
- Operating System / Platform:
- Linux
-
- Apache web server with Tomcat 4 or higher
-
- Java Server Pages
-
- MySQL database
-
- MySQL JDBC Driver
OAI PROTOCOL CONFORMANCE & XML Schema Definition Documents for Validation:
As installed locally, this system has been validated using version 1.44 of
the OAI Repository Explorer (available at
http://oai.dlib.vt.edu/~oai/cgi-bin/Explorer/2.0b2-1.44/testoai).
The OAI Repository Explorer tests for conformance to OAI Protocol release
2.0. When you have you have completed installing this tool and believe everything is set up properly, go
to this site and enter the URL for your data to validate your system.
INCLUDED FILES:
- oai.jsp
- The program file written in jsp. It reads the database named "oai" and OAI.ini, the program initializtion file.
- OAI.ini
- The configuration file for the OAI Data Provider.
- directory WEB-INF containing file web.xml
- This directory is a mailbox between your Tomcat and
your application. Modify the file web.xml to record your
project name and web adminsitrator. Also create a new subdirectory named lib. The JDBC driver for MySQL should be loaded in WEB-INF/lib.
- README.html
- This HTML file.
- license.html
- The Open Source license for this code.
- DatabaseDescription.txt
-
A screen shot of MySQL "describe <tablename>" commands for each table.
- MySqlDBCreateScript.txt
- A Unix script file which can be submitted to MySQL to re-create the example database named "oai." The script file contains commands to create the database and all four tables. While loading data dumped with the mysqldump Unix command, this script illustrates the one method of loading data into tables. See also the MySQL LOAD DATA command at http://www.mysql.com/doc/en/index.html
All code provided in this illustrative implementation is being made available
under OpenSource license.
AUTHORS:
- Thomas G. Habing
- Research Programmer, Digital Library Initiative
University of Illinois at Urbana-Champaign
052 Grainger Engineering Library, MC-274
thabing@uiuc.edu
- Timothy W. Cole
- Mathematics Librarian
University of Illinois at Urbana-Champaign
214 Altgeld Hall, MC-382
t-cole3@uiuc.edu
- Ying-ping Chen
- Graduate Assistant
University of Illinois at Urbana-Champaign
052 Grainger Engineering Library, MC-274
ychen21@uiuc.edu
- John Lewis
- Visiting Research Programmer
University of Illinois at Urbana-Champaign
052 Grainger Engineering Library, MC-274
jslewis@uiuc.edu