|
Once the objects for a desired
digital collection have been sought, found and selected,
barcoded ‘pull sheets’ can be printed and
the objects gathered for digitization. With physical
object in hand, preservationists can select a metadata
template, modify the template to perfectly fit the object,
print barcode metadata sheets and insert them in front
of the pages that they correspond to, then submit the
volume for scanning. Or, all metadata work can be postponed
until after image treatment has been performed and OPUS
has made a first pass at identifying metadata, automatically.
From the exotic and sophisticated,
automatic page turning scanners to Bookeye, WideTEK
and even Epson scanners, OPUS supports virtually any
scanner that provides sufficient quality for the desired
output. OPUS either runs the scanner directly or can
be configured to automatically import images from the
scanner.
Automatic image treatment
consists of numerous functions, including book-fold
correction, ‘two up’ page splitting, de-skew,
content location, application of fixed margins, color
correction, and de-speckle.
The manual image treatment
stage allows all pages to be reviewed and any that is
not processed to the desired quality level can be reprocessed
using manual and semi-automatic processing. When the
page is acceptable, the user advances to the next page.
OPUS offers an automatic metadata
capture system that looks for common metadata such as
book title, author, publisher, table of contents, section
and chapter names, and page numbers.
Automatic metadata capture
is a wonderful concept, but expecting a software program
to know what’s what is a tall order, especially
with artificial intelligence still in its infancy. So,
the Digital Library Systems Group at Image Access invested
substantially in software engineering to produce a facility
for efficiently reviewing and editing the results of
the automatic metadata. Not only can text be changed,
but the actual metadata elements can be rearranged or
deleted and new elements added.
Once an object has been processed
to this stage, derivative creation can be a fully automatic
process. It is at this stage that high, medium and low
resolution PDF, JPEG, TIFF and other image files are
created. In addition, metadata is output to any number
of custom formats and
to standard formats such as METS XML.
Release is the final stage
of OPUS. It can be as simple as copying the files to
a special directory structure or as complex as
a sophisticated relational database on a SQL server.
|