A digital library, in the abstract, must minimally provide means to store large volumes of heterogeneous data, and a user interface for the users to query and retrieve the data. To efficiently locate and retrieve the data, libraries usually contain a catalogue, which supports index structure and access methods. The library must also include some mechanism for incorporating and ingesting new data. The four components of this project: interface, catalogue, storage, and ingest will be incorporated over a network. Distribution of both the data and the interface components is essential for the success of any such library. In particular, the physical distribution of the data is usually dictated by the availability of the maps, images and even text. The issues of replication of digital data, and the distributed nature of user interface components becomes crucial if not essential for the success of the design.
Figure 1: Alexandria Architecture
The proposed Alexandria distributed architecture is illustrated in Figure 1. It consists of a network interconnecting any number of each of the following four components. The interface component of the digital library provides library interactions and facilities to the user. In particular, users query, browse, and retrieve data through this interface. The catalogue component contains the basic index structures and provides the database services to the users. The storage component is primarily concerned with providing large storage capabilities through secondary and tertiary storage devices. The storage will contain the digital maps, images and text. Finally, the ingest component allows the librarians and system managers to incorporate new data into the system. Each of these components has an interface to the network providing protocols for communication within Alexandria.
The Alexandria interface is an all encompassing term for any type of interface through which users may interact with the system. In particular, this interface may be a dumb terminal with simple textual interface, to a powerful workstation with graphical interface, large memory and multiprocessor capabilities. A standard digital library interface should however, provide a graphical interface for the visualization of images and maps. This interface would need some basic visualization library as well as basic capabilities for supporting query processing and browsing. Depending on the capabilities of the particular user interface, these functionalities could be supported locally or remotely at the catalogue. We initially envision most of the query processing to take place in the catalogue, and most of the browsing capabilities to be in the storage components. Furthermore, each Alexandria interface will be associated with a particular catalogue component, which will handle all its query processing. This will simplify the design of the protocols and the implementation of the system.
The Alexandria ingest component provides the primary methods for data acquisition, data capture, preprocessing, and for storage. The ingest component is responsible for extracting any metadata or features in images that will be used in content-based indexing. This information is directly transmitted to the relevant Alexandria catalogues. It is also responsible for constructing hierarchies of low resolution images for browsing and forwarding this information to the Alexandria storage components.
In order to use the Alexandria library, the user would initiate a session on the Alexandria interface. Typically, this will involves searching the catalogue and browsing the storage. To search or retrieve data, a user formulates a query and uses the network component to send it to the appropriate Alexandria catalogue. The catalogue contains a query processor with spatial data handling capabilities. This query processor uses the metadata extracted from the data as well as the index structures constructed during data ingestion. To respond to a query, the catalogue determines the location of data corresponding to the query being posed. and returns a handle to the user. This handle identifies the Alexandria storage component as well as as the physical properties of the requested image (e.g., physical address, format, etc.). We chose to pass this information back to the interface, instead of simply retrieving the actual data, in order to allow the user to retrieve the data selectively. Finally, the interface retrieves the actual images directly from the storage component.
In order to browse through the library, the user obtains an initial handle to the data of interest, and uses that handle to sequentially browse through a sequence of low resolution images. Unlike traditional alphanumeric browsing, image data can be retrieved on a variety of features, e.g., image content, location, direction, time, etc. The Alexandria storage components provide powerful browsing capabilities including display of low resolution images for quick search. The multi-resolution decomposition using wavelet transforms facilitates this process. Once the user has located the particular images of interest, the higher resolution images are then displayed. A typical user session will consist of a sequence of queries and browsing, each further narrowing the search until the particular image of interest is located.
The Alexandria network will use emerging broadband integrated services to offer a wide range of services to users. Locally, the users would be connected by a network technology such as Distributed-queue Dual-bus 802.6 metropolitan area network. This will allow fast response times to local and remote users. Finally, the library services will be integrated into the OSI model [64].