next up previous contents
Next: Loading and Maintenance Up: The Catalog Component Previous: The Catalog Component

Alexandria Gazetteer

The Alexandria Digital Library's gazetteer was created by combining the gazetteer databases from the U.S. Geological Survey and the U.S. Defense Mapping Agency (no the National Imagery and Mapping Agency - NIMA). It consists of approximately 6 million placenames with latitude and longitude point locations. The placenames are categorized by a set of feature classes and types that were derived from the original datasets. To improve on this initial gazetteer, ADL has developed a Content Standard for and a Thesaurus of Feature Types to provide a more extensive vocabulary for placename categorization.

This work has been supported by the Hughes Information Technologies Systems, EOSDIS Core System, as part of their Collaborative Prototyping Program: ``The ECS-UCSB Collaborative Prototype on Digital Libraries as ECS Value-added Providers: Gazetteer Implementation.'' Earlier work on under this agreement included establishing the Content Standard and a relational database model for it. Phase II has two Tasks. Task 1 is to develop a ``gazetteer classes/features list schema with authoritated list'' and Task 2 is to (1) ``assess the availability of and the need for classes of placenames for the gazetteer,'' and (2) ``provide an initial gazetteer for the ECS Client that contains placenames and associated footprints for the geographic locations identified in the first deliverable. These gazetteer entries will be categorized with terms from the feature type thesaurus developed in Task 1.'' Hughes provided $70,261 for Phase II work.

Progress report

The work we are doing on this gazetteer project will be integrated into both the Alexandria Digital Library and the NASA/Hughes Earth Core System Release B.1 for EOSDIS. Work is reported for the Thesaurus of Feature Types and for the new ADL gazetteer.

Thesaurus of Feature Types

Our work on the gazetteer implementation has resulted in the development of a new thesaurus of feature types for use with gazetteers This thesaurus is based on the Guidelines for the Construction, Format, and Management of Monolingual Thesauri (ANSI/NISO Standard Z39.19-1993) and is designed primarily for information retrieval. The advantage of such a thesaurus is in the relationships represented between terms: hierarchical, associative, and equivalence. This allows navigation among terms by the users and also links from the user's own terminology to the terms used by the system. The terms in the thesaurus have been drawn from feature types in existing gazetteers (USGS GNIS and NIMA's GNPS) and other sources such as NATO's Feature and Attribute Coding Catalog (FACC). The USGS Circular 1048 entitled ``An Enhanced Digital Line Graph Design'' various earth science and regular dictionaries, and thesauri from related information indexing and retrieval systems have been used as references for terms and their definitions and relationships.

The draft thesaurus has been reviewed by Giulietta Fargion (Hughes), UCSB faculty and students, staff of the Global Change Master Directory (Lola Olsen and her contractors), and by NOAA staff (Gerry Barton and Kevin Vrieze). In all cases, the extensive comments received were reviewed and integrated into the thesaurus as appropriate. It has also been reviewed by Bob Rugg who is the chair of the ISO working group within TC211 working on ``Feature Cataloguing - Methodology.'' I have also been appointed as a U.S. expert for TC211 Working Group 3 on Indirect Reference Systems (Gazetteers) as a result of this project work.

In addition, the feature type thesaurus is being compared to the feature types of the NATO Feature Attribute Class Catalog (FACC) used by NIMA, to the feature types used by the Canadian Permanent Committee on Geographical Names (CPCGN), and to the categories used by the Getty Museum's Thesaurus of Geographic Names. Terms are added and other adjustments made in the process. Needed terminology is also discovered

during the process of writing conversion rules to move the current ADL gazetteer feature types to the new set of terminology. We plan to continue this process of review and comparison and frequent changes until we are ready for the actual conversion process.

As of this date (3/17/98), the thesaurus contains a total of 549 terms (191 valid terms and 358 invalid terms for reference). The size of the file is 231 Kbytes. It has been developed using the MultiTes Thesaurus software which is designed to handle multilingual thesauri. So far, our thesaurus is in the English language only. An html version available for review at the following URL: http://www.alexandria.ucsb.edu/ lhill/html/index.htm. The structure of six (6) top terms (administrative areas, hydrographic features, land parcels, manmade features, physiographic features, and regions) is working out very well as a basis for the organization of the terminology.

We have hired two graduate students to work on the project: Mary-Anna Rae is a Ph.D. student in the Graduate School of Education and has worked with the Alexandria Digital Library Project for the last year. She has worked on developing conversion rules from the current feature classes and types to the new thesaurus terms and performed quality control. Rong Hua was a graduate student in Computer Science and established the new relational database structure for the gazetteer and loaded it with data from the USGS, NIMA, and other sources. Qi Zheng, who is the senior engineer for the ADL Project, is also spending part of his time in support of this project.



next up previous contents
Next: Loading and Maintenance Up: The Catalog Component Previous: The Catalog Component



Terence R. Smith
Tue Jul 21 09:26:42 PDT 1998