A data model for paleontological species level specimen based information


Paleontological data are complex. They involve a wide variety of different classes of information relating physical objects such as type specimens to all manner of diferent things including collecting localities, systematic concepts such as a genus, and published statements about stratigraphic or systematic placement. As the paleobiological community seeks to build more and more highly refined databases from the literature, museum collections, and new stratigraphic collections, it has become clear t hat we need sophisticated data models to deal with this information. Furthermore, it is widely recognised that sources of paeontological data are highly variable in their quality, ranging from type specimens with well documented provinance to offhand li tterature references that a particular species is included in the faunal list from some formation. It is therefore of critical importance that we are able to identify the source for each particular piece of information in a data set, and be able to rejec t information that we feel is of inadequet quality for a particular purpose. This data model has been designed to address all of these atributes of species level paleontological data.


Summary Diagram

Figure 1. This diagram, loosely in the form of an entity-relationship diagram, sumarises the main classes of information incorporated in this model. The primary data are here seen as species level taxonomic concepts, which can be associated with morp hological descriptions and have been reported to occur in In the real world. These data arise from some authority, be it a published reference or a museum lot. Of critical importance is the ability to verify the source and quality of both taxonomic and occurance data. This diagram is intended to serve as a road map for the more detailed descriptions of the data model below.


Thumbnail of E-R diagram

Figure 2. Summary Entity-Relationship diagram for the data model. Symbols follow the widely used convention of Chen(1976), except that arrowheads have been added to indicate the direction in which the phrase within the relationship construct happens to describe the relationship in this figure. A clear explanation of the symbols used here may be found in Teorey (1994). Most, but not all entities and relationships are shown here.


In order to allow for greater portability of data between different databases, certain portions of this data model have been drawn from existing models. Most importantly are the basic structure of the potions of the data model that deal with literatur e references and with higher taxonomic ranks. Both of these are drawn from the PaleoBank data model developed at the Paleontological Institute at the University of Kansas. This model has a lso drawn from the Association of Systematics Collections draft data model for biological collections standards

Documentation

Alphabetical List of Entities
Conceptual List of Entities
Relationship Cardinalities and Descriptions
Entity Documentation

References

Chen, P.P., 1976. The Entity-Relationship Model: Towards a unified view of data. ACM Transactions Database Systems 1:9-36.
Teorey, T.T., 1994. Database Modeling and Design: The fundamental principles. Morgan Kaufmann, San Francisco.

All Material on this page and in associated images Copyright © Paul J. Morris, 1995
Paul J. Morris mole@bio.umass.edu

14 Dec 1995