A data model for paleontological species level specimen based information

Go to Documentation

Paleontological data are complex. They involve a wide variety of different classes of information relating physical objects such as type specimens to all manner of diferent things including collecting localities, systematic concepts such as a genus, and published statements about stratigraphic or systematic placement. As the paleobiological community seeks to build more and more highly refined databases from the literature, museum collections, and new stratigraphic collections, it has become clear that we need sophisticated data models to deal with this information. Furthermore, it is widely recognised that sources of paeontological data are highly variable in their quality, ranging from type specimens with well documented provinance to offhand litterature references that a particular species is included in the faunal list from some formation. It is therefore of critical importance that we are able to identify the source for each particular piece of information in a data set, and be able to reject information that we feel is of inadequet quality for a particular purpose. This data model has been designed to address all of these atributes of species level paleontological data.

The heart of this model is the concept of an occurance: a report that a species has been found at some location. All occurances are vouchered by either literature references or museum specimens.


Thumbnail of E-R diagram

Figure 1. Summary Entity-Relationship diagram for the data model. Symbols follow the IDEFX1 convention. A clear explanation of the symbols used here may be found in Teorey (1994).


In order to allow for greater portability of data between different databases, certain portions of this data model have been drawn from existing models. Large portions of this model directly follow the Invertebrate Paleontological Collections Management Data Model, which draws upon several other models found in the community. The basic structure of the potions of the data model that deal with literature references and with higher taxonomic ranks are drawn from the PaleoBank data model developed at the Paleontological Institute at the University of Kansas. Concepts related to geographic and stratigraphic occurance are drawn from the POSC Epicenter datamodel. This model has also drawn from the Association of Systematics Collections draft data model for biological collections standards

The model was created in the CASE tool xCase, which uses a variant of the IDEFX1 format. Documentation produced by xCase was converted to HTML with HotDog.

Documentation

Enity Relationship Diagram
Alphabetical List of Entities
Conceptual List of Entities
Relationship Cardinalities and Descriptions
Entity Documentation

References

Bruce, T.A., 1992. Designing Quality Databases with IDEF1X Information Models. Dorset House, New York.

Chen, P.P., 1976. The Entity-Relationship Model: Towards a unified view of data. ACM Transactions Database Systems 1:9-36.

Elmasri R. and S.B. Navathe. 1994. Fundamental of Database Systems. Benjamin/Cummings, Menlo Park.

Krebs, J.W., et al., 1996. PaleoBank, a relational database for invertebrate paleontology: The data model. University of Kansas, Paleontological Contributions 8:1-7.

Shaler, S., and S.J. Mellor. 1988. Object-Oriented Systems Analysis: Modeling the world in data. Yourdon Press, Englewood Cliffs, New Jersey.

Teorey, T.T., 1994. Database Modeling and Design: The fundamental principles. Morgan Kaufmann, San Francisco.


All Material on this page and in associated images Copyright © Paul J. Morris, 1996-1998
Paul J. Morris mole@morris.net

16 Apr 1998