Stages of DModlg

Conceptual vs. Logical vs. Physical: Stages of Data Modeling

It has become common to speak of levels of data models.

Conceptual is commonly thought of as a high-level, enterprise-wide, abstract data model. The conceptual model is the one to present to executives for an initial understanding of the data model. It may also be a preliminary model to which detail is added as the modeling process progresses.

Physical, at the bottom, is how the data is stored in a database implemented in some database management system (DBMS) or NoSQL tool, its physical realization.

Logical is something in between, adding detail to the conceptual model but free of physical implementation details which do not contribute to the logical understanding of the model. Sometimes a Relational or Entity-Relationship (ER) model is considered the logical model.

In preparation for this presentation, attendees are encouraged to view the 35 minute video from David Hay entitled “Kinds of Data Models -- and How to Name Them.” (Search on YouTube or go to: http://bit.ly/QwSkOY ). David is one of the pre-eminent thinkers and authors on data modeling. This video captures the essence of popular thinking regarding the characterization and distinctions between conceptual, logical, and physical data models. My presentation could be considered a rebuttal or contrary view to the one promulgated by David Hay.

All data modeling is driven or guided by some data modeling scheme. It tells the modeler what constructs to use in developing a data model and the rules for assembling those constructs. Examples of constructs are: entities (or objects), attributes, entity/attribute names, identifiers, keys, records, relationships, foreign keys, indexes, key-value pairs, partitions (in distributed systems), etc. Rules of normalization and integrity constraints are examples of construction rules in a modeling scheme. Some constructs are a step toward physical implementation and should not be introduced in the initial data model.

There are precedence orderings among the various constructs and construction rules. Here are some obvious examples. You must assign names to the entities or objects (and presumably some detailed description so people know what the terms mean) before they can be referenced in relationships or constraints. You must know about attributes before you can introduce records (or relations as in the relational data model). You must define keys before you can introduce foreign keys. You must define relationships before you can introduce roles or dependency constraints.

This presentation takes a closer look at the various constructs and constraints commonly used in data modeling. Then puts them into a precedence graph, from which we can identify some of the data modeling schemes in common use today.