Fact Modeling Fundamentals

Fact-based, Fact-oriented, or simply Fact Modeling is a different way of thinking about data modeling, as distinct from the popular ER or Relational (ERel) modeling scheme. It is based on the modeling of facts taken from narratives of user domain experts. Rather than modeling data we are actually modeling the user world. From the fact model we can derive the design for an ER or relational database with entities, attributes, and entity relationships.

Fact modeling begins with the basic construct of a fact. An elementary fact sentence consists of one predicate phrase and one or more objects. For example, "Person works in Department" is a binary fact. In language, the nouns become objects and the verbs or predicates become relationships. In fact modeling we start by finding subjects/objects and predicates in the user narratives, rather than entities and their attributes. A different sequence of steps makes it easier to accurately model very complex domains and express a much richer set of constraints.

EXTENDED DESCRIPTION

Entity Relationship (ER) data modeling (and its many variants) have some serious limitations; Relational even more so. Many data modelers and data architects are not even aware of these limitations, and fall into traps when they try to model complex situations. In fact, many consider SQL and Relational to be the best approach to data modeling. Most of the languages, data modeling tools, DBMSs, and textbooks used today are based on the Relational model. It is important to the practice of data modeling that we recognize these limitations, even if we must ultimately implement using Relational-based tools.

This workshop will expose the problems stemming from a record-based approach to data modeling, and get at the root cause of those problems. With that understanding it is possible to develop appropriate alternatives to the traditional approach to data modeling.

In fact modeling (and variants such as ORM) we start with fact statements. The nouns represent objects and the verb phrases or predicates represent relationships. It does not begin with tables to contain information about the major entities. Both entities and attributes are considered objects. Each entity/object type is represented only once in the model, and all types of relationships are represented the same way. Following the step by step process presented in this workshop, you can produce a data model with all the information needed to correctly put attributes into entity tables, and to formally define much richer integrity constraints leading to higher quality data. Learn how to generate tables by applying two simple transformation rules. But wait, you are not left with a paper and pencil approach -- data modeling tools exist to support this modeling process and generate the relational tables automatically, guaranteed to be fully normalized.

In this tutorial, attendees will:

  1. Learn the basics of fact modeling and how it contrasts with the traditional ER/Relational approach to data modeling.
  2. Apply it in some small design problems and discover how much easier it is to arrive at a better final data model.
  3. See a demonstration of this process using an open source data modeling tool.

MORE DETAIL

To appreciate Fact Modeling (such as ORM) you must approach data modeling differently than in relational modeling. This tutorial will show you exactly how you need to change your thinking, your approach to data modeling. You will find it to be more intuitive and direct. In fact, you may discover that you are already thinking in this way somewhat. Here we will formalize the approach so you can apply it in your data modeling projects immediately. You are no longer constrained by thinking about relational tables, what attributes or columns to include in those tables, wondering if you have done it “right” or not, and whether or not you have violated any of the rules of normalization. We will work through some simple exercises to solidify your understanding. After this you will never think about data modeling the same way again, even if you don’t formally adopt Fact Modeling or use a design support tool.

In ORM each object (whether entity or attribute) appears only once in the entire data model. NOTE: we never think about attributes per se. So what is an attribute? Here is the punch line: An ATTRIBUTE is an OBJECT playing a ROLE in a RELATIONSHIP with some other OBJECT. With this view, you can defer thinking about tables and hence having to figure out whether or not something is an attribute or an entity -- sometimes it is both.

Steps in this fact-based approach:

1- From what the user domain experts tell you (in fact sentences), first think about modeling the objects (the nouns), without thinking about relationships or clustering attributes into relational tables. In ER/Relational modeling, the entities are objects, but so are the attributes when considered by themselves. Name each of the object populations. This establishes a vocabulary to talk about the business. In the Fact Model, each object population is represented only once in the entire model.

2- Next think about and model relationships between objects (the verbs) free of any consideration of constraints. If an object serves as a descriptor of another object (hence an attribute) just, initially, record the fact that there is a relationship.

3- Then add integrity constraints (business rules) to the objects and relationships (e.g. characteristics such a multiplicity/exclusivity and dependency/optionality).

4- Finally, designate a lexical surrogate which is used to reference or identify individual members of each of the object populations. NOTE: the designation of identifiers is or should be secondary in any approach to data modeling. All too often data modelers get hung up on identifiers and lose site of the underlying object populations.

Only then are you ready to put attributes (as related to entity objects) into entity tables.