Book Review: Star Schema

May 13, 2015 — Larry Rockoff

Star Schema by Christopher Adamson is a clearly written and comprehensive book about the all important topic of dimensional design. The ability to create a database with a dimensional design is essential to all business intelligence endeavors, and permeates many analytical processes.

The term “star schema” refers to the star-like structure of a dimensional design, in which central fact tables are joined to any number of surrounding dimension tables. In most cases, the central fact table contains information on discrete events, and dimensions provide attributes related to those events. Not only does this type of structure simplify a user’s ability to grasp the nature of the data, but it is also correlates well with crosstab, pivot tables and charts. Any time anyone utilizes an Excel pivot table, they are in essence viewing data in a dimensional way, with the values of the pivot table corresponding to elements in the fact tables, and the rows and columns corresponding to attributes in dimension tables. Furthermore, many BI reporting tools rely on data structures called cubes that take star schemas to a higher level, allowing for advanced capabilities such a the ability to drill down into data that’s been pre-arranged in a hierarchical manner.

A distinguishing quality of this book is that it is not specific to the methodologies advocated by business intelligence gurus Bill Inmon or Ralph Kimball. In fact, the author spends a great deal of time going over the salient points of Inmon’s Corporate Information Factory and Kimball’s Dimensional Data Warehouse strategies, explaining the benefits of each approach. By avoiding the hype that is sometimes found in books by these self-declared experts, Mr. Adamson delivers a balanced and objective treatise on the subject.

For the BI practitioner, this book is much more than an overview, and is filled with the technical details that are needed to make sense of this complex topic. Early chapters explain the basics of dimension and fact tables, covering essential elements such as surrogate keys, dimensional hierarchies and the intricacies of slowly changing dimensions. The advantages and disadvantages of OLAP cubes are discussed, helping the reader understand the differences between cubes and relational data warehouses.

A later chapter on factless fact tables is particularly interesting, and is indicative of the well thought-out design tactics advocated by the author. A common design problem for the data analyst is in how to model events. The factless fact table provides a way to capture events that lack quantitative measures. In this type of situation, one merely wants to count the occurrence of various events. This can be anything from service requests, contacts with a customer or website clicks.

Another particularly useful feature of the book is its final chapter on how to design and document dimensional models. Not only are the key steps in designing a dimensional model clearly outlined, but some useful templates are provided for documenting the results. This includes some nice ways to graphically depict fact and dimension tables, as well as how to utilize a matrix to portray the relationships between conformed dimensions and facts. For the motivated BI professional, this book is a superb resource that will provide the details necessary to create an understandable and useful dimensional design of data.

Star Schema: The Complete Reference
by Christopher Adamson
McGraw-Hill, July 2010
486 pages, $59.99

Posted in Analytics and Visualization. Comments Off

LarryRockoff.com

Pages