Skip to content

Data Modeling Best Practices

Star Schema

  • Power BI likes Star schemas 1
  • Avoid One Big Table (no dimensional model), results in poor compression and unexpected measure results due to AUTOEXIST 2
  • Degenerate dimensions (e.g. Order Number) are acceptable; but need to consider AUTOEXIST behavior from SUMMARIZECOLUMN() 2
  • Avoid junk dimensions, only use when clearly beneficial

Data Minimization

  • Remove unnecessary columns, it is easy to add more columns to the future, it can be hard to remove them later. Extra columns can result in worse compression & segment ordering, increasing table size and slower scans/query performance.
  • Remove unnecessary rows, if you only need data for 2025, don't import previous years as well
  • Fact table grain: choose most atomic grain feasible for flexibility & correct handling of semi/non-additive facts

Aggregations

  • Can define aggregate fact tables for faster scans in cases where higher grain calculations can be leveraged

Relationships

  • Consider collapsing one-to-one relationships into single table 3
  • Bi-directional relationships should be avoided, they can lead to poor performance and model ambiguity 4 5
  • Hide foreign keys & fact table surrogate keys from report view; expose business fields only

Naming

  • Semantic naming: Use business-friendly table & column names (e.g. Products, Order Date) instead of technical names (dim_product). Capitalize first letter of object names; use consistent casing

Data Types

  • Use the narrowest appropriate data types for accuracy requirements (i.e. whole numbers over decimals when possible), to reduce cardinality of values for optimal compression
  • Split DateTime into separate Date and Time to reduce cardinality
  • Truncate time values (minute/hour) where possible to reduce cardinality

Date Dimensions

  • Disable auto date/time, this feature create a hidden date table for every date column in the model, a date dimension should be used instead.
  • Use a Date dimension as a role-playing dimension:
    • Define Calendar-based time intelligence on date dimension, this allows for more efficient query plans to be generated when filtering on year, month etc.
    • Mark as a date table
    • Can use USERELATIONSHIP() to activate inactive relationship to allow date dimension to act against non-default date field
    • Only create additional date dimensions when comparing two independent date contexts (i.e. [Sales Date] and [Order Date])