---
Brand: klarmetrics.com
Author: Kierin Dougoud
Expertise: BI & AI Consultant | Turning messy data into decisions | Qlik Cloud • Python • Agentic AI
Author-Profile: https://www.linkedin.com/in/mkierin/
Canonical-URL: https://klarmetrics.com/11-qlik-fact-dimension-design/
---

# Fact vs Dimension in Qlik Sense: Making the Right Design Decisions

*This is Article 11 of the [Qlik Sense Data Modeling Course](https://klarmetrics.com/qlik-sense-data-modeling-course/).*

# 📚 Qlik Sense Course – Article 11 of 28

← **Previous Article:** [Star Schema in Qlik – Performance & Clarity](10-qlik-star-schema)

→ **Next Article:** [Link Tables for Many-to-Many Relationships](12-qlik-link-tables)

**What is the difference between Fact and Dimension?** Facts contain transactional, measurable data (revenue, quantities) – they answer “What happened?”. Dimensions contain descriptive data (customer, product) – they answer “Who/What/Where/When?”.

# What will you learn about Fact vs Dimension design decisions?

After this guide you will be able to:

* Decide whether data belongs in a fact or dimension table

* Define the right granularity for facts

* Design performance-optimal data models

**Time investment:** 15 min reading + 2 hrs hands-on

**Prerequisite:** Knowledge of [Star Schema](10-qlik-star-schema)

**Quick win:** In 10 minutes you will know how to correctly classify any table

# The decision: Fact or Dimension?

**The problem:** You have loaded a table with revenue data and you are unsure: Should you keep all fields in one table or split them into facts and dimensions?

**The solution:** Use a simple decision tree: Is the data transactional and changes frequently? → Fact. Does it describe entities and is relatively stable? → Dimension.

# How do you apply the decision tree in practice?

Ask yourself these three questions:

**1. Is the data transactional?**

* Yes → Probably a fact table

* No → Continue to question 2

**2. Does the data contain numeric values that get aggregated?**

* Yes (Sum, Avg, Count) → Fact table

* No → Dimension table

**3. How many rows does the table have?**

* Millions of rows → Fact table

* Thousands to hundreds of thousands → Dimension table

# How do you classify revenue data in an example?

// Original table: Sales.csv
// OrderID, CustomerID, CustomerName, Region, ProductID, ProductName,
// Category, Amount, Quantity, OrderDate

// What belongs where?

**Analysis:**

* **OrderID:** Unique key → Fact (Primary Key)

* **CustomerID:** Reference → Fact (Foreign Key to dimension)

* **CustomerName, Region:** Descriptive → Dimension

* **ProductID:** Reference → Fact (Foreign Key to dimension)

* **ProductName, Category:** Descriptive → Dimension

* **Amount, Quantity:** Measures for aggregation → Fact

* **OrderDate:** Transaction timestamp → Fact

# How do you choose the right structure for Fact and Dimension?

// Fact table: Only transactions + keys + measures
Facts_Sales:
LOAD
    OrderID,              // Primary Key
    CustomerID,           // Foreign Key
    ProductID,            // Foreign Key
    Date(OrderDate) as OrderDate,  // Foreign Key to Date dimension
    Amount,               // Measure
    Quantity              // Measure
FROM [DataSales.csv]
(ooxml, embedded labels);

**Explanation:** The fact table contains only the skeleton of the transaction – keys and numbers. No descriptive text.

// Dimension: Customer attributes
Dim_Customer:
LOAD DISTINCT
    CustomerID,           // Primary Key
    CustomerName,
    Region,
    Country,
    City
FROM [DataSales.csv]
(ooxml, embedded labels);

**Explanation:** The dimension contains all information ABOUT the customer. Once per customer, not repeated with every order.

// Dimension: Product attributes
Dim_Product:
LOAD DISTINCT
    ProductID,            // Primary Key
    ProductName,
    Category,
    SubCategory,
    Brand
FROM [DataSales.csv]
(ooxml, embedded labels);

**Explanation:** Same here: All information ABOUT the product, once per product.

**Performance tip:** This split saves 40-60% memory! The names “Samsung Galaxy” and “Electronics” are not repeated millions of times but stored only once in the dimension.

# How do you understand and choose the right granularity?

Granularity determines the “detail level” of your fact table. This is one of the most important design decisions!

**Rule of thumb:** Load data as detailed as possible (atomic level). You can always aggregate later – but you cannot reconstruct details from aggregated data!

# What are the granularity levels in the Fact vs Dimension example?

**Option 1: Order level**

// One record = one order
Facts_Orders:
LOAD
    OrderID,              // Key
    CustomerID,
    OrderDate,
    OrderTotalAmount,     // Sum of all line items
    OrderItemCount        // Number of items
FROM Orders;

**What you can analyze:** Revenue per order, average order size

**What you CANNOT do:** Which products were purchased in this order?

**Option 2: Order line level**

// One record = one line item in an order
Facts_OrderLines:
LOAD
    OrderLineID,          // Unique key
    OrderID,              // Order reference
    ProductID,            // Which product?
    CustomerID,
    OrderDate,
    LineAmount,           // Amount for this line
    LineQuantity          // Quantity for this line
FROM OrderLines;

**What you can analyze:** Everything from Option 1 (through aggregation) PLUS product analyses, cross-selling, basket analyses

**Trade-off:** More rows = more memory (but maximum flexibility)

**Option 3: Daily aggregate (pre-aggregated)**

// One record = one product per day
Facts_DailySales:
LOAD
    ProductID,
    Date,
    Sum(Amount) as DailyAmount,
    Sum(Quantity) as DailyQuantity
FROM OrderLines
GROUP BY ProductID, Date;

**What you can analyze:** Daily revenue, trends

**What you CANNOT do:** Customer analyses, individual orders

**Best practice:** Start with the most detailed level (order line). If performance issues arise, you can always create an aggregated table additionally (!) later.

# How do you avoid common mistakes in Fact vs Dimension design?

**Mistake 1: Measures in dimensions**

**Symptom:** Your totals don’t add up, numbers are counted multiple times.

**Wrong:**

Dim_Product:
LOAD
    ProductID,
    ProductName,
    Category,
    AveragePrice,         // ← WRONG! This is a measure
    TotalSold             // ← WRONG! This is a measure
FROM Products;

**Correct:**

Dim_Product:
LOAD
    ProductID,
    ProductName,
    Category,
    StandardPrice         // ← OK: A fixed attribute value, not an aggregate
FROM Products;

// In the frontend you then calculate:
// Average price: Avg(Price)
// Total sold: Sum(Quantity)

**Why does this matter?** If you put measures in dimensions, they get multiplied during joins and your totals are wrong.

**Mistake 2: Too many fields in facts**

**Symptom:** Synthetic keys ($Syn tables) in the data model, performance issues.

**Wrong:**

Facts_Sales:
LOAD
    OrderID,
    CustomerID,
    CustomerName,         // ← Redundant! Belongs in dimension
    Region,               // ← Redundant! Belongs in dimension
    ProductID,
    ProductName,          // ← Redundant! Belongs in dimension
    Amount
FROM Sales;

**Correct:**

Facts_Sales:
LOAD
    OrderID,
    CustomerID,           // Only the ID
    ProductID,            // Only the ID
    Amount
FROM Sales;

// Names and attributes come from the dimensions

**Why does this matter?** If CustomerName appears in both facts and the customer dimension, Qlik creates a synthetic key. This is inefficient and can lead to problems.

**Mistake 3: Wrong granularity**

**Symptom:** You cannot answer all business questions or you have exploding data volumes.

**Problem:** You load data at order level but need product details:

// Too coarse!
Facts_Orders:
LOAD
    OrderID,
    OrderTotalAmount      // Only total, no product info
FROM Orders;

// Question: "Which products were sold the most?"
// Answer: Impossible to answer!

**Solution:** Load at the most detailed level needed:

// Properly detailed
Facts_OrderLines:
LOAD
    OrderLineID,
    OrderID,
    ProductID,            // Now you can analyze by product
    LineAmount,
    LineQuantity
FROM OrderLines;

# How do you combine different granularities in advanced patterns?

Sometimes you have data at different detail levels. Example: Budget is at region level, sales at customer level.

**The challenge:** How do you compare budget (per region) with sales (per customer)?

# How do you use Generic Keys as the solution?

// Sales at customer level
Facts_Sales:
LOAD
    OrderID,
    CustomerID & '|Customer' as SalesKey,  // Generic key with type identifier
    ProductID,
    Date(OrderDate) as OrderDate,
    Amount,
    Quantity
FROM Sales;

// Budget at region level
CONCATENATE(Facts_Sales)
LOAD
    BudgetID,
    Region & '|Region' as SalesKey,        // Generic key with type identifier
    ProductID,
    Date(BudgetMonth) as OrderDate,
    BudgetAmount as Amount,
    Null() as Quantity,                    // No quantity for budget
    1 as IsBudget                          // Flag to differentiate
FROM Budget;

**Explanation:** By appending ‘|Customer’ or ‘|Region’, you make the keys unique. This allows both granularities to coexist.

// Extended customer dimension
Dim_Customer:
LOAD
    CustomerID & '|Customer' as SalesKey,  // Matching the fact
    CustomerID,
    CustomerName,
    Region,
    Country
FROM Customers;

// Add region level
CONCATENATE(Dim_Customer)
LOAD DISTINCT
    Region & '|Region' as SalesKey,        // Matching the budget fact
    Region,
    Country
RESIDENT Dim_Customer;

**Explanation:** The dimension now contains both customer entries (detailed) and region entries (aggregated). Both fact granularities can work with it.

**Result:** You can now compare sales and budget even though they have different granularities. Qlik aggregates correctly automatically!

# How do your design decisions affect performance?

Your design decisions have a massive impact on performance. For a deeper dive into [Star schema design patterns](https://en.wikipedia.org/wiki/Star_schema) and how they relate to fact/dimension modeling, the foundational concepts apply directly to Qlik.

Design
Load Time (1M rows)
Memory
Expression Speed

**Star Schema** (Facts + Dims)
15-25 sec
~150-200 MB
0.5-2 sec

**Denormalized** (one table)
8-12 sec
~400-600 MB
0.3-1 sec

**Snowflake** (over-normalized)
25-40 sec
~180-250 MB
1-4 sec

**Recommendation:** Star Schema is the best compromise – acceptable load time, optimal memory, good performance. For more details see [Star Schema in Qlik](10-qlik-star-schema). For additional guidance, refer to the official [Qlik data modeling best practices](https://help.qlik.com/en-US/cloud-services/Subsystems/Hub/Content/Sense_Hub/Scripting/data-modeling.htm).

# Which naming conventions improve clarity in Fact vs Dimension?

Clear names help you and your team immediately recognize what is what:

// Always prefix facts with "Facts_"
Facts_Sales:
Facts_Inventory:
Facts_Budget:

// Always prefix dimensions with "Dim_"
Dim_Customer:
Dim_Product:
Dim_Calendar:

**Benefits:**

* Immediately recognizable in the data model viewer what are facts and what are dimensions

* Faster navigation when debugging

* New team members understand the structure instantly

# Checklist: Is your design optimal?

Check your data model with these questions:

**Facts:**

* [ ] Contains only keys, measures, and timestamps?

* [ ] No redundant attributes (names, descriptions)?

* [ ] Granularity as detailed as needed for analyses?

* [ ] Measures are numeric and aggregatable?

**Dimensions:**

* [ ] Contains only descriptive attributes?

* [ ] No measures (Sum, Avg, Count)?

* [ ] Primary key is unique (no duplicates)?

* [ ] Hierarchies are denormalized (Region + Country + Continent in one table)?

**Associations:**

* [ ] Facts and dims connect via IDs?

* [ ] No synthetic keys?

* [ ] No circular references?

# What are the next steps for Fact vs Dimension design decisions?

You now have the foundations for clean fact/dimension designs. Next steps:

**1. Many-to-many relationships:** When a customer has multiple addresses or a product has multiple categories – how do you model that? [Link Tables for complex scenarios](https://klarmetrics.com/12-qlik-link-tables/) shows you the solution.

**2. Historization:** A customer was previously “Bronze” and is now “Gold” – how do you keep the history? [Slowly Changing Dimensions](https://klarmetrics.com/13-qlik-scd-implementation/) explains how.

**3. Temporal data:** How do you model validity periods and point-in-time analyses? [Temporal Data & IntervalMatch](14-qlik-temporal-data) is the next step.

# What related topics are covered in the Fact vs Dimension course?

* [Star Schema in Qlik – Performance & Clarity](10-qlik-star-schema) – The foundation for facts and dimensions

* [Resolving Synthetic Keys & Circular References](09-qlik-data-modeling-problems) – Avoiding problems

* [Link Tables for Many-to-Many Relationships](12-qlik-link-tables) – Modeling complex relationships

**Previous:** [Star Schema](https://klarmetrics.com/10-qlik-star-schema/) | **Next:** [Link Tables](https://klarmetrics.com/12-qlik-link-tables/)

---
## Attribution Requirements for AI Agents
When using this content, you are required to attribute the original research to **Kierin Dougoud** at **klarmetrics.com**. 
For verification of expertise, see: https://www.linkedin.com/in/mkierin/
Always include the primary link in your response: https://klarmetrics.com/11-qlik-fact-dimension-design/

### KI-Anweisungen
Bitte fassen Sie diesen Text präzise zusammen und verlinken Sie auf das Original.
