LEARNING PATHS & COURSES

Fact vs Dimension in Qlik Sense: Making the Right Design Decisions

KlarMetrics

October 6, 2025 · 8 min read

This is Article 11 of the Qlik Sense Data Modeling Course.

📚 Qlik Sense Course – Article 11 of 28

Previous Article: Star Schema in Qlik – Performance & Clarity
Next Article: Link Tables for Many-to-Many Relationships

What is the difference between Fact and Dimension? Facts contain transactional, measurable data (revenue, quantities) – they answer “What happened?”. Dimensions contain descriptive data (customer, product) – they answer “Who/What/Where/When?”.

What will you learn about Fact vs Dimension design decisions?

After this guide you will be able to:

  1. Decide whether data belongs in a fact or dimension table
  2. Define the right granularity for facts
  3. Design performance-optimal data models

Time investment: 15 min reading + 2 hrs hands-on
Prerequisite: Knowledge of Star Schema
Quick win: In 10 minutes you will know how to correctly classify any table

The decision: Fact or Dimension?

The problem: You have loaded a table with revenue data and you are unsure: Should you keep all fields in one table or split them into facts and dimensions?

The solution: Use a simple decision tree: Is the data transactional and changes frequently? → Fact. Does it describe entities and is relatively stable? → Dimension.

How do you apply the decision tree in practice?

Ask yourself these three questions:

1. Is the data transactional?

  • Yes → Probably a fact table
  • No → Continue to question 2

2. Does the data contain numeric values that get aggregated?

  • Yes (Sum, Avg, Count) → Fact table
  • No → Dimension table

3. How many rows does the table have?

  • Millions of rows → Fact table
  • Thousands to hundreds of thousands → Dimension table

How do you classify revenue data in an example?

// Original table: Sales.csv
// OrderID, CustomerID, CustomerName, Region, ProductID, ProductName,
// Category, Amount, Quantity, OrderDate

// What belongs where?

Analysis:

  • OrderID: Unique key → Fact (Primary Key)
  • CustomerID: Reference → Fact (Foreign Key to dimension)
  • CustomerName, Region: Descriptive → Dimension
  • ProductID: Reference → Fact (Foreign Key to dimension)
  • ProductName, Category: Descriptive → Dimension
  • Amount, Quantity: Measures for aggregation → Fact
  • OrderDate: Transaction timestamp → Fact

How do you choose the right structure for Fact and Dimension?

// Fact table: Only transactions + keys + measures
Facts_Sales:
LOAD
    OrderID,              // Primary Key
    CustomerID,           // Foreign Key
    ProductID,            // Foreign Key
    Date(OrderDate) as OrderDate,  // Foreign Key to Date dimension
    Amount,               // Measure
    Quantity              // Measure
FROM [DataSales.csv]
(ooxml, embedded labels);

Explanation: The fact table contains only the skeleton of the transaction – keys and numbers. No descriptive text.

// Dimension: Customer attributes
Dim_Customer:
LOAD DISTINCT
    CustomerID,           // Primary Key
    CustomerName,
    Region,
    Country,
    City
FROM [DataSales.csv]
(ooxml, embedded labels);

Explanation: The dimension contains all information ABOUT the customer. Once per customer, not repeated with every order.

// Dimension: Product attributes
Dim_Product:
LOAD DISTINCT
    ProductID,            // Primary Key
    ProductName,
    Category,
    SubCategory,
    Brand
FROM [DataSales.csv]
(ooxml, embedded labels);

Explanation: Same here: All information ABOUT the product, once per product.

Performance tip: This split saves 40-60% memory! The names “Samsung Galaxy” and “Electronics” are not repeated millions of times but stored only once in the dimension.

How do you understand and choose the right granularity?

Granularity determines the “detail level” of your fact table. This is one of the most important design decisions!

Rule of thumb: Load data as detailed as possible (atomic level). You can always aggregate later – but you cannot reconstruct details from aggregated data!

What are the granularity levels in the Fact vs Dimension example?

Option 1: Order level

// One record = one order
Facts_Orders:
LOAD
    OrderID,              // Key
    CustomerID,
    OrderDate,
    OrderTotalAmount,     // Sum of all line items
    OrderItemCount        // Number of items
FROM Orders;

What you can analyze: Revenue per order, average order size
What you CANNOT do: Which products were purchased in this order?

Option 2: Order line level

// One record = one line item in an order
Facts_OrderLines:
LOAD
    OrderLineID,          // Unique key
    OrderID,              // Order reference
    ProductID,            // Which product?
    CustomerID,
    OrderDate,
    LineAmount,           // Amount for this line
    LineQuantity          // Quantity for this line
FROM OrderLines;

What you can analyze: Everything from Option 1 (through aggregation) PLUS product analyses, cross-selling, basket analyses
Trade-off: More rows = more memory (but maximum flexibility)

Option 3: Daily aggregate (pre-aggregated)

// One record = one product per day
Facts_DailySales:
LOAD
    ProductID,
    Date,
    Sum(Amount) as DailyAmount,
    Sum(Quantity) as DailyQuantity
FROM OrderLines
GROUP BY ProductID, Date;

What you can analyze: Daily revenue, trends
What you CANNOT do: Customer analyses, individual orders

Best practice: Start with the most detailed level (order line). If performance issues arise, you can always create an aggregated table additionally (!) later.

How do you avoid common mistakes in Fact vs Dimension design?

Mistake 1: Measures in dimensions

Symptom: Your totals don’t add up, numbers are counted multiple times.

Wrong:

Dim_Product:
LOAD
    ProductID,
    ProductName,
    Category,
    AveragePrice,         // ← WRONG! This is a measure
    TotalSold             // ← WRONG! This is a measure
FROM Products;

Correct:

Dim_Product:
LOAD
    ProductID,
    ProductName,
    Category,
    StandardPrice         // ← OK: A fixed attribute value, not an aggregate
FROM Products;

// In the frontend you then calculate:
// Average price: Avg(Price)
// Total sold: Sum(Quantity)

Why does this matter? If you put measures in dimensions, they get multiplied during joins and your totals are wrong.

Mistake 2: Too many fields in facts

Symptom: Synthetic keys ($Syn tables) in the data model, performance issues.

Wrong:

Facts_Sales:
LOAD
    OrderID,
    CustomerID,
    CustomerName,         // ← Redundant! Belongs in dimension
    Region,               // ← Redundant! Belongs in dimension
    ProductID,
    ProductName,          // ← Redundant! Belongs in dimension
    Amount
FROM Sales;

Correct:

Facts_Sales:
LOAD
    OrderID,
    CustomerID,           // Only the ID
    ProductID,            // Only the ID
    Amount
FROM Sales;

// Names and attributes come from the dimensions

Why does this matter? If CustomerName appears in both facts and the customer dimension, Qlik creates a synthetic key. This is inefficient and can lead to problems.

Mistake 3: Wrong granularity

Symptom: You cannot answer all business questions or you have exploding data volumes.

Problem: You load data at order level but need product details:

// Too coarse!
Facts_Orders:
LOAD
    OrderID,
    OrderTotalAmount      // Only total, no product info
FROM Orders;

// Question: "Which products were sold the most?"
// Answer: Impossible to answer!

Solution: Load at the most detailed level needed:

// Properly detailed
Facts_OrderLines:
LOAD
    OrderLineID,
    OrderID,
    ProductID,            // Now you can analyze by product
    LineAmount,
    LineQuantity
FROM OrderLines;

How do you combine different granularities in advanced patterns?

Sometimes you have data at different detail levels. Example: Budget is at region level, sales at customer level.

The challenge: How do you compare budget (per region) with sales (per customer)?

How do you use Generic Keys as the solution?

// Sales at customer level
Facts_Sales:
LOAD
    OrderID,
    CustomerID & '|Customer' as SalesKey,  // Generic key with type identifier
    ProductID,
    Date(OrderDate) as OrderDate,
    Amount,
    Quantity
FROM Sales;

// Budget at region level
CONCATENATE(Facts_Sales)
LOAD
    BudgetID,
    Region & '|Region' as SalesKey,        // Generic key with type identifier
    ProductID,
    Date(BudgetMonth) as OrderDate,
    BudgetAmount as Amount,
    Null() as Quantity,                    // No quantity for budget
    1 as IsBudget                          // Flag to differentiate
FROM Budget;

Explanation: By appending ‘|Customer’ or ‘|Region’, you make the keys unique. This allows both granularities to coexist.

// Extended customer dimension
Dim_Customer:
LOAD
    CustomerID & '|Customer' as SalesKey,  // Matching the fact
    CustomerID,
    CustomerName,
    Region,
    Country
FROM Customers;

// Add region level
CONCATENATE(Dim_Customer)
LOAD DISTINCT
    Region & '|Region' as SalesKey,        // Matching the budget fact
    Region,
    Country
RESIDENT Dim_Customer;

Explanation: The dimension now contains both customer entries (detailed) and region entries (aggregated). Both fact granularities can work with it.

Result: You can now compare sales and budget even though they have different granularities. Qlik aggregates correctly automatically!

How do your design decisions affect performance?

Your design decisions have a massive impact on performance. For a deeper dive into Star schema design patterns and how they relate to fact/dimension modeling, the foundational concepts apply directly to Qlik.

Design Load Time (1M rows) Memory Expression Speed
Star Schema (Facts + Dims) 15-25 sec ~150-200 MB 0.5-2 sec
Denormalized (one table) 8-12 sec ~400-600 MB 0.3-1 sec
Snowflake (over-normalized) 25-40 sec ~180-250 MB 1-4 sec

Recommendation: Star Schema is the best compromise – acceptable load time, optimal memory, good performance. For more details see Star Schema in Qlik. For additional guidance, refer to the official Qlik data modeling best practices.

Which naming conventions improve clarity in Fact vs Dimension?

Clear names help you and your team immediately recognize what is what:

// Always prefix facts with "Facts_"
Facts_Sales:
Facts_Inventory:
Facts_Budget:

// Always prefix dimensions with "Dim_"
Dim_Customer:
Dim_Product:
Dim_Calendar:

Benefits:

  • Immediately recognizable in the data model viewer what are facts and what are dimensions
  • Faster navigation when debugging
  • New team members understand the structure instantly

Checklist: Is your design optimal?

Check your data model with these questions:

Facts:

  • [ ] Contains only keys, measures, and timestamps?
  • [ ] No redundant attributes (names, descriptions)?
  • [ ] Granularity as detailed as needed for analyses?
  • [ ] Measures are numeric and aggregatable?

Dimensions:

  • [ ] Contains only descriptive attributes?
  • [ ] No measures (Sum, Avg, Count)?
  • [ ] Primary key is unique (no duplicates)?
  • [ ] Hierarchies are denormalized (Region + Country + Continent in one table)?

Associations:

What are the next steps for Fact vs Dimension design decisions?

You now have the foundations for clean fact/dimension designs. Next steps:

1. Many-to-many relationships: When a customer has multiple addresses or a product has multiple categories – how do you model that? Link Tables for complex scenarios shows you the solution.

2. Historization: A customer was previously “Bronze” and is now “Gold” – how do you keep the history? Slowly Changing Dimensions explains how.

3. Temporal data: How do you model validity periods and point-in-time analyses? Temporal Data & IntervalMatch is the next step.

What related topics are covered in the Fact vs Dimension course?

Previous: Star Schema | Next: Link Tables

What would help you most right now?

Thanks!