Slowly Changing Dimensions Type 2 in Qlik Sense [Guide]

Q: How do dimensions change in Slowly Changing Dimensions Type 2?

The problem: Your customer "Acme Corp" was in the "North" region from 2020-2022. Starting July 2022, they moved to "South." When you analyze sales from 2021 – which region should be displayed? The current one (South, wrong!) or the one at the time (North, correct!)? Without SCD2 (default): Customer: CustomerID | Name | Region CUST001 | Acme Corp | South ← Current, overwrites old region Sales 2021 → Region "South" (WRONG! Was "North" at the time) The solution: SCD Type 2

Q: What is a Surrogate Key (Technical Key) in Qlik?

CustomerSurrogateKey: 1, 2, 3, 4, ... Explanation: Unique key per VERSION. Auto-increment. This key is referenced in fact tables!

Q: How do you implement the Business Key in SCD Type 2?

CustomerID: CUST001, CUST001, CUST002, ... Explanation: The "real" customer ID from the source system. Stays the same across all versions.

Q: What are attributes that can change in SCD Type 2?

CustomerName, Region, Segment, ... Explanation: The fields that can change and need to be historized.

Q: What are the Effective Dates in SCD Type 2?

EffectiveStartDate: 2022-07-01 EffectiveEndDate: NULL (= still current) or 2023-12-31 (= superseded) Explanation: When was this version valid? NULL for EndDate means: still currently valid.

Q: What is the IsCurrent Flag in SCD Type 2?

IsCurrent: 1 = current, 0 = historical Explanation: Quick access to the current version without date comparison. Important: Facts must reference the Surrogate Key, not the Business Key! This is the only way point-in-time works correctly.

Q: How do you create an SCD2 Initial Load in Qlik?

The first time, you load all customers as version 1: // Load source data Temp_Customers: LOAD CustomerID, CustomerName, Region, Segment, Date(ModifiedDate) as SourceModifiedDate FROM [DataCustomers.xlsx] (ooxml, embedded labels); Explanation: Standard load of source data. ModifiedDate is the date of the last change in the source system. // Build SCD2 structure Dim_Customer_SCD2: LOAD RowNo() as CustomerSurrogateKey, // 1, 2, 3, ... CustomerID,

Q: How do you load the old SCD2 and new source data in Qlik?

// 1. Load old SCD2 dimension Dim_Customer_SCD2_Old: LOAD * FROM [QVDDim_Customer_SCD2.qvd] (qvd); // 2. Load new source data Temp_Customers_New: LOAD CustomerID, CustomerName, Region, Segment, Date(ModifiedDate) as SourceModifiedDate, Hash128(CustomerName, Region, Segment) as AttributeHash // Same hash! FROM [DataCustomers.xlsx] (ooxml, embedded labels); Explanation: We load both the old SCD2 and the new source data. The hash is calculated again – same formula a

This is Article 13 of the Qlik Sense Data Modeling Course.

📚 Qlik Sense Course – Article 13 of 28

← Previous Article: Link Tables for Many-to-Many Relationships
→ Next Article: Temporal Data & IntervalMatch

What is SCD Type 2? A method for historizing dimension changes. When a customer switches regions, the old version is kept and a new one is created – keeping historical analyses accurate! The concept of Slowly Changing Dimensions on Wikipedia provides the theoretical foundation for this approach.

What will you learn about Slowly Changing Dimensions Type 2 in Qlik?

After this guide you will be able to:

Implement SCD Type 2 dimensions in Qlik
Perform point-in-time analyses (link sales to the region that was current at the time of the sale)
Track and analyze historical changes

Time investment: 30 min reading + 4 hrs hands-on
Prerequisite: Knowledge of Fact vs Dimension Design
Quick win: In 20 minutes you will understand the concept and be able to create your first SCD2 dimensions

How do dimensions change in Slowly Changing Dimensions Type 2?

The problem: Your customer “Acme Corp” was in the “North” region from 2020-2022. Starting July 2022, they moved to “South.” When you analyze sales from 2021 – which region should be displayed? The current one (South, wrong!) or the one at the time (North, correct!)?

Without SCD2 (default):

Customer:
CustomerID | Name       | Region
CUST001    | Acme Corp  | South      ← Current, overwrites old region

Sales 2021 → Region "South" (WRONG! Was "North" at the time)

The solution: SCD Type 2 Create a new version of the record, keep the old one. This way you can assign historical sales to the correct (original!) region.

With SCD2:

Customer_SCD2:
SurrogateKey | CustomerID | Name       | Region | StartDate  | EndDate    | IsCurrent
1            | CUST001    | Acme Corp  | North  | 2020-01-01 | 2022-06-30 | 0
2            | CUST001    | Acme Corp  | South  | 2022-07-01 | NULL       | 1

Sales 2021 → SurrogateKey 1 → Region "North" (CORRECT!)
Sales 2023 → SurrogateKey 2 → Region "South" (CORRECT!)

How do you understand the structure of SCD Type 2?

An SCD2 dimension has these mandatory fields:

What is a Surrogate Key (Technical Key) in Qlik?

CustomerSurrogateKey: 1, 2, 3, 4, ...

Explanation: Unique key per VERSION. Auto-increment. This key is referenced in fact tables!

How do you implement the Business Key in SCD Type 2?

CustomerID: CUST001, CUST001, CUST002, ...

Explanation: The “real” customer ID from the source system. Stays the same across all versions.

What are attributes that can change in SCD Type 2?

CustomerName, Region, Segment, ...

Explanation: The fields that can change and need to be historized.

What are the Effective Dates in SCD Type 2?

EffectiveStartDate: 2022-07-01
EffectiveEndDate: NULL (= still current) or 2023-12-31 (= superseded)

Explanation: When was this version valid? NULL for EndDate means: still currently valid.

What is the IsCurrent Flag in SCD Type 2?

IsCurrent: 1 = current, 0 = historical

Explanation: Quick access to the current version without date comparison.

Important: Facts must reference the Surrogate Key, not the Business Key! This is the only way point-in-time works correctly.

How do you create an SCD2 Initial Load in Qlik?

The first time, you load all customers as version 1:

// Load source data
Temp_Customers:
LOAD
    CustomerID,
    CustomerName,
    Region,
    Segment,
    Date(ModifiedDate) as SourceModifiedDate
FROM [DataCustomers.xlsx]
(ooxml, embedded labels);

Explanation: Standard load of source data. ModifiedDate is the date of the last change in the source system.

// Build SCD2 structure
Dim_Customer_SCD2:
LOAD
    RowNo() as CustomerSurrogateKey,              // 1, 2, 3, ...
    CustomerID,                                    // Business Key
    CustomerName,
    Region,
    Segment,
    Date(SourceModifiedDate) as EffectiveStartDate,
    Date(Null()) as EffectiveEndDate,             // NULL = current
    1 as IsCurrent,                               // All initially current
    1 as VersionNumber,                           // First version
    Hash128(CustomerName, Region, Segment) as AttributeHash  // For later use
RESIDENT Temp_Customers;

Explanation of fields:

RowNo(): Automatically creates 1, 2, 3, … as Surrogate Key
EffectiveEndDate = Null(): No previous version yet, so all are current
IsCurrent = 1: All are initially the current version
AttributeHash: Combines all attributes into a hash – used later for change detection!

DROP TABLE Temp_Customers;

// Save SCD2
STORE Dim_Customer_SCD2 INTO [QVDDim_Customer_SCD2.qvd] (qvd);

Explanation: The SCD2 dimension is saved as a QVD. On the next reload, it will be updated incrementally.

How do you update SCD2 with an Incremental Update in Qlik?

On the next reload, you detect changes and create new versions:

How do you load the old SCD2 and new source data in Qlik?

// 1. Load old SCD2 dimension
Dim_Customer_SCD2_Old:
LOAD * FROM [QVDDim_Customer_SCD2.qvd] (qvd);

// 2. Load new source data
Temp_Customers_New:
LOAD
    CustomerID,
    CustomerName,
    Region,
    Segment,
    Date(ModifiedDate) as SourceModifiedDate,
    Hash128(CustomerName, Region, Segment) as AttributeHash  // Same hash!
FROM [DataCustomers.xlsx]
(ooxml, embedded labels);

Explanation: We load both the old SCD2 and the new source data. The hash is calculated again – same formula as in the initial load!

How do you identify changes with hash comparison in Qlik?

// 3. Find changes through hash comparison
Temp_Changes:
LOAD
    n.CustomerID,
    n.CustomerName,
    n.Region,
    n.Segment,
    n.SourceModifiedDate,
    n.AttributeHash as NewHash,
    o.AttributeHash as OldHash,
    o.CustomerSurrogateKey as OldSurrogateKey,
    o.VersionNumber as OldVersionNumber
FROM Temp_Customers_New n
LEFT JOIN (Dim_Customer_SCD2_Old o)
ON n.CustomerID = o.CustomerID
WHERE o.IsCurrent = 1;                  // Only compare current versions

Explanation: We join new data (n) with old data (o) via CustomerID. But we only care about the current versions (IsCurrent = 1)!

// 4. Only truly changed records
Temp_Changed:
LOAD *
RESIDENT Temp_Changes
WHERE NewHash <> OldHash         // Different hash = change!
   OR IsNull(OldHash);           // Or new customer (no hash yet)

Explanation: The trick: If NewHash ≠ OldHash, at least one attribute has changed. If OldHash is NULL, it’s a new customer.

Performance tip: Hash comparison is 2.4x faster than field-by-field comparison (5 sec vs 12 sec for 50k records)!

How do you close old versions in Step 3?

// 5. Close old versions (set EndDate, IsCurrent = 0)
Dim_Customer_SCD2_Updated:
LOAD
    CustomerSurrogateKey,
    CustomerID,
    CustomerName,
    Region,
    Segment,
    EffectiveStartDate,
    Date(Today()-1) as EffectiveEndDate,    // Closed yesterday
    0 as IsCurrent,                         // No longer current
    VersionNumber,
    AttributeHash
RESIDENT Dim_Customer_SCD2_Old
WHERE EXISTS(CustomerID, CustomerID)        // Only for changed customers
  AND IsCurrent = 1;                        // That are still current

Explanation: For all changed customers, we set the old version to IsCurrent=0 and EffectiveEndDate=yesterday. This makes it historical but not deleted!

How do you add new versions in SCD Type 2?

// 6. Create new versions
LET vMaxSurrogateKey = Peek('CustomerSurrogateKey', -1, 'Dim_Customer_SCD2_Updated');

CONCATENATE(Dim_Customer_SCD2_Updated)
LOAD
    $(vMaxSurrogateKey) + RowNo() as CustomerSurrogateKey,  // New keys!
    CustomerID,
    CustomerName,
    Region,
    Segment,
    Date(Today()) as EffectiveStartDate,
    Date(Null()) as EffectiveEndDate,
    1 as IsCurrent,
    OldVersionNumber + 1 as VersionNumber,      // Version + 1
    NewHash as AttributeHash
RESIDENT Temp_Changed
WHERE NOT IsNull(OldHash);                      // Only changes, not new records

Explanation of details:

Peek(‘CustomerSurrogateKey’, -1, …): Gets the highest existing surrogate key
$(vMaxSurrogateKey) + RowNo(): New keys start after the highest existing one
OldVersionNumber + 1: Increment version (1 → 2 → 3 …)

Result: You now have both the old version (closed) and the new version (current) in the dimension!

How do you connect Facts with SCD2 for Point-in-Time Lookup?

The critical part: Facts must get the correct surrogate key for the correct point in time!

// Load facts
Temp_Sales:
LOAD
    OrderID,
    CustomerID,
    ProductID,
    Date(OrderDate) as OrderDate,
    Amount,
    Quantity
FROM [DataSales.xlsx]
(ooxml, embedded labels);

// Load SCD2 dimension
Dim_Customer_SCD2:
LOAD
    CustomerSurrogateKey,
    CustomerID,
    CustomerName,
    Region,
    EffectiveStartDate,
    EffectiveEndDate
FROM [QVDDim_Customer_SCD2.qvd] (qvd);

// Point-in-Time Join
Facts_Sales:
LOAD
    OrderID,
    CustomerID,
    ProductID,
    OrderDate,
    Amount,
    Quantity
RESIDENT Temp_Sales;

LEFT JOIN (Facts_Sales)
LOAD
    CustomerID,
    CustomerSurrogateKey,
    EffectiveStartDate,
    EffectiveEndDate
RESIDENT Dim_Customer_SCD2;

Explanation: The LEFT JOIN connects sales with ALL versions of the customer. Now we temporarily have too many rows!

// Final: Keep only the correct version
Facts_Sales_Final:
LOAD
    OrderID,
    CustomerSurrogateKey,      // Instead of CustomerID!
    ProductID,
    OrderDate,
    Amount,
    Quantity
RESIDENT Facts_Sales
WHERE OrderDate >= EffectiveStartDate
  AND (IsNull(EffectiveEndDate) OR OrderDate <= EffectiveEndDate);

Explanation of WHERE clause:

OrderDate >= EffectiveStartDate: Sale is after the version start
IsNull(EffectiveEndDate): Version is still current (no EndDate) OR
OrderDate <= EffectiveEndDate: Sale is before the version end

The result: Every sale now has the CustomerSurrogateKey of the version that was valid at the OrderDate. Point-in-time perfect!

To track date-based changes effectively in your SCD implementation, a well-designed master calendar for date tracking provides the date dimension you need for historical analysis.

How can you use IntervalMatch for Point-in-Time in Qlik?

A more elegant method for large dimensions is Qlik IntervalMatch documentation:

// Load facts
Facts_Sales:
LOAD
    OrderID,
    CustomerID,
    ProductID,
    Date(OrderDate) as OrderDate,
    Amount
FROM [DataSales.xlsx]
(ooxml, embedded labels);

// SCD2 Dimension
Dim_Customer_SCD2:
LOAD
    CustomerSurrogateKey,
    CustomerID,
    CustomerName,
    Region,
    EffectiveStartDate,
    EffectiveEndDate
FROM [QVDDim_Customer_SCD2.qvd] (qvd);

// IntervalMatch: OrderDate within validity period
Temp_Interval:
IntervalMatch(OrderDate, CustomerID)
LOAD DISTINCT
    EffectiveStartDate,
    If(IsNull(EffectiveEndDate), Date(Today()+1000), EffectiveEndDate) as EffectiveEndDate,
    CustomerID
RESIDENT Dim_Customer_SCD2;

Explanation of IntervalMatch:

Finds the matching time span for each OrderDate+CustomerID
If(IsNull(EffectiveEndDate), …) replaces NULL with a date far in the future
IntervalMatch automatically creates the association!

// Join for Surrogate Key
LEFT JOIN (Facts_Sales)
LOAD
    CustomerID,
    CustomerSurrogateKey,
    EffectiveStartDate,
    EffectiveEndDate
RESIDENT Dim_Customer_SCD2;

Explanation: Now we can retrieve the surrogate key. IntervalMatch has already found the correct version! For a deep dive into temporal data handling with IntervalMatch, see the next article in this series.

Advantage of IntervalMatch: More compact code, better performance for large dimensions. More details in Temporal Data & IntervalMatch.

How do you avoid common mistakes with SCD Type 2 in Qlik?

Mistake 1: Facts use Business Key instead of Surrogate Key

Symptom: Historical analyses always show current attributes.

Wrong:

Facts_Sales:
- OrderID
- CustomerID              // Business Key - always links to ALL versions!
- Amount

Correct:

Facts_Sales:
- OrderID
- CustomerSurrogateKey    // Surrogate Key - point-in-time correct!
- Amount

Why it matters: CustomerID connects to ALL versions. Qlik doesn’t know which one is correct. CustomerSurrogateKey is unique to one version!

Mistake 2: Multiple records with IsCurrent = 1

Symptom: A customer has multiple “current” versions.

Cause: During the update, the old version was not set to IsCurrent=0.

Diagnosis:

// Check: How many current records per customer?
CheckCurrent:
LOAD
    CustomerID,
    Count(*) as CurrentCount
RESIDENT Dim_Customer_SCD2
WHERE IsCurrent = 1
GROUP BY CustomerID
HAVING Count(*) > 1;

Solution: Make sure that when closing old versions, IsCurrent is explicitly set to 0 (see code above).

Mistake 3: Duplicate Surrogate Keys

Symptom: Surrogate key is not unique, joins don’t work.

Wrong:

// WRONG: RowNo() restarts at 1 for every LOAD!
CustomerSurrogateKey = RowNo()

Correct:

// CORRECT: Find highest existing key, then continue counting
LET vMaxSurrogateKey = Peek('CustomerSurrogateKey', -1, 'OldDimension');
CustomerSurrogateKey = $(vMaxSurrogateKey) + RowNo()

Why it matters: Surrogate keys must remain unique ACROSS ALL LOADS, not just within a single load!

What is the Best Practices Checklist for SCD Type 2 in Qlik?

Structure:

[ ] Surrogate Key (auto-increment, unique)
[ ] Business Key (CustomerID, stays the same across versions)
[ ] EffectiveStartDate & EffectiveEndDate (validity period)
[ ] IsCurrent Flag (1=current, 0=historical)
[ ] Optional: VersionNumber (for debugging)

Change Detection:

[ ] Use Hash128() for all tracked attributes
[ ] Hash comparison instead of field-by-field (2.4x faster)

Facts Integration:

[ ] Facts reference Surrogate Key, NOT Business Key
[ ] Point-in-time lookup with WHERE or IntervalMatch

Performance:

[ ] Separate current table for fast standard analyses
[ ] QVD architecture (Staging → Transform → Model)

For more SCD implementation patterns on Qlik Community, see the community discussion on best practices.

How do you implement SCD Type 2 for Sales Analysis with Region Change?

Scenario: Customer “Acme Corp” switches from region North to South on July 1, 2022.

SCD2 Dimension:
SurrogateKey | CustomerID | Name       | Region | StartDate  | EndDate    | IsCurrent
1            | CUST001    | Acme Corp  | North  | 2020-01-01 | 2022-06-30 | 0
2            | CUST001    | Acme Corp  | South  | 2022-07-01 | NULL       | 1

Facts:
OrderID | CustomerSurrogateKey | OrderDate  | Amount
ORD001  | 1                    | 2021-05-15 | $5,000   ← Version 1 (North)
ORD002  | 2                    | 2023-03-10 | $8,000   ← Version 2 (South)

Analysis: Sum(Amount) by Region

North: $5,000 (ORD001 with Surrogate Key 1)
South: $8,000 (ORD002 with Surrogate Key 2)

Perfect! Historical sales are assigned to the region that was valid at the time of the sale. This keeps trend analyses accurate!

What are the next steps for implementing SCD Type 2 in Qlik?

You can now implement SCD Type 2 dimensions! Next steps:

1. Deep dive into IntervalMatch: SCD2 often uses IntervalMatch for point-in-time. Temporal Data & IntervalMatch shows you all the details and more use cases.

2. Incremental Loading: SCD2 updates are part of a larger incremental loading strategy. Incremental Loading Patterns shows the big picture.

3. IterNo for iterative updates: More complex SCD2 updates can be optimized with LOAD WHILE. IterNo & WHILE Patterns explains how.

What related topics are covered in the SCD Type 2 course?

Fact vs Dimension – Design Decisions – Foundation for clean dimensions
Temporal Data & IntervalMatch – Point-in-time analyses
Incremental Loading Patterns – Efficient updates

Previous: Link Tables | Next: Temporal Data

What would help you most right now?

Thanks!