📚 Qlik Sense Course – Article 2 of 28
← Previous Article: Loading Data in Qlik – All Sources Explained
→ Next Article: JOINs vs KEEPs – When to Use Which
What Can You Learn About Qlik Load Strategies in 12 Minutes?
Once you’ve loaded raw data from external sources, the real work begins: transformation. This guide shows you how to work with data that’s already in Qlik’s memory and process it further.
- Use
RESIDENTloads for step-by-step transformations to write clean, maintainable scripts. - Use
PRECEDINGloads for compact, nested calculations — and understand their limitations. - Choose the performance-optimal strategy for your data volume and complexity.
Time investment: 12 min reading + 1.5 hrs hands-on
Prerequisite: Data loading fundamentals understood
Quick win: Implement a robust, multi-stage RESIDENT load in 10 minutes
How Do I Start My First RESIDENT Load?
The problem: You’ve loaded raw sales data. The date formats are wrong, there are trailing spaces everywhere, and you need to derive new fields for analysis.
The solution: A multi-stage approach (staging) using RESIDENT. Think of it like a production line: each LOAD block is a station with a specific job.
- Station 1: Load the raw data from the file.
- Station 2: Create a cleaned version by reading from the raw table (
RESIDENT). - Station 3: Use the clean data as the source to calculate new fields (e.g. year, category).
- Clean up: Drop the intermediate tables you no longer need.
First, we load the raw data. This table is just the starting point.
RawSales:
LOAD *
FROM [lib://DataFiles/sales_raw.csv]
(txt, utf8, embedded labels, delimiter is ';');
Now the transformation begins. We create a new table CleanSales by reading from the already-loaded table RawSales. Here we fix formats and strip whitespace.
CleanSales:
LOAD
OrderID,
Trim(CustomerName) as CustomerName,
Date(Date#(OrderDate, 'DD.MM.YYYY')) as OrderDate,
Num(Replace(Amount, ',', '.'), '#.##') as Amount
RESIDENT RawSales;
In the final step, we read from our clean table CleanSales and add the final business logic and new analysis fields.
EnrichedSales:
LOAD
*,
Year(OrderDate) as OrderYear,
Month(OrderDate) as OrderMonth,
If(Amount > 1000, 'High', 'Standard') as OrderCategory
RESIDENT CleanSales;
To free up memory and keep the data model clean, we drop the intermediate tables we no longer need.
DROP TABLES RawSales, CleanSales;
Explanation:
RESIDENT: The key command. It tells Qlik to load data not from an external file, but from a table already in memory.- Staging approach: Each step is logically separate. This makes the script extremely readable, maintainable, and easy to debug.
DROP TABLES: An essential command to remove the «scaffolding» (your intermediate tables) once they’ve done their job.
Checkpoint: Does your data model now contain only the final table EnrichedSales with all the new fields? Perfect!
Which Load Strategy Is Right for Your Project?
Choosing the right in-memory load strategy is a trade-off between complexity, maintainability, and performance.
| Load Type | Complexity | Maintainability | Performance | Ideal for… |
|---|---|---|---|---|
| FROM File | Low | High | Fast | Directly importing raw data from an external source. |
| RESIDENT | Medium | Very high | Medium | Multi-stage ETL processes (Extract, Transform, Load). The gold standard for clean scripts. |
| PRECEDING | High | Low | Fast | Compact but hard-to-read nested calculations across a few levels. |
| INLINE | Low | High | Very fast | Small static tables, test data, or configuration parameters written directly in the script. |
What Are the Qlik Load Strategies: RESIDENT, PRECEDING, and INLINE?
What Are Multi-Step Transformations in Load Strategies?
For a deeper look at the RESIDENT load reference and the INLINE load reference, consult the official Qlik documentation. RESIDENT is the preferred method for all non-trivial transformations. By splitting the process into logical stages, your code stays understandable even as complexity grows.
- Stage 1 (Raw): Load the data «as is» from the source. No transformations.
- Stage 2 (Clean): Technical cleansing. Fix data types, standardize keys, handle nulls, and set quality flags.
- Stage 3 (Final/Enrich): Apply business logic. Calculate KPIs, perform segmentation, and create analysis fields.
This example shows a typical, robust ETL pipeline.
Raw_CustomerOrders:
LOAD * FROM [lib://DataFiles/customer_orders.csv]
(txt, utf8, embedded labels, delimiter is ';');
Clean_CustomerOrders:
LOAD
OrderID,
If(Len(CustomerID) = 0, 'UNKNOWN', CustomerID) as CustomerID,
Trim(Upper(CustomerName)) as CustomerName,
Date(Date#(OrderDate, 'DD.MM.YYYY')) as OrderDate,
If(IsNum(Amount), Num(Amount, '#.##'), 0) as Amount,
If(Amount > 0, 'Valid', 'Invalid') as QualityFlag
RESIDENT Raw_CustomerOrders
WHERE Not IsNull(OrderID);
Final_CustomerOrders:
LOAD
*,
Year(OrderDate) as OrderYear,
Month(OrderDate) as OrderMonth,
If(Amount > 1000, 'Premium', 'Standard') as CustomerSegment,
Amount * 0.19 as VATAmount
RESIDENT Clean_CustomerOrders
WHERE QualityFlag = 'Valid';
DROP TABLES Raw_CustomerOrders, Clean_CustomerOrders;
RESIDENT best practices:
- Clear staging levels: Name your tables with prefixes (
Raw_,Clean_,Final_). - Filter early: Exclude invalid data with a
WHEREclause as early as possible to improve performance in downstream stages. - Don’t forget DROP: Only the final data model should remain at the end.
What Are PRECEDING Loads and How Do They Work?
How Does Bottom-Up Processing Work in Load Strategies?
A PRECEDING Load is a chain of LOAD statements without FROM or RESIDENT. Processing runs from the bottom up.
Factory analogy:
- Bottom
LOAD(ground floor): Raw materials are loaded and a first calculation runs (e.g. total revenue). - Middle
LOAD(1st floor): The result from the ground floor is received and processed further (e.g. calculate a running total). - Top
LOAD(2nd floor): The final result is produced from the 1st floor output (e.g. calculate percentage share).
This example calculates a running total (RunningTotal) and its percentage share of total revenue.
SalesAnalysis:
LOAD
OrderID, Amount, RunningTotal,
RunningTotal / TotalSales as PercentageOfTotal;
LOAD
OrderID, Amount, TotalSales,
RangeSum(Above(Amount, 0, RowNo())) as RunningTotal;
LOAD
OrderID, Amount,
Sum(TOTAL Amount) as TotalSales
RESIDENT SourceOrders
ORDER BY OrderDate;
PRECEDING is powerful but dangerous: The code is compact, but extremely difficult to debug because you can’t see the intermediate results at each level. Use it sparingly and only when you know exactly what you’re doing.
What Are INLINE Loads and How Do I Use Them for Static Data and Testing?
With INLINE, you can write small tables directly into your script. It’s perfect for data that doesn’t exist in external files but is needed by the model.
A mapping table for assigning countries to regions.
RegionMapping:
MAPPING LOAD * INLINE [
Country, Region
Germany, DACH
Austria, DACH
USA, Americas
];
A configuration table to manage parameters in a single place.
Config:
LOAD * INLINE [
Key, Value
VAT_Rate, 0.19
Sales_Threshold, 1000
];
Test data to develop and validate your transformations quickly, without loading the full dataset.
TestOrders:
LOAD * INLINE [
OrderID, CustomerID, Amount
T001, C001, 1500.50
T002, C002, 750.25
];
How Do I Optimize Performance with Qlik Load Strategies?
How Does Memory Management Work with RESIDENT Loads?
Every RESIDENT table occupies additional RAM. With large datasets, it’s critical to drop tables as soon as you’re done with them.
Bad: All tables stay in memory until the end of the script. This can cause an «out of memory» error.
RawData: LOAD * FROM source.csv;
CleanData: LOAD * RESIDENT RawData;
EnrichedData: LOAD * RESIDENT CleanData;
Better: Each intermediate table is dropped immediately after use. Memory is freed up and available for the next step.
RawData: LOAD * FROM source.csv;
CleanData: LOAD * RESIDENT RawData;
DROP TABLE RawData;
EnrichedData: LOAD * RESIDENT CleanData;
DROP TABLE CleanData;
How Does a QVD-Based Staging Pipeline Work for Large Datasets?
For very large datasets (>10 million rows), a purely memory-based RESIDENT approach carries risk. The enterprise solution is to write the output of each stage to a fast QVD file. The next stage then reads from that QVD.
Advantages:
- Minimal memory footprint: Only one table is in memory at a time.
- Restartable: If the script fails at stage 3, you can restart from stage 3 without re-running the first two stages.
- Transparency: You can inspect the QVD intermediate results for troubleshooting.
STORE RawData INTO [lib://Staging/01_Raw.qvd] (qvd);
DROP TABLE RawData;
CleanData:
LOAD * FROM [lib://Staging/01_Raw.qvd] (qvd);
STORE CleanData INTO [lib://Staging/02_Clean.qvd] (qvd);
DROP TABLE CleanData;
FinalData:
LOAD * FROM [lib://Staging/02_Clean.qvd] (qvd);
How Do I Fix Problems with RESIDENT, PRECEDING, or INLINE Loads?
What Does the «Table Not Found» Error Mean in a RESIDENT Load?
Problem: The most common error. It occurs when you try to load from a table that doesn’t (or no longer) exist.
Causes:
- A typo in the table name.
- You already dropped the table with
DROP TABLEearlier in the script.
Use TRACE to check which tables are in memory at the point of the error.
FOR i = 0 to NoOfTables() - 1
TRACE 'Table $(i): $(TableName($(i)))';
NEXT i
What Are the Next Steps in the Course After Load Strategies?
Next up: JOINs vs KEEPs – When to Use Which
Related topics: