LEARNING PATHS & COURSES

Qlik Sense JOINs vs KEEPs: When to Use Which Strategy

KlarMetrics

October 5, 2025 ยท 7 min read

This is Article 3 of the Qlik Sense Data Modeling Course.

๐Ÿ“š Qlik Sense Course – Article 3 of 28

โ† Previous Article: Load Strategies – RESIDENT vs PRECEDING vs INLINE
โ†’ Next Article: Mapping Tables vs JOINs – Performance Optimization

What can you learn about JOINs vs KEEPs in 10 minutes?

Table linking is at the heart of every data model. Qlik offers several techniques that go far beyond the classic SQL JOIN. This guide shows you which strategy is best for your goal.

  1. Choose the right linking strategy for your data model (JOIN, KEEP, or CONCATENATE).
  2. Understand performance differences: Why a KEEP can be up to 5x faster than a JOIN.
  3. Implement the WHERE EXISTS() pattern as a highly efficient filtering alternative.

Time investment: 10 min reading + 1 hr hands-on
Prerequisite: Load Strategies understood
Quick win: Implement a performant KEEP for filtering in 5 minutes


How do you filter tables quickly without merging them?

The problem: You have a large fact table (Orders) and a smaller dimension table (Customers). You want to load only orders from active customers into your data model, but keep both tables separate for flexible analysis.

The solution with KEEP: Instead of merging the tables into a single, large table with a JOIN, we use KEEP to filter one table by the other.

Analogy: Think of JOIN as copying two Excel sheets into a single, wide worksheet. KEEP, on the other hand, is like taking a guest list (Customers) and crossing off everyone in another list (Orders) who isn’t on the guest list. In the end, you still have two separate lists.

First, we load the dimension table and pre-filter it to include only the “active” customers relevant to us.

Customers:
LOAD
    CustomerID,
    CustomerName,
    Region,
    Status
FROM [lib://DataFiles/customers.csv]
(txt, utf8, embedded labels, delimiter is ';')
WHERE Status = 'Active';

Next, we load the fact table completely into memory.

Orders:
LOAD
    OrderID,
    CustomerID,
    OrderDate,
    Amount
FROM [lib://DataFiles/orders.csv]
(txt, utf8, embedded labels, delimiter is ';');

Now comes the crucial step: We instruct Qlik to keep only the rows in the already loaded Orders table whose CustomerID also exists in the Customers table. The tables themselves remain separate.

INNER KEEP (Customers)
LOAD * RESIDENT Orders;

Explanation:

  • Separate tables: The result is two lean tables. This is the foundation for the performant Star Schema in Qlik.
  • INNER KEEP: Retains only the intersection in the target table (here Orders).
  • Automatic association: Qlik recognizes the shared field CustomerID and automatically links the two tables in the data model.

Checkpoint: Do you see two separate tables in the data model viewer connected by a line at CustomerID? Perfect!


When should you use which linking strategy?

The right choice depends on one central question: What is the goal of your operation? For simple 1:1 lookups, ApplyMap as a faster alternative to JOINs can reduce script runtime significantly. For complex relationships involving multiple shared fields, the choice becomes more nuanced.

Link Type Result in Data Model Performance Ideal for…
JOIN A single, wider table Slow Denormalization: When you intentionally want to create a flat “report table” for simple analysis.
KEEP Two separate, filtered tables Fast Star Schema: When you want to filter fact tables based on dimensions and keep your data model clean.
CONCATENATE A single, longer table Medium Union: When you want to stack tables with the same structure vertically (e.g. Sales Q1 + Sales Q2).
Association Two separate, unfiltered tables Very fast Dimensional model: The default case in Qlik. Simply load tables and let Qlik create associations via matching field names.

When should you use JOINs instead of KEEPs for denormalization?

A JOIN combines columns from two tables into a single new table. This process is called denormalization because it dissolves the normalized structure of a database model. This can reduce complexity for end users but often comes at the cost of performance and flexibility.

In the following example, customer information is appended directly to each row of the Orders table. The result is a single, wide table.

Orders:
LOAD
    OrderID, CustomerID, OrderDate, Amount
FROM [lib://DataFiles/orders.csv];

LEFT JOIN (Orders)
LOAD
    CustomerID, CustomerName, Region, Segment
FROM [lib://DataFiles/customers.csv];

What JOIN types exist and when do you use them compared to KEEPs?

The full syntax and options for each join variant are described in the Qlik official JOIN documentation.

The different JOIN types are best visualized with Venn diagrams.

  • INNER JOIN: Keeps only rows that have a matching key in both tables (the intersection).
  • LEFT JOIN: Keeps all rows from the left (first) table and adds matching data from the right table. Where no match is found, NULL values are created.
  • RIGHT JOIN: Keeps all rows from the right (second) table. The opposite of LEFT JOIN.
  • OUTER JOIN: Keeps all rows from both tables and combines them.

When JOINs create synthetic keys due to multiple shared fields, consider using link tables for complex relationships to resolve the relationship cleanly.

Warning – performance killer: A JOIN can lead to a Cartesian product if the key in the second table is not unique. Example: If a customer appears twice in the customer table, every order for that customer gets duplicated. This leads to incorrect KPIs and exploding memory usage.


When should JOINs and KEEPs be used in a Star Schema?

The main purpose of KEEP is to maintain a clean and performant Star Schema. A Star Schema consists of a central fact table (e.g. Orders) surrounded by multiple dimension tables (Customers, Products, etc.). This model is highly optimized for Qlik’s associative engine.

KEEP filters a table based on the keys of another table without merging them.

First, we load both tables into memory.

Customers:
LOAD * FROM customers.csv;

Orders:
LOAD * FROM orders.csv;

INNER KEEP retains only the rows in the Orders table whose CustomerID also exists in Customers.

INNER KEEP (Customers)
LOAD * RESIDENT Orders;

RIGHT KEEP retains only the rows in the Customers table whose CustomerID also exists in Orders (i.e. only customers with orders).

RIGHT KEEP (Orders)
LOAD * RESIDENT Customers;

LEFT KEEP would keep all rows in the Orders table whose CustomerID exists in Customers, plus all rows from the original Orders table. Since Orders is the target table here, this command often has little practical use.


For all KEEP variants and their exact filter behaviour, refer to the Qlik official KEEP reference.

When should you use CONCATENATE for vertical table combining?

CONCATENATE is the vertical counterpart to JOIN. Instead of placing columns side by side, it stacks rows on top of each other. It is Qlik’s equivalent of UNION ALL in SQL.

If you load two tables that have exactly the same column names and the same number of columns, Qlik performs an automatic concatenation.

Sales_Q1:
LOAD 'Q1' as Quarter, * FROM [lib://DataFiles/sales_q1.csv];

Sales_Q2:
LOAD 'Q2' as Quarter, * FROM [lib://DataFiles/sales_q2.csv];

In the example above, the contents of Sales_Q2 are automatically appended to Sales_Q1. To control this explicitly or when columns don’t match exactly, use the CONCATENATE command. To prevent automatic concatenation, use NoConcatenate.

NoConcatenate
HistoricalOrders:
LOAD * FROM historical_orders.csv;

When should you use WHERE EXISTS() as a filtering alternative?

WHERE EXISTS() is an extremely efficient alternative to INNER KEEP. The main difference lies in the execution:

  • INNER KEEP: First loads both tables completely into memory, then performs the filtering.
  • WHERE EXISTS(): Filters the second table row by row as it is being loaded. Only rows that pass the filter are loaded into memory. This is often more memory-efficient and faster.

The order matters: The table containing the valid keys (the “whitelist”) must be loaded first.

First, we load the “whitelist” of active customers.

Customers:
LOAD CustomerID, CustomerName, Region
FROM [lib://DataFiles/customers.csv]
WHERE Status = 'Active';

Then we load the Orders table. The WHERE EXISTS(CustomerID) clause ensures that only rows are loaded whose CustomerID already exists in a field of the same name in a previously loaded table.

Orders:
LOAD
    OrderID, CustomerID, OrderDate, Amount
FROM [lib://DataFiles/orders.csv]
WHERE EXISTS(CustomerID);

This pattern is particularly powerful for hierarchical filtering across multiple levels (Category โ†’ Product โ†’ Sale).


How do I troubleshoot JOINs and KEEPs?

The Qlik Community discussion on JOIN vs KEEP covers many real-world scenarios and edge cases worth reviewing before debugging your own models.

What is the Cartesian product problem with JOINs?

Problem: After a JOIN, your table suddenly has millions more rows than expected and KPIs are far too high.

Cause: The join key is not unique in the second table. This causes every row from the first table to be multiplied by every matching row in the second table.

Solution: Ensure the key in the dimension table is unique. Use aggregations (GROUP BY) or a QUALIFY statement to clean up duplicates before performing the JOIN.

Why isn’t my KEEP working?

Problem: The KEEP command isn’t filtering any rows, even though it should.

Checklist:

  1. Same field names? The key field must have exactly the same name in both tables (CustomerID vs. Customer_ID won’t work).
  2. Same data types? A numeric key cannot be compared with a text key. Make sure the formats match.
  3. Correct order? The table referenced by KEEP (e.g. Customers in INNER KEEP (Customers)) must already be fully loaded.

What are the next steps in the course?

Up next: Mapping Tables vs JOINs – Performance Optimization

Related topics:

Previous: Load Strategies | Next: ApplyMap & Lookups

What would help you most right now?

Thanks!