Star Schema in Qlik Sense: Performance & Clarity Guide

Q: How do you get started with your first Star Schema in Qlik?

The problem: Your revenue data is in denormalized form (one large table) and expressions get slower as data volumes grow. Solution in 4 steps: Identify transactional data (facts) vs. descriptive data (dimensions) Extract dimensions into separate tables with unique keys Keep only keys and measures in the fact table Leverage Qlik's automatic associations between facts and dimensions // Starting point: Denormalized revenue table SalesFlat: LOAD * FROM [lib://Data/sales_flat.csv]; // Problems: 10M

Q: How does Method 1: Simple Star Schema work in Qlik?

The problem Flat table structures work for small data volumes but quickly become a performance problem as data grows. Redundant data consumes memory and aggregations must iterate over all columns instead of just relevant facts. The solution A central fact table (transactional data) surrounded by multiple dimension tables (master data). Each dimension has a primary key, facts have foreign keys to the dimensions. // Step 1: Define fact table (granularity = one sales record) Sales_Fact: LOAD Sa

Q: How does Method 2: Multiple Facts Star Schema work in Qlik?

The problem Business processes have different granularities: Order header (1x per order) vs order lines (Nx per order) vs payments (Mx per order). A single fact table cannot optimally answer all business questions. The solution Multiple fact tables with different granularities that share common dimension tables. Each fact table is optimized for specific business questions. // Fact 1: Order Header (granularity = one order) OrderHeader_Fact: LOAD OrderID, // PK for this fact C

Q: What are Conformed Dimensions in Method 3?

The problem Enterprise environments have multiple subject areas (Sales, Marketing, Finance, Operations) with their own fact tables. Without coordination, inconsistent dimension definitions emerge → business users see different "truths" depending on the data area. The solution Central, standardized dimension tables shared by all subject areas. Conformed dimensions ensure a unified business view and enable cross-subject-area analysis. // Conformed Dimension 1: Master Customer (for all subject area

Q: How do you troubleshoot common Star Schema issues in Qlik?

Alphabetical index: Aggregations produce wrong totals Circular references despite star schema Facts not connecting to dimensions Performance worse after star schema conversion Synthetic keys between facts and dimensions Wrong granularity – aggregation too high or too low

Q: How do aggregations affect totals in Star Schema?

Symptom: Sum(SalesAmount) in star schema differs from the original flat table Most common causes: Measures in dimensions instead of facts → Solution: Numeric values only in fact tables, dimensions only for descriptive attributes Many-to-many without bridge table → Solution: Model M:N relationships via separate bridge tables Duplicate records in facts → Solution: DISTINCT or GROUP BY in fact load, check for true duplicates Debug steps: Compare row count: TRACE NoOfRows('Sales_Fact'); vs origin

Q: How do you connect facts to dimensions in Star Schema?

Symptom: Fact and dimension tables stand isolated in the data model, no associations Most common causes: Key field name mismatch → Solution: CustomerID in facts must be named exactly CustomerID in dimensions Data type mismatch → Solution: String vs number keys cannot be automatically linked Key values not present → Solution: Orphaned records in facts without corresponding dimension records Debug steps: Check field names: TRACE "Fact Fields: " & FieldName(1, 'Sales_Fact'); Test key overlap

This is Article 10 of the Qlik Sense Data Modeling Course.

📚 Qlik Sense Course – Article 10 of 28

← Previous Article: Resolving Synthetic Keys & Circular References
→ Next Article: Fact vs Dimension – Design Decisions

What can you learn about Star Schema in Qlik in 25 minutes?

After this guide you will be able to:

Implement star schema architecture in Qlik for 3-8x better expression performance
Make fact vs dimension decisions that reduce memory consumption by 40-60%
Model clean business hierarchies without circular references or synthetic keys

Time investment: 25 min reading + 4 hrs hands-on
Prerequisite: Knowledge of Synthetic Keys & Circular References and basic table relationships
Quick win: In 15 minutes, build a working fact-dimension model with immediate performance improvement

The Star schema (Wikipedia) is a widely adopted pattern in dimensional modeling. For Qlik-specific guidance, see the Qlik data modeling best practices.

How do you get started with your first Star Schema in Qlik?

The problem: Your revenue data is in denormalized form (one large table) and expressions get slower as data volumes grow.

Solution in 4 steps:

Identify transactional data (facts) vs. descriptive data (dimensions)
Extract dimensions into separate tables with unique keys
Keep only keys and measures in the fact table
Leverage Qlik’s automatic associations between facts and dimensions

// Starting point: Denormalized revenue table
SalesFlat:
LOAD * FROM [lib://Data/sales_flat.csv];
// Problems: 10M rows x 25 columns = redundant data, slow aggregations

// Star Schema: Fact table (only transactions + keys)
Sales_Fact:
LOAD
    SalesID,
    CustomerID,    // FK to Customer Dimension
    ProductID,     // FK to Product Dimension
    DateID,        // FK to Date Dimension
    SalesAmount,   // Measure
    Quantity,      // Measure
    Discount       // Measure
FROM [lib://Data/sales_flat.csv];

// Customer Dimension (descriptive attributes)
Customer_Dim:
LOAD DISTINCT
    CustomerID,           // PK
    CustomerName,
    CustomerSegment,
    CustomerRegion,
    CustomerCountry
FROM [lib://Data/sales_flat.csv];

// Product Dimension (descriptive attributes)
Product_Dim:
LOAD DISTINCT
    ProductID,            // PK
    ProductName,
    ProductCategory,
    ProductSubCategory,
    ProductBrand,
    ProductUnitPrice
FROM [lib://Data/sales_flat.csv];

// Date Dimension (time hierarchies)
Date_Dim:
LOAD DISTINCT
    Date(SalesDate) as DateID,    // PK
    Year(SalesDate) as Year,
    Month(SalesDate) as Month,
    Week(SalesDate) as Week,
    WeekDay(SalesDate) as WeekDay
FROM [lib://Data/sales_flat.csv];

Explanation:

Fact table: Contains only transaction keys and numeric measures. High granularity, many rows.
Dimension tables: Contain descriptive attributes for facts. Low granularity, few rows.
Automatic associations: CustomerID in both tables → Qlik links them automatically without JOINs.

Performance: 10M rows: Denormalized 45s expression time → Star Schema 8s expression time (5.6x speedup)

Checkpoint: Do you see a central fact table with multiple dimension tables around it in the data model? → Perfect! Still one big table? → Troubleshooting

For systematic modeling, see Fact vs Dimension Design.

Which Star Schema pattern fits your project?

Choose the star schema variant based on data volume, business complexity, and performance requirements:

Pattern	Fact Rows	Dimensions	Complexity	Performance	Ideal for
Simple Star	< 5M	3-8 dims	Low	Good	Standard BI, dashboards, prototypes
Multiple Facts	5M – 50M	5-15 dims	Medium	Very good	Enterprise BI, different granularities
Conformed Dims	> 50M	10-30 dims	High	Optimal	Data warehouse, multi-subject-area, enterprise

Legend:

Fact Rows: Number of transaction rows in the largest fact table
Dimensions: Number of separate dimension tables in the model
Complexity: Development effort and maintenance overhead
Performance: Expression speed and memory efficiency

How does Method 1: Simple Star Schema work in Qlik?

The problem

Flat table structures work for small data volumes but quickly become a performance problem as data grows. Redundant data consumes memory and aggregations must iterate over all columns instead of just relevant facts.

The solution

A central fact table (transactional data) surrounded by multiple dimension tables (master data). Each dimension has a primary key, facts have foreign keys to the dimensions.

// Step 1: Define fact table (granularity = one sales record)
Sales_Fact:
LOAD
    SalesID,              // Primary key of the transaction
    CustomerID,           // FK → Customer_Dim
    ProductID,            // FK → Product_Dim
    Date(SalesDate) as DateID,  // FK → Date_Dim
    EmployeeID,           // FK → Employee_Dim
    SalesAmount,          // Measure: What gets aggregated
    Quantity,             // Measure: What gets aggregated
    Discount,             // Measure: What gets aggregated
    CommissionRate        // Measure: What gets aggregated
FROM [lib://Data/sales_transactions.csv];

// Step 2: Customer Dimension (all customer attributes)
Customer_Dim:
LOAD DISTINCT
    CustomerID,           // Primary Key
    CustomerName,
    CustomerSegment,      // Here: Premium, Standard, Basic
    CustomerRegion,
    CustomerCountry,
    CustomerCreditLimit,
    CustomerSignupDate
FROM [lib://Data/customers.csv];  // Separate master data source

// Step 3: Product Dimension (all product attributes)
Product_Dim:
LOAD DISTINCT
    ProductID,            // Primary Key
    ProductName,
    ProductCategory,      // Here: Electronics, Clothing, Books
    ProductSubCategory,
    ProductBrand,
    ProductUnitPrice,
    ProductCost,
    ProductLaunchDate
FROM [lib://Data/products.csv];   // Separate master data source

// Step 4: Employee Dimension (all employee attributes)
Employee_Dim:
LOAD DISTINCT
    EmployeeID,           // Primary Key
    EmployeeName,
    EmployeeDepartment,   // Here: Sales, Marketing, Support
    EmployeeTitle,
    EmployeeHireDate,
    EmployeeManager
FROM [lib://Data/employees.csv];  // Separate master data source

// Step 5: Date Dimension (time hierarchies for better navigation)
Date_Dim:
LOAD
    DateID,
    Year(DateID) as Year,
    Month(DateID) as Month,
    MonthName(DateID) as MonthName,
    Quarter(DateID) as Quarter,
    Week(DateID) as Week,
    WeekDay(DateID) as WeekDay,
    Day(DateID) as Day
RESIDENT Sales_Fact;
// Here: Generated from sales data, alternatively: Master Calendar

Explanation of strategy:

Fact granularity: One record per transaction/event. Contains only keys (for joins) and measures (for aggregation).
Dimension denormalization: All attributes of a business entity in one table (Customer*). Few rows, many columns.
Natural keys: Business keys (CustomerID, ProductID) instead of technical surrogate keys for better understandability.

Performance context:

5M facts + 4 dimensions: Sum(SalesAmount) 3-5s vs denormalized 15-25s
Memory reduction: 60-80% less RAM through eliminated redundancy
Development speed: 2-3x faster expression development through clear structure

The 3 most common mistakes (and solutions)

Mistake 1: Measures in dimension tables

Symptom: Aggregations produce wrong totals, measures are double-counted
Cause: Numeric values (prices, amounts) were placed in dimensions instead of facts
Solution: Rule: What gets aggregated belongs in facts, what describes belongs in dimensions
Code: ProductUnitPrice in Product_Dim is OK, SalesAmount only in Sales_Fact

Mistake 2: Over-granular dimensions (snowflake instead of star)

Symptom: Many small dimension tables, complex joins, poor performance
Cause: Region → Country → Continent as separate tables instead of denormalized
Solution: Denormalize hierarchies into one dimension table
Code: Customer_Dim with CustomerCountry AND CustomerContinent, not a separate Country_Dim

Mistake 3: Business logic in facts instead of dimensions

Symptom: Calculated fields everywhere in the code, hard to maintain, inconsistent
Cause: Derived attributes (CustomerSegment, ProductCategory) are calculated in expressions
Solution: Move business logic to the script, store results in dimensions
Code: If(SalesAmount > 10000, 'Premium', 'Standard') as CustomerSegment in Customer_Dim

Best Practices

Fact table granularity rule: One row = one business transaction. Never pre-aggregated (Qlik does that at runtime). Test: Can you reconstruct an invoice/order/event from every fact row? Then granularity is correct.
Dimension completeness: Every dimension must contain ALL possible attributes of the business entity. Don’t spread them across multiple tables. CustomerSegment, CustomerRegion, CustomerCredit all in Customer_Dim.
Surrogate vs natural keys: Use business keys (CustomerID from ERP) instead of technical ones (AutoIncrement). Better debuggability and business users can understand the data. Surrogate keys are only needed for Slowly Changing Dimensions.
Date dimension standardization: Always create a separate date dimension with complete hierarchies (Year, Quarter, Month, Week). Never parse dates in expressions. Prepare centralized date logic (fiscal year, holiday calendar) in the dimension.

Checkpoint: Why should hierarchies (Region → Country → Continent) be in one dimension table instead of separate tables?

Show answer

Separate hierarchy tables create a snowflake schema → more joins → worse performance. Qlik optimizes for star schemas with few large dimensions. Denormalized hierarchies in one table also enable better drill-down performance and simpler expression syntax.

For very complex multi-subject areas, see Multiple Facts Pattern.

→ Next step: When you have different granularities (order-level and order-line-level), you need Multiple Facts. Implement Multiple Facts Star Schema.

How does Method 2: Multiple Facts Star Schema work in Qlik?

The problem

Business processes have different granularities: Order header (1x per order) vs order lines (Nx per order) vs payments (Mx per order). A single fact table cannot optimally answer all business questions.

The solution

Multiple fact tables with different granularities that share common dimension tables. Each fact table is optimized for specific business questions.

// Fact 1: Order Header (granularity = one order)
OrderHeader_Fact:
LOAD
    OrderID,              // PK for this fact
    CustomerID,           // FK → Customer_Dim
    EmployeeID,           // FK → Employee_Dim
    Date(OrderDate) as DateID,  // FK → Date_Dim
    OrderTotalAmount,     // Measure: Total order value
    OrderShippingCost,    // Measure: Shipping costs
    OrderDiscount,        // Measure: Total discount
    OrderItemCount        // Measure: Number of line items
FROM [lib://Data/orders.csv];

// Fact 2: Order Lines (granularity = one order line item)
OrderLine_Fact:
LOAD
    OrderLineID,          // PK for this fact
    OrderID,              // FK → OrderHeader_Fact (drill-through)
    ProductID,            // FK → Product_Dim
    CustomerID,           // FK → Customer_Dim (redundant but performance-optimal)
    Date(OrderDate) as DateID,  // FK → Date_Dim (redundant)
    LineQuantity,         // Measure: Quantity for this line
    LineUnitPrice,        // Measure: Unit price
    LineAmount,           // Measure: Line total
    LineDiscountPercent   // Measure: Discount %
FROM [lib://Data/order_lines.csv];

// Fact 3: Payments (granularity = one payment)
Payment_Fact:
LOAD
    PaymentID,            // PK for this fact
    OrderID,              // FK → OrderHeader_Fact
    CustomerID,           // FK → Customer_Dim (redundant)
    Date(PaymentDate) as PaymentDateID,  // FK → Date_Dim
    PaymentAmount,        // Measure: Payment amount
    PaymentMethod,        // Dimension: Credit Card, Bank Transfer
    PaymentStatus         // Dimension: Completed, Pending, Failed
FROM [lib://Data/payments.csv];

// Shared Dimensions (used by all facts)
Customer_Dim:
LOAD DISTINCT
    CustomerID,           // Shared PK
    CustomerName,
    CustomerSegment,
    CustomerRegion
FROM [lib://Data/customers.csv];

Product_Dim:
LOAD DISTINCT
    ProductID,            // Shared PK
    ProductName,
    ProductCategory,
    ProductBrand
FROM [lib://Data/products.csv];

// Advanced: Date Dimension with role-playing for different date types
Date_Dim:
LOAD
    DateID,
    Year(DateID) as Year,
    Month(DateID) as Month,
    Quarter(DateID) as Quarter,
    WeekDay(DateID) as WeekDay
// Generates all dates from min to max across all facts
WHERE DateID >= Date(Floor(MinDate)) AND DateID <= Date(Ceil(MaxDate));

Explanation of strategy:

Granularity optimization: Each fact at the optimal level for typical business questions.
Controlled redundancy: CustomerID in all facts for performance, even though derivable via OrderID.
Shared dimensions: One Customer_Dim for all facts → consistent navigation and filtering.

Performance context:

Order-level analysis (revenue per customer): OrderHeader_Fact 2-4s vs OrderLine_Fact 8-15s
Product-level analysis (units sold): OrderLine_Fact 3-6s vs OrderHeader_Fact impossible
Payment analysis (collection performance): Payment_Fact 1-3s vs calculated from orders 10-20s

The 3 most common mistakes (and solutions)

Mistake 1: Inconsistent dimensions between facts

Symptom: CustomerSegment in OrderHeader_Fact differs from Payment_Fact
Cause: Each fact table pulls customer data from different sources/timestamps
Solution: One central Customer_Dim table, all facts reference it
Code: Never load customer attributes directly into facts, always via Customer_Dim

Mistake 2: Over-normalization – too many granular facts

Symptom: 15+ fact tables for every mini-process, complex expressions
Cause: Every database table becomes a separate fact without business context
Solution: Consolidate facts by business processes, not technical tables
Code: OrderLines + OrderCharges + OrderTaxes → one OrderLine_Fact with calculated totals

Mistake 3: Missing bridge tables for many-to-many

Symptom: Wrong aggregations for customers with multiple addresses or products with multiple categories
Cause: M:N relationships were not modeled as separate link tables
Solution: Bridge tables for M:N, allocation factors for correct distribution
Code: Customer_Address_Bridge with weight factors for primary/secondary address

Best Practices

Fact consolidation rule: Maximum 5-8 fact tables per subject area. More confuses business users and complicates cross-fact analysis. Rule: One fact per major business process (Sales, Purchases, Production, Service).
Redundant foreign keys: CustomerID in all facts even when derivable via OrderID. Performance gain outweighs normalization. Exception: High-cardinality keys (OrderID in Payment) only when truly needed.
Date role-playing: One Date_Dim but multiple roles (OrderDate, ShipDate, PaymentDate). Use aliases in expressions: Sum({<OrderDate={'2024'}>} Amount) vs Sum({<PaymentDate={'2024'}>} Amount).
Fact-to-fact relationships: Explicitly document how facts relate (Order → OrderLine → Payment). Bridge tables for M:N, drill-through keys for 1:N. Never create direct associations between facts without a bridge.

Checkpoint: Why should CustomerID be in both OrderHeader_Fact and OrderLine_Fact, even though it’s derivable via OrderID?

Show answer

Performance and flexibility. Direct CustomerID in OrderLine_Fact enables fast customer-to-product analysis without joining through OrderHeader. Qlik can also perform better indexing and memory optimization. The memory overhead is minimal compared to the performance gain for complex expressions.

For enterprise-scale with 50+ facts, see Conformed Dimensions Pattern.

→ Next step: For enterprise environments with many subject areas, you need Conformed Dimensions. Implement Conformed Dimensions.

What are Conformed Dimensions in Method 3?

The problem

Enterprise environments have multiple subject areas (Sales, Marketing, Finance, Operations) with their own fact tables. Without coordination, inconsistent dimension definitions emerge → business users see different “truths” depending on the data area.

The solution

Central, standardized dimension tables shared by all subject areas. Conformed dimensions ensure a unified business view and enable cross-subject-area analysis.

// Conformed Dimension 1: Master Customer (for all subject areas)
Customer_Conformed:
LOAD
    CustomerID,                    // Standardized Business Key
    CustomerName,
    CustomerSegment,               // Consistent: Premium, Gold, Silver, Bronze
    CustomerRegion,                // Consistent: North, South, East, West
    CustomerCountry,               // Consistent: ISO Country Codes
    CustomerTier,                  // Business Rule: Based on 12-month Revenue
    CustomerSignupDate,
    CustomerStatus,                // Active, Inactive, Suspended
    CustomerCreditRating,          // A, B, C, D (Enterprise-wide Standard)
    CustomerIndustryCode,          // NAICS Standard Classification
    // Calculated fields for enterprise consistency
    If(CustomerTier = 'Enterprise', 1, 0) as IsEnterpriseCustomer,
    If(Year(CustomerSignupDate) = Year(Today()), 1, 0) as IsNewCustomer
FROM [lib://MasterData/Customer_Master.qvd] (qvd);

// Subject Area 1: Sales Facts
Sales_Fact:
LOAD
    SalesID,
    CustomerID,                    // References Customer_Conformed
    ProductID,                     // References Product_Conformed
    Date(SalesDate) as DateID,     // References Date_Conformed
    SalesRepID,                    // References Employee_Conformed
    SalesAmount,
    SalesQuantity,
    SalesDiscount
FROM [lib://Sales/sales_transactions.qvd] (qvd);

// Subject Area 2: Marketing Facts
Campaign_Fact:
LOAD
    CampaignID,
    CustomerID,                    // Same Customer_Conformed as Sales
    Date(CampaignDate) as DateID,  // Same Date_Conformed as Sales
    CampaignType,
    CampaignCost,
    CampaignImpressions,
    CampaignClicks,
    CampaignConversions
FROM [lib://Marketing/campaigns.qvd] (qvd);

// Subject Area 3: Service Facts
Service_Fact:
LOAD
    ServiceTicketID,
    CustomerID,                    // Same Customer_Conformed
    ProductID,                     // Same Product_Conformed
    Date(ServiceDate) as DateID,   // Same Date_Conformed
    ServiceRepID,                  // References Employee_Conformed
    ServiceType,
    ServiceDuration,
    ServiceCost,
    ServiceSatisfactionScore
FROM [lib://Service/service_tickets.qvd] (qvd);

// Conformed Product Dimension (shared across all subject areas)
Product_Conformed:
LOAD
    ProductID,                     // Standardized across all systems
    ProductName,
    ProductCategory,               // L1: Electronics, Clothing, Books
    ProductSubCategory,            // L2: Smartphones, Laptops, Accessories
    ProductBrand,
    ProductLaunchDate,
    ProductStatus,                 // Active, Discontinued, Phase-Out
    ProductProfitMargin,           // Calculated by Finance team
    // Calculated business rules
    If(ProductProfitMargin > 0.4, 'High Margin',
       If(ProductProfitMargin > 0.2, 'Medium Margin', 'Low Margin')) as ProductProfitSegment
FROM [lib://MasterData/Product_Master.qvd] (qvd);

// Conformed Date Dimension (enterprise calendar)
Date_Conformed:
LOAD
    DateID,
    Year(DateID) as Year,
    Month(DateID) as Month,
    Quarter(DateID) as Quarter,
    Week(DateID) as Week,
    WeekDay(DateID) as WeekDay,
    // Enterprise-specific business calendar
    If(Month(DateID) >= 7, Year(DateID), Year(DateID)-1) as FiscalYear,  // Fiscal year starts July
    If(WeekDay(DateID) >= 1 and WeekDay(DateID) <= 5, 1, 0) as IsBusinessDay,
    If(Match(DateID, '2024-01-01', '2024-07-04', '2024-12-25'), 1, 0) as IsHoliday
FROM [lib://MasterData/Calendar_Master.qvd] (qvd);

// Conformed Employee Dimension
Employee_Conformed:
LOAD
    EmployeeID,
    EmployeeName,
    EmployeeDepartment,            // Sales, Marketing, Service, Finance
    EmployeeTitle,
    EmployeeRegion,                // Same regions as Customer_Conformed
    EmployeeHireDate,
    EmployeeManager,
    EmployeeStatus                 // Active, On Leave, Terminated
FROM [lib://MasterData/Employee_Master.qvd] (qvd);

Explanation of strategy:

Single source of truth: One Customer_Conformed for Sales, Marketing, Service → unified customer view.
Cross-subject analysis: Customer revenue (Sales) vs marketing spend (Campaigns) vs service cost (Support) → ROI analysis possible.
Business rule centralization: Fiscal year definition, customer tier logic centralized in dimensions → no code duplication.

Performance context:

Cross-subject expressions: Customer LTV (3 facts) 8-15s vs separate models 45-90s
Consistent aggregations: CustomerSegment definitions identical across all apps
Memory efficiency: Shared dimensions 40-60% less memory than duplicate dimensions

The 3 most common mistakes (and solutions)

Mistake 1: Dimension proliferation – too many “conformed” variants

Symptom: Customer_Sales, Customer_Marketing, Customer_Service as separate dimensions
Cause: Each team creates their own customer definition because the shared one is too complex
Solution: One true Customer_Conformed + subject-area-specific attributes in facts or bridge tables
Code: Customer_Marketing_Attributes as a separate table, not Customer_Marketing_Dim

Mistake 2: Governance vacuum – no central dimension ownership

Symptom: Conformed dimensions are inconsistently modified by different teams
Cause: No clear responsibility for master data management
Solution: Data governance board defines dimension owner per conformed dimension
Code: Customer_Conformed owned by Sales Ops, Product_Conformed owned by Product Management

Mistake 3: Over-conforming – too many attributes in conformed dimensions

Symptom: Customer_Conformed has 50+ columns, becomes unwieldy and slow
Cause: Each subject-area team adds “their” customer attributes
Solution: Core conformed (10-15 attributes) + subject-area extensions (separate tables)
Code: Customer_Core_Conformed (Name, Segment, Region) + Customer_Sales_Extension (Territory, Rep)

Best Practices

Dimension ownership matrix: Explicitly define who maintains which conformed dimension (Customer → Sales Ops, Product → Product Mgmt, Employee → HR). Change requests go through the dimension owner. Prevents chaos and inconsistencies.
Core vs extended attributes: Conformed dimensions contain only universally relevant attributes (Name, Status, Hierarchy). Subject-area-specific attributes in extension tables or directly in facts. Keeps dimensions focused and maintainable.
Version control for dimensions: Conformed dimensions are critical enterprise assets. Git-based version control, change approval process, rollback capability. Track every change with business justification.
Cross-subject validation: Automated tests that customer revenue (Sales) + customer cost (Service) + customer acquisition cost (Marketing) = consistent customer P&L. Validate business logic consistency across subject areas.

Checkpoint: Why should fiscal year definitions be in Date_Conformed instead of calculated separately in each app?

Show answer

Consistency and performance. If the Sales app calculates FiscalYear differently than the Finance app, you get different “YTD” numbers → business confusion. Central definition in the conformed dimension ensures enterprise-wide consistency. Performance bonus: Calculation happens once in ETL instead of on every expression.

For complete enterprise implementation, see Data Governance Frameworks.

→ Next step: Conformed dimensions require enterprise data governance. Data Governance Framework implements standards and quality assurance.

How do all Star Schema patterns compare in performance?

These benchmarks are based on typical enterprise data volumes (standard hardware: 32 GB RAM, SSD, optimized QVD loads).

Pattern	Expressions (5M Facts)	Cross-Subject Analysis	Memory (Peak)	Development Speed	Maintenance	Enterprise-Ready
Denormalized Flat	15-25s	Not possible	3.2 GB	Fast	High	No
Simple Star	3-6s	Limited	1.8 GB	Medium	Medium	For smaller orgs
Multiple Facts	2-4s	Good	2.1 GB	Medium	Medium-High	Yes
Conformed Dims	2-3s	Excellent	1.9 GB	Slow	Low	Optimal

Legend:

Expressions: Average time for Sum(Amount) by Customer, Product for 5M fact rows
Cross-subject analysis: Ability to calculate customer LTV across Sales + Marketing + Service
Memory (Peak): Maximum RAM usage during load + first expression calculation
Development speed: Time from requirements to working app
Maintenance: Long-term effort for changes, new requirements, debugging

Interpretation: Conformed Dimensions have the highest initial complexity but best long-term performance and maintainability. Multiple Facts is the sweet spot for mid-size enterprise environments. Simple Star for teams that want to get started quickly.

How do you troubleshoot common Star Schema issues in Qlik?

Alphabetical index:

Aggregations produce wrong totals
Circular references despite star schema
Facts not connecting to dimensions
Performance worse after star schema conversion
Synthetic keys between facts and dimensions
Wrong granularity – aggregation too high or too low

How do aggregations affect totals in Star Schema?

Symptom: Sum(SalesAmount) in star schema differs from the original flat table

Most common causes:

Measures in dimensions instead of facts → Solution: Numeric values only in fact tables, dimensions only for descriptive attributes
Many-to-many without bridge table → Solution: Model M:N relationships via separate bridge tables
Duplicate records in facts → Solution: DISTINCT or GROUP BY in fact load, check for true duplicates

Debug steps:

Compare row count: TRACE NoOfRows('Sales_Fact'); vs original table
Test with simple Sum(SalesAmount) without dimensions → should be identical
Check whether measures accidentally ended up in dimension tables

Code fix:

// Wrong: Amount in Dimension
Customer_Dim: LOAD CustomerID, CustomerName, CustomerRevenue FROM customers.csv;

// Correct: Only descriptive attributes in Dimensions
Customer_Dim: LOAD CustomerID, CustomerName, CustomerSegment FROM customers.csv;
// Revenue calculated from Facts: Sum(SalesAmount)

How do you connect facts to dimensions in Star Schema?

Symptom: Fact and dimension tables stand isolated in the data model, no associations

Most common causes:

Key field name mismatch → Solution: CustomerID in facts must be named exactly CustomerID in dimensions
Data type mismatch → Solution: String vs number keys cannot be automatically linked
Key values not present → Solution: Orphaned records in facts without corresponding dimension records

Debug steps:

Check field names: TRACE "Fact Fields: " & FieldName(1, 'Sales_Fact');
Test key overlap: Count(DISTINCT CustomerID) in both tables
Check data types with sample values

Code fix:

// Ensure consistent data types and field names
Sales_Fact:
LOAD
    Text(CustomerID) as CustomerID,  // Force to text
    SalesAmount
FROM sales.csv;

Customer_Dim:
LOAD
    Text(CustomerID) as CustomerID,  // Same type as Facts
    CustomerName
FROM customers.csv;

Why is performance worse after Star Schema conversion?

Symptom: Expressions take longer than before with the denormalized table

Most common causes:

Too many small dimensions → Solution: Consolidate related dimensions (snowflake → star)
Non-optimized QVD loads → Solution: Check that all tables load as optimized QVDs
Over-normalization → Solution: Denormalize hierarchies into dimensions

Debug steps:

Measure memory usage before/after: DocumentSize()
Profiling: Which specific expressions have gotten slower?
Check QVD load times: “optimized” vs “non-optimized” in the log

Code fix:

// Consolidate related dimensions
Customer_Dim:
LOAD
    CustomerID,
    CustomerName,
    CustomerRegion,    // Denormalized instead of separate Region_Dim
    CustomerCountry,   // Denormalized instead of separate Country_Dim
    CustomerContinent  // Full hierarchy in one table
FROM customers.csv;

See also: QVD Performance Optimization for memory-optimized star schemas

What are the next steps for Star Schema in Qlik?

You can now build performant, clean star schemas that meet enterprise requirements. For production-ready implementation, dive deeper into these topics:

1. Advanced modeling: Star schemas are the foundation for more complex patterns. Fact vs Dimension Design Decisions shows you how to make optimal business entity assignments.

2. Performance maximization: Clean models need optimized load patterns. QVD Optimization for Star Schemas achieves maximum memory efficiency and load speed.

3. Enterprise governance: Conformed dimensions require data governance. Enterprise Data Governance Framework establishes standards, quality gates, and change management for scalable star schema architectures.

What are the next steps in the Star Schema course?

Phases 1 & 2 complete! You have mastered the fundamentals (articles 1-5) and advanced data processing (articles 6-10).

Upcoming phases:

Phase 3 (Articles 11-15): Data Modeling Concepts
Phase 4 (Articles 16-20): Advanced Development
Phase 5 (Articles 21-28): Enterprise & Operations

Related topics:

Previous: Data Model Problems | Next: Fact vs Dimension Design

What would help you most right now?

Thanks!