VISUALIZATION & DASHBOARD DESIGN

Qlik Sense Scatter Plots: How to Uncover Hidden Data Correlations

Autor

Qlik Doktor

September 28, 2025 · 8 min read

Last updated: September 2025 | Reading time: 10 minutes | Level: Advanced

The problem: You suspect that marketing spend and sales performance are connected, but how strong is the relationship? And what other hidden correlations is everyone else missing? Traditional charts show only one dimension — you need a visualization for relationships.

The solution: Scatter Plots reveal correlations that remain invisible in other charts. In 10 minutes you’ll learn how to find hidden relationships and make data-driven decisions before the competition even knows what to look for.

What Will You Learn in This Qlik Sense Scatter Plot Tutorial?

  • Correlation detection: Distinguish strong, weak, and false relationships
  • Multi-variable analysis: Analyze 3-4 dimensions simultaneously
  • Business insights: 5 practical use cases with immediate findings
  • Statistical significance: Separate real trends from random patterns
  • Advanced features: Clustering, forecasting, and outlier detection

How Scatter Plots Revolutionize Your Data Analysis

Understanding the Hidden Correlation Problem

Classic situation: You’re the Marketing Director and you see:

  • Campaign A: $50K budget, 200 leads
  • Campaign B: $80K budget, 180 leads
  • Campaign C: $30K budget, 250 leads

Without Scatter Plot: You see only individual values and guess

With Scatter Plot: You instantly see that budget does NOT correlate linearly to leads and discover the sweet spot at $30-40K

The Multi-Dimensional Secret of Qlik Sense Scatter Plots

Scatter Plots can display 4 dimensions simultaneously:

  • X-axis: Marketing Budget
  • Y-axis: Generated Leads
  • Bubble size: Lead Quality Score
  • Color: Campaign Type

When Are Scatter Plots the Right Choice in Qlik Sense?


perfect_for:
  - Marketing ROI vs. Lead Quality
  - Employee performance vs. experience
  - Product price vs. sales volume vs. customer satisfaction
  - Website traffic vs. conversion vs. bounce rate
  - Production output vs. quality vs. cost

not_suitable_for:
  - Fewer than 20 data points (→ Table)
  - Time series trends (→ Line Chart)
  - Category comparisons (→ Bar Chart)
  - Hierarchical data (→ Treemap)

How to Set Up Your First Scatter Plot in 10 Minutes

Before configuring, skim the Qlik official scatter plot documentation to confirm which features are available in your Qlik Cloud version.

Identifying the Data Relationship

Minimum requirements:

  • 2 numeric variables (X and Y axis)
  • At least 20-30 data points
  • Optional: 1-2 additional dimensions (size, color)

Example data structure:


Campaign | Budget | Leads | Quality_Score | Channel | Region
Campaign_A | 45000 | 220 | 8.2 | Social | North
Campaign_B | 32000 | 180 | 9.1 | Email | South
Campaign_C | 67000 | 290 | 7.8 | PPC | West

How to Configure a Scatter Plot in Qlik Sense

  1. Add new object → «Scatter plot»
  2. Dimension (bubble identification):

Field: [Campaign]
Label: "Campaign"
  1. X-axis (first variable):

Expression: Sum([Budget])
Label: "Marketing Budget $"
Axis limit: Auto
  1. Y-axis (second variable):

Expression: Count([Leads])
Label: "Generated Leads"
Axis limit: 0 to Auto

How to Add a Third Dimension (Bubble Size)

Bubble size (optional third dimension):


expression: "Avg([Quality_Score])"
label: "Lead Quality"
size_range: "3 to 15"

How to Add a Fourth Dimension (Color)

Color (optional fourth dimension):


expression: "[Channel]"
color_palette: "12-color distinct"
legend: "Right"

Business Scenarios from Marketing to Operations

Scenario 1: Marketing ROI Optimization

Problem: «Which campaigns deliver the best ROI with the highest lead quality?»

Setup:


dimension: "[CampaignID]"
x_axis:
  expression: "Sum([MarketingSpend])"
  title: "Marketing Budget $"

y_axis:
  expression: "Sum([Revenue])"
  title: "Generated Revenue $"

bubble_size:
  expression: "Avg([LeadScore])"
  title: "Avg Lead Quality"

color:
  expression: "[Channel]"
  values: ["Social", "Email", "PPC", "Content"]

trend_line: true

Business insight: You instantly see campaigns with high ROI AND high lead quality. The sweet spot between budget and results becomes visible.

Scenario 2: Sales Performance vs. Experience

Problem: «Does sales performance correlate with years of experience?»

Setup:


dimension: "[SalesPersonID]"
x_axis:
  expression: "[YearsExperience]"
  title: "Years of Experience"

y_axis:
  expression: "Sum([SalesRevenue])"
  title: "Annual Revenue $"

bubble_size:
  expression: "Count([CustomerRetention])"
  title: "Customer Retention Rate"

color:
  expression: "[SalesTeam]"
  conditional_coloring: true

Advanced feature: Display the correlation coefficient:


-- Add as a text object:
Correl(Aggr(Sum([SalesRevenue]), [SalesPersonID]),
       Aggr([YearsExperience], [SalesPersonID]))

Scenario 3: Finding the Product Price Sweet Spot

Problem: «Is there an optimal price for maximum sales volume?»

Setup:


dimension: "[ProductID]"
x_axis:
  expression: "Avg([Price])"
  title: "Average Price $"

y_axis:
  expression: "Sum([UnitsSold])"
  title: "Units Sold"

bubble_size:
  expression: "Sum([Price] * [UnitsSold])"
  title: "Total Revenue"

color:
  expression: "[ProductCategory]"

reference_lines:
  x_line:
    value: "Avg(TOTAL [Price])"
    label: "Market Average Price"
  y_line:
    value: "Avg(TOTAL [UnitsSold])"
    label: "Average Sales"

Scenario 4: Website Performance Analysis

Problem: «How are traffic, bounce rate, and conversions connected?»

Setup:


dimension: "[PageURL]"
x_axis:
  expression: "Sum([PageViews])"
  title: "Page Views"

y_axis:
  expression: "Sum([Conversions]) / Sum([Sessions])"
  title: "Conversion Rate %"

bubble_size:
  expression: "Avg([TimeOnPage])"
  title: "Time on Page"

color:
  expression: "If([BounceRate] > 0.7, 'High', If([BounceRate] > 0.4, 'Medium', 'Low'))"
  title: "Bounce Rate Category"

Scenario 5: Production Efficiency vs. Quality

Problem: «Does higher production speed lead to worse quality?»

Setup:


dimension: "[ProductionLineID]"
x_axis:
  expression: "Sum([UnitsPerHour])"
  title: "Units per Hour"

y_axis:
  expression: "Avg([QualityScore])"
  title: "Quality Score (1-10)"

bubble_size:
  expression: "Sum([ProductionCosts])"
  title: "Production Costs"

color:
  expression: "[Shift]"
  values: ["Morning", "Afternoon", "Night"]

cluster_analysis: true

Advanced Features for Pro-Level Scatter Plots

The scatter plot properties reference documents every setting available in the properties panel, including clustering and regression options covered below.

Automatic Clustering


clustering:
  enabled: true
  number_of_clusters: 4
  method: "K-Means"
  color_by_cluster: true
  show_centroids: true

Business value: Automatic grouping of similar data points (e.g., High/Medium/Low performers).

Regression Lines and R² in Qlik Sense Scatter Plots


trend_analysis:
  linear_regression:
    enabled: true
    show_r_squared: true
    confidence_interval: 95%

  polynomial_regression:
    degree: 2
    for_curved_data: true

For detailed data exploration beyond visual patterns, use tables for detailed data exploration to drill into the individual data points behind your scatter plot clusters.

Outlier Detection in Qlik Sense Scatter Plots


outlier_detection:
  method: "Standard Deviation"
  threshold: 2.5
  highlight: true
  color: "#FF0000"
  automatic_labels: true

Dynamic Reference Lines


reference_lines:
  performance_benchmarks:
    x_line: "Percentile(TOTAL [XValue], 0.75)"  # Top 25%
    y_line: "Percentile(TOTAL [YValue], 0.75)"  # Top 25%

  quadrant_labels:
    q1: "High X, High Y"
    q2: "Low X, High Y"
    q3: "Low X, Low Y"
    q4: "High X, Low Y"

How to Read Scatter Plots: Correlation Interpretation

When your scatter plot reveals time-based patterns in your data, switch to a line chart for trend visualization over periods — it shows the same story more clearly when sequence matters more than correlation.

Recognizing Correlation Strength

Strong positive correlation (r > 0.7):

  • Points form a clear ascending line
  • Example: Marketing budget → Leads (well-planned campaigns)

Weak correlation (0.3 < r < 0.7):

  • Points show a trend, but with scatter
  • Example: Experience → Performance (other factors matter)

No correlation (r ≈ 0):

  • Points randomly distributed
  • Example: Employee age → Sales performance

Negative correlation (r < -0.3):

  • Points form a descending line
  • Example: Product price → Sales volume

Correlation vs. Causation


correlation_detected: "Budget correlates with Leads"
possible_causes:
  - "Higher budget leads to more leads"
  - "Successful teams get more budget"
  - "Seasonal effects influence both"
  - "Third variable (team quality) influences both"

caution: "Correlation ≠ Causation"

Performance Optimization for Large Datasets

Clean expression optimization for chart calculations matters especially here — poorly written aggregations in scatter plot axes can multiply render times on larger datasets.

Why Is My Scatter Plot Slow in Qlik Sense?

Performance indicators:

  • More than 2,500 bubbles
  • Load time > 5 seconds
  • Browser lag during interaction

Optimization Strategies for Qlik Sense Scatter Plots

When your expressions use set identifiers like {1} to compute market totals, check the set analysis for chart expressions guide to ensure your aggregation formulas evaluate correctly across selections.

Strategy 1: Smart Sampling


-- Only representative data points:
{<[RecordID] = {"=Mod(RecordNumber(), 5) = 0"}>}

Strategy 2: Increase Aggregation


-- Instead of individual transactions, use monthly aggregates:
Dimension: Month([Date])
X-axis: Sum([Revenue])
Y-axis: Sum([Costs])

Strategy 3: Calculation Condition


calculation_condition:
  condition: "Count(DISTINCT [Dimension]) <= 1000"
  message: "Too many data points. Please apply a filter."

Design Best Practices for Effective Scatter Plots

Axis Optimization

X-axis guidelines:


zero_point: "Only when value range is meaningful"
scaling: "Linear (default) or Log (for exponential data)"
grid_lines: "Use sparingly"
title: "Clear with units"

Y-axis guidelines:


zero_point: "Usually start at 0"
auto_scaling: true
negative_values: "Mark clearly"

Bubble Size Guidelines


minimum_size: 3  # Ensure readability
maximum_size: 15  # Avoid overlap
scaling: "Proportional to bubble value"
null_values: "Display as minimum size"

Color Guidelines for Scatter Plots


categorical_data:
  palette: "Qualitative (distinct colors)"
  maximum: 8  # More becomes unreadable

continuous_data:
  palette: "Sequential (e.g., Blues)"
  legend: "With min/max values"

performance_data:
  palette: "Diverging (Red-Yellow-Green)"
  center: "Benchmark/Target"

Common Mistakes and Pro Solutions

Mistake 1: Too Few Data Points

Problem: 8 bubbles in a scatter plot

Solution: Minimum 20-30 points, otherwise use a table

Mistake 2: Unsuitable Variables

Problem: Categorical data on continuous axes

Solution:


correct:
  x: "Budget (numeric)"
  y: "Leads (numeric)"

incorrect:
  x: "Product category (categorical)"
  y: "Leads (numeric)"  # → Use a Bar Chart instead

Mistake 3: Meaningless Bubble Size

Problem: Bubble size is random/constant

Solution: Size must represent a meaningful third dimension

Mistake 4: Overinterpreting Correlations

Problem: r=0.3 interpreted as «strong relationship»

Solution:


interpretation_guide:
  r_0_3: "Weak relationship"
  r_0_5: "Moderate relationship"
  r_0_7: "Strong relationship"
  r_0_9: "Very strong relationship"

Quality Checklist for Scatter Plot Go-Live

Before release, check these 12 points:

  • At least 20-30 data points
  • X/Y axes are numeric and meaningfully scaled
  • Bubble size has business meaning
  • Colors are meaningful and colorblind-friendly
  • Axis titles include units
  • Correlation coefficient displayed (if relevant)
  • Outliers are identified and explained
  • Reference lines for benchmarks (if useful)
  • Legend is complete
  • Chart loads in <5 seconds
  • Title explains the discovered correlation
  • Causation vs. correlation clarified

QSBA Exam Preparation: Scatter Plots

Common Exam Questions About Qlik Sense Scatter Plots

Question 1: When is a Scatter Plot more suitable than a Line Chart?

Answer: When analyzing correlations between two numeric variables, not for time series trends.

Question 2: How do you interpret the correlation coefficient?

Answer: r > 0.7 = strong, 0.3-0.7 = moderate, < 0.3 = weak. Negative = inverse relationship.

Question 3: What is the difference between correlation and causation?

Answer: Correlation shows a statistical relationship, causation proves a cause-and-effect connection.


Troubleshooting Qlik Sense Scatter Plots

Why Do All Bubbles Have the Same Size?

Solution: Check the bubble size expression:


-- Wrong: Constant values
Avg([ConstantValue])

-- Correct: Variable values
Sum([Revenue]) / Count([Customers])

How to Identify Correlations in Scatter Plots

Solution:

  1. Remove outliers and re-examine
  2. Consider time lag (lagged correlation)
  3. Check for non-linear relationships (polynomial regression)

How to Fix Overlapping Bubbles

Solution:

  1. Add jitter for minimal random displacement
  2. Enable transparency (alpha blending)
  3. Aggregate data at a higher level

For an overview of all findings, pair your scatter plot with KPI objects for summary metrics — a single correlation coefficient displayed prominently tells executives more than a full chart grid.

Pro tip: The best Scatter Plot answers the question «Is there a relationship?» with statistical clarity. If you can’t explain the R², the chart isn’t ready yet.