Last updated: September 2025 | Reading time: 10 minutes | Level: Advanced
The problem: You suspect that marketing spend and sales performance are connected, but how strong is the relationship? And what other hidden correlations is everyone else missing? Traditional charts show only one dimension — you need a visualization for relationships.
The solution: Scatter Plots reveal correlations that remain invisible in other charts. In 10 minutes you’ll learn how to find hidden relationships and make data-driven decisions before the competition even knows what to look for.
What Will You Learn in This Qlik Sense Scatter Plot Tutorial?
- ✅ Correlation detection: Distinguish strong, weak, and false relationships
- ✅ Multi-variable analysis: Analyze 3-4 dimensions simultaneously
- ✅ Business insights: 5 practical use cases with immediate findings
- ✅ Statistical significance: Separate real trends from random patterns
- ✅ Advanced features: Clustering, forecasting, and outlier detection
How Scatter Plots Revolutionize Your Data Analysis
Understanding the Hidden Correlation Problem
Classic situation: You’re the Marketing Director and you see:
- Campaign A: $50K budget, 200 leads
- Campaign B: $80K budget, 180 leads
- Campaign C: $30K budget, 250 leads
Without Scatter Plot: You see only individual values and guess
With Scatter Plot: You instantly see that budget does NOT correlate linearly to leads and discover the sweet spot at $30-40K
The Multi-Dimensional Secret of Qlik Sense Scatter Plots
Scatter Plots can display 4 dimensions simultaneously:
- X-axis: Marketing Budget
- Y-axis: Generated Leads
- Bubble size: Lead Quality Score
- Color: Campaign Type
When Are Scatter Plots the Right Choice in Qlik Sense?
perfect_for:
- Marketing ROI vs. Lead Quality
- Employee performance vs. experience
- Product price vs. sales volume vs. customer satisfaction
- Website traffic vs. conversion vs. bounce rate
- Production output vs. quality vs. cost
not_suitable_for:
- Fewer than 20 data points (→ Table)
- Time series trends (→ Line Chart)
- Category comparisons (→ Bar Chart)
- Hierarchical data (→ Treemap)
How to Set Up Your First Scatter Plot in 10 Minutes
Before configuring, skim the Qlik official scatter plot documentation to confirm which features are available in your Qlik Cloud version.
Identifying the Data Relationship
Minimum requirements:
- 2 numeric variables (X and Y axis)
- At least 20-30 data points
- Optional: 1-2 additional dimensions (size, color)
Example data structure:
Campaign | Budget | Leads | Quality_Score | Channel | Region
Campaign_A | 45000 | 220 | 8.2 | Social | North
Campaign_B | 32000 | 180 | 9.1 | Email | South
Campaign_C | 67000 | 290 | 7.8 | PPC | West
How to Configure a Scatter Plot in Qlik Sense
- Add new object → «Scatter plot»
- Dimension (bubble identification):
Field: [Campaign]
Label: "Campaign"
- X-axis (first variable):
Expression: Sum([Budget])
Label: "Marketing Budget $"
Axis limit: Auto
- Y-axis (second variable):
Expression: Count([Leads])
Label: "Generated Leads"
Axis limit: 0 to Auto
How to Add a Third Dimension (Bubble Size)
Bubble size (optional third dimension):
expression: "Avg([Quality_Score])"
label: "Lead Quality"
size_range: "3 to 15"
How to Add a Fourth Dimension (Color)
Color (optional fourth dimension):
expression: "[Channel]"
color_palette: "12-color distinct"
legend: "Right"
Business Scenarios from Marketing to Operations
Scenario 1: Marketing ROI Optimization
Problem: «Which campaigns deliver the best ROI with the highest lead quality?»
Setup:
dimension: "[CampaignID]"
x_axis:
expression: "Sum([MarketingSpend])"
title: "Marketing Budget $"
y_axis:
expression: "Sum([Revenue])"
title: "Generated Revenue $"
bubble_size:
expression: "Avg([LeadScore])"
title: "Avg Lead Quality"
color:
expression: "[Channel]"
values: ["Social", "Email", "PPC", "Content"]
trend_line: true
Business insight: You instantly see campaigns with high ROI AND high lead quality. The sweet spot between budget and results becomes visible.
Scenario 2: Sales Performance vs. Experience
Problem: «Does sales performance correlate with years of experience?»
Setup:
dimension: "[SalesPersonID]"
x_axis:
expression: "[YearsExperience]"
title: "Years of Experience"
y_axis:
expression: "Sum([SalesRevenue])"
title: "Annual Revenue $"
bubble_size:
expression: "Count([CustomerRetention])"
title: "Customer Retention Rate"
color:
expression: "[SalesTeam]"
conditional_coloring: true
Advanced feature: Display the correlation coefficient:
-- Add as a text object:
Correl(Aggr(Sum([SalesRevenue]), [SalesPersonID]),
Aggr([YearsExperience], [SalesPersonID]))
Scenario 3: Finding the Product Price Sweet Spot
Problem: «Is there an optimal price for maximum sales volume?»
Setup:
dimension: "[ProductID]"
x_axis:
expression: "Avg([Price])"
title: "Average Price $"
y_axis:
expression: "Sum([UnitsSold])"
title: "Units Sold"
bubble_size:
expression: "Sum([Price] * [UnitsSold])"
title: "Total Revenue"
color:
expression: "[ProductCategory]"
reference_lines:
x_line:
value: "Avg(TOTAL [Price])"
label: "Market Average Price"
y_line:
value: "Avg(TOTAL [UnitsSold])"
label: "Average Sales"
Scenario 4: Website Performance Analysis
Problem: «How are traffic, bounce rate, and conversions connected?»
Setup:
dimension: "[PageURL]"
x_axis:
expression: "Sum([PageViews])"
title: "Page Views"
y_axis:
expression: "Sum([Conversions]) / Sum([Sessions])"
title: "Conversion Rate %"
bubble_size:
expression: "Avg([TimeOnPage])"
title: "Time on Page"
color:
expression: "If([BounceRate] > 0.7, 'High', If([BounceRate] > 0.4, 'Medium', 'Low'))"
title: "Bounce Rate Category"
Scenario 5: Production Efficiency vs. Quality
Problem: «Does higher production speed lead to worse quality?»
Setup:
dimension: "[ProductionLineID]"
x_axis:
expression: "Sum([UnitsPerHour])"
title: "Units per Hour"
y_axis:
expression: "Avg([QualityScore])"
title: "Quality Score (1-10)"
bubble_size:
expression: "Sum([ProductionCosts])"
title: "Production Costs"
color:
expression: "[Shift]"
values: ["Morning", "Afternoon", "Night"]
cluster_analysis: true
Advanced Features for Pro-Level Scatter Plots
The scatter plot properties reference documents every setting available in the properties panel, including clustering and regression options covered below.
Automatic Clustering
clustering:
enabled: true
number_of_clusters: 4
method: "K-Means"
color_by_cluster: true
show_centroids: true
Business value: Automatic grouping of similar data points (e.g., High/Medium/Low performers).
Regression Lines and R² in Qlik Sense Scatter Plots
trend_analysis:
linear_regression:
enabled: true
show_r_squared: true
confidence_interval: 95%
polynomial_regression:
degree: 2
for_curved_data: true
For detailed data exploration beyond visual patterns, use tables for detailed data exploration to drill into the individual data points behind your scatter plot clusters.
Outlier Detection in Qlik Sense Scatter Plots
outlier_detection:
method: "Standard Deviation"
threshold: 2.5
highlight: true
color: "#FF0000"
automatic_labels: true
Dynamic Reference Lines
reference_lines:
performance_benchmarks:
x_line: "Percentile(TOTAL [XValue], 0.75)" # Top 25%
y_line: "Percentile(TOTAL [YValue], 0.75)" # Top 25%
quadrant_labels:
q1: "High X, High Y"
q2: "Low X, High Y"
q3: "Low X, Low Y"
q4: "High X, Low Y"
How to Read Scatter Plots: Correlation Interpretation
When your scatter plot reveals time-based patterns in your data, switch to a line chart for trend visualization over periods — it shows the same story more clearly when sequence matters more than correlation.
Recognizing Correlation Strength
Strong positive correlation (r > 0.7):
- Points form a clear ascending line
- Example: Marketing budget → Leads (well-planned campaigns)
Weak correlation (0.3 < r < 0.7):
- Points show a trend, but with scatter
- Example: Experience → Performance (other factors matter)
No correlation (r ≈ 0):
- Points randomly distributed
- Example: Employee age → Sales performance
Negative correlation (r < -0.3):
- Points form a descending line
- Example: Product price → Sales volume
Correlation vs. Causation
correlation_detected: "Budget correlates with Leads"
possible_causes:
- "Higher budget leads to more leads"
- "Successful teams get more budget"
- "Seasonal effects influence both"
- "Third variable (team quality) influences both"
caution: "Correlation ≠ Causation"
Performance Optimization for Large Datasets
Clean expression optimization for chart calculations matters especially here — poorly written aggregations in scatter plot axes can multiply render times on larger datasets.
Why Is My Scatter Plot Slow in Qlik Sense?
Performance indicators:
- More than 2,500 bubbles
- Load time > 5 seconds
- Browser lag during interaction
Optimization Strategies for Qlik Sense Scatter Plots
When your expressions use set identifiers like {1} to compute market totals, check the set analysis for chart expressions guide to ensure your aggregation formulas evaluate correctly across selections.
Strategy 1: Smart Sampling
-- Only representative data points:
{<[RecordID] = {"=Mod(RecordNumber(), 5) = 0"}>}
Strategy 2: Increase Aggregation
-- Instead of individual transactions, use monthly aggregates:
Dimension: Month([Date])
X-axis: Sum([Revenue])
Y-axis: Sum([Costs])
Strategy 3: Calculation Condition
calculation_condition:
condition: "Count(DISTINCT [Dimension]) <= 1000"
message: "Too many data points. Please apply a filter."
Design Best Practices for Effective Scatter Plots
Axis Optimization
X-axis guidelines:
zero_point: "Only when value range is meaningful"
scaling: "Linear (default) or Log (for exponential data)"
grid_lines: "Use sparingly"
title: "Clear with units"
Y-axis guidelines:
zero_point: "Usually start at 0"
auto_scaling: true
negative_values: "Mark clearly"
Bubble Size Guidelines
minimum_size: 3 # Ensure readability
maximum_size: 15 # Avoid overlap
scaling: "Proportional to bubble value"
null_values: "Display as minimum size"
Color Guidelines for Scatter Plots
categorical_data:
palette: "Qualitative (distinct colors)"
maximum: 8 # More becomes unreadable
continuous_data:
palette: "Sequential (e.g., Blues)"
legend: "With min/max values"
performance_data:
palette: "Diverging (Red-Yellow-Green)"
center: "Benchmark/Target"
Common Mistakes and Pro Solutions
Mistake 1: Too Few Data Points
Problem: 8 bubbles in a scatter plot
Solution: Minimum 20-30 points, otherwise use a table
Mistake 2: Unsuitable Variables
Problem: Categorical data on continuous axes
Solution:
correct:
x: "Budget (numeric)"
y: "Leads (numeric)"
incorrect:
x: "Product category (categorical)"
y: "Leads (numeric)" # → Use a Bar Chart instead
Mistake 3: Meaningless Bubble Size
Problem: Bubble size is random/constant
Solution: Size must represent a meaningful third dimension
Mistake 4: Overinterpreting Correlations
Problem: r=0.3 interpreted as «strong relationship»
Solution:
interpretation_guide:
r_0_3: "Weak relationship"
r_0_5: "Moderate relationship"
r_0_7: "Strong relationship"
r_0_9: "Very strong relationship"
Quality Checklist for Scatter Plot Go-Live
Before release, check these 12 points:
- ⬜ At least 20-30 data points
- ⬜ X/Y axes are numeric and meaningfully scaled
- ⬜ Bubble size has business meaning
- ⬜ Colors are meaningful and colorblind-friendly
- ⬜ Axis titles include units
- ⬜ Correlation coefficient displayed (if relevant)
- ⬜ Outliers are identified and explained
- ⬜ Reference lines for benchmarks (if useful)
- ⬜ Legend is complete
- ⬜ Chart loads in <5 seconds
- ⬜ Title explains the discovered correlation
- ⬜ Causation vs. correlation clarified
QSBA Exam Preparation: Scatter Plots
Common Exam Questions About Qlik Sense Scatter Plots
Question 1: When is a Scatter Plot more suitable than a Line Chart?
Answer: When analyzing correlations between two numeric variables, not for time series trends.
Question 2: How do you interpret the correlation coefficient?
Answer: r > 0.7 = strong, 0.3-0.7 = moderate, < 0.3 = weak. Negative = inverse relationship.
Question 3: What is the difference between correlation and causation?
Answer: Correlation shows a statistical relationship, causation proves a cause-and-effect connection.
Troubleshooting Qlik Sense Scatter Plots
Why Do All Bubbles Have the Same Size?
Solution: Check the bubble size expression:
-- Wrong: Constant values
Avg([ConstantValue])
-- Correct: Variable values
Sum([Revenue]) / Count([Customers])
How to Identify Correlations in Scatter Plots
Solution:
- Remove outliers and re-examine
- Consider time lag (lagged correlation)
- Check for non-linear relationships (polynomial regression)
How to Fix Overlapping Bubbles
Solution:
- Add jitter for minimal random displacement
- Enable transparency (alpha blending)
- Aggregate data at a higher level
For an overview of all findings, pair your scatter plot with KPI objects for summary metrics — a single correlation coefficient displayed prominently tells executives more than a full chart grid.
Pro tip: The best Scatter Plot answers the question «Is there a relationship?» with statistical clarity. If you can’t explain the R², the chart isn’t ready yet.