Unstoppable Fabric: Capacity Planning, Smart Alerting & Workload Tuning
Description
Learn how to make Microsoft Fabric unstoppable by combining capacity planning, smart alerting, and workload tuning. We’ll use the Fabric Capacity Metrics app and API to read CU usage and throttling, plan and schedule ETL/Spark/refresh workloads, configure alerts (incl. Data Activator), and apply Spark/Warehouse best practices so Fabric stays fast, predictable, and cost-efficient.
Key Takeaways
- ●Apply Spark/Warehouse best practices
- Let's use Fabric!
- Key Questions We’ll Answer
- Which tools, when, and how to use in Fabric to measure and manage a Workflow?
- What We Need to Remember?
- Current Utilization = Real Current Utilization + Accumulated Debt
- ▸ Root cause analysis
My Notes
Action Items
- [ ]
Resources & Links
Slides
Unstoppable Fabric:
Capacity Planning, Smart Alerting &
Workload Tuning
Iurii (Yurri) Iurchenko
Sr Data Engineer, Azure Advocate
Intro
IN THIS SECTION
▸ Goals & Questions
▸ Not ideal scenario
▸ Better scenario
What We’ll Accomplish Today
Goal: Microsoft Fabric = Unstoppable By:
●Utilizing Capacity planning + Other Tools
●Monitoring & Configuring Alerting
●Improve Data Processing
We’ll use:
●Fabric Capacity Metrics app (primary)
●Fabric Activator
●Apply Spark/Warehouse best practices
●API to read CU usage and throttling
So Fabric stays fast, predictable, and cost-efficient.
Business Case: Real-World Capacity Challenge
Business goes Cloud
Let's buy F2
to save money
Let's use Fabric!
Let's Try Trial
What to do?
Let's buy F64
to flourish
Key Questions We’ll Answer
By the end of this session, you'll be able to answer:
How to understand the current workload in Fabric?
How to understand the future state of the workload?
Which tools, when, and how to use in Fabric to measure and manage a Workflow?
What tech strategies to implement to maximize the ROI from Fabric?
When Things Go Wrong
Started
adding new
data
Started
building the
Core
Bought
Fabric w/
Trial
Struggling with
performance
issues. Goal predictability
Sold it to
Users and
Directors
A Better Way
Capacity
Metrics
App;
Monitoring
& Analysis
SKU
Estimator
Alerting
Code best
practices
Started
adding
new data
Started
building
the Core
Bought
Fabric w/
Trial
SKU
Estimator
Started
Scaling
Sold it to
Users and
Directors
Planning for Success
IN THIS SECTION
▸ SKU Estimator
▸ Fabric Pricing
Choosing the Right SKU: SKU Estimator
- Helps to clarify the right things in
advance - Pretty accurate
Link: https://www.microsoft.com/en-us/microsoft-fabric/capacity-estimator
Understanding Fabric Pricing
Link: https://azure.microsoft.com/en-us/pricing/details/microsoft-fabric/
Capacity Model
IN THIS SECTION
▸ Capacity
▸ Capacity states
Core Concept: What ‘Capacity’ Really Means
Goal: Max Utilization & Availability
workload
Capacity
1 Bursting
2 Smoothing
3 Throttling
Capacity States Explained
0 Suspended
1 Healthy
2 At Risk of Throttling
3 Interactive Rejection
4 Background Rejection
Capacity Metrics App
IN THIS SECTION
▸ Capacity Metrics App
▸ Dynamics of Capacity Consumption
▸ Throttling
Capacity Metrics App - UI
Configuration Features
Installation:
Configuration Later:
Example 1. Capacity – Incremental Degradation
Example 2. Stability, Spike, Throttling
Average utilization is ~85-90%
Spike due to high load
Accumulated debt
Utilization = Current Utilization + Smoothed Debt
What We Need to Remember?
- Current Utilization = Real Current Utilization + Accumulated Debt
Disaster – How It Looks Like?
What to Do?
IN THIS SECTION
▸ Identification
▸ Root cause analysis
▸ Short- and long-term actions
Step 1. Identification
- Alert about Capacity Utilization
- Overall cause - Slide "Health”
- Compute.Utilization > 100% and Compute.Throttling > 100%
- Fabric Activator Alert
Step 2. Root Cause Analysis
Step 3. Root Cause Analysis
Immediate Actions: What to Do Right Now
Start & Stop Capacity
Resize Capacity
Turn Off Heavy
Processes
- No accumulated
Pros
debt - Faster relief
- No billing
- No turn-offs
Cons - Additional $ - Quota
request - Slow relief
- Additional $
Long-term Actions - Data Pipelines Optimization - Heavy Pipeline: Capacity Upscale -> Run -> Downscale
- Loading Type: Full Reload -> Incremental
- High Concurrency Mode for Development
Long-term Actions – Capacity Separation
Before:
One F32 capacity for Users, Critical Processes, and the other
processes
After:
F16 for Critical Processes, F8 for Users, F8 for the other processes
Additional Tools
IN THIS SECTION
▸ Tools
▸ Recap
Tool 1. Fabric Chargeback Reporting
Tool 2. Fabric Activator for Alerting
Tool 3. Surge Protection in Fabric
Tool 4. Fabric Cost Analysis
https://community.fabric.microsoft.com/t5/Fabric-platform-Community-Blog/Fabric-Cost-Analysis-Shine-a-light-on-your-platform-costs/ba-p/4907392
Tool 5. SemPy to Save Historical Data
Code examples: https://github.com/dataassets1/microsoft-fabric-capacity-tools/blob/main/sempy_get_data.py
Conclusion & Recap
- Plan Before You Buy
Use the SKU Estimator + Fabric Pricing page to right-size your capacity before purchase. - Know Your Utilization Formula
Current Utilization = Real CU + Accumulated Smoothed Debt. Throttling is cumulative — it creeps up. - Monitor with the Capacity Metrics App
Track CU%, detect spikes, degradation, and throttling patterns before they become disasters. - Respond Systematically
Alert → Identify → Root Cause → Act: start/stop capacity, resize SKU, or kill heavy processes. - Optimize for the Long Term
Switch to incremental loads, schedule pipelines with upscale/downscale, and separate capacities by workload. - Leverage Additional Tools
Fabric Activator (alerts), Chargeback (cost visibility), Surge Protection, Cost Analysis, SemPy (historical data).
Sound off.
The mic is all yours.
Influence the product roadmap.
Join the Fabric User Panel
Join the SQL User Panel
Share your feedback directly with our
Fabric product group and researchers.
Influence our SQL roadmap and ensure
it meets your real-life needs
https://aka.ms/JoinFabricUserPanel
https://aka.ms/JoinSQLUserPanel