Build Smarter Analytics Assistants with Fabric Data Agents and Copilot Studio

#fabcon2026/notes #fabcon2026/session #data-agents #copilot-studio #ai #semantic-model #modeling #power-bi #learning #microsoft #fabcon2026-notes

📎 Slide deck: Build Smarter Analytics Assistants with Fabric Data Agents - Piotr Prussak.pdf
Speaker: Piotr Prussak — Data & AI Architect (PL-300, DP-600, DP-700, AI-102, CSPO)

Key Takeaways

Modeling is the #1 lever for agent accuracy — not the AI model
Column descriptions are the single highest-ROI grounding mechanism
F2 is enough to start — Copilot/AI included on all paid SKUs since April 2025
Build a golden test set (20–50 questions) BEFORE deploying to production
Re-evaluate every 3 months — this space moves faster than enterprise planning cycles
Pattern: Copilot Studio = orchestrator, Fabric Data Agent = domain expert

Session Roadmap

Honest Caveats — what you need to know before investing time
AI Solutions Landscape — Data Agents, Copilot Studio, and where they fit
Setup, Prerequisites & Costs — what it takes to get started
Solution Walkthroughs — three data scenarios, increasing complexity
Deep Dives — modeling, schema design, grounding, and testing
Decision Guides — take-home frameworks

Honest Caveats

Caveat #1: This Is Preview

Fabric Data Agents + Copilot Studio integration is currently in preview
New features ship monthly (MCP endpoints, M365 Copilot integration, ontology support all landed in last 6 months)
SLAs, performance guarantees, full docs not yet final
Build to learn, not to bet the farm — yet

Caveat #2: Microsoft Follows Adoption

Microsoft invests in features that get used — mothballs what doesn't stick
Examples of killed products: Cortana Intelligence Suite, Power BI Dataflows v1, Data Activator (unclear trajectory)
If you adopt early, you influence the roadmap. If you wait, the feature may not survive.

Caveat #3: Set a 3-Month Horizon

GPT-4o → GPT-4.1 → GPT-5 GA in Copilot Studio — three model generations in under two months
Schedule a formal re-evaluation every 3 months

Caveat #4: This Is One Piece of the Toolkit

Azure AI Foundry Agents — custom agents, code-first control
Semantic Kernel — orchestration framework for complex AI workflows
Microsoft 365 Copilot — end-user surface
Copilot Studio — low-code agent builder/orchestrator
MCP (Model Context Protocol) — emerging interop standard

AI Solutions Landscape

What Is a Fabric Data Agent?

AI-powered assistants for natural language conversations about enterprise data
Understands schema across lakehouses, warehouses, semantic models, KQL databases, ontologies
Enforces governance — RLS, CLS, user permissions flow through automatically
Stores conversation history across sessions
Not just NL-to-SQL — reasons across multiple sources, maintains context

What Is Copilot Studio?

Low-code platform for building custom AI agents with multi-agent orchestration
Connected Agents — link Fabric Data Agents as specialized "experts"
Multi-channel: Teams, web, M365 Copilot, custom apps
Currently on GPT-5 GA with versioning controls
Pattern: User asks in Teams → Copilot routes → Data Agent queries → grounded answer returns

Where Do These Fit?

Option	When to Use
Native Copilot in Power BI	User is in a report, needs contextual Q&A
Fabric Data Agent (standalone)	Domain expert on specific dataset, analysts in Fabric
Data Agent + Copilot Studio	Multi-agent, mixed knowledge, deploy to Teams/web
Azure AI Foundry / Semantic Kernel	Full code-first control, custom RAG, complex workflows

Start with the simplest option. Escalate complexity only when needed.

Setup, Prerequisites & Costs

Prerequisites Checklist

F2+ capacity (or P1+ with Fabric enabled) — Copilot/AI included on all paid SKUs since April 2025
Tenant settings: Fabric Data Agent, Cross-geo AI processing, XMLA endpoints, Standalone Copilot — all enabled
At least one data source with data (Warehouse, Lakehouse, Semantic Model, KQL DB, or Ontology)
Copilot Studio: same tenant, same account, M365 Copilot license

Authentication Mode (Critical Decision)

User Authentication — queries run as end user (RLS enforced per user) ← right choice for enterprise
Agent Author Authentication — queries run as author (simpler, but shared access)

Costs

Item	Cost
F2 capacity	~$262/month (Copilot included)
Copilot Studio PAYG	$0.01/credit
Copilot Studio prepaid	$200/tenant/month (25,000 credits)
M365 Copilot (authoring)	$30/user/month

💡 F2 Copilot inclusion was a game-changer — many still think F64 is required.

Solution Walkthroughs

Solution A: Technical / Operational Data (Start here)

Scenario: Fabric Capacity Metrics / FUAM as data source
Well-structured, narrow domain, numeric-heavy, low ambiguity
Example queries: peak CU usage, workspace consumption, failed refreshes
Why this works: Schema is self-descriptive, questions map to single-table aggregations
MVP path — use your own capacity data, no setup needed, immediate relevance

Solution B: Business Data — Complex Schema (What goes wrong)

Scenario: Wide World Importers (many-to-many, SCD patterns, multi-granularity facts, self-referencing hierarchies)
Common failures:
- Ambiguous joins — agent picks wrong path through M:M
- Temporal confusion — doesn't know which date = "current"
- Granularity mismatch — aggregates at wrong level
- Hallucinated columns — invents column names that sound right
- Over-joining — joins 6 tables when answer was in one
The lesson: Raw complex schemas are hostile to AI agents. The agent isn't broken — the schema was never designed for this consumer.

Solution C: Business Data — Simplified Schema (Same data, modeled right)

Same questions as Solution B — now they work
Changes made: star schema, descriptive column names, column descriptions populated, bridge tables hidden, SCD abstracted into "current" views
The punchline: The agent didn't get smarter. The data got clearer.

Modeling is the prerequisite for production-quality agent responses.

Deep Dives

Data Agent Modeling

Agent sees: table names, column names, data types, relationships, descriptions
Agent does NOT see data values unless it queries them
Best practices:
- Descriptive, unambiguous names — avoid abbreviations
- Define explicit foreign keys and cardinality
- Write rich column descriptions — single highest-ROI grounding mechanism
- Hide internal/technical columns
- Use meta-prompting: ask the agent to generate its own instructions from the schema
Prefer calculated columns over measures for values agents need to filter/group on

Schema Design

Denormalize strategically — flatten M:M into bridge-free views
Resolve SCD ambiguity — create "current" views alongside history
Eliminate field name collisions ("date" in 12 tables — which one?)
Separate concerns — one semantic model per bounded domain
Agent instructions capped at 15,000 characters — be concise
Define what the agent SHOULD and SHOULD NOT answer

Getting Grounded Responses

Grounding = responses tied to actual data, not hallucinated
Key mechanisms: agent instructions, agent descriptions, column descriptions, semantic model layer
Goal: not "always answers" — it's "answers correctly or says it can't"

Testing Patterns

Build golden question set (20–50 questions with known correct answers) BEFORE production
Run after every model update, schema change, or instruction edit
Test for: correct answers, graceful refusal on out-of-scope, consistency across phrasings, concurrent load
Copilot Studio supports side-by-side agent version comparison — use it

Decision Guides

Semantic Model vs. Direct SQL/Lakehouse

Use Semantic Model when...	Go Direct to SQL/Lakehouse when...
Business logic in DAX measures	Exploratory / ad-hoc (data science)
Need consistent calculations	Schema is simple + self-descriptive
RLS/CLS already defined	Data not yet modeled (raw ingestion)
Well-bounded domain	Performance requires engine pushdown

Copilot Studio vs. Native Fabric Copilot

Native: users already in reports, contextual Q&A, no custom orchestration
Studio: multi-agent, custom topics/triggers, Teams/web deployment, mixed knowledge sources

Signs You're NOT Ready to Deploy

✗ No golden question test set
✗ No column descriptions in semantic model or schema
✗ No clear domain boundary ("it should answer everything")
✗ No executive sponsor who understands "this is preview"
✗ No plan for monitoring responses in production
✗ No defined escalation path for wrong answers

If more than two apply, invest in readiness before deployment.

Five Things to Do Monday Morning

Verify Fabric tenant settings — enable Data Agents, Copilot, XMLA endpoints
Build one data agent on capacity metrics — prove the platform works
Audit one production semantic model — add column descriptions, check naming clarity
Write 20 golden test questions for your most likely agent domain
Schedule a 3-month re-evaluation checkpoint (next: June 2026)

Resources

Action Items

Enable Data Agent tenant settings in admin portal
Build proof-of-concept agent on capacity metrics data
Audit semantic model column descriptions
Build a golden question test set (20 questions min)
Calendar: 3-month re-evaluation — June 2026

Key Takeaways

Session Roadmap

Honest Caveats

Caveat #1: This Is Preview

Caveat #2: Microsoft Follows Adoption

Caveat #3: Set a 3-Month Horizon

Caveat #4: This Is One Piece of the Toolkit

AI Solutions Landscape

What Is a Fabric Data Agent?

What Is Copilot Studio?

Where Do These Fit?

Setup, Prerequisites & Costs

Prerequisites Checklist

Authentication Mode (Critical Decision)

Costs

Solution Walkthroughs

Solution A: Technical / Operational Data (Start here)

Solution B: Business Data — Complex Schema (What goes wrong)

Solution C: Business Data — Simplified Schema (Same data, modeled right)

Deep Dives

Data Agent Modeling

Schema Design

Getting Grounded Responses

Testing Patterns

Decision Guides

Semantic Model vs. Direct SQL/Lakehouse

Copilot Studio vs. Native Fabric Copilot

Signs You're NOT Ready to Deploy

Five Things to Do Monday Morning

Resources

Action Items

Related Sessions