AI in Action: Microsoft Fabric for Data Science
Description
In this fast-paced, discover how Microsoft Fabric streamlines the entire AI workflow—from data exploration to model deployment. You’ll learn how to use it for data science, train models, and generate predictions all within Fabric’s unified analytics platform. Perfect for data professionals and AI enthusiasts looking to get started with Fabric in just one hour.
Key Takeaways
- When Fabric is the better choice
- Why Fabric is effective for Data Science & AI
- Kickstart Your Data Science Journey
- Wrangling (Data Preparation)
- Exploration & Visualization
- CSV, JSON), and ensures scalability and compatibility.
My Notes
Action Items
- [ ]
Resources & Links
Slides
Microsoft Fabric
AI in Action:
Microsoft
Fabric for Data
Science
Exploring Data Science using Microsoft Fabric
to build AI technologies and building scalable
intelligent systems
By Sadiq Ahmed
Quantum Technology Inc.
Linkedin.com/in/sadiqhahmed/
Microsoft Fabric
Session Outline
• Holistic Overview
• When Fabric is the better choice
• Why Fabric is effective for Data Science & AI
• Key Concepts
• AI & DS
• Kickstart Your Data Science Journey
• Data Ingestion
• Wrangling (Data Preparation)
• Exploration & Visualization
• Modeling
• Evaluation
• Deployment
• Things to know
• Take Away
• Thank You & Q&A
Microsoft Fabric
Holistic overview
This presentation is a Holistic Overview of the Fabric primary for Data Science & AI
Microsoft Fabric
When Fabric is
the better
choice
Microsoft Fabric
When
You want SaaS-first design, tight Microsoft ecosystem integration, and unified architecture.
You want one platform for data engineering, data science, ML, BI and Governance.
You want serverless compute with no cluster management.
Your team includes multiple roles like analysts and citizen data scientists.
You want integration with Power BI.
You want fast time to value with minimal setup & infrastructure management.
And More …
Microsoft Fabric
Why Fabric is
effective for
Data Science &
AI
Microsoft Fabric
WHY
Its strength comes from combining OneLake, Spark, ML tooling, and AI
functions into a single SaaS experience.
Deep integration with Azure AI and OpenAI
End-to-end workflow in one place
Low-code/no-code and pro-code together
Governance and responsible AI built in
Microsoft Fabric
Key Concepts of
Microsoft Fabric
Microsoft Fabric
Key Concepts !
➢ Fabric (Platform)
A unified SaaS platform for analytics, data science,
and AI, offering a single environment for all workloads.
➢ OneLake (Centralized Storage Layer)
A unified data lake storing all organizational data with
one logical view, reducing duplication across regions
and clouds.
➢ Azure Data Lake Storage (ADLS) Foundation
Built on ADLS, supports open formats (Delta, Parquet,
CSV, JSON), and ensures scalability and compatibility.
Image Source: Microsoft Learn
Microsoft Fabric
Key Concepts !
➢ Lakehouse (Structured Container in OneLake)
Organizes data as tables and files, blending data
lake flexibility with warehouse schema, optimized
for analytics and ML.
➢ Compute Engines (Integration Layer)
All compute engines (Data Engineering, Data
Science, Real-Time Analytics, Power BI) store data
in OneLake using delta-parquet format for
seamless interoperability.
➢ Shortcuts (External References)
References to external files or storage outside
OneLake enable access without copying and keep
data synchronized.
Image Source: Microsoft Learn
Microsoft Fabric
Key Concepts ! !
➢Mirroring (External References)
Mirroring in Fabric is near real-time replication of
external databases into OneLake so you can
query and analyze them directly.
➢Workspace
A workspace in Microsoft Fabric is a secure,
collaborative container that organizes and
manages all the data and analytics items for a
specific project or team
Image Source: Microsoft Learn
Microsoft Fabric
KEY TAKEAWAY:
CONCEPTS
✓ Fabric is the platform.
✓ OneLake is the unified storage layer.
✓ Lakehouse (Structured Container in OneLake)
✓ ADLS + formats are the foundation.
✓ Compute engines integrate seamlessly.
✓ Shortcuts extend access to external data.
✓ Mirroring : Replication of external system
Microsoft Fabric
Data Science &
Artificial
Intelligence
Microsoft Fabric
Relation : DS & AI
Data Science workflow = foundation (data prep, modeling,
evaluation).
AI workflow = extension
(using ML + advanced AI capabilities).
In Fabric, you typically start with the Data Science steps and then
expand into AI workflows if your project requires intelligent
automation, generative AI, or advanced ML.
Microsoft Fabric
Data Science + AI/ML lifecycle Project
Frameworks/Lifecycle/workflow
/methodologies..
Generally, follows these steps:
CRISP-DM
OSEMN
SEMMA
KDD Process
TDSP (Microsoft)
ASUM-DM (IBM)
AI-DSF
ASEMIC
ML Lifecycle (Google)
MLOps Lifecycle
(Google/AWS/Azure)
NIST AI RMF
DataOps Lifecycle
Business stage
➢Business problem definition
(Business understanding)
Technical stages
➢Data collection
➢Data preparation
➢Data exploration (Exploratory Data Analyst)
➢Modeling
➢Model evaluation
➢Model deployment + Monitoring & Inference
Microsoft Fabric
In General, Stages in Microsoft Fabric
Data Ingestion →
Data Wrangler →
bringing raw data from OneLake,
databases, or external sources
cleaning, transforming, and
preparing datasets for analysis
Data Exploration & Visualization
→
Modeling →
charts, summaries, and quick
insights
training ML models (classification,
regression, clustering, etc.)
Evaluation →
Deployment →
testing model accuracy and
performance
integrating models into Fabric
pipelines for production use
Microsoft Fabric
Backbone of workflow
Six steps are for Data Science in Fabric, but they
also serve as the backbone of AI workflows.
AI simply layers additional capabilities on top of
them.
Microsoft Fabric
Kickstart Your
Journey
Microsoft Fabric
ROLES INVOLVED:
✓ Data Engineer
✓ Data Analyst
✓ Data Scientist
✓ Machine Learning Engineer
✓ AI Engineer
✓ MLOps Engineer
✓ Database Administrator
✓ Data Architect
✓ Solution Architect
✓ Data Steward
✓ Governance/Compliance Officer
✓ Fabric Administrator / Workspace Admin
✓ Business Analyst
✓ BI Developer
✓ Real-Time Analytics Engineer
✓ DevOps Engineer
✓ Product Owner
✓ …..
Microsoft Fabric
USE CASES :
Examples
AI-Driven Customer Insights
Organizations use AI to segment customers, enabling personalized
marketing and enhanced customer experiences.
Predictive Analytics
Predictive analytics forecast like sales trends, helping businesses optimize
inventory and increase revenue.
Operational Efficiency Gains
AI-driven insights enhance operational workflows, leading to improved
efficiency and competitive advantage.
Microsoft Fabric
Stages in Microsoft Fabric
Data Ingestion → bringing raw data from OneLake, databases, or external sources
Data Wrangler → cleaning, transforming, and preparing datasets for analysis
Data Exploration & Visualization → charts, summaries, and quick insights
Modeling → training ML models (classification, regression, clustering, etc.)
Evaluation → testing model accuracy and performance
Deployment → integrating models into Fabric pipelines for production use
Microsoft Fabric
Demo
• Workspace
• Menu
Microsoft Fabric
Data Onboarding
in Fabric
Data Ingestion
Microsoft Fabric
Data Ingestion
Ingestion data into OneLake (physically or
virtually) so Fabric can analyze, transform, and
use it.
Microsoft Fabric
Ways to Ingest Data ?
Pipelines
Dataflows
Notebooks
Real-time streams
Direct file uploads
Shortcuts (virtualization)
SQL loading
APIs
More..
Microsoft Fabric
Ways to Ingest Data ?
Code-Free Ingestion
Dataflows Gen2 (Power Query)
Data Pipelines (Data Factory in Fabric)
Copy Activity (inside Pipelines)
Drag-and-drop file upload into the Lakehouse
Shortcuts (virtualize external storage like ADLS, AWS S3, Google Cloud Storage)
Eventstreams (real-time ingestion)
Microsoft Fabric
Ways to Ingest Data ?
Code-Based Ingestion
Spark Notebooks
(PySpark, Scala, Spark SQL)
Spark Jobs
Pandas / Python inside notebooks
Delta Lake APIs
COPY INTO (SQL Warehouse)
Bulk Insert (SQL Warehouse)
Microsoft Fabric
Ways to Ingest Data ?
Streaming Ingestion
Eventstreams
(real-time ingestion into OneLake or KQL)
Kusto / Eventhouse connectors
Event Hubs / Kafka via Eventstream
Microsoft Fabric
What happens after Ingestion?
- Fabric transforms raw information into OneLake (Delta tables )
- Making them searchable via SQL and Spark,
- Allows to build Semantic Models, and
- Supporting analytics, reporting, and machine learning applications built on the underlying data.
Microsoft Fabric
Key Takeaway:
DATA INGESTION
Process:
Collect raw data from sources
(databases, APIs, files, external systems).
Goal:
Bring all relevant data into Fabric for analysis.
Fabric Tools/Features:
OneLake → unified data lake for all organizational data
Lakehouse → structured container for analytics-ready data
Data Factory pipelines → ETL/ELT workflows to move and transform
data
More…
Connectors → More than 200 native connectors
Microsoft Fabric
Demo
Microsoft Fabric
Data Preparation
with Data
Wrangler
Wrangling
(Data Preparation )
Microsoft Fabric
IMPORTANCE OF
CLEAN, WELLPREPARED DATA
Data Cleaning
Cleaning data removes noise and inconsistencies, essential for accurate
machine learning results.
Data Transformation
Transforming data prepares it for modeling, boosting machine learning
performance.
Model Performance Improvement
Well-prepared data enhances model accuracy and trustworthiness.
Microsoft Fabric
What Data Wrangler
Does in Microsoft
Fabric
Exploratory data analysis: Displays data in a
grid-like interface with dynamic summary
statistics.
Data cleaning operations: Built-in functions
for handling missing values, duplicates, and
formatting issues.
Transformations: Apply operations like
filtering, grouping, or feature engineering with
just a few clicks.
Code generation: Produces pandas or
PySpark code that can be saved back into
notebooks for reproducibility.
Visualization support: Offers charts and
plots to quickly understand distributions and
relationships.
Microsoft Fabric
Why It Matters
Accelerates preprocessing:
Saves time compared to manual
coding.
Ensures reproducibility:
Generated code can be reused and
shared.
Bridges raw data to ML models:
Clean data is essential for accurate
predictions.
Scales with Fabric:
Works with both small datasets
(pandas) and big data (PySpark).
Microsoft Fabric
Bridge
Data Wrangler in Microsoft Fabric is the bridge
between raw data and machine learning
models, making the data science process
smoother, faster, and more reliable.
Key Takeaway:
WRANGLING (DATA PREPARATION )
Process:
Clean, transform, and structure the data.
Goal:
Make messy raw data usable for analysis and modeling.
Fabric Tools/Features:
Data Wrangler → interactive tool in notebooks for cleaning and
transformation
Spark (PySpark) → distributed processing for large-scale data prep
Microsoft Fabric
Microsoft Fabric
Demo
Microsoft Fabric
Notebook
Magic: Explore,
Visualize,
Discover
Exploration &
Visualization
Microsoft Fabric
Understanding Data Through
Visual Representation
Visualize
Libraries
Enhance
Visualize Data Relationships
Use Key Libraries
• Explore connections within your
data by using visualization
tools.
• Take advantage of libraries like
matplotlib and seaborn in
Fabric notebooks to plot
correlations, outliers, and
trends.
Enhance Analysis with Embedded
Visuals
• Incorporate visuals directly into
your workflow to make analysis
clearer and highlight important
insights.
Microsoft Fabric
Understanding Data
Through
Visual Representation
Code Execution Integration
Documentation and Explanation
Handling Missing Values
etc
Microsoft Fabric
KEY TAKEAWAY:
EXPLORATION & VISUALIZATION
Process:
Analyze distributions, correlations, and anomalies.
Goal:
Understand the data before modeling.
Fabric Tools/Features:
Notebooks → Python/Spark for exploratory analysis
Visualization libraries → matplotlib, seaborn, Plotly
Power BI integration → interactive dashboards and reports
Microsoft Fabric
Demo
Microsoft Fabric
Train Smarter:
MLflow in
Action
Modeling
Microsoft Fabric
Train , Track & Manage
Microsoft Fabric enables data scientists to train, track, and manage machine learning models with
notebooks and framework/libraries.
Microsoft Fabric
Training models inside Fabric
• Fabric notebooks use Spark compute, supporting PySpark and Python.
• Popular ML frameworks/libraries include:
a. Scikit-learn: For classification, regression, and clustering models.
b. PyTorch and TensorFlow: For deep learning in NLP and computer vision.
c. SynapseML: For scalable machine learning pipelines.
• All works with Python and Pandas DataFrames.
Microsoft Fabric
Tracking experiments with MLflow
• Fabric supports MLflow Experiments, making it easy to log
• parameters,
• metrics,
• artifacts, and
• model versions.
• This streamlines run comparison, result reproduction, and team collaboration.
• MLflow acts as the main platform to track model performance over time.
Microsoft Fabric
Managing models in Fabric
• After training, models can be registered and versioned with MLflow’s model registry.
• Fabric keeps models, code, data, and experiment history unified in one environment.
Microsoft Fabric
Key Takeaway:
MODELING
Process:
Train machine learning models (classification, regression, clustering, etc.).
Goal:
Build predictive or descriptive models.
Fabric Tools/Features:
MLflow integration → experiment tracking and model management
Fabric notebooks → scikit-learn, TensorFlow, PyTorch, Spark MLlib
Microsoft Fabric
Demo
Microsoft Fabric
Model
Evaluation with
MLflow Metrics
Evaluation
Microsoft Fabric
BEST PRACTICES: REPRODUCIBILITY, EXPERIMENT COMPARISON
Ensuring Reproducibility
MLflow captures code, data, and parameters to ensure experiments can be reproduced accurately and reliably.
Experiment Comparison
Comparing different experiments allows selecting the best performing models confidently for deployment.
Microsoft Fabric
Key Takeaway:
EVALUATION
Process:
Test models with metrics (accuracy, precision, recall,
RMSE, F1 score).
Goal:
Validate performance and select the best model.
Fabric Tools/Features:
MLflow metrics tracking → compare runs and
performance
Evaluation libraries → scikit-learn metrics, Spark MLlib
evaluators
Microsoft Fabric
Demo
Microsoft Fabric
Predict at Scale:
Deploy and
Deliver
Deployment
Microsoft Fabric
DEPLOYING MODELS FOR PRODUCTION USE
Model Deployment Support
Fabric enables seamless deployment of machine learning models into production environments for enterprise use.
Real-Time Inference
Models deployed with Fabric can provide real-time inference for immediate data processing and decision making.
Batch Inference Capability
Batch inference is supported allowing models to process large datasets efficiently at scheduled intervals.
Microsoft Fabric
GENERATING BATCH PREDICTIONS INTO DELTA TABLES
Batch Prediction Generation
Batch predictions enable processing large datasets at once for efficient analytics and reporting.
Delta Tables Storage
Storing predictions in Delta tables ensures reliable, scalable data management and easy access.
Analytics Integration
Delta tables facilitate seamless integration with downstream analytics and reporting workflows.
Microsoft Fabric
Key Takeaway:
DEPLOYMENT
Process:
Integrate the model into production pipelines or apps.
Goal:
Deliver predictions and insights into real-world systems.
Fabric Tools/Features:
MLflow model registry → save, version, and manage models
Batch prediction pipelines → generate predictions at scale
Power BI dashboards → consume predictions in business
reports
Microsoft Fabric
Demo
Microsoft Fabric
Additional
Points
Things to know
Microsoft Fabric
How Fabric and AI Foundry Fit Together ?
PLATFORM
Microsoft Fabric
Microsoft AI
Foundry
PRIMARY
PURPOSE
STRENGTHS
WHEN YOU USE IT
When preparing
Unified analytics Data pipelines,
data, training
- data
Lakehouse, notebooks,
models, and
engineering + ML ML training, governance
managing analytics
End-to-end AI
application
development
Prompt engineering,
When building,
model catalog,
deploying, and
evaluation, deployment, operationalizing AI
monitoring
apps
Microsoft Fabric
Storage Account in Azure Portal ?
• Fabric uses OneLake, not Azure Blob Storage
• Exposes paths that look like ADLS Gen2 URLs, but they are virtual
Microsoft Fabric
OneLake & lakehouse
➢OneLake
✓Fabric’s unified data lake
✓Built on top of Azure storage
✓Single logic
➢Lakehouse
✓A structured data container organizes information into tables and files.
✓Inside OneLake
✓Optimized for analytics and machine learning..
Microsoft Fabric
Usually!
➢Ingest data → Lakehouse
➢Transform data → Lakehouse (Delta tables)
➢Expose tables → Semantic
➢Model Build reports → Power BI
Microsoft Fabric
Medallion
Architecture in Fabric
Medallion Architecture
Microsoft Fabric
Medallion Architecture
It is a multi-layered Data Architectural Pattern that organizes data into Bronze, Silver, and Gold layers to
progressively improve quality, structure, and business value.
Microsoft Fabric
Medallion Architecture
• Medallion Architecture inside Lakehouses stored in OneLake.
Layer
Bronze
Silver
Gold
Fabric
Implementation
Typical Storage
Typical
Tools/Engines
Purpose
Files, Delta
tables
Data Factory
Pipelines, Copy
Activity,
Dataflows Gen2
Ingest raw data
Lakehouse
(clean zone)
Delta tables
Spark
Transform &
Notebooks,
Dataflows Gen2, standardize
Pipelines
Lakehouse
(curated zone)
Delta tables or
warehouse or
semantic
model
Spark, SQL,
Power BI
modeling
Lakehouse
(raw zone)
Business-ready
analytics
Microsoft Fabric
Take Away
Microsoft Fabric
Take Away : Stages
Data Ingestion →
Data Wrangling →
Exploration & Visualization →
bringing raw data from OneLake,
databases, or external sources
cleaning, transforming, and
preparing datasets for analysis
charts, summaries, and quick
insights
Evaluation →
testing model accuracy and
performance
Deployment →
integrating models into Fabric
pipelines for production use
Modeling →
training ML models (classification,
regression, clustering, etc.)
Microsoft Fabric
Take Away : Stages- Tools/Features
Data Ingestion →
OneLake, Lakehouse, Data Factory
(pipelines), Shortcuts, Mirroring
Data Preparation / Wrangling →
Data Wrangler, Spark (PySpark)
Evaluation →
MLflow metrics tracking, scikit-learn
metrics, Spark MLlib evaluators
Exploration & Visualization →
Notebooks, Visualization libraries
(matplotlib, seaborn, Plotly), Power BI
Modeling →
MLflow, Spark MLlib, scikit-learn,
TensorFlow, PyTorch
Deployment →
MLflow model registry, Batch prediction
pipelines, Power BI dashboards
Microsoft Fabric
Further study
To Learn more, please visit the following
Analytics end-to-end with Microsoft Fabric
https://learn.microsoft.com/en-us/azure/architecture/examplescenario/dataplate2e/data-platform-end-to-end
Microsoft Fabric
THANK YOU
Connect and continue the discussion through professional
networking.
www.linkedin.com/in/sadiqhahmed/
Sound off.
The mic is all yours.
Influence the product roadmap.
Join the Fabric User Panel
Join the SQL User Panel
Share your feedback directly with our
Fabric product group and researchers.
Influence our SQL roadmap and ensure
it meets your real-life needs
https://aka.ms/JoinFabricUserPanel
https://aka.ms/JoinSQLUserPanel