top of page

Ingest data with Spark and Microsoft Fabric notebooks

Writer's picture: Harini MallawaarachchiHarini Mallawaarachchi

In this lab, you’ll create a Microsoft Fabric notebook and use PySpark to connect to an Azure Blob Storage path, then load the data into a lakehouse using write optimizations.


This lab will take approximately 30 minutes to complete.


For this experience, you’ll build the code across multiple notebook code cells, which may not reflect how you will do it in your environment; however, it can be useful for debugging.

Because you’re also working with a sample dataset, the optimization doesn’t reflect what you may see in production at scale; however, you can still see improvement and when every millisecond counts, optimization is key.



Note: You need a Microsoft school or work account to complete this exercise. If you don’t have one, you can sign up for a trial of Microsoft Office 365 E3 or higher.


Create a workspace

Create a Lakehouse

Create a Fabric notebook and load external data

Transform and load data to a Delta table

Optimize Delta table writes

Analyze Delta table data with SQL queries

Clean up resources

* Using CoPilot




2 views0 comments

Recent Posts

See All

PowerBI/Fabric REST API unleashed!

In this article, I delve into the practical applications of the Power BI REST API. These examples are equally relevant for both Power BI...

Comments


bottom of page