top of page

L19-Use Spark in Azure Databricks

Writer's picture: Harini MallawaarachchiHarini Mallawaarachchi

Azure Databricks is a Microsoft Azure-based version of the popular open-source Databricks platform. Azure Databricks is built on Apache Spark, and offers a highly scalable solution for data engineering and analysis tasks that involve working with data in files. One of the benefits of Spark is support for a wide range of programming languages, including Java, Scala, Python, and SQL; making Spark a very flexible solution for data processing workloads including data cleansing and manipulation, statistical analysis and machine learning, and data analytics and visualization.



Before you start

You'll need an Azure subscription in which you have administrative-level access.

Review the Exploratory data analysis on Azure Databricks article in the Azure Synapse Analytics documentation.



Create a cluster

Explore data using a notebook




0 views0 comments

Recent Posts

See All

L20-Use Delta Lake in Azure Databricks

DP-203-Labs-20 Delta Lake is an open source project to build a transactional data storage layer for Spark on top of a data lake. Delta...

Comments


bottom of page