Feb

14

2022

Data Engineering using Databricks on AWS and Azure

Laser 14 Feb 2022 02:50 LEARNING » e-learning - Tutorial

Data Engineering using Databricks on AWS and Azure
Genre: eLearning | MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHzLanguage: English | Size: 5.65 GB | Duration: 13h 50m

What you'll learn
Data Eeering leveraging Databricks features
Databricks CLI to manage files, Data Eeering jobs and clusters for Data Eeering Pipelines
Deploying Data Eeering applications developed using PySpark on job clusters
Deploying Data Eeering applications developed using PySpark using Notebooks on job clusters
Perform CRUD Operations leveraging Delta Lake using Spark SQL for Data Eeering Applications or Pipelines
Perform CRUD Operations leveraging Delta Lake using Pyspark for Data Eeering Applications or Pipelines
Setting up development environment to develop Data Eeering applications using Databricks
Building Data Eeering Pipelines using Spark Structured Streaming on Databricks Clusters
Incremental File Processing using Spark Structured Streaming leveraging Databricks Auto Loader cloudFiles
Overview of Auto Loader cloudFiles File Discovery Modes - Directory Listing and File Notifications
Differences between Auto Loader cloudFiles File Discovery Modes - Directory Listing and File Notifications
Differences between traditional Spark Structured Streaming and leveraging Databricks Auto Loader cloudFiles for incremental file processing.

Requirements
Programming experience using Python
Data Eeering experience using Spark
Ability to write and interpret SQL Queries
This course is ideal for experienced data eeers to add Databricks as one of the key skill as part of the profile
Description
As part of this course, you will learn all the Data Eeering using cloud platform-agnostic technology called Databricks.
About Data Eeering
Data Eeering is nothing but processing the data depending upon our downstream needs. We need to build different pipelines such as Batch Pipelines, Streaming Pipelines, etc as part of Data Eeering. All roles related to Data Processing are consolidated under Data Eeering. Conventionally, they are known as ETL Development, Data Warehouse Development, etc.
About Databricks
Databricks is the most popular cloud platform-agnostic data eeering tech stack. They are the committers of the Apache Spark project. Databricks run provide Spark leveraging the elasticity of the cloud. With Databricks, you pay for what you use. Over a period of , they came up with an idea of Lakehouse by providing all the features that are required for traditional BI as well as AI & ML. Here are some of the core features of Databricks.
Spark - Distributed Computing
Delta Lake - Perform CRUD Operations. It is primarily used to build capabilities such as inserting, updating, and deleting the data from files in Data Lake.
cloudFiles - Get the files in an incremental fashion in the most efficient way leveraging cloud features.
Databricks SQL - A Photon-based interface that is fine-tuned for running queries submitted for reporting and visualization by reporting tools. It is also used for Ad-hoc Analysis.
Course Details
As part of this course, you will be learning Data Eeering using Databricks.
Getting Started with Databricks
Setup Local Development Environment to develop Data Eeering Applications using Databricks
Using Databricks CLI to manage files, jobs, clusters, etc related to Data Eeering Applications
Spark Application Development Cycle to build Data Eeering Applications
Databricks Jobs and Clusters
Deploy and Run Data Eeering Jobs on Databricks Job Clusters as Python Application
Deploy and Run Data Eeering Jobs on Job Cluster using Notebooks
Deep Dive into Delta Lake using Dataframes
Deep Dive into Delta Lake using Spark SQL
Building Data Eeering Pipelines using Spark Structured Streaming on Databricks Clusters
Incremental File Processing using Spark Structured Streaming leveraging Databricks Auto Loader cloudFiles
Overview of Auto Loader cloudFiles File Discovery Modes - Directory Listing and File Notifications
Differences between Auto Loader cloudFiles File Discovery Modes - Directory Listing and File Notifications
Differences between traditional Spark Structured Streaming and leveraging Databricks Auto Loader cloudFiles for incremental file processing.
Overview of Databricks SQL for Data Analysis and reporting.
We will be adding few more modules related to Pyspark, Spark with Scala, Spark SQL, Streaming Pipelines in the coming weeks.
Desired Audience
Here is the desired audience for this advanced course.
Experienced application developers to gain expertise related to Data Eeering with prior knowledge and experience of Spark.
Experienced Data Eeers to gain enough skills to add Databricks to their profile.
Testers to improve their testing capabilities related to Data Eeering applications using Databricks.
Prerequisites
Logistics
Computer with decent configuration (At least 4 GB RAM, however 8 GB is highly desired)
Dual Core is required and Quad-Core is highly desired
Chrome Browser
High-Speed Internet
Valid AWS Account
Valid Databricks Account (free Databricks Account is not sufficient)
Experience as Data Eeer especially using Apache Spark
Knowledge about some of the cloud concepts such as storage, users, roles, etc.
Associated Costs
As part of the training, you will only get the material. You need to practice on your own or corporate cloud account and Databricks Account.
You need to take care of the associated AWS or Azure costs.
You need to take care of the associated Databricks costs.
Training Approach
Here are the details related to the training approach.
It is self-paced with reference material, code snippets, and videos provided as part of Udemy.
One needs to sign up for their own Databricks environment to practice all the core features of Databricks.
We would recommend completing 2 modules every week by spending 4 to 5 hours per week.
It is highly recommended to take care of all the tasks so that one can get real experience of Databricks.
Support will be provided through Udemy Q&A.
Who this course is for
Bner or Intermediate Data Eeers who want to learn Databricks for Data Eeering
Intermediate Application Eeers who want to explore Data Eeering using Databricks
Data and Analytics Eeers who want to learn Data Eeering using Databricks
Testers who want to learn Databricks to test Data Eeering applications built using Databricks



DOWNLOAD
uploadgig.com



rapidgator.net


nitro.download

High Speed Download

Add Comment

  • People and smileys emojis
    Animals and nature emojis
    Food and drinks emojis
    Activities emojis
    Travelling and places emojis
    Objects emojis
    Symbols emojis
    Flags emojis