20775PerformingDataEngineeringonMicrosoft

Categories Microsoft Technical, SQL Server

20775 Performing Data Engineering on Microsoft HD Insight

This 5-day course focuses on giving the participants the ability to plan and implement big data workflows on HDInsight.

Description

Overview

This 5-day course focuses on giving the participants the ability to plan and implement big data workflows on HDInsight.

Prerequisites

In addition to their professional experience, participants who attend this course should have:

Programming experience using R, and familiarity with common R packages
Knowledge of common statistical methods and data analysis best practices
Basic knowledge of the Microsoft Windows operating system and its core functionality
Working knowledge of relational databases

Who Should Attend?

This course is recommended for data engineers, data architects, data scientists, and data developers who plan to implement big data engineering workflows on HDInsight.

Course Outline

Lesson 1: Getting Started with HDInsight

Big Data
Hadoop
MapReduce
HDInsight

Lab: Querying Big Data

Lesson 2: Deploying HDInsight Clusters

HDInsight cluster types
Managing HDInsight Clusters
Managing HDInsight Clusters with PowerShell

Lab: Managing HDInsight clusters with the Azure Portal

Lesson 3: Authorizing Users to Access Resources

Non-domain Joined clusters
Configuring domain-joined HDInsight clusters
Manage domain-joined HDInsight clusters

Lab: Authorizing Users to Access Resources

Lesson 4: Loading data into HDInsight

HDInsight Storage
Data loading tools
Performance and reliability

Lab: Loading Data into HDInsight

Lesson 5: Troubleshooting HDInsight

Analyze HDInsight logs
YARN logs
Heap dumps
Operations management suite

Lab: Troubleshooting HDInsight

Lesson 6: Implementing Batch Solutions

Apache Hive storage
Querying with Hive and Pig
Operationalize HDInsight

Lab: Backing Up SQL Server Databases

Lesson 7: Design Batch RTL solutions for big data with Spark

What is Spark?
ETL with Spark
Spark performance

Lab: Design Batch ETL solutions for big data with Spark

Lesson 8: Analyze Data with Spark SQL

Implement interactive queries
Perform exploratory data analysis

Lab: Analyze data with Spark SQL

Lesson 9: Analyze Data Hive and Phoenix

Implement interactive queries for big data with interactive hive.
Perform exploratory data analysis by using Hive
Perform interactive processing by using Apache Phoenix

Lab: Analyze data with Hive and Phoenix

Lesson 10: Stream Analytics

Stream analytics
Process streaming data from stream analytics
Managing stream analytics jobs

Lab: Implement Stream Analytics

Lesson 11: Spark Streaming using the DStream API

Dstream
Create Spark structured streaming applications
Persistence and visualization

Lab: Spark streaming applications using DStream API

Lesson 12: Developing big data real-time processing solutions with Apache Storm

Persist long term data
Stream data with Storm
Create Storm topologies
Configure Apache Storm

Lab: Developing big data real-time processing solutions with Apache Storm

Lesson 13: Analyze Data with Spark SQL

Implement interactive queries
Perform exploratory data

Lab: Analyze data with Spark SQL

Get Pricing and Brochure

Your name

Your best contact email

Your best phone contact

I want to find out more about:

Course Price Course Date & Schedule I'd like to chat with someone Others

Is there anything else we can help you with?

Consent:

*By submitting your details, you agree and consent to us, ITEL Learning Systems, to use and disclose and/or retain your personal data within the organization.

Newsletter Subscription

Send me information about New Courses, Upcoming Seminars, Fundings, and Promotions.

Privacy Policy

More Like This