20773 Analyzing Big Data with Microsoft R

The main purpose of this 3-day course is to give participants the ability to use Microsoft R Server to create and run an analysis on a large dataset, and show how to utilize it in Big Data environments, such as a Hadoop or Spark cluster, or a SQL Server database.

Overview

The main purpose of this 3-day course is to give participants the ability to use Microsoft R Server to create and run an analysis on a large dataset, and show how to utilize it in Big Data environments, such as a Hadoop or Spark cluster, or a SQL Server database.

Prerequisites

In addition to their professional experience, participants who attend this course should have:

  • Programming experience using R, and familiarity with common R packages
  • Knowledge of common statistical methods and data analysis best practices
  • Basic knowledge of the Microsoft Windows operating system and its core functionality.

Working knowledge of relational databases.

Who Should Attend?

This course is recommended for people who wish to analyze large datasets within a big data environment. This course is also recommended for developers who need to integrate R analyses into their solutions.

Course Outline

  • What is Microsoft R server
  • Using Microsoft R client
  • The ScaleR functions

Lab: Exploring Microsoft R Server and Microsoft R Client

  • Understanding ScaleR date sources
  • Reading data into an XDF object
  • Summarizing data in an XDF object

Lab: Exploring Big Data

  • Visualizing In-memory data
  • Visualizing big data

Lab: Visualizing data

  • Transforming Big Data
  • Managing datasets

Lab: Processing big data

  • Using the RxLocalParallel compute context with rxExec
  • Using the revoPemaR package

Lab: Using rxExec and RevoPemaR to parallelize operations

  • Clustering Big Data
  • Generating regression models and making predictions

Lab: Creating a linear regression model

  • Creating partitioning models based on decision trees
  • Test partitioning models by making and comparing predictions

Lab: Creating and evaluating partitioning models

  • Using R in SQL Server
  • Using Hadoop Map/Reduce
  • Using Hadoop Spark

Lab: Processing big data in SQL Server and Hadoop

Get Pricing and Brochure

More Like This

Get the course Brochure & Pricing

Our course consultant will contact you within 1 working day

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.

Get in touch with our consultant