Duration

3 Days

Audience

Employees of federal, state and local governments; and businesses working with the government.

Course Overview

R is a functional programming environment for business analysts and data scientists. It’s a language that many non-programmers can easily work with, naturally extending a skill set that is common to high-end Excel users. It’s the perfect tool for when the analyst has a statistical, numerical, or probabilities-based problem based on real data, and they’ve pushed Excel past its limits.

Introduction to R Programming for Data Science & Analytics is a hands-on course that presents common scenarios encountered in analysis and present practical solutions. This course provides indoctrination in the practical use of the umbrella of technologies that are on the leading edge of data science development focused on R and related tools.  Working in a hands-on learning environment, led by our expert practitioner, you’ll learn R and its ecosystem, and where it’s a better a tool than Excel.

Learning Objectives

This course provides indoctrination in the practical use of the umbrella of technologies that are on the leading edge of data science development focused on R and related tools.  Working in a hands-on learning environment, led by our expert practitioner, you’ll explore R and its ecosystem, and where it’s a better a tool than Excel.

This course is approximately 50% hands-on, combining expert lecture, real-world demonstrations and group discussions with machine-based practical labs and exercises.  Our engaging instructors and mentors are highly experienced practitioners who bring years of current “on-the-job” experience into every classroom.  Working in a hands-on learning environment, guided by our expert team, attendees will learn about and explore:

  • Data Science essentials
  • R programming Essentials
  • Variables and Types, Loops, R Scalars, Vectors, and Matrices
  • String and Text Manipulation, List & Functions
  • DataFrames and File I/O
  • Reading data from files and data prep
  • Visualization
  • Exploration With Dplyr
  • Statistical Modeling With R
  • Data Exploration
  • Regressions
  • R and Big Data

Course Outline

  1. Session: Data Science Essentials
  • Data Science
  • Process of Doing Data Science
  1. Session: Introducing R
  • R Essentials
  • Variables and Types
  • Control Structures (Loops / Conditionals)
  • R Scalars, Vectors, and Matrices
    • Defining R Vectors
    • Matrices
  • String and Text Manipulation
    • Character data type
    • File IO
  • Lists
  • Functions
    • Introducing Functions
    • Closures
    • lapply/sapply functions
  • DataFrames
  1. Session: Intermediate R
  • DataFrames and File I/O
  • Reading data from files
  • Data Preparation
  • Built-in Datasets
  • Visualization
    • Graphics Package
    • plot() / barplot() / hist() / boxplot() / scatter plot
    • Heat Map
    • ggplot2 package ( qplot(), ggplot())
  • Exploration With Dplyr
  1. Session: Analytics With R
  • Statistical Modeling With R
    • Statistical Functions
    • Dealing With NA
    • Distributions (Binomial, Poisson, Normal)
  • Data Exploration
  • Regressions
    • Linear Regressions
    • Logistic Regressions
  • Text Processing (tm package / Wordclouds)
  • R and Big Data