757-216-3656 | Monday–Friday 8:30 AM – 4:30 PM | info@itdojo.com

Course Duration

1 Day

Audience

Employees of federal, state and local governments; and businesses working with the government.

Prerequisites

At least one year of data analytics experience or direct experience building real-time applications or streaming analytics solutions.

Course Description

This course teaches students how to build real-time streaming data pipelines on AWS using services such as Amazon Kinesis Data Streams, Kinesis Data Firehose, Amazon MSK (Managed Streaming for Apache Kafka), and AWS Lambda. Students learn to ingest, process, and analyze continuous data streams for operational and analytical use cases.

Learning Objectives

  • Understand the features and benefits of a modern data architecture. Learn how AWS streaming services fit into a modern data architecture.
  • Design and implement a streaming data analytics solution
  • Identify and apply appropriate techniques, such as compression, sharding, and partitioning, to optimize data storage
  • Select and deploy appropriate options to ingest, transform, and store real-time and near real-time data
  • Choose the appropriate streams, clusters, topics, scaling approach, and network topology for a particular business use case
  • Understand how data storage and processing affect the analysis and visualization mechanisms needed to gain actionable business insights
  • Secure streaming data at rest and in transit
  • Monitor analytics workloads to identify and remediate problems
  • Apply cost management best practices

Course Outline

  • Module A: Overview of Data Analytics and the Data Pipeline
  • Module 1: Introduction to Amazon EMR
  • Module 2: Data Analytics Pipeline Using Amazon EMR: Ingestion and Storage
  • Module 3: High-Performance Batch Data Analytics Using Apache Spark on Amazon EMR
  • Module 4: Processing and Analyzing Batch Data with Amazon EMR and Apache Hive
  • Module 5: Serverless Data Processing
  • Module 6: Security and Monitoring of Amazon EMR Clusters
  • Module 7: Designing Batch Data Analytics Solutions
  • Module B: Developing Modern Data Architectures on AWS
Get More Information

We cannot work with the general public. We only work with Government Agencies, Military, government contractors, and corporate clients.