Duration:
2 Days
Audience:
Employees of federal, state and local governments; and businesses working with the government.
This course is perfect for:
- Data Analyst professionals
- Business Analyst professionals
- Business Intelligence professionals
- Cloud Data Engineers who will be partnering with Data Analysts to build scalable data solutions on Google Cloud Platform
Course Overview:
Want to know how to query and process petabytes of data in seconds? Curious about data analysis that scales automatically as your data grows? Welcome to the Data Insights course!
This two-day, instructor-led course teaches participants how to derive insights through data analysis and visualization using the Google Cloud Platform. The course features interactive scenarios and hands-on labs where participants explore, mine, load, visualize, and extract insights from diverse Google BigQuery datasets. The course covers data loading, querying, schema modeling, optimizing performance, query pricing, and data visualization.
Course Outline:
Before and Now: Scalable Data Analysis in the Cloud
- Highlight Analytics Challenges Faced by Data Analysts
- Compare Big Data On-Premise vs. on the Cloud
- Learn from Real-World Use Cases of Companies Transformed Through Analytics on the Cloud
- Navigate Google Cloud Platform Project Basics
Module 2: Big Data Tools Overview
Sharpen the Tools in your Data Analyst toolkit
- Walkthrough Data Analyst Tasks, Challenges, and Introduce Google Cloud Platform Data Tools
- Demo: Analyze 10 Billion Records with Google BigQuery
- Explore 9 Fundamental Google BigQuery Features
- Compare GCP Tools for Analysts, Data Scientists, and Data Engineers
Module 3: Exploring your Data
Get Familiar with Google BigQuery and Learn SQL Best Practices
- Compare Common Data Exploration Techniques
- Learn How to Code High Quality Standard SQL
- Explore Google BigQuery Public Datasets
- Visualization Preview: Google Data Studio
Module 4: Google BigQuery Pricing
Calculate Google BigQuery Storage and Query Costs
- Walkthrough of a BigQuery Job
- Calculate BigQuery Pricing: Storage, Querying, and Streaming Costs
- Optimize Queries for Cost
Module 5: Cleaning and Transforming your Data
Wrangle your Raw Data into a Cleaner and Richer Dataset
- Examine the 5 Principles of Dataset Integrity
- Characterize Dataset Shape and Skew
- Clean and Transform Data using SQL
- Clean and Transform Data using a new UI: Introducing Cloud Dataprep
Module 6: Storing and Exporting Data
Create new Tables and Exporting Results
- Compare Permanent vs. Temporary Tables
- Save and Export Query Results
- Performance Preview: Query Cache
Module 7: Ingesting New Datasets into Google BigQuery
Bring your Data into the Cloud
- Query from External Data Sources
- Avoid Data Ingesting Pitfalls
- Ingest New Data into Permanent Tables
- Discuss Streaming Inserts
Module 8: Data Visualization
Effectively Explore and Explain Data through Visualization
- Overview of Data Visualization Principles
- Exploratory vs. Explanatory Analysis Approaches
- Demo: Google Data Studio UI
- Connect Google Data Studio to Google BigQuery
Module 9: Joining and Merging Datasets
Combine and Enrich Datasets with More Data
- Merge Historical Data Tables with UNION
- Introduce Table Wildcards for Easy Merges
- Review Data Schemas: Linking Data Across Multiple Tables
- Walkthrough JOIN Examples and Pitfalls
Module 10: Google BigQuery Tables Deep Dive
What Sets Cloud Architecture Apart?
- Compare Data Warehouse Storage Methods
- Deep-Dive into Column-Oriented Storage
- Examine Logical Views, Date-Partitioned Tables, and Best Practices
- Query the Past with Time Travelling Snapshots
Module 11: Schema Design and Nested Data Structures
Model Datasets for Scale in Google BigQuery
- Compare Google BigQuery vs. Traditional RDBMS Data Architecture
- Normalization vs. Denormalization: Performance Trade-Offs
- Schema Review: The Good, The Bad, and The Ugly
- Arrays and Nested Data in Google BigQuery
Module 12: Advanced Visualization with Google Data Studio
Create Pixel-Perfect Dashboards
- Create Case Statements and Calculated Fields
- Avoid Performance Pitfalls with Cache Considerations
- Share Dashboards and Discuss Data Access Considerations
Module 13: Advanced Functions and Clauses
Dive Deeper into Advanced Query Writing with Google BigQuery
- Review SQL Case Statements
- Introduce Analytical Window Functions
- Safeguard Data with One-Way Field Encryption
- Discuss Effective Sub-query and CTE design
- Compare SQL and Javascript UDFs
Module 14: Optimizing for Performance
Troubleshoot and Solve Query Performance Problems
- Avoid Google BigQuery Performance Pitfalls
- Prevent Hotspots in Data
- Diagnose Performance Issues with the Query Explanation Map
Module 15: Advanced Insights
Think, Analyze, and Share Insights Like a Data Scientist
- Distill Complex Queries
- Brainstorm Data-Driven Hypotheses
- Think like a Data Scientist
- Introducing Cloud Datalab
Module 16: Data Access
Keep Data Security Top-of-Mind in the Cloud
- Compare IAM and BigQuery Dataset Roles
- Avoid Access Pitfalls
- Review Members, Roles, Organizations, Account Administration, and Service Accounts
Labs
- Lab: Getting Started with Google Cloud Platform
- Lab: Exploring Datasets with Google BigQuery
- Lab: Troubleshoot Common SQL Errors
- Lab: Calculate Google BigQuery Pricing
- Lab: Explore and Shape Data with Cloud Dataprep
- Lab: Creating New Permanent Tables
- Lab: Ingesting and Querying New Datasets
- Lab: Exploring a Dataset in Google Data Studio
- Lab: Join and Union Data from Multiple Tables
- Lab: Querying Nested and Repeated Data
- Lab: Visualizing Insights with Google Data Studio
- Lab: Deriving Insights with Advanced SQL Functions
- Lab: Optimizing and Troubleshooting Query Performance
- Lab: Reading a Google Cloud Datalab Notebook