Building Streaming Data Analytics Solutions on AWS

In this course, you will learn to build streaming data analytics solutions using AWS services, including Amazon Kinesis and Amazon Managed Streaming for Apache Kafka (Amazon MSK).

Description

Overview

In this course, you will learn to build streaming data analytics solutions using AWS services, including Amazon Kinesis and Amazon Managed Streaming for Apache Kafka (Amazon MSK). Amazon Kinesis is a massively scalable and durable real-time data streaming service. Amazon MSK offers a secure, fully managed, and highly available Apache Kafka service. You will learn how Amazon Kinesis and Amazon MSK integrate with AWS services such as AWS Glue and AWS Lambda. The course addresses the streaming data ingestion, stream storage, and stream processing components of the data analytics pipeline. You will also learn to apply security, performance, and cost management best practices to the operation of Kinesis and Amazon MSK.

Course Objectives

After completing this course, students will be able to:

  • Understand the features and benefits of a modern data architecture. Learn how AWS streaming services fit into a modern data architecture.
  • Design and implement a streaming data analytics solution
  • Identify and apply appropriate techniques, such as compression, sharding, and partitioning, to optimize data storage
  • Select and deploy appropriate options to ingest, transform, and store real-time and near real-time data
  • Choose the appropriate streams, clusters, topics, scaling approach, and network topology for a particular business use case
  • Understand how data storage and processing affect the analysis and visualization mechanisms needed to gain actionable business insights
  • Secure streaming data at rest and in transit
  • Monitor analytics workloads to identify and remediate problems
  • Apply cost management best practices

Who Should Attend

This course is intended for:

  • Data engineers and architects
  • Developers who want to build and manage real-time applications and streaming data analytics solutions

Course Outline

This course includes presentations, practice labs, discussions, and class exercises.

Module A: Overview of Data Analytics and the Data Pipeline

  • Data analytics use cases
  • Using the data pipeline for analytics

Module 1: Using Streaming Services in the Data Analytics Pipeline

  • The importance of streaming data analytics
  • The streaming data analytics pipeline
  • Streaming concepts

Module 2: Introduction to AWS Streaming Services

  • Streaming data services in AWS
  • Amazon Kinesis in analytics solutions
  • Demonstration: Explore Amazon Kinesis Data Streams
  • Practice Lab: Setting up a streaming delivery pipeline with Amazon Kinesis
  • Using Amazon Kinesis Data Analytics
  • Introduction to Amazon MSK
  • Overview of Spark Streaming

Module 3: Using Amazon Kinesis for Real-time Data Analytics

  • Exploring Amazon Kinesis using a clickstream workload
  • Creating Kinesis data and delivery streams
  • Demonstration: Understanding producers and consumers
  • Building stream producers
  • Building stream consumers
  • Building and deploying Flink applications in Kinesis Data Analytics
  • Demonstration: Explore Zeppelin notebooks for Kinesis Data Analytics
  • Practice Lab: Streaming analytics with Amazon Kinesis Data Analytics and Apache Flink

Module 4: Securing, Monitoring, and Optimizing Amazon Kinesis

  • Optimize Amazon Kinesis to gain actionable business insights
  • Security and monitoring best practices

Module 5: Using Amazon MSK in Streaming Data Analytics Solutions

  • Use cases for Amazon MSK
  • Creating MSK clusters
  • Demonstration: Provisioning an MSK Cluster
  • Ingesting data into Amazon MSK
  • Practice Lab: Introduction to access control with Amazon MSK
  • Transforming and processing in Amazon MSK

Module 6: Securing, Monitoring, and Optimizing Amazon MSK

  • Optimizing Amazon MSK
  • Demonstration: Scaling up Amazon MSK storage
  • Practice Lab: Amazon MSK streaming pipeline and application deployment
  • Security and monitoring
  • Demonstration: Monitoring an MSK cluster

Module 7: Designing Streaming Data Analytics Solutions

  • Use case review
  • Class Exercise: Designing a streaming data analytics workflow

Module B: Developing Modern Data Architectures on AWS

  • Modern data architectures

Prerequisites

We recommend that attendees of this course have:

Similar courses

Using Data Analysis Expressions to solve common business problems in Power BI

More Information

Analyze business data, visualize insights, and share those insights across the enterprise

More Information

In this course, you will perform advanced data visualization and data blending with Tableau.

More Information

R is a functional programming environment for business analysts and data scientists. It's a language that many non-programmers can easily work with, naturally extending a skill set that is common to high-end Excel users. It's the perfect tool for when the analyst has a statistical, numerical, or probabilities-based problem based on real data and they've pushed Excel past its limits.

More Information

This course is designed for people who want to learn the Python programming language in preparation for using Python to develop software for a wide range of applications, such as data science, machine learning, artificial intelligence, and web development.

More Information

This course teaches concepts by deep-dive on-hand exercises. Throughout the course, you will learn data wrangling with hands-on exercises and activities. You’ll find checklists, best practices, and critical points mentioned throughout the lessons, making things more interesting.

More Information

In this course, you will build a data analytics solution using Amazon Redshift, a cloud data warehouse service.

More Information

In this course, you will learn new concepts, strategies, and best practices for designing a cloud-based data warehousing solution using Amazon Redshift, the petabyte-scale data warehouse in AWS.

More Information

In this course, you will learn how to build an operational data lake that supports analysis of both structured and unstructured data. You will learn the components and functionality of the services involved in creating a data lake. You will use AWS Lake Formation to build a data lake, AWS Glue to build a data catalog, and Amazon Athena to analyze data. The course lectures and labs further your learning with the exploration of several common data lake architectures.

More Information

In this course, you will learn to build batch data analytics solutions using Amazon EMR, an enterprise-grade Apache Spark and Apache Hadoop managed service.

More Information

This course is designed to teach those in a systems administrator or Development Operations (DevOps) role how to create automatable and repeatable deployments of networks and systems on the AWS platform. The course covers the specific AWS features and tools related to configuration and deployment, in addition to best practices for configuring and deploying systems.

More Information

This course explores how to use the machine learning (ML) pipeline to solve a real business problem in a project-based learning environment.

More Information

In this course, you will learn the most common DevOps patterns to develop, deploy, and maintain applications on the AWS platform. We will explore the core principles of the DevOps methodology and examine a number of use cases applicable to startup, small- to medium-sized business, and enterprise development scenarios.

More Information

In this course you will describe key database concepts in the context of SQL Server, characterize database languages used in SQL Server, describe data modeling techniques, discuss normalization and denormalization techniques, distinguish relationship types and effects in database design, describe the effects of database design on performance, and define commonly used database objects.

More Information

In this course, you will create single table SELECT queries, create multiple table SELECT queries, insert, update, and delete data, query data using built-in functions, create queries that aggregate data, create subqueries, create queries that use table expressions, use UNION, INTERSECT, and EXCEPT on multiple sets of data, implement window functions in queries, use PIVOT and GROUPING SETS in queries, use stored procedures in queries, add error handling to queries, and use transactions in queries.

More Information

In this course you will, create sophisticated SSIS packages for extracting, transforming, and loading data, use containers to efficiently control repetitive tasks and transactions, configure packages to dynamically adapt to environment changes, use Data Quality Services to cleanse data, successfully troubleshoot packages, create and manage the SSIS Catalog, deploy, configure, and schedule packages, secure the SSIS Catalog.

More Information

This course will teach you the fundamentals of programming in R to get you started. It will also teach you how to use R to perform common data science tasks and achieve data-driven results for the business.

More Information

In this course you will authenticate and authorize users, assign server and database roles, authorize users to access resources, use encryption and auditing features to protect data, describe recovery models and backup strategies, backup and restore SQL Server databases, automate database management, configure security for the SQL Server agent, manage alerts and notifications, managing SQL Server using PowerShell, trace access to SQL Server, monitor a SQL Server infrastructure, and import and export data.

More Information

In this course, you will learn how to leverage AWS data Services to store, process, analyze, stream, and query data to make decisions with speed and agility at scale, how to modernize data solutions end to end, and obtain skills to put your data to work to make better, more informed decisions, respond faster to the unexpected, and uncover new opportunities.

More Information

In this course, you will practice and deploy serverless solutions on AWS.

More Information

The creation of data-backed visualizations is a key way data scientists, or any professional, can explore, analyze, and report insights and trends from data. Tableau® software is designed for this purpose. Tableau was built to connect to a wide range of data sources and allows users to quickly create visualizations of connected data to gain insights, show trends, and create reports.

More Information

This course provides students with the knowledge and skills to administer a SQL Server database infrastructure for cloud, on-premises and hybrid relational databases and who work with the Microsoft PaaS relational database offerings. Additionally, it will be of use to individuals who develop applications that deliver content from SQL-based relational databases.

More Information

In this course, you will develop and deploy VBA modules to solve business problems.

More Information

If you are someone with existing SQL or SQL Server knowledge (or someone highly versed in different data repositories), this is the Power BI course for you. This course covers the various methods and best practices that are in line with business and technical requirements for modeling, visualizing, and analyzing data with Power BI.

More Information

This introductory and beyond level course is for technical users newer to Python who want to learn advanced data handling and transformation skills, using the latest tools and techniques. The course is approximately 50% hands-on to 50% lecture ratio, combining expert lecture, real-world demonstrations and group discussions with machine-based practical labs and exercises. Student machines are required.

More Information

CompTIA Data+ is an early-career data analytics certification for professionals tasked with developing and promoting data-driven business decision-making that gives learners the confidence to bring data analysis to life.

More Information

In this course, you will compose SQL queries to retrieve desired information from a database.

More Information

In this course, you will work with advanced queries to manipulate and index tables. You will also create transactions so that you can choose to save or cancel the data entry process.

More Information

This 2-day entry-level course examines the services and features of Microsoft SQL 2022. (This is NOT a SQL querying course, SQL Querying syntax will not be discussed). The content focuses on database tables, adding and changing data, creating and using stored procedures, entity relationships, and indexes.

More Information

Doing data analysis work is about more than learning a software program (Excel, Power BI, Tableau, etc.) - you need to understand the concepts and theory too. This one day course gets you up to speed (and can be useful either before or after your software classes).

More Information

In this course, you will use various Python tools to load, analyze, manipulate, and visualize business data.

More Information

In this course, students will create complex reports & data sources using the tools in Crystal Reports 2020. Students will not only create more complex reports including sub-reports and cross-tabs, but will also increase their speed and efficiency.

More Information

In this course, students will create a basic report by connecting to a database and modifying the report's presentation.

More Information

Our Exam Cram sessions are intensive, focused review sessions designed to help your team master key concepts and pass their CompTIA certification exams with confidence. Led by expert instructors, these sessions provide in-depth, targeted hands-on practice to ensure your team is fully prepared for exam day. Cloud+ covers mining and manipulating data, applying basic statistical methods, and analyzing complex datasets. This exam cram session is included with the Data+ course.

More Information

This course is designed for professionals in a variety of job roles who are currently using desktop or web-based data management tools such as Microsoft® Excel® or SQL Server® reporting services to perform numerical or general data analysis. This course is also designed for professionals who want to pursue the Microsoft Power BI Data Analyst (Exam PL-300) certification.

More Information

In this course, you will develop your understanding about agile business analysis and the role of the business analyst on an agile team. You will learn how business analysis on an agile project is ‘the same’ and ‘different’ than business analysis performed on waterfall projects. You will understand how the business analysis role changes on an agile team.

More Information

This course shows you the fundamentals of building IT infrastructure on the AWS platform. You learn how to optimize the AWS Cloud by understanding AWS services and how they fit into cloud-based solutions. You explore best practices and design patterns to help you architect optimal IT solutions on AWS, then build and explore a variety of infrastructures through guided, hands-on activity. You learn how to create fledgling architectures and build them into robust and adaptive solutions.

More Information

This fundamental-level, full-day course is intended for individuals who seek an overall understanding of the AWS Cloud, independent of specific technical roles. It provides a detailed overview of cloud concepts, AWS services, security, architecture, pricing, and support. It includes lab exercises reinforcing some of the core concepts of the lecture. This course also helps you prepare for the AWS Certified Cloud Practitioner exam.

More Information

In this course, you will learn how to use the AWS SDK for developing secure and scalable cloud applications. The course provides in-depth knowledge about how to interact with AWS using code and covers key concepts, best practices, and troubleshooting tips.

More Information

During this 5-day course, students will learn Transact-SQL as implemented in SQL Server 2008, 2012 and 2014. The course starts by establishing a foundation understanding of database concepts and terminology. Students are then prepared to use various Microsoft tools to submit queries and view the result.

More Information

In this course, students will continue their learning on the foundations of report writing with Microsoft® SQL Server® Report Builder and SSRS.

More Information

AWS Technical Essentials introduces you to AWS products, services, and common solutions. It provides you with fundamentals to become more proficient in identifying AWS services so that you can make informed decisions about IT solutions based on your business requirements and get started working on AWS.

More Information

Building on concepts introduced in Architecting on AWS, Advanced Architecting on AWS is intended for individuals who are experienced with designing scalable and elastic applications on the AWS platform. Building on concepts introduced in Architecting on AWS, this course covers how to build complex solutions which incorporate data services, governance, and security on AWS. This course introduces specialized AWS services, including AWS Direct Connect and AWS Storage Gateway to support Hybrid architecture.

More Information

The course focuses on common data engineering tasks such as orchestrating data transfer and transformation pipelines, working with data files in a data lake, creating and loading relational data warehouses, capturing and aggregating streams of real-time data, and tracking data assets and lineage.

More Information