About the course
This intensive, two-day hands-on workshop provides a comprehensive guide to utilizing AWS CloudWatch for complete observability across your applications and infrastructure. We move beyond basic monitoring to focus on best practices for operational excellence.
You will gain mastery over the three pillars of observability: Metrics, Logs, and Events. The course covers setting up custom application metrics, using CloudWatch Logs Insights for troubleshooting, creating actionable Alarms, and automating responses with EventBridge. By the end, you'll be able to build comprehensive, effective dashboards that ensure high availability and rapid issue resolution.
Instructor-led online and in-house face-to-face options are available - as part of a wider customised training programme, or as a standalone workshop, on-site at your offices or at one of many flexible meeting spaces in the UK and around the World.
-
- Implement Comprehensive Monitoring: Configure and visualize standard metrics from core AWS services (EC2, Lambda, DynamoDB).
- Create Custom Metrics: Define, publish, and aggregate custom application metrics to track business-specific key performance indicators (KPIs).
- Centralize Logging: Configure the CloudWatch Agent to collect, stream, and structure application and system logs from various sources.
- Troubleshoot with Logs: Use CloudWatch Logs Insights and advanced querying techniques to rapidly search, filter, and analyze log data.
- Set Up Proactive Alerts: Create effective CloudWatch Alarms based on metric thresholds and composite conditions, integrated with SNS for notifications.
- Automate Responses: Use Amazon EventBridge to detect changes in your AWS environment and trigger automated actions (e.g., scaling, patching, healing).
- Build Observability Dashboards: Design and implement comprehensive CloudWatch Dashboards for operational visibility and team-wide reporting.
-
This course is ideal for Developers, DevOps Engineers, and Operations Staff who are responsible for the performance, reliability, and troubleshooting of applications running on Amazon Web Services.
-
Attendees should have attended our AWS Cloud Fundamentals training or have an equivalent working knowledge of AWS services (e.g., EC2, S3, Lambda) and familiarity with the AWS Management Console.
-
This Cloudwatch course is available for private / custom delivery for your team - as an in-house face-to-face workshop at your location of choice, or as online instructor-led training via MS Teams (or your own preferred platform).
Get in touch to find out how we can deliver tailored training which focuses on your project requirements and learning goals.
-
CloudWatch Fundamentals and Metrics
The Three Pillars of Observability: Metrics, Logs, and Events.
Overview of the CloudWatch dashboard and navigation.
Standard metrics for core AWS services (EC2, Lambda, S3).
Understanding metric resolution, statistics, and dimensions.
Hands-on Lab: Reviewing and visualizing AWS service metrics.
Publishing Custom Metrics
When and why to use custom metrics (tracking business KPIs).
Methods for publishing custom metrics (AWS CLI, SDK/Boto3).
Understanding storage resolution and namespaces.
Hands-on Lab: Publishing custom application metrics from a Python script to CloudWatch.
CloudWatch Logs and Agents
Core concepts: Log Groups, Log Streams, and Retention Policies.
Configuring the CloudWatch Agent to collect application and system logs (Linux/Windows).
Structured logging vs. unstructured logging.
Hands-on Lab: Installing the CloudWatch Agent on an EC2 instance and verifying log ingestion.
Log Analysis and Troubleshooting
Searching and filtering log data via the console.
Introduction to CloudWatch Logs Insights query language.
Analyzing request latency, error rates, and user patterns in logs.
Hands-on Lab: Using Logs Insights to run advanced queries and troubleshoot a simulated application error.
CloudWatch Alarms and Notifications
Creating metric alarms and setting appropriate thresholds.
Understanding different alarm states (OK, ALARM, INSUFFICIENT_DATA).
Integrating alarms with Amazon SNS (Simple Notification Service) for alerts (email, Slack/Teams).
Creating Composite Alarms for advanced state logic.
Hands-on Lab: Configuring high-utilization alarms and setting up SNS notifications.
Dashboards and Visualization
Best practices for dashboard design (for ops teams vs. business users).
Adding metrics, logs, and text widgets to a dashboard.
Using Metric Math to perform calculations on existing metrics (e.g., calculating error rates).
Hands-on Lab: Building a comprehensive operational dashboard combining custom metrics, log queries, and alarms.
EventBridge (CloudWatch Events)
The concept of Event-Driven Architecture on AWS.
Creating Rules based on AWS service state changes (e.g., EC2 status change, Auto Scaling events).
Defining targets for automated actions (e.g., Lambda, SQS, EC2 Run Command).
Hands-on Lab: Setting up an EventBridge rule to automatically notify an administrator when an EC2 instance enters a specific state.
Monitoring AI/ML Workloads and Best Practices
Using CloudWatch to monitor Lambda and API Gateway for Bedrock integration.
Advanced cost optimization tips for CloudWatch Logs.
Reviewing monitoring best practices and next steps.
-
Core AWS CloudWatch Documentation
These are the primary sources for understanding the CloudWatch service, its components, and its capabilities:
AWS CloudWatch Product Page and Pricing:
The central hub for feature announcements and general service overview.
CloudWatch User Guide:
The authoritative documentation covering Metrics, Logs, Alarms, and Dashboards in depth.
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/WhatIsCloudWatch.html
CloudWatch Logs Insights Query Syntax:
The detailed reference guide for the specialized query language used in Log Analysis (Module 4).
https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CWL_QuerySyntax.html
Amazon EventBridge User Guide:
Documentation for the event-driven service used for automation and integration (Module 7).
https://docs.aws.amazon.com/eventbridge/latest/userguide/what-is-eventbridge.html
Tools and SDKs for Hands-on Labs
Participants will need these tools to configure their environment and publish custom data:
AWS CLI (Command Line Interface):
Essential for configuring credentials, managing resources, and publishing custom metrics via command line.
Boto3 SDK (Python) CloudWatch Client:
Documentation for the Python SDK used to publish custom metrics programmatically (Module 2).
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/cloudwatch.html
CloudWatch Agent Documentation:
Detailed instructions for installing and configuring the agent used to collect system and application logs from EC2 instances (Module 3).
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Install-CloudWatch-Agent.html
Development Environment
VS Code (Visual Studio Code):
You're welcome to use your own preferred IDE during the course - your instructor will most likely be using VSC for Python scripts and managing local files for the hands-on metric labs.
Python Extension for VS Code (Microsoft):
Provides debugging, linting, and IntelliSense support when writing custom metric publishers with the Boto3 SDK.
https://marketplace.visualstudio.com/items?itemName=ms-python.python
Trusted by



