Introduction to CloudWatch
What is CloudWatch?
Amazon CloudWatch is a monitoring service to monitor your AWS resources, as well as the applications that you run on AWS.
What can CloudWatch do?
CloudWatch can monitor things like:
- Compute
- Autoscaling Groups
- Elastic Load Balancers
- Route53 Health Checks
- Storage & Content Deliver
- EBS Volumes
- Storage Gateways
- CloudFront
- Database & Analytics
- DynamoDB
- Elasticache Nodes
- RDS Instances
- Elastic MapReduce Job Flows
- Redshift
- Other
- SNS Topics
- SQS Queues
- Opsworks
- CloudWatch Logs
- Estimated Charges on your AWS Bill
CloudWatch and EC2
Host Level Metrics Consist of (by default):
- CPU
- usage
- Network
- usage
- Disk
- overall disk throughput
- Status Check
- ****If it isn't any of these, its a custom metric. -
CANNOT SEE:
- storage space left available
- ram utilization
@@EXAM TIP:
RAM Utilization is a custom metric!!!! By default, EC2 monitoring is 5 minute intervals, unless you enable detailed monitoring which will then make it 1 minute intervals.
How Long are CloudWatch Metrics Stored?
You can retrieve data using the GetMetricStatistics API or by using third party tools offered by AWS partners.
You can store your log data in CloudWatch Logs for as long as you want. By default, CloudWatch Logs will store your log data indefinitely. You can change the retention for each Log Group at any time.
You can retrieve data from any terminated EC2 or ELB instance after its termination
Metric Granularity
It depends on the AWS service. Many default metrics for many default services are 1 minute, but it can be 3 or 5 minutes depending on the service
EXAM TIP:
For custom metrics the minimum granularity that you can have is 1 minute
CloudWatch Alarms
You can create an alarm to monitor any Amazon CloudWatch metric in your account. This can include EC2 CPU Utilization, Elastic Load Balancer Latency or even the charges on your AWS bill. You can set the appropriate thresholds in which to trigger the alarms and also set what actions should be taken if an alarm state is reached. This will be covered in a subsequent lecture.
CloudWatch Exam Tips
Cloud Watch - a monitoring service to monitor your aws resources, as well as the applications that you run on AWS
Host Level Metrics Consist of:
- CPU
- Network
- Disk
- Status Check
if it asks about a metric outside of this (ram utlization, storage space on virtual hard disk): custom metric
Custom Metrics - minimum granularity is 1 minute
Terminated Instances - you can retrieve data from any terminated EC2 or ELB instance after its termination. CloudWatch logs by default stored indefinitely.
Metric Granularity
- 1 minute for detailed monitoring
- for 2 minute monitoring use detailed monitoring
- 5 minutes for standard monitoring
CloudWatch can be used on premise. It is not restricted to just AWS resources. Can be on premise too. Just need to download and install the SSM agent and CloudWatch agent.
Amazon CloudWatch Monitoring Scripts
You can use the Amazon CloudWatch Monitoring Scripts for Amazon Elastic Compute Cloud (Amazon EC2) Linux-based instances to produce and consume Amazon CloudWatch custom metrics.
Extras
- High Resolution custom metrics
- You can publish Custom Metrics down to 1-second resolution
- Alert sooner with High Resolution alarms as frequently as 10 second periods
- https://aws.amazon.com/about-aws/whats-new/2017/07/amazon-cloudwatch-introduces-high-resolution-custom-metrics-and-alarms/
- If you set an alarm on a high-resolution metric, you can specify a high-resolution alarm with a period of 10 seconds or 30 seconds, or you can set a regular alarm with a period of any multiple of 60 seconds.
- Cloud Watch has three settings you specify
- Period
- Length of time to evaluate the metric or expression to create each individual data point for an alarm
- Expressed in seconds
- If you choose one minute as the period, there is one datapoint every minute
- Evaluation Period
- The number of most recent periods, or data points, to evaluate when determining alarm state
- Datapoints to Alarm
- Number of data points within the evaluation period that must be breaching to cause the alarm to go to the ALARM state
- The breaching data points do not have to be consecutive, they just must all be within the number of data points equal to evaluation period
- Period