系列文章
- Grafana 系列文章
AWS Cloudwatch 数据源
对于 AWS Cloudwatch, 主要在于 3 种不同的认证方式:
- AWS SDK Default
- IAM Role
- AK&SK
- Credentials file
现在推荐的是使用 IAM Role 的认证方式,避免了密钥泄露的风险。
(资料图片仅供参考)
但是特别要注意的是,要读取 CloudWatch 指标和 EC2 标签 (tags)、实例、区域和告警,你必须通过 IAM 授予 Grafana 权限。你可以将这些权限附加到你在 AWS 认证中配置的 IAM role 或 IAM 用户。
IAM policy 示例如下:
Metrics-only:
{ "Version": "2012-10-17", "Statement": [ { "Sid": "AllowReadingMetricsFromCloudWatch", "Effect": "Allow", "Action": [ "cloudwatch:DescribeAlarmsForMetric", "cloudwatch:DescribeAlarmHistory", "cloudwatch:DescribeAlarms", "cloudwatch:ListMetrics", "cloudwatch:GetMetricData", "cloudwatch:GetInsightRuleReport" ], "Resource": "*" }, { "Sid": "AllowReadingTagsInstancesRegionsFromEC2", "Effect": "Allow", "Action": ["ec2:DescribeTags", "ec2:DescribeInstances", "ec2:DescribeRegions"], "Resource": "*" }, { "Sid": "AllowReadingResourcesForTags", "Effect": "Allow", "Action": "tag:GetResources", "Resource": "*" } ]}
Logs-only:
{ "Version": "2012-10-17", "Statement": [ { "Sid": "AllowReadingLogsFromCloudWatch", "Effect": "Allow", "Action": [ "logs:DescribeLogGroups", "logs:GetLogGroupFields", "logs:StartQuery", "logs:StopQuery", "logs:GetQueryResults", "logs:GetLogEvents" ], "Resource": "*" }, { "Sid": "AllowReadingTagsInstancesRegionsFromEC2", "Effect": "Allow", "Action": ["ec2:DescribeTags", "ec2:DescribeInstances", "ec2:DescribeRegions"], "Resource": "*" }, { "Sid": "AllowReadingResourcesForTags", "Effect": "Allow", "Action": "tag:GetResources", "Resource": "*" } ]}
Metrics and Logs:
{ "Version": "2012-10-17", "Statement": [ { "Sid": "AllowReadingMetricsFromCloudWatch", "Effect": "Allow", "Action": [ "cloudwatch:DescribeAlarmsForMetric", "cloudwatch:DescribeAlarmHistory", "cloudwatch:DescribeAlarms", "cloudwatch:ListMetrics", "cloudwatch:GetMetricData", "cloudwatch:GetInsightRuleReport" ], "Resource": "*" }, { "Sid": "AllowReadingLogsFromCloudWatch", "Effect": "Allow", "Action": [ "logs:DescribeLogGroups", "logs:GetLogGroupFields", "logs:StartQuery", "logs:StopQuery", "logs:GetQueryResults", "logs:GetLogEvents" ], "Resource": "*" }, { "Sid": "AllowReadingTagsInstancesRegionsFromEC2", "Effect": "Allow", "Action": ["ec2:DescribeTags", "ec2:DescribeInstances", "ec2:DescribeRegions"], "Resource": "*" }, { "Sid": "AllowReadingResourcesForTags", "Effect": "Allow", "Action": "tag:GetResources", "Resource": "*" } ]}
跨账号可观测性:
{ "Version": "2012-10-17", "Statement": [ { "Action": ["oam:ListSinks", "oam:ListAttachedLinks"], "Effect": "Allow", "Resource": "*" } ]}
AWS Cloudwatch 数据源配置示例
几种认证方式的 AWS CLoudwatch 配置示例如下:
AWS SDK(default):
apiVersion: 1datasources: - name: CloudWatch type: cloudwatch jsonData: authType: default defaultRegion: eu-west-2
使用 Credentials 配置文件:
apiVersion: 1datasources: - name: CloudWatch type: cloudwatch jsonData: authType: credentials defaultRegion: eu-west-2 customMetricsNamespaces: "CWAgent,CustomNameSpace" profile: secondary
使用 AK&SK:
apiVersion: 1datasources: - name: CloudWatch type: cloudwatch jsonData: authType: keys defaultRegion: eu-west-2 secureJsonData: accessKey: "" secretKey: ""
使用 AWS SDK Default 和 IAM Role 的 ARM 来 Assume:
apiVersion: 1datasources: - name: CloudWatch type: cloudwatch jsonData: authType: default assumeRoleArn: arn:aws:iam::123456789012:root defaultRegion: eu-west-2
Cloudwatch 自带仪表板
Cloudwatch 自带的几个仪表板都不太好用,建议使用 monitoringartist/grafana-aws-cloudwatch-dashboards 替代。
创建告警的查询
告警需要返回 numeric 数据的查询,而 CloudWatch Logs 支持这种查询。例如,你可以通过使用 stats
命令来启用告警。
这也是一个有效的查询,用于对包括文本 "Exception" 的消息发出告警:
filter @message like /Exception/ | stats count(*) as exceptionCount by bin(1h) | sort exceptionCount desc
跨账户的可观察性
CloudWatch 插件使您能够跨区域账户监控应用程序并排除故障。利用跨账户的可观察性,你可以无缝地搜索、可视化和分析指标和日志,而不必担心账户的界限。
要使用这个功能,请在 AWS 控制台的 Cloudwatch 设置下,配置一个 monitoring 和 source 账户,然后按照上文所述添加必要的 IAM 权限。
关键词: