prometheus部署
prometheus部署详情作业
下载各种二进制文件压缩包
prometheus-2.47.0tar.gz #prometheus服务端
mysqld_exporter-0.15.0.linux-amd64.tar.gz #mysql_exporter节点
node_exporter-1.6.1.linux-amd64.tar.gz #node_exporter节点
alertmanager-0.26.0.linux-amd64.tar-vwsq.gz #alertmanager报警器
prometheus-webhook-dingtalk.tar.gz #webhook报警器
redis_exporter-v1.54.0.linux-amd64.tar.gz #redis_exporter节点
prometheus.server部署
进入官网下载prometheus版本https://prometheus.io/download/
以下载的2.47.0为例
wget https://github.com/prometheus/prometheus/releases/download/v2.47.0/prometheus-2.47.0.linux-amd64.tar.gz #下载官网压缩包
tar zxf prometheus-2.47.0.linux-amd64.tar.gz -C /usr/local/ #解压
mv prometheus-2.47.0.linux-amd64/ prometheus #改名,方便看
vim /usr/lib/systemd/system/prometheus.service #设置system管理系统
[Unit]
Description=https://prometheus.io
[Service]
Restart=on-failure
ExecStart=/usr/local/prometheus/prometheus --storage.tsdb.path=/usr/local/prometheus/data --config.file=/usr/local/prometheus/prometheus.yml
[Install]
WantedBy=multi-user.target
cd /usr/local/prometheus/ && cp prometheus.yml prometheus.yml,bak #备份配置文件
添加exporter
vim prometheus.yml #在最下面添加配置文件exporter节点和mysql节点
- job_name: "node1"
static_configs:
- targets: ["192.168.48.139:9100"]
- job_name: "mysql1"
static_configs:
- targets: ["192.168.48.139:9104"]
./promtool check config prometheus.yml #每次修改配置文件后最好输入一次,检测语法是否正确
每次修改配置文件后重启prometheus生效,所有添加的exporter都是套用此模板即可,启动查看他的端口号,由prometheus获得他收集的数据
systemctl start prometheus.service #启动服务,端口默认9090
访问ip加9090端口进入prometheus的web页面,如果提醒Warning: Error fetching server time:说明时间不对,需要调整
被监控端操作
添加node-exporter节点 #
wget https://github.com/prometheus/node_exporter/releases/download/v1.6.1/node_exporter-1.6.1.linux-amd64.tar.gz #下载node-exporter
tar xfz node_exporter-1.6.1.linux-amd64.tar.gz -C /usr/local/ && cd /usr/local/ #解压
mv node_exporter-1.6.1.linux-amd64/ node_exporter #改名
nohup ./node_exporter & #在后台启动,查看端口9100是否使用,启动失败查看nohup文件报错信息
登陆web查看
设置mysql-exporter节点
wget https://github.com/prometheus/mysqld_exporter/releases/download/v0.15.0/mysqld_exporter-0.15.0.linux-amd64.tar.gz #下载
tar xfz mysqld_exporter-0.15.0.linux-amd64.tar.gz -C /usr/local/prometheus && cd /usr/local/prometheus && mv mysqld_exporter-0.15.0.linux-amd64 mysqld_exporter && cd mysqld_exporter#解压改名
nohup ./mysqld_exporter --web.listen-address=0.0.0.0:9104 --config.my-cnf=/etc/my.cnf --collect.auto_increment.columns --collect.binlog_size --collect.info_schema.innodb_metrics --collect.info_schema.processlist --collect.info_schema.tables --collect.info_schema.tablestats --collect.slave_status --collect.global_status --collect.global_variables &
注意,config指定的配置文件路径需要在[client]块内写入一个用户的账号密码,用于读取mysql的数据
默认端口占用9104
cat /etc/my.cnf
[client]
user=root
password='1'
port = 3306
socket = /tmp/mysql.sock
default-character-set = utf8
#mysql的基本参数,启动命令--collect.auto_increment.columns
:自增列信息
--collect.binlog_size
:二进制日志大小--collect.info_schema.innodb_metrics
:InnoDB 存储引擎的指标信息--collect.info_schema.processlist
:当前执行的进程信息--collect.info_schema.tables
:信息模式下的数据表信息--collect.info_schema.tablestats
:信息模式下的数据表统计信息--collect.slave_status
:主从复制状态信息--collect.global_status
:全局 MySQL 状态信息--collect.global_variables
:全局 MySQL 变量信息
grafana的mysql的模板有18382,主从模板有7362
设置redis-exporter节点
https://github.com/oliver006/redis_exporter/releases #redis_exporter官网
wget https://github.com/oliver006/redis_exporter/releases/download/v1.29.0/redis_exporter-1.29.0.linux-amd64.tar.gz #下载
tar xfz redis_exporter-v1.54.0.linux-amd64.tar.gz -C /usr/local/prometheus && cd /usr/local/prometheus && mv redis_exporter-v1.54.0.linux-amd64 redis_exporter && cd redis_exporter #解压改名
nohup ./redis_exporter --redis.addr 192.168.48.139:6379 & #启动
监控redis各项指标,默认端口占用9121
grafane的redis模板有11835
监控端操作
设置报警器altermanager
在prometheus的server端下载altermanager
wget https://github.com/prometheus/alertmanager/releases/download/v0.26.0/alertmanager-0.26.0.linux-amd64.tar.gz #官网下载最新版
tar xfz /root/tar合集/alertmanager-0.26.0.linux-amd64.tar.gz -C /usr/local/ && cd /usr/local/ && mv alertmanager-0.26.0.linux-amd64 alertmanager #解压改名
cp alertmanager/alertmanager.yml alertmanager/alertmanager.yml.bak #备份配置文件
./amtool check-config alertmanager.yml #官方检测配置文件语法是否有问题的脚本
添加alertmanagers报警器
vim prometheus.yml #在主配置文件内找到alerting修改
设置alertmanagers所在的主机ip和端口,rule定义了采用哪个目录内的报警规则
alerting:
alertmanagers:
- static_configs:
- targets:
- 192.168.48.147:9093
rule_files:
- "/usr/local/prometheus/rules/*.yml"
mkdir prometheus/rules/ #创建报警规则的目录
设置邮箱报警配置文件
vim alertmanager.yml
global:
resolve_timeout: 5m #处理超时时间,默认为5min
smtp_from: '793653518@qq.com' #发送邮箱名称
smtp_smarthost: 'smtp.qq.com:25' #邮箱smtp服务器代理
smtp_auth_username: '793653518@qq.com' #邮箱名称
smtp_auth_password: 'fpeqhpuyhjhobcbb' #邮箱授权码
smtp_require_tls: false
route:
group_by: ['alertname'] #报警分组依据
group_wait: 10s #最初即第一次等待多久时间发送一组警报的通知
group_interval: 10s #在发送新警报前的等待时间
repeat_interval: 1m #发送重复警报的周期 对于email配置中,此项不可以设置过低,否则将会由于邮件发送太多频繁,被smtp服务器拒绝
receiver: 'email' #发送警报的接收者的名称,以下receivers name的名称
receivers:
- name: 'email' #警报,引用receiver定义的名称
email_configs: #邮箱配置
- to: 'lanchi0831@foxmail.com' #收件人的邮箱
send_resolved: true #一个inhibition规则是在与另一组匹配器匹配的警报存在的条件下,使匹配一组匹配器的警报失效的规则。两个警报必须具有一组相同的标签.
inhibit_rules: #抑制规则
- source_match: #源标签
severity: 'critical'
target_match:
severity: 'warning'
equal: ['alertname', 'dev', 'instance']
然后
./amtool check-config alertmanager.yml #检测配置文件
nohup ./alertmanager --config.file=./alertmanager.yml #启动,端口使用9093和9094
设置钉钉报警
去官网下载webhook服务
https://github.com/timonwong/prometheus-webhook-dingtalk
wget https://github.com/timonwong/prometheus-webhook-dingtalk/releases/download/v2.1.0/prometheus-webhook-dingtalk-2.1.0.linux-amd64.tar.gz #下载
tar xfz prometheus-webhook-dingtalk-2.1.0.linux-amd64.tar.gz -C /usr/local/alertmanager && cd /usr/local/alertmanager/prometheus-webhook-dingtalk-2.1.0.linux-amd64 #解压到alertmanager目录移动过去
cp config.example.yml config.yml #备份配置文件
找到钉钉群内的webhook机器人,获取url地址和加签
vim config.yml #编辑配置文件
templates:
- /data/ding.tmpl
targets:
webhook1:
url: https://oapi.dingtalk.com/robot/send?access_token=863a37411584a310ba7ef80bb0af2a3e766df34f52f84a306103ec586f63b2da
secret: SEC566fc700806d1248b9da08877fb1a51fa4a24648ed59ec54595030497916fc40
url和secret是地址和加签地址,templates使用的默认模板,可以修改
nohup ./prometheus-webhook-dingtalk --config.file=config.yml & #启动webhook
cd /usr/local/alertmanager && vim ./alertmanager.yml #设置钉钉报警配置文件
global:
resolve_timeout: 5m
smtp_from: '793653518@qq.com'
smtp_smarthost: 'smtp.qq.com:25'
smtp_auth_username: '793653518@qq.com'
smtp_auth_password: 'fpeqhpuyhjhobcbb'
smtp_require_tls: false
route:
group_by: ['alertname']
group_wait: 10s
group_interval: 1m
repeat_interval: 1m
receiver: 'dingding'
receivers:
- name: 'dingding'
webhook_configs:
- send_resolved: true
url: http://localhost:8060/dingtalk/webhook1/send #输入上面查看webhook启动时显示的地址
inhibit_rules:
- source_match:
severity: 'critical'
target_match:
severity: 'warning'
equal: ['alertname', 'dev', 'instance']
./amtool check-config alertmanager.yml #检测配置文件
nohup ./alertmanager --config.file=./alertmanager.yml #启动,端口使用9093和9094
设置邮箱加钉钉,多个报警调整receivers: 模块
global:
resolve_timeout: 5m
smtp_from: '793653518@qq.com'
smtp_smarthost: 'smtp.qq.com:25'
smtp_auth_username: '793653518@qq.com'
smtp_auth_password: 'fpeqhpuyhjhobcbb'
smtp_require_tls: false
templates:
- '/usr/local/alertmanager/*.tmpl'
route:
receiver: default
group_by: ['alertname']
group_wait: 10s
group_interval: 10s
repeat_interval: 1m
receivers:
- name: 'default'
email_configs:
- to: '{{ template "email.to" . }}'
html: '{{ template "email.to.html" . }}'
send_resolved: true
webhook_configs:
- send_resolved: true
url: "http://localhost:8060/dingtalk/webhook1/send"
- name: 'email'
email_configs:
- to: '{{ template "email.to" . }}'
html: '{{ template "email.to.html" . }}'
send_resolved: true
- name: 'dingding'
webhook_configs:
- send_resolved: true
url: "http://localhost:8060/dingtalk/webhook1/send"
inhibit_rules:
- source_match:
severity: 'critical'
target_match:
severity: 'warning'
equal:
- alertname
- dev
- instance
这样设置可以同时触发多个接收器