跳转至

对接企业微信

企业微信对接 Alertmanager

Alertmanager 将告警信息推送到微信群,主要涉及到如下几方面的配置:

  1. 企业微信后台的配置,包括新建告警部门和应用;
  2. Alertmanager 的主配置文件配置和告警模板配置;
  3. Prometheus 主配置文件的配置以及告警规则的配置;

配置

官方文档:https://prometheus.io/docs/alerting/latest/configuration/#wechat_config

获取企业微信信息

  • 企业 ID
  • 然后在通讯录中,添加一个子部门,用于接收告警信息,获取部门 ID
  • 告警 AgentId 和 Secret 获取是需要在企业微信后台,【应用管理】中,自建应用才能够获得的

alertmanager 配置

alertmanager:
  config:
    global:
      resolve_timeout: 1m
      wechat_api_url: "https://qyapi.weixin.qq.com/cgi-bin/"
      wechat_api_corp_id: "wwa362d8ed5ea3fd22"
      wechat_api_secret: "t9WzuT9OVefm55nzwoVSuZtvaQPd0U0Y8wFrFpfOaIk"
    templates:
      - "/etc/alertmanager/config/*.tmpl"
    route:
      group_by: ["namespace"]
      # group_by: ['env','instance','type','group','job','alertname']
      group_wait: 30s
      group_interval: 5m
      repeat_interval: 12h
      receiver: "wechat"
      # routes:
      #   - receiver: "null"
      #     matchers:
      #       - alertname =~ "InfoInhibitor|Watchdog"
    receivers:
      - name: "wechat"
        wechat_configs:
          - send_resolved: true
            message: '{{ template "wechat.default.message" . }}'
            to_party: "2"
            agent_id: "1000002"
            api_secret: "t9WzuT9OVefm55nzwoVSuZtvaQPd0U0Y8wFrFpfOaIk"
    inhibit_rules:
      - source_matchers:
          - "severity = critical"
        target_matchers:
          - "severity =~ warning|info"
        equal:
          - "namespace"
          - "alertname"
      - source_matchers:
          - "severity = warning"
        target_matchers:
          - "severity = info"
        equal:
          - "namespace"
          - "alertname"
      - source_matchers:
          - "alertname = InfoInhibitor"
        target_matchers:
          - "severity = info"
        equal:
          - "namespace"

告警模板

alertmanager:
  templateFiles:
    wechat.tmpl: |-
      {{ define "wechat.default.message" }}
      {{- if gt (len .Alerts.Firing) 0 -}}
      {{- range $index, $alert := .Alerts -}}
      {{- if eq $index 0 }}
      ========= 监控报警 =========
      告警状态:{{   .Status }}
      告警级别:{{ .Labels.severity }}
      告警类型:{{ $alert.Labels.alertname }}
      故障主机: {{ $alert.Labels.instance }}
      告警主题: {{ $alert.Annotations.summary }}
      告警详情: {{ $alert.Annotations.message }}{{ $alert.Annotations.description}};
      触发阀值:{{ .Annotations.value }}
      故障时间: {{ ($alert.StartsAt.Add 28800e9).Format "2006-01-02 15:04:05" }}
      ========= = end =  =========
      {{- end }}
      {{- end }}
      {{- end }}
      {{- if gt (len .Alerts.Resolved) 0 -}}
      {{- range $index, $alert := .Alerts -}}
      {{- if eq $index 0 }}
      ========= 异常恢复 =========
      告警类型:{{ .Labels.alertname }}
      告警状态:{{   .Status }}
      告警主题: {{ $alert.Annotations.summary }}
      告警详情: {{ $alert.Annotations.message }}{{ $alert.Annotations.description}};
      故障时间: {{ ($alert.StartsAt.Add 28800e9).Format "2006-01-02 15:04:05" }}
      恢复时间: {{ ($alert.EndsAt.Add 28800e9).Format "2006-01-02 15:04:05" }}
      {{- if gt (len $alert.Labels.instance) 0 }}
      实例信息: {{ $alert.Labels.instance }}
      {{- end }}
      ========= = end =  =========
      {{- end }}
      {{- end }}
      {{- end }}
      {{- end }}

参考文档