Supervision d'une URL avec blackbox-exporter



  • Prérequis

    • Un cluster Kubernetes
    • Un pod Prometheus

    Deployer blackbox-exporter

    blackbox-exporter va vous permettre de superviser et de générer des métriques de vos services externes.
    Il sait mesurer :

    • des requêtes http
    • des requêtes tcp
    • des requêtes dns
    • de l’icmp

    On va s’interesser ici à la supervision de ce serveur NodeBB via du http.
    J’ai utilisé ce deployment (deployment.yaml) :

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      labels:
        app: blackbox-exporter
      name: blackbox-exporter
      namespace: prometheus
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: blackbox-exporter
      template:
        metadata:
          labels:
            app: blackbox-exporter
        spec:
          containers:
          - image: prom/blackbox-exporter:v0.12.0
            imagePullPolicy: IfNotPresent
            livenessProbe:
              failureThreshold: 3
              httpGet:
                path: /
                port: 9115
                scheme: HTTP
              initialDelaySeconds: 60
              periodSeconds: 10
              successThreshold: 1
              timeoutSeconds: 10
            name: blackbox-exporter
            ports:
            - containerPort: 9115
              protocol: TCP
            readinessProbe:
              failureThreshold: 3
              httpGet:
                path: /metrics
                port: 9115
                scheme: HTTP
              periodSeconds: 10
              successThreshold: 1
              timeoutSeconds: 10
            resources:
              limits:
                cpu: 500m
                memory: 128Mi
              requests:
                cpu: 500m
                memory: 128Mi
            volumeMounts:
            - mountPath: /etc/blackbox_exporter
              name: blackbox
          restartPolicy: Always
          volumes:
          - name: blackbox
            configMap:
              name: blackbox-config
    

    Je gère la configuration de blackbox-exporter dans un configMap (configmap.yaml) :

    apiVersion: v1
    data:
      config.yml: |
        modules:
          nodebb:
            prober: http
            timeout: 5s
            http:
              method: GET
              fail_if_not_ssl: true
              fail_if_not_matches_regexp:
              - "healthy"
    kind: ConfigMap
    metadata:
      name: blackbox-config
      namespace: prometheus
    

    Je déploie le tout sur mon cluster Kubernetes :

    $ kubectl apply -f configmap.yaml,deployment.yaml
    $ kubectl -n prometheus expose deployment blackbox-exporter --port=9115 --target-port=9115
    

    Je peux maintenant récupérer les métriques HTTP de binbash.fr :

    $ curl "http://$(kubectl -n prometheus get svc blackbox-exporter -o jsonpath='{.spec.clusterIP}'):9115/probe?target=binbash.fr/sping&module=nodebb"
    # HELP probe_dns_lookup_time_seconds Returns the time taken for probe dns lookup in seconds
    # TYPE probe_dns_lookup_time_seconds gauge
    probe_dns_lookup_time_seconds 0.071831404
    # HELP probe_duration_seconds Returns how long the probe took to complete in seconds
    # TYPE probe_duration_seconds gauge
    probe_duration_seconds 0.292811509
    # HELP probe_failed_due_to_regex Indicates if probe failed due to regex
    # TYPE probe_failed_due_to_regex gauge
    probe_failed_due_to_regex 0
    # HELP probe_http_content_length Length of http content response
    # TYPE probe_http_content_length gauge
    probe_http_content_length 7
    # HELP probe_http_duration_seconds Duration of http request by phase, summed over all redirects
    # TYPE probe_http_duration_seconds gauge
    probe_http_duration_seconds{phase="connect"} 0.057645064
    probe_http_duration_seconds{phase="processing"} 0.076625814
    probe_http_duration_seconds{phase="resolve"} 0.084556566
    probe_http_duration_seconds{phase="tls"} 0.10175307
    probe_http_duration_seconds{phase="transfer"} 0.000189276
    # HELP probe_http_redirects The number of redirects
    # TYPE probe_http_redirects gauge
    probe_http_redirects 1
    # HELP probe_http_ssl Indicates if SSL was used for the final redirect
    # TYPE probe_http_ssl gauge
    probe_http_ssl 1
    # HELP probe_http_status_code Response HTTP status code
    # TYPE probe_http_status_code gauge
    probe_http_status_code 200
    # HELP probe_http_version Returns the version of HTTP of the probe response
    # TYPE probe_http_version gauge
    probe_http_version 1.1
    # HELP probe_ip_protocol Specifies whether probe ip protocol is IP4 or IP6
    # TYPE probe_ip_protocol gauge
    probe_ip_protocol 4
    # HELP probe_ssl_earliest_cert_expiry Returns earliest SSL cert expiry in unixtime
    # TYPE probe_ssl_earliest_cert_expiry gauge
    probe_ssl_earliest_cert_expiry 1.530091576e+09
    # HELP probe_success Displays whether or not the probe was a success
    # TYPE probe_success gauge
    probe_success 1
    

    Configuration de Prometheus

    Il y a plusieurs facon de faire. Soit vous faites une configuration statique comme l’exemple donné sur le README du projet blackbox-exporter soit une configuration dynamique qui utilisera les services Kubernetes. Je vais détailler cette méthode.

    Dans la configuration de mon Prometheus, j’ai rajouté :

    - job_name: 'blackbox'
    
      metrics_path: /probe
    
      kubernetes_sd_configs:
      - role: service
    
      relabel_configs:
      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe]
        action: keep
        regex: true
      - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
        action: replace
        target_label: __address__
        regex: (.+)(?::\d+);(\d+)
        replacement: $1:$2
      - action: labelmap
        regex: __meta_kubernetes_service_label_(.+)
      - source_labels: [__meta_kubernetes_service_namespace]
        action: replace
        target_label: kubernetes_namespace
      - source_labels: [__meta_kubernetes_service_name]
        action: replace
        target_label: kubernetes_name
      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_target]
        action: replace
        target_label: __param_target
      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_target]
        action: replace
        target_label: url
      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_module]
        action: replace
        target_label: __param_module
    

    On va donc utiliser les annotations du service Kubernetes pour construire le target Prometheus. Il me reste à déclarer un service Kubernetes pour que Prometheus scrape mon blackbox-exporter (binbash-nodebb.yaml) :

    apiVersion: v1
    kind: Service
    metadata:
      annotations:
        prometheus.io/module: nodebb
        prometheus.io/probe: "true"
        prometheus.io/target: binbash.fr/sping
      name: binbash-nodebb
      namespace: prometheus
    spec:
      ports:
      - port: 9115
        protocol: TCP
        targetPort: 9115
      selector:
        app: blackbox-exporter
      sessionAffinity: None
      type: ClusterIP
    

    On retrouve dans les annotations :

    Je déploie le service :

    $ kubectl apply -f binbash-nodebb.yaml
    

    Je retrouve ensuite dans mes targets Prometheus une nouvelle entrée dans mon service discovery blackbox.

    0_1522927165248_2018-04-05-123842_1217x305_scrot.png

    Grafana

    Voilà un example très simple de dashboard Grafana :

    0_1522927176354_2018-04-05-123640_869x247_scrot.png

    Alerting

    Il est maintenant possible d’alerter si le probe ne fonctionne pas ou si les temps de réponse ne sont pas bons.

    Exemple de règles pour Prometheus :

    groups:
    - name: blackbox.rules
      rules:
      - alert: UrlProbeFailed
        expr: probe_success != 1
        for: 5m
        labels:
          scope: blackbox
          severity: critical
        annotations:
          description: '{{ $labels.url }} probe failed'
          summary: URL Probe Failed
      - alert: UrlProbeSlow
        expr: probe_http_duration_seconds > 1
        for: 5m
        labels:
          scope: blackbox
          severity: critical
        annotations:
          description: '{{ $labels.url }} probe slow ({{ $value }}s)'
          summary: URL Probe Slow 
      - alert: SSLCertExpiringSoon
        expr:  probe_ssl_earliest_cert_expiry - time() < 86400 * 5
        for: 5m
        labels:
          scope: blackbox
          severity: critical
        annotations:
          description: 'Renew SSL cert of {{ $labels.url }} !'
          summary: SSL Cert Expiring Soon