Prérequis

  • Un cluster Kubernetes
  • Un pod Prometheus

Deployer blackbox-exporter

blackbox-exporter va vous permettre de superviser et de générer des métriques de vos services externes.

Il sait mesurer :

  • des requêtes http
  • des requêtes tcp
  • des requêtes dns
  • de l’icmp

On va s’interesser ici à la supervision de ce serveur NodeBB via du http. J’ai utilisé ce deployment (deployment.yaml) :

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: blackbox-exporter
  name: blackbox-exporter
  namespace: prometheus
spec:
  replicas: 1
  selector:
    matchLabels:
      app: blackbox-exporter
  template:
    metadata:
      labels:
        app: blackbox-exporter
    spec:
      containers:
      - image: prom/blackbox-exporter:v0.12.0
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /
            port: 9115
            scheme: HTTP
          initialDelaySeconds: 60
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 10
        name: blackbox-exporter
        ports:
        - containerPort: 9115
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /metrics
            port: 9115
            scheme: HTTP
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 10
        resources:
          limits:
            cpu: 500m
            memory: 128Mi
          requests:
            cpu: 500m
            memory: 128Mi
        volumeMounts:
        - mountPath: /etc/blackbox_exporter
          name: blackbox
      restartPolicy: Always
      volumes:
      - name: blackbox
        configMap:
          name: blackbox-config

Je gère la configuration de blackbox-exporter dans un configMap (configmap.yaml) :

apiVersion: v1
data:
  config.yml: |
    modules:
      nodebb:
        prober: http
        timeout: 5s
        http:
          method: GET
          fail_if_not_ssl: true
          fail_if_not_matches_regexp:
          - "healthy"
kind: ConfigMap
metadata:
  name: blackbox-config
  namespace: prometheus

Je déploie le tout sur mon cluster Kubernetes :

$ kubectl apply -f configmap.yaml,deployment.yaml
$ kubectl -n prometheus expose deployment blackbox-exporter --port=9115 --target-port=9115

Je peux maintenant récupérer les métriques HTTP de binbash.fr :

$ curl "http://$(kubectl -n prometheus get svc blackbox-exporter -o jsonpath='{.spec.clusterIP}'):9115/probe?target=binbash.fr/sping&module=nodebb"
# HELP probe_dns_lookup_time_seconds Returns the time taken for probe dns lookup in seconds
# TYPE probe_dns_lookup_time_seconds gauge
probe_dns_lookup_time_seconds 0.071831404
# HELP probe_duration_seconds Returns how long the probe took to complete in seconds
# TYPE probe_duration_seconds gauge
probe_duration_seconds 0.292811509
# HELP probe_failed_due_to_regex Indicates if probe failed due to regex
# TYPE probe_failed_due_to_regex gauge
probe_failed_due_to_regex 0
# HELP probe_http_content_length Length of http content response
# TYPE probe_http_content_length gauge
probe_http_content_length 7
# HELP probe_http_duration_seconds Duration of http request by phase, summed over all redirects
# TYPE probe_http_duration_seconds gauge
probe_http_duration_seconds{phase="connect"} 0.057645064
probe_http_duration_seconds{phase="processing"} 0.076625814
probe_http_duration_seconds{phase="resolve"} 0.084556566
probe_http_duration_seconds{phase="tls"} 0.10175307
probe_http_duration_seconds{phase="transfer"} 0.000189276
# HELP probe_http_redirects The number of redirects
# TYPE probe_http_redirects gauge
probe_http_redirects 1
# HELP probe_http_ssl Indicates if SSL was used for the final redirect
# TYPE probe_http_ssl gauge
probe_http_ssl 1
# HELP probe_http_status_code Response HTTP status code
# TYPE probe_http_status_code gauge
probe_http_status_code 200
# HELP probe_http_version Returns the version of HTTP of the probe response
# TYPE probe_http_version gauge
probe_http_version 1.1
# HELP probe_ip_protocol Specifies whether probe ip protocol is IP4 or IP6
# TYPE probe_ip_protocol gauge
probe_ip_protocol 4
# HELP probe_ssl_earliest_cert_expiry Returns earliest SSL cert expiry in unixtime
# TYPE probe_ssl_earliest_cert_expiry gauge
probe_ssl_earliest_cert_expiry 1.530091576e+09
# HELP probe_success Displays whether or not the probe was a success
# TYPE probe_success gauge
probe_success 1

Configuration de Prometheus

Il y a plusieurs facon de faire. Soit vous faites une configuration statique comme l’exemple donné sur le README du projet blackbox-exporter soit une configuration dynamique qui utilisera les services Kubernetes. Je vais détailler cette méthode.

Dans la configuration de mon Prometheus, j’ai rajouté :

- job_name: 'blackbox'

  metrics_path: /probe

  kubernetes_sd_configs:
  - role: service

  relabel_configs:
  - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe]
    action: keep
    regex: true
  - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
    action: replace
    target_label: __address__
    regex: (.+)(?::\d+);(\d+)
    replacement: $1:$2
  - action: labelmap
    regex: __meta_kubernetes_service_label_(.+)
  - source_labels: [__meta_kubernetes_service_namespace]
    action: replace
    target_label: kubernetes_namespace
  - source_labels: [__meta_kubernetes_service_name]
    action: replace
    target_label: kubernetes_name
  - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_target]
    action: replace
    target_label: __param_target
  - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_target]
    action: replace
    target_label: url
  - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_module]
    action: replace
    target_label: __param_module

On va donc utiliser les annotations du service Kubernetes pour construire le target Prometheus. Il me reste à déclarer un service Kubernetes pour que Prometheus scrape mon blackbox-exporter (binbash-nodebb.yaml) :

apiVersion: v1
kind: Service
metadata:
  annotations:
    prometheus.io/module: nodebb
    prometheus.io/probe: "true"
    prometheus.io/target: binbash.fr/sping
  name: binbash-nodebb
  namespace: prometheus
spec:
  ports:
  - port: 9115
    protocol: TCP
    targetPort: 9115
  selector:
    app: blackbox-exporter
  sessionAffinity: None
  type: ClusterIP

On retrouve dans les annotations :

  • prometheus.io/module : le module blackbox-exporter configuré plus haut.
  • prometheus.io/probe : pour indiquer à Prometheus qu’il faut gérér ce service.
  • prometheus.io/target : l’url de mon site

Je déploie le service :

$ kubectl apply -f binbash-nodebb.yaml

Je retrouve ensuite dans mes targets Prometheus une nouvelle entrée dans mon service discovery blackbox.

prometheus.png

Grafana

Voilà un example très simple de dashboard Grafana :

grafana.png

Alerting

Il est maintenant possible d’alerter si le probe ne fonctionne pas ou si les temps de réponse ne sont pas bons.

Exemple de règles pour Prometheus :

groups:
- name: blackbox.rules
  rules:
  - alert: UrlProbeFailed
    expr: probe_success != 1
    for: 5m
    labels:
      scope: blackbox
      severity: critical
    annotations:
      description: '{{ $labels.url }} probe failed'
      summary: URL Probe Failed
  - alert: UrlProbeSlow
    expr: probe_http_duration_seconds > 1
    for: 5m
    labels:
      scope: blackbox
      severity: critical
    annotations:
      description: '{{ $labels.url }} probe slow ({{ $value }}s)'
      summary: URL Probe Slow 
  - alert: SSLCertExpiringSoon
    expr:  probe_ssl_earliest_cert_expiry - time() < 86400 * 5
    for: 5m
    labels:
      scope: blackbox
      severity: critical
    annotations:
      description: 'Renew SSL cert of {{ $labels.url }} !'
      summary: SSL Cert Expiring Soon