Installation et configuration¶

Prerequis¶

Composant	Version minimum	Rôle
Podman ou Kubernetes	Podman 4.x / K8s 1.28+	Runtime de conteneurs
Helm	3.14+	Déploiement sur Kubernetes
MinIO	RELEASE.2024+	Stockage objet (si auto-heberge)
Certificats TLS	Let's Encrypt ou CA interne	Chiffrement des flux

Option 1 : Déploiement Podman (environnement standalone)¶

Réseau et stockage¶

# Creer le reseau dedie
podman network create observability

# Creer les volumes persistants
podman volume create minio-data
podman volume create prometheus-data
podman volume create loki-data
podman volume create grafana-data

MinIO (stockage objet)¶

podman run -d \
  --name minio \
  --network observability \
  -p 9000:9000 -p 9001:9001 \
  -v minio-data:/data:Z \
  -e MINIO_ROOT_USER=lgtm-admin \
  -e MINIO_ROOT_PASSWORD=ChangeMeInVault! \
  quay.io/minio/minio:latest \
  server /data --console-address ":9001"

# Creer les buckets
podman exec minio mc alias set local http://localhost:9000 lgtm-admin ChangeMeInVault!
podman exec minio mc mb local/mimir-blocks
podman exec minio mc mb local/loki-chunks
podman exec minio mc mb local/tempo-traces

Secrets en production

Les credentials MinIO doivent etre injectes depuis Vault, jamais en clair dans les commandes. L'exemple ci-dessus est simplifié pour la demonstration.

Prometheus¶

# prometheus.yml
global:
  scrape_interval: 15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: "prometheus"
    static_configs:
      - targets: ["localhost:9090"]

  - job_name: "node-exporter"
    static_configs:
      - targets: ["node-exporter:9100"]

  - job_name: "minio"
    metrics_path: /minio/v2/metrics/cluster
    static_configs:
      - targets: ["minio:9000"]

remote_write:
  - url: http://mimir:9009/api/v1/push

podman run -d \
  --name prometheus \
  --network observability \
  -p 9090:9090 \
  -v prometheus-data:/prometheus:Z \
  -v ./prometheus.yml:/etc/prometheus/prometheus.yml:ro,Z \
  quay.io/prometheus/prometheus:latest \
  --config.file=/etc/prometheus/prometheus.yml \
  --storage.tsdb.retention.time=7d \
  --web.enable-lifecycle

Loki¶

# loki-config.yaml
auth_enabled: false

server:
  http_listen_port: 3100

common:
  ring:
    kvstore:
      store: inmemory
  replication_factor: 1
  path_prefix: /loki

schema_config:
  configs:
    - from: 2024-01-01
      store: tsdb
      object_store: s3
      schema: v13
      index:
        prefix: index_
        period: 24h

storage_config:
  tsdb_shipper:
    active_index_directory: /loki/index
    cache_location: /loki/cache
  aws:
    endpoint: http://minio:9000
    bucketnames: loki-chunks
    access_key_id: lgtm-admin
    secret_access_key: ChangeMeInVault!
    s3forcepathstyle: true
    insecure: true

compactor:
  working_directory: /loki/compactor
  compaction_interval: 10m
  retention_enabled: true
  retention_delete_delay: 2h

limits_config:
  retention_period: 90d
  max_query_length: 721h
  max_query_parallelism: 32

podman run -d \
  --name loki \
  --network observability \
  -p 3100:3100 \
  -v loki-data:/loki:Z \
  -v ./loki-config.yaml:/etc/loki/local-config.yaml:ro,Z \
  grafana/loki:latest \
  -config.file=/etc/loki/local-config.yaml

Alloy (collecteur unifie)¶

// alloy-config.river

// Collecte des logs depuis les conteneurs Podman
loki.source.journal "journal" {
  forward_to = [loki.write.default.receiver]
}

// Ecriture vers Loki
loki.write "default" {
  endpoint {
    url = "http://loki:3100/loki/api/v1/push"
  }
}

// Reception OTLP pour les traces
otelcol.receiver.otlp "default" {
  grpc {
    endpoint = "0.0.0.0:4317"
  }
  http {
    endpoint = "0.0.0.0:4318"
  }
  output {
    traces = [otelcol.exporter.otlp.tempo.input]
  }
}

// Export traces vers Tempo
otelcol.exporter.otlp "tempo" {
  client {
    endpoint = "tempo:4317"
    tls {
      insecure = true
    }
  }
}

podman run -d \
  --name alloy \
  --network observability \
  -p 4317:4317 -p 4318:4318 -p 12345:12345 \
  -v ./alloy-config.river:/etc/alloy/config.river:ro,Z \
  grafana/alloy:latest \
  run /etc/alloy/config.river

Grafana¶

podman run -d \
  --name grafana \
  --network observability \
  -p 3000:3000 \
  -v grafana-data:/var/lib/grafana:Z \
  -e GF_SECURITY_ADMIN_USER=admin \
  -e GF_SECURITY_ADMIN_PASSWORD=ChangeMeInVault! \
  -e GF_AUTH_GENERIC_OAUTH_ENABLED=true \
  grafana/grafana:latest

Option 2 : Déploiement Kubernetes (Helm charts)¶

Namespace et prerequis¶

kubectl create namespace observability

# Ajouter les repos Helm
helm repo add grafana https://grafana.github.io/helm-charts
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

Prometheus (kube-prometheus-stack)¶

helm install prometheus prometheus-community/kube-prometheus-stack \
  --namespace observability \
  --set prometheus.prometheusSpec.retention=7d \
  --set prometheus.prometheusSpec.remoteWrite[0].url=http://mimir.observability:9009/api/v1/push \
  --set alertmanager.enabled=true \
  --set grafana.enabled=false  # On deploie Grafana separement

Loki¶

helm install loki grafana/loki \
  --namespace observability \
  --set deploymentMode=SimpleScalable \
  --set loki.storage.type=s3 \
  --set loki.storage.s3.endpoint=http://minio.observability:9000 \
  --set loki.storage.s3.bucketnames=loki-chunks \
  --set loki.storage.s3.accessKeyId=lgtm-admin \
  --set loki.storage.s3.secretAccessKey=ChangeMeInVault! \
  --set loki.storage.s3.s3ForcePathStyle=true \
  --set loki.schemaConfig.configs[0].from=2024-01-01 \
  --set loki.schemaConfig.configs[0].store=tsdb \
  --set loki.schemaConfig.configs[0].object_store=s3 \
  --set loki.schemaConfig.configs[0].schema=v13

Grafana avec provisioning¶

# grafana-values.yaml
replicas: 2

persistence:
  enabled: true
  size: 10Gi

datasources:
  datasources.yaml:
    apiVersion: 1
    datasources:
      - name: Prometheus
        type: prometheus
        url: http://prometheus-kube-prometheus-prometheus.observability:9090
        isDefault: true
        jsonData:
          timeInterval: 15s

      - name: Mimir
        type: prometheus
        url: http://mimir.observability:9009/prometheus
        jsonData:
          timeInterval: 15s

      - name: Loki
        type: loki
        url: http://loki-gateway.observability:80

      - name: Tempo
        type: tempo
        url: http://tempo.observability:3100
        jsonData:
          tracesToLogsV2:
            datasourceUid: loki
            filterByTraceID: true
          tracesToMetrics:
            datasourceUid: prometheus
          serviceMap:
            datasourceUid: prometheus

dashboardProviders:
  dashboardproviders.yaml:
    apiVersion: 1
    providers:
      - name: default
        orgId: 1
        folder: "Infrastructure"
        type: file
        disableDeletion: false
        editable: true
        options:
          path: /var/lib/grafana/dashboards/default

dashboardsConfigMaps:
  default: grafana-dashboards

helm install grafana grafana/grafana \
  --namespace observability \
  -f grafana-values.yaml

Alloy (DaemonSet)¶

helm install alloy grafana/alloy \
  --namespace observability \
  --set alloy.configMap.create=true \
  --set alloy.mounts.varlog=true

Vérification post-installation¶

# Verifier les pods
kubectl get pods -n observability

# Verifier que Prometheus scrape ses cibles
curl -s http://prometheus:9090/api/v1/targets | jq '.data.activeTargets | length'

# Verifier que Loki recoit des logs
curl -s http://loki:3100/ready

# Verifier que Grafana repond
curl -s http://grafana:3000/api/health

# Tester une requete LogQL
curl -s 'http://loki:3100/loki/api/v1/query?query={job="varlogs"}&limit=5'

Service discovery Kubernetes

Prometheus decouvre automatiquement les pods avec l'annotation prometheus.io/scrape: "true". Ajouter cette annotation aux deployments pour activer le scraping automatique.

# Annotation sur un pod pour le scraping automatique
metadata:
  annotations:
    prometheus.io/scrape: "true"
    prometheus.io/port: "8080"
    prometheus.io/path: "/metrics"