PromptShield

Prometheus Metrics

PromptShield is its own exporter. No separate process. /metrics is always on, same port as the proxy.

PromptShield is its own Prometheus exporter, built directly into the proxy using the official prometheus/client_golang library. There is no separate exporter binary to install. There is no flag to enable it. It is always on.

curl http://localhost:8080/metrics

Metrics

MetricTypeLabelsDescription
promptshield_requests_totalCounteraction, provider, modelEvery request, labeled by outcome
promptshield_request_duration_secondsHistogramaction, provider, modelEnd-to-end latency including upstream LLM
promptshield_tokens_totalCountertoken_type, provider, modelToken counts — token_type is prompt, completion, or total
promptshield_entities_detected_totalCounterentity_type, providerPII entities detected (requires detection engine)
promptshield_injections_detected_totalCounterprovider, modelPrompt injection attempts detected
promptshield_response_scans_totalCounterprovider, modelLLM responses scanned for PII

action label values: allow mask block rate_limited error

The duration histogram uses buckets: 50ms, 100ms, 250ms, 500ms, 1s, 2s, 5s, 10s, 30s, 60s.

Prometheus config

Point Prometheus at the proxy's /metrics endpoint:

scrape_configs:
  - job_name: promptshield
    static_configs:
      - targets: ["localhost:8080"]
    scrape_interval: 15s

The full observability stack (Prometheus + Grafana pre-configured) is in infra/observability/. See Grafana for the one-command quickstart.

Useful PromQL

# Request rate by action (rps)
rate(promptshield_requests_total[5m])

# p95 end-to-end latency
histogram_quantile(0.95, sum by (le) (rate(promptshield_request_duration_seconds_bucket[5m])))

# Block rate as a percentage
rate(promptshield_requests_total{action="block"}[5m])
/ rate(promptshield_requests_total[5m]) * 100

# Token burn rate per model (tokens/min)
sum by (model) (rate(promptshield_tokens_total{token_type="total"}[5m])) * 60

# Error rate
rate(promptshield_requests_total{action="error"}[5m])
/ rate(promptshield_requests_total[5m]) * 100

On this page