PromptShield

Policy

One YAML file controls what gets blocked, masked, or allowed per entity type. Stored in git, reloads without a restart.

Edit config/policy.yaml:

pii:
  EMAIL_ADDRESS: mask # replace with [EMAIL_ADDRESS] before the LLM sees it
  PHONE_NUMBER: mask
  IP_ADDRESS: mask
  CREDIT_CARD: block # reject the request, LLM never called
  US_SSN: block
  IBAN_CODE: block
  CRYPTO: block
  MEDICAL_LICENSE: block
  PERSON: allow # pass through unchanged
  LOCATION: allow
  DATE_TIME: allow
  URL: allow

secrets:
  action: block # block any request containing a detected secret

injection:
  action: block # block | allow (detection not yet active, see note below)

on_detector_error: fail_closed

Policy enforcement requires the detection engine. Set PROMPTSHIELD_ENGINE_URL to connect it. Without the engine the proxy runs in gateway mode: the policy file loads but nothing is enforced.

Actions

ActionBehavior
blockHTTP 403 returned. LLM never called. Zero tokens consumed.
maskEntity replaced with [ENTITY_TYPE]. Sanitized prompt forwarded.
allowPasses through unchanged.
warnLogs the event and audit record, but lets the request through. Coming soon.

warn is useful for rolling out policy changes gradually: run in warn mode, review what would have been blocked in the audit log, tune the config, then switch to block.

Secrets

The secrets block controls what happens when a credential appears in a prompt. See Secrets Detection for the full list of detected types.

secrets:
  action: block # block | allow | warn (coming soon)

mask does not apply to secrets. A partially redacted API key may still be usable, so the options are block or pass through.

Injection

Injection detection is not yet active. The injection config block is accepted and parsed, but the detection engine currently returns false for all injection checks. This will be the first thing shipped next. Configure the policy now and it will take effect when detection lands.

When injection detection ships, mask will not apply. Replacing "ignore previous instructions" with a placeholder still produces a broken prompt. Any mask set on injection will escalate to block.

Response scanning

To scan LLM responses before they reach your app:

response:
  scan: true
  EMAIL_ADDRESS: mask
  CREDIT_CARD: block

Engine errors

on_detector_error controls what happens when the detection engine is unreachable:

  • fail_closed: block any request that cannot be scanned. Recommended for production.
  • fail_open: pass unscanned requests through. Use during development when uptime matters more than enforcement.

Keeping policy in version control

policy.yaml is a plain file. Check it into git, review changes in pull requests, and deploy it alongside your application code. The proxy watches the file and reloads on change without a restart.

On this page