Guardrail Auditor

Audit pipelines. Preserve evidence. Iterate safely.

Start New Audit
Safety audit platform
No provider costPipeline-backed runsStructured evidence

Guardrail operations

Practical safety auditing for prompt, endpoint, and RAG systems.

Guardrail Auditor is a no-cost demo environment for exercising high-risk prompt flows, preserving execution evidence, and reviewing results in a repeatable audit pipeline.

Execution modes

3

Prompt-only, endpoint, and retrieval-grounded flows

Evidence depth

10+

Structured fields captured per result, including metadata and spans

Demo posture

$0

No paid model integrations required to understand the workflow

Latest demo run

Enterprise Assistant

72 / 100
PipelineExecutor → Scorer → Aggregator
Provider modeSimulated / Generic HTTP
ExportsJSON, CSV, optional PDF
Execution costDemo-safe
Evidence is persisted per test case, not just summarized at the run level.

Target execution

Run prompts against deterministic demo targets or generic HTTP endpoints without adding provider spend.

Evidence capture

Persist raw request and response data, normalized output, latency, provider identity, and execution status.

Heuristic scoring

Evaluate refusal strength, leakage risk, role bypass, and unsupported RAG claims with explicit verdict logic.

How it works

A repeatable audit path, not just a scorecard

1

Define a target and configuration snapshot

2

Select categories and assemble the active suite

3

Execute each test case through the pipeline

4

Score outputs and persist structured evidence

5

Export the run for review, triage, and iteration

Implementation focus

Built to demonstrate depth without cloud spend

The current demo preserves run versions, target snapshots, execution metadata, evidence spans, and remediation suggestions. That gives the product a credible backbone before any paid provider adapters are introduced.

The next step is straightforward: swap the simulated executor for real provider adapters with spend controls, while keeping the same scoring and reporting layers.

runAudit() ├─ executeTestCase() ├─ scoreExecution() ├─ persist TestResult evidence └─ aggregateRun()