Run evidence collectors¶
Collectors are how Evidentia learns what your real systems actually do. Each one
makes an outbound, authenticated call to an external system — AWS, GitHub, Okta, a
SQL database, a SaaS GRC platform — and returns a list of SecurityFinding records
mapped to control families. This guide covers the full collector surface: the
evidentia collect command group on the CLI, and the credentialed Collect
screen in the web console.
This is the operator reference for the whole collector matrix. If you have never run a collector before, start with the gentler end-to-end walkthrough in Getting started → Your first evidence collection, which wires the GitHub collector start-to-finish; this guide assumes you already understand that shape and focuses on the breadth of providers, the credential model, and the network/SSRF guards.
What this surface is for¶
- Pull live posture from a source system and turn it into control-mapped
findings you can fold into a gap analysis (
--format oscal-ar) or convert to OCSF for a SIEM. - Cover many providers — cloud (AWS), code (GitHub), identity (Okta), data stores (PostgreSQL, MySQL, SQLite, MS SQL, Oracle, Databricks, Snowflake), and vendor-risk platforms (Vanta, Drata, BitSight, SecurityScorecard).
- Ingest or convert OCSF — pull third-party OCSF findings in (
collect ocsf) or push Evidentia findings out to an OCSF bundle (collect convert).
Every credentialed collector is read-only by design. The SQL adapters even run
a write-privilege probe on first connect and emit an EVIDENTIA-WRITE-PRIV-DETECTED
finding (mapped to NIST AC-6) if they find more access than they should have.
The credential model (read this first)¶
This is the highest-risk surface in Evidentia, so the credential handling is deliberate and worth understanding before you run anything:
- Secrets never pass through a CLI value flag or a request body. Tokens and
passwords are sourced from server-side environment variables (or the
provider SDK's own credential chain). For example the GitHub collector reads
$GITHUB_TOKEN; Okta reads$OKTA_API_TOKEN; the SQL adapters read a password from an env var you name with--password-env. The collectors refuse a secret passed as a flag. - In the web console, the forms only carry non-secret parameters — region,
repo, host, options. The browser never sees a key value, and the Status tab
reports only
configured: true/falseplus the env-var name a token came from, never the value. - Networked collectors block private IPs by default (
--block-private-ips). This SSRF guard refuses any host that resolves to a private / loopback / link-local / multicast / reserved range — including the cloud instance-metadata endpoint169.254.169.254— before a socket opens. You opt out per-call with--allow-private-ipsonly for a trusted internal endpoint.
Prerequisites¶
- Evidentia installed (
pip install evidentia; verify withevidentia version). - The provider extra for whatever you are collecting — for example
pip install "evidentia-collectors[aws]"orpip install "evidentia-core[ocsf]"for the OCSF verbs. A missing extra surfaces as a clear install hint rather than a stack trace. - The relevant credential exported in your shell (CLI) or set in the server's
environment (console). See each provider's
--helpfor the exact env-var name.
Step 1 — See what you can collect¶
List the whole collector matrix:
You will see the credentialed providers (aws, github, okta, sql,
databricks, snowflake, vanta, drata, bitsight, securityscorecard) plus
the two OCSF verbs (ocsf to ingest, convert to emit). Each subcommand has its
own --help with the exact flags and the env var its secret is read from:
Step 2 — Configure the credential¶
Each collector documents its credential source in --help. The token/password is
always read from the environment, never a flag. For GitHub:
Bash / Linux / macOS
PowerShell (Windows)
For the SQL adapters, the password lives in the env var named by --password-env
(default differs per adapter, e.g. EVIDENTIA_POSTGRES_PASSWORD). For the
SaaS/vendor-risk collectors, set the provider's token env var (OKTA_API_TOKEN,
VANTA_API_TOKEN, DRATA_API_TOKEN, BITSIGHT_API_TOKEN,
SECURITYSCORECARD_API_TOKEN, SNOWFLAKE_PASSWORD, etc.).
Step 3 — Run a collector¶
Run one provider and write the findings JSON to a file with --output (or omit it
to print to stdout). A plain evidentia collect ... line works the same in any
shell:
A few representative providers:
The SQL adapter takes --adapter, a password-free --connection-uri, and reads
the password from --password-env:
evidentia collect sql --adapter postgres --connection-uri postgres://reader@db.example.com/app --output db-findings.json
SQLite is the exception — no auth, no password env var; pass the database file path as the connection URI:
evidentia collect sql --adapter sqlite --connection-uri /var/lib/app/data.db --output sqlite-findings.json
Step 4 — Fold findings into a gap analysis¶
The payoff: pass a findings file to evidentia gap analyze with
--format oscal-ar and each finding is embedded in the OSCAL Assessment Results
back-matter with a SHA-256 digest for chain-of-custody:
evidentia gap analyze --inventory my-controls.yaml --frameworks nist-800-53-rev5-moderate --findings findings.json --format oscal-ar --output assessment-results.json
See Run a gap analysis for the full workflow.
Step 5 — Ingest or emit OCSF¶
Pull third-party OCSF findings (Prowler, AWS Security Hub) in with
collect ocsf. The --input accepts a local file path or an https:// URL;
URL mode keeps the --block-private-ips SSRF guard on by default:
Push Evidentia findings out to an OCSF Compliance Finding bundle with
collect convert:
OCSF has its own dedicated guide — see Ingest OCSF for the round-trip and SIEM-ingest path, and Emit OCSF detection findings for the export side.
Running collectors in the web console¶
Everything above also works from the browser. Start the server with
evidentia serve and open the Collect screen from the sidebar (under
Connect, route /collect). The screen has four tabs — Collectors, OCSF
ingest, Convert, and Status — each returning a list of findings rendered
as severity-tagged cards.

The auth gate (why your Run buttons may be disabled)¶
Collectors make credentialed, network-egressing calls, so the console refuses to
run them on an unauthenticated deployment. The page reads the backend's
auth_configured flag and, when it is false, shows a red "Collectors disabled"
banner and disables every credentialed Run button. To enable them, configure
API authentication by pointing EVIDENTIA_API_AUTH_TOKEN_FILE at a token file and
restarting evidentia serve. This mirrors the always-visible security-posture
banner — it is a §4(c) safeguard so that anyone who can reach the local API
cannot silently drive credentialed external calls.
Two surfaces stay enabled even without auth, because they are local-only (no network, no credentials): the Convert tab, and the OCSF ingest tab's inline content mode. The OCSF ingest URL mode is networked, so it is auth-gated like the credentialed collectors.
Collectors tab¶
- Choose a collector. Click a provider pill (AWS, GitHub, Okta, PostgreSQL,
Databricks, Snowflake, Vanta, Drata, BitSight, SecurityScorecard). Each card
says where its credentials come from (e.g. "Token via server
$GITHUB_TOKEN"). - Fill the non-secret parameters. The form carries only fields like region, repo, org URL, connection URI (without a password), or account/user — never a secret. Required fields are marked; the Run button stays disabled until they are filled (and until auth is configured).
- Confirm, then run. Because a run hits a live external API, the button is a two-step: clicking Run collector reveals a red Confirm — run collector button (and a Cancel). Confirming issues the run; the page shows Running… and then a finding count with per-finding cards (severity, source system, title, description, resource id). Errors — including the provider's own message — are surfaced inline as escaped text.
OCSF ingest tab¶
Choose Inline content (paste OCSF JSON — a single finding object or an array;
parsed locally, no network) or URL (fetch from an https:// endpoint). URL
mode shows a Block private IPs (SSRF guard) checkbox that is checked by
default; unchecking it pops a warning that the URL may resolve to a private /
loopback / link-local / metadata address — only opt out for a trusted internal
endpoint. Click Ingest OCSF to run.
Convert tab¶
Paste a findings document (a single SecurityFinding or an array), set the output
format (currently ocsf), and click Convert. This round-trips through the OCSF
mapping layer entirely locally — no network, no credentials — and prints the
converted JSON.
Status tab¶
Reports which collectors are installed and which credentials are configured on the server — as booleans plus the env-var name a token was sourced from. It never returns a token value. Use it to confirm a provider is ready before you try to run it.
Under the hood the console calls POST /api/collectors/<provider>/collect (and
/api/collectors/ocsf/collect, /api/collectors/convert,
GET /api/collectors/status) — the same engine the CLI drives, so the two paths
produce identical findings.
What's next¶
- Your first run, end to end: Getting started → Your first evidence collection.
- Turn findings into a gap report: Run a gap analysis.
- Round-trip OCSF: Ingest OCSF · Emit OCSF detection findings.
- Manage vendor risk from collector output: Manage third-party risk.
- Run everything offline: Air-gapped install.
- The full flag matrix: the CLI reference.
Got stuck?¶
- Console Run buttons are greyed out / "Collectors disabled" banner — the
deployment is unauthenticated. Set
EVIDENTIA_API_AUTH_TOKEN_FILEand restartevidentia serve. Local-only Convert and OCSF inline ingest stay enabled regardless. ... collector not installed. Run pip install ...— install the provider extra named in the message (e.g.evidentia-collectors[aws],evidentia-core[ocsf]).- "Env var '…' is not set or is empty" — the server has no credential for that provider. Export the named env var (CLI) or set it in the server environment (console) and re-run; confirm with the Status tab.
- An OCSF URL is refused / SSRF error — the host resolved to a private /
loopback / link-local / metadata address and the default guard blocked it. Use a
public URL, or pass
--allow-private-ips(CLI) / uncheck Block private IPs (console) only for a trusted internal endpoint. - A finding shows
compliance_status: unknown— a transient 5xx on one sub-check. The run still completes; re-run to resolve the indeterminate item.