Your first evidence collection¶
Gap analysis tells you what controls a framework expects. Collectors tell you
what your real systems actually do — they pull live configuration from a
source system (GitHub, AWS, Okta, a SQL database, etc.) and emit
SecurityFinding records mapped to control families. This guide wires one
collector end-to-end: configure it, run it, inspect the findings, and fold them
into a gap analysis as tamper-evident OSCAL back-matter.
We use the GitHub collector because it needs only a personal access token and a repository you can read — no cloud account required.
Prerequisites¶
- Evidentia installed (
pip install evidentia; verify withevidentia version). - A GitHub repository you can read.
- A GitHub personal access token (classic or fine-grained) with
reporead scope. A token is required for private repos and lifts the API rate limit on public ones. (Public repos work without a token, subject to the lower unauthenticated rate limit.)
Step 1 — Configure the token¶
The GitHub collector reads its token from the GITHUB_TOKEN environment
variable by default (you can override with --token, but the env var keeps the
secret out of your shell history):
Bash / Linux / macOS
PowerShell (Windows)
Evidentia's collectors deliberately prefer environment variables / file paths for secrets over CLI value flags, so a token never lands in your shell history or a process listing.
Step 2 — Run the collector¶
evidentia collect github takes a --repo in owner/repo form and writes the
findings JSON to --output (or stdout if omitted):
Bash / Linux / macOS
PowerShell (Windows)
The collector inspects branch protection, CODEOWNERS presence, repository
visibility, and (where the token's scope allows) the security posture surface,
then writes a JSON list of SecurityFinding objects. It prints a per-severity
and per-source summary to the console and writes the full list to
findings.json:
Wrote 3 findings to findings.json
GitHub findings
(octocat/Hello-World) — 3 total
┌──────────┬───────┐
│ Severity │ Count │
├──────────┼───────┤
│ medium │ 2 │
│ low │ 1 │
└──────────┴───────┘
By source
┌────────┬───────┐
│ Source │ Count │
├────────┼───────┤
│ github │ 3 │
└────────┴───────┘
(Counts are illustrative — they depend on the repository's configuration and the token's scope. Without a token you will see fewer findings, because checks such as branch protection require authentication.)
Step 3 — Inspect the findings¶
Each finding carries a compliance_status (pass / fail / warning /
not_applicable / unknown), a severity, and one or more control_mappings
that tie the observation to NIST SP 800-53 Rev 5 control IDs (with an OLIR
relationship + a per-mapping justification). Open findings.json in your editor,
or pretty-print the first record:
Bash / Linux / macOS
PowerShell (Windows)
A branch-protection finding, for example, maps to access-enforcement controls and records whether the default branch requires reviews before merge.
Step 4 — Fold the findings into a gap analysis¶
The payoff: pass the findings to evidentia gap analyze with --format oscal-ar
and each finding is embedded in the OSCAL Assessment Results back-matter with a
SHA-256 digest, cross-referenced from observations that share a control ID:
Bash / Linux / macOS
evidentia gap analyze \
--inventory my-controls.yaml \
--frameworks nist-800-53-rev5-moderate \
--findings findings.json \
--format oscal-ar \
--output assessment-results.json
PowerShell (Windows)
evidentia gap analyze `
--inventory my-controls.yaml `
--frameworks nist-800-53-rev5-moderate `
--findings findings.json `
--format oscal-ar `
--output assessment-results.json
When the findings are embedded you will see a confirmation line on the console:
Loaded 3 finding(s) from findings.json for AR evidence embedding.
Report exported: assessment-results.json (oscal-ar)
(Don't have an inventory yet? evidentia init --preset nist-moderate-starter
scaffolds evidentia.yaml + my-controls.yaml you can edit — see the
Quickstart.)
The findings are now tamper-evident evidence attached to your assessment: an
auditor can recompute the SHA-256 of each back-matter resource and confirm it
matches the embedded digest. --findings is honored only with
--format oscal-ar; passing it with another format prints a note and ignores
the file.
Step 5 — (Optional) convert findings to OCSF¶
If your downstream tooling speaks OCSF, convert the same findings file to an
OCSF Compliance Finding bundle (requires the [ocsf] extra,
pip install "evidentia-core[ocsf]"):
See Guides → Ingest OCSF for the round-trip and the SIEM-ingest path.
What's next¶
- Run the full gap workflow: Guides → Run a gap analysis.
- Try a cloud collector:
evidentia collect awspulls AWS Config + Security Hub findings; see the CLI reference for the full collector matrix (Okta, SQL, Snowflake, Databricks, Vanta, Drata, BitSight, SecurityScorecard). - Self-assess this repo's open-source posture: Guides → OSPS self-assessment uses the GitHub collector's OSPS Baseline helpers.
Troubleshooting¶
GitHubCollectorError: Could not read repo ...— the token cannot see the repository (wrong scope, private repo without access, or a typo in--repo owner/repo). Confirm the token works:
Bash / Linux / macOS
PowerShell (Windows)
(In PowerShell, curl is an alias for Invoke-WebRequest; use curl.exe for
the real cURL, or Invoke-RestMethod -Uri https://api.github.com/repos/<owner>/<repo> -Headers @{ Authorization = "Bearer $env:GITHUB_TOKEN" }.)
- Rate-limit (HTTP 403/429) — unauthenticated requests are throttled. Set
GITHUB_TOKEN (Step 1) to use the authenticated 5000-req/hour limit.
- A finding shows compliance_status: unknown — a transient API/5xx error
on one sub-check. The run still completes; the indeterminate item is flagged
rather than dropped. Re-run to resolve.