Security & Audit¶
Cerebro MCP includes a detection-first security layer that classifies tool risk, detects suspicious invocations, and maintains an append-only audit trail. The security layer is observation-only — it logs and measures but never blocks tool execution.
Why This Exists¶
LLM supply-chain attacks are a real and growing threat. Research such as "Your Agent Is Mine" (arXiv:2604.08407) demonstrates that intermediary routers between clients and LLM providers can inject malicious tool calls that reach servers unverified. Cerebro's security layer makes such attacks visible and auditable.
Tool Risk Classification¶
Every tool registered with Cerebro MCP is assigned a risk class:
| Risk Class | Description | Example Tools |
|---|---|---|
read_only | No side effects | execute_query, describe_table, search_models, generate_charts, quick_chart |
server_state_write | Persists state to disk or memory | save_query, generate_report, start_research_project, storyteller_record_* |
workspace_write | Writes to the workspace filesystem | scaffold_dashboard_tab |
subprocess | Spawns external processes | scaffold_dashboard_tab |
app_only | Hidden from the model; only callable by the frontend | get_mini_app_rows, get_mini_app_state |
Unknown tools
Dynamically registered tools (e.g., custom query tools from YAML) that are not in the static registry default to read_only and are flagged as unknown_tool in the audit log.
Risk Priority¶
When a tool has multiple risk classes, the primary class is determined by severity: subprocess > workspace_write > app_only > server_state_write > read_only.
Suspicious-Call Detection¶
The security layer flags calls as suspicious in three cases:
| Flag | Trigger |
|---|---|
app_only_tool_called | An app_only tool is invoked — these should only come from the frontend SDK |
workspace_write_via_sse | A filesystem-writing or process-spawning tool is called over the SSE (remote) transport |
unknown_tool | The tool name is not in the static risk registry |
Suspicious does not mean blocked
In log_only mode, suspicious calls are logged and counted but not blocked. This allows operators to establish baselines and tune thresholds before enabling enforcement in a future phase.
Audit Log¶
Every tool call — successful or failed, suspicious or not — produces an append-only JSONL audit event.
Storage¶
- Directory: configured via
MCP_SECURITY_LOG_DIR(default.cerebro/security_audit/) - File naming:
security_audit_YYYY-MM-DD.jsonl(UTC date, daily rotation) - Thread-safe: writes are serialized via a lock for concurrent SSE sessions
Event Schema¶
{
"timestamp": "2026-04-10T14:32:01.123456+00:00",
"transport": "stdio",
"auth_present": false,
"tool_name": "execute_query",
"risk_class": "read_only",
"visibility": "public",
"redacted_arg_summary": "{\"sql\":\"SELECT count()...\",\"database\":\"dbt\"}",
"arg_hash": "a1b2c3d4e5f6...",
"result_hash": "f6e5d4c3b2a1...",
"duration_ms": 142,
"success": true,
"suspicious_flags": []
}
| Field | Description |
|---|---|
transport | stdio (local) or sse (remote) |
auth_present | Whether MCP_AUTH_TOKEN is configured in the environment |
risk_class | Primary risk class of the tool |
visibility | public (model-visible) or app_only (frontend-only) |
redacted_arg_summary | First 200 characters of redacted JSON arguments |
arg_hash / result_hash | SHA-256 of canonical JSON of redacted payloads |
suspicious_flags | List of flag strings; empty when the call is not suspicious |
Redaction¶
Arguments and results are redacted before hashing using the same redaction engine as the reasoning trace system. Keys matching password, token, api_key, secret, authorization, private_key, and related patterns are replaced with ***REDACTED***.
Artifact Provenance¶
When remote artifacts (dbt manifest, catalog, semantic registry, docs index) are loaded or reloaded, the server emits a structured artifact_reload log event with the artifact label, source (local/remote), content hash (SHA-256), ETag, and Last-Modified header. These events are queryable in Loki:
Report Endpoint Auth Audit¶
The /reports/{id} endpoint logs every access attempt when MCP_AUTH_TOKEN is configured:
- Auth method:
bearer,query_token, ornone - Success/denial status
- Report ID
Denied attempts increment cerebro_report_token_auth_total{status="denied"}.
Configuration¶
| Variable | Default | Description |
|---|---|---|
MCP_SECURITY_POLICY_MODE | log_only | Security policy mode. Future: warn, enforce |
MCP_SECURITY_LOG_DIR | .cerebro/security_audit | Daily JSONL audit file directory |
MCP_EXPECTED_MANIFEST_SHA256 | (empty) | Optional SHA-256 pin for the dbt manifest; empty = disabled |
Prometheus Metrics¶
Security counters are exposed at the /metrics endpoint and visualized in the Grafana dashboard:
| Metric | Labels | Description |
|---|---|---|
cerebro_security_high_risk_tool_calls_total | tool_name, risk_class, transport | Non-read_only tool invocations |
cerebro_security_suspicious_calls_total | tool_name, flag_type | Suspicious call flags |
cerebro_security_app_only_calls_total | tool_name, transport | App-only tool invocations |
cerebro_report_token_auth_total | status | Report endpoint auth events |
Architecture¶
Tool Call
|
v
+------------------------+
| _wrapped_call_tool() | (tools/reasoning.py)
| - timing |
| - tracing |
| - observability |
+----------+-------------+
|
v
+------------------------+
| assess_tool_call() | (security.py)
| - risk lookup |
| - flag detection |
| - hash computation |
| - JSONL audit write |
| - Prometheus counters |
+------------------------+
|
(never blocks — try/except)
The security assessment runs after the tool has executed. It is wrapped in try/except Exception to guarantee it never interferes with tool execution. If the security layer itself fails, a debug-level log is emitted and the tool result is returned unchanged.
Future: Enforcement Phase¶
The MCP_SECURITY_POLICY_MODE setting supports future enforcement modes:
warn: Log suspicious calls and emit warnings in tool responses, but do not blockenforce: Block calls that match configurable rules (e.g., workspace writes over SSE without explicit authorization)
The JSONL audit trail and Prometheus counters provide the observability foundation needed to establish baselines and tune enforcement thresholds before enabling them.