- References
The attack described involves an agent, interacting with GitHub via MCP, being tricked by a malicious GitHub issue in a public repository. This issue contains a prompt injection that coerces the agent to first access a private repository and then exfiltrate its data into a public pull request. Our SCITT-based approach can defend against this by validating each tool call against context-aware policies before execution.
Here's how it would work:
-
Initial Benign Interaction:
- User asks the agent: "Have a look at the open issues in
<user>/public-repo
." - The agent intends to call
github_mcp.get_issues(repo='<user>/public-repo')
. - Our LLM Proxy intercepts this. A BOM (BOM_GetPublicIssues) is created, sent to the SCITT Transparency Service (TS), validated against its Registration Policy, a receipt is issued, a Transparent Statement (
URN_GetPublicIssues
) is formed, and a JWT (JWT_ForGetPublicIssues
) is given to the agent/MCP client to make the call. This call is likely permitted. - The agent fetches issues, including the malicious one containing the prompt injection.
- User asks the agent: "Have a look at the open issues in
-
Agent Tries to Access Private Repo (Post-Injection):
- The injected agent now intends to call
github_mcp.get_file_content(repo='<user>/private-repo', path='sensitive_data.txt')
. - The LLM Proxy intercepts this. A new BOM (BOM_AccessPrivateFile) is created. This BOM would crucially include:
sourceOfInstruction
: "github_issue_content:<public_repo_issue_id_or_url>" (indicating the instruction came from potentially untrusted content).provenanceChain
: [URN_GetPublicIssues
] (linking this action back to the previous validated step).
- This BOM is submitted to the SCITT TS. The SCITT Registration Policy now plays a critical role. It would contain rules to detect this toxic flow:
- It would check
sourceOfInstruction
. If it's from an untrusted source (like public issue content) and the target is a sensitive resource (like a private repo), it could deny the request. - It could analyze the
provenanceChain
. If the chain started with access to a public resource (URN_GetPublicIssues
forpublic-repo
) and now attempts to access a private resource based on content from that public interaction, it's a high-risk escalation.
- It would check
- The SCITT TS, based on this policy, would refuse to issue a receipt for
BOM_AccessPrivateFile
. The tool call is never executed. No JWT is issued. The attempt to access the private repository is blocked.
- The injected agent now intends to call
-
Agent Tries to Exfiltrate Data (Would be Blocked Earlier):
- Since the previous step was blocked, this wouldn't be reached. However, if it were, the agent would intend to call
github_mcp.create_pull_request(repo='<user>/public-repo', body='<leaked_data>')
. - A BOM (BOM_CreatePublicPR) would be created, again with
sourceOfInstruction
andprovenanceChain
(now includingURN_AccessPrivateFile
, even if hypothetical). - The SCITT Registration Policy would again deny this, based on rules preventing data potentially sourced from private contexts (indicated by
provenanceChain
) from being written to public contexts, especially when initiated by untrusted instructions.
- Since the previous step was blocked, this wouldn't be reached. However, if it were, the agent would intend to call
The core defense lies in the LLM Proxy intercepting every tool call, creating a detailed BOM for it (including provenance and source of instruction), and having a robust, context-aware SCITT Registration Policy that can identify and block these patterns of escalation and exfiltration based on untrusted inputs.
sequenceDiagram
participant LP as LLM Proxy
participant TI as Token Issuer
participant ToolA as Tool A
participant SCITT_TS as SCITT Transparency Service
participant ResR as Resource R
LP->>TI: Request JWT_ProxyAuthored (subj: URN_ProxyValidatedCall, aud: ToolA_DID)
TI-->>LP: JWT_ProxyAuthored
LP->>ToolA: Invoke (auth: JWT_ProxyAuthored)
ToolA->>TI: Request JWT_ToolScoped (using JWT_ProxyAuthored for auth, claims include original_ref: URN_ProxyValidatedCall, aud: ResR_DID)
TI-->>ToolA: JWT_ToolScoped
ToolA->>ResR: Access Resource (auth: JWT_ToolScoped)
ResR->>ResR: Validate JWT_ToolScoped, extract URN_ProxyValidatedCall
ResR->>SCITT_TS: GET TransparentStatement (urn=URN_ProxyValidatedCall)
SCITT_TS-->>ResR: TransparentStatement (contains original_BOM_element, TS_receipt)
ResR->>SCITT_TS: GET RegistrationPolicy.yaml (from TS's DID/URI)
SCITT_TS-->>ResR: TS_RegistrationPolicy.yaml
ResR->>ResR: Evaluate OwnPolicy(original_BOM_element, TS_RegistrationPolicy.yaml, JWT_ToolScoped_claims)
alt Access Granted by OwnPolicy
ResR-->>ToolA: Resource Data / Success
else Access Denied by OwnPolicy
ResR-->>ToolA: Access Denied Error
end
Example BOMs (YAML Pseudocode):
-
BOM_GetPublicIssues.yaml
(Initial benign call's payload):apiVersion: scitt.example.com/v1alpha1 kind: ToolCallProposal metadata: statementID: "call_gpi_001" requestorContext: { userID: "user-alpha", sessionID: "session-123" } sourceOfInstruction: "user_direct_prompt" # From the user directly spec: toolName: "github_mcp" functionName: "get_issues" parameters: { owner: "user-alpha", repo: "public-repo" } intent: "Fetch open issues from public-repo for user review." sensitivityContext: { sourceRepoType: "public", targetRepoType: "public" }
-
BOM_AccessPrivateFile.yaml
(Malicious call attempt's payload):apiVersion: scitt.example.com/v1alpha1 kind: ToolCallProposal metadata: statementID: "call_apf_002" requestorContext: { userID: "user-alpha", sessionID: "session-123" } # Crucial fields for policy decision: sourceOfInstruction: "github_issue_content:public-repo/issues/42" provenanceChain: ["URN_GetPublicIssues"] # URN of the BOM_GetPublicIssues Transparent Statement spec: toolName: "github_mcp" functionName: "get_file_content" parameters: { owner: "user-alpha", repo: "private-repo", path: "secrets.txt" } intent: "Access file private-repo/secrets.txt based on issue instructions." # Agent's stated intent sensitivityContext: { sourceRepoType: "public", targetRepoType: "private" } # Context shift
Example SCITT Registration Policy (YAML Pseudocode snippets): This policy is applied by the SCITT Transparency Service.
# scitt_ts_registration_policy_github.yaml
name: "GitHub MCP Secure Tool Call Policy"
on: statement_submission
jobs:
validate_github_mcp_call:
steps:
- name: "Basic Schema and Permission Checks"
# ... (ensure user has underlying GitHub permissions for action) ...
- name: "Prevent Cross-Context Escalation from Untrusted Sources"
run: |
# If instruction source is external content (e.g., issue body)
if payload.metadata.sourceOfInstruction.startsWith("github_issue_content:"):
# And if the previous interaction (from provenance) was with a public repo
# (This implies the policy engine can resolve URNs in provenanceChain to get their context)
previous_op_context = resolve_urn_context(payload.metadata.provenanceChain[0] if payload.metadata.provenanceChain else None)
if previous_op_context and previous_op_context.targetRepoType == "public":
# And current operation targets a private repo
if payload.spec.sensitivityContext.targetRepoType == "private":
deny_statement("Attempt to access private repository based on instruction from public repository content is forbidden.")
- name: "Prevent Data Exfiltration to Public Repositories"
if: payload.spec.functionName == "create_pull_request" or payload.spec.functionName == "create_issue_comment"
run: |
if payload.spec.sensitivityContext.targetRepoType == "public":
# Check if provenanceChain contains any access to private data
for urn_in_chain in payload.metadata.provenanceChain:
op_context = resolve_urn_context(urn_in_chain)
if op_context and op_context.targetRepoType == "private": # or op_context.data_accessed_was_sensitive
deny_statement("Attempt to write potentially private data to a public repository is forbidden.")
Example Resource (GitHub MCP Server) JWT Validating Policy (Conceptual - Not YAML, but logic): This is applied by the actual tool/resource (GitHub MCP server in this case) when it receives a JWT.
function validateAndExecute(jwt, intendedApiCall):
// 1. Validate JWT core properties (signature, expiry, audience == self_DID)
validateToken(jwt, expectedAudience = "did:web:githubmcp.example.com")
// 2. Get URN from JWT subject
urn = jwt.payload.subject
// 3. Fetch Transparent Statement from SCITT TS
transparentStatement = scittClient.fetchTransparentStatement(urn)
if not transparentStatement:
throw AuthorizationError("URN from JWT not found or invalid.")
// 4. Compare BOM in Transparent Statement with the intended API call
bomPayload = transparentStatement.payload // This is the SCITT-validated BOM
if bomPayload.spec.toolName != "github_mcp" or \
bomPayload.spec.functionName != intendedApiCall.functionName or \
bomPayload.spec.parameters.owner != intendedApiCall.parameters.owner or \
bomPayload.spec.parameters.repo != intendedApiCall.parameters.repo or \
(bomPayload.spec.parameters.path and bomPayload.spec.parameters.path != intendedApiCall.parameters.path) or \
(bomPayload.spec.parameters.body and bomPayload.spec.parameters.body != intendedApiCall.parameters.body) : # etc. for all relevant params
throw AuthorizationError("JWT validated for a different operation than requested.")
// 5. If all checks pass, proceed with executing the intendedApiCall
executeGitHubApi(intendedApiCall)
This ensures that the JWT presented to the GitHub MCP server was indeed issued for the exact operation it is being asked to perform, as recorded in the SCITT-validated BOM. The primary defense against the toxic flow, however, happens at the SCITT Registration Policy level, preventing the malicious operation from even getting a validated URN and JWT.