GenAI Mutating Workload Identity Example

References
- https://invariantlabs.ai/blog/mcp-github-vulnerability
- https://mailarchive.ietf.org/arch/msg/scitt/BjCAySWyODuhDWwn4kMtCoY5eDA/
  - https://github.com/johnandersen777/litellm/commit/3b6b7427b15c0cadd23a8b5da639e22a2fba5043
    - See file scitt_validated_tool_use.py

The attack described involves an agent, interacting with GitHub via MCP, being tricked by a malicious GitHub issue in a public repository. This issue contains a prompt injection that coerces the agent to first access a private repository and then exfiltrate its data into a public pull request. Our SCITT-based approach can defend against this by validating each tool call against context-aware policies before execution.

Here's how it would work:

Initial Benign Interaction:
- User asks the agent: "Have a look at the open issues in <user>/public-repo."
- The agent intends to call github_mcp.get_issues(repo='<user>/public-repo').
- Our LLM Proxy intercepts this. A BOM (BOM_GetPublicIssues) is created, sent to the SCITT Transparency Service (TS), validated against its Registration Policy, a receipt is issued, a Transparent Statement (URN_GetPublicIssues) is formed, and a JWT (JWT_ForGetPublicIssues) is given to the agent/MCP client to make the call. This call is likely permitted.
- The agent fetches issues, including the malicious one containing the prompt injection.
Agent Tries to Access Private Repo (Post-Injection):
- The injected agent now intends to call github_mcp.get_file_content(repo='<user>/private-repo', path='sensitive_data.txt').
- The LLM Proxy intercepts this. A new BOM (BOM_AccessPrivateFile) is created. This BOM would crucially include:
  - sourceOfInstruction: "github_issue_content:<public_repo_issue_id_or_url>" (indicating the instruction came from potentially untrusted content).
  - provenanceChain: [URN_GetPublicIssues] (linking this action back to the previous validated step).
- This BOM is submitted to the SCITT TS. The SCITT Registration Policy now plays a critical role. It would contain rules to detect this toxic flow:
  - It would check sourceOfInstruction. If it's from an untrusted source (like public issue content) and the target is a sensitive resource (like a private repo), it could deny the request.
  - It could analyze the provenanceChain. If the chain started with access to a public resource (URN_GetPublicIssues for public-repo) and now attempts to access a private resource based on content from that public interaction, it's a high-risk escalation.
- The SCITT TS, based on this policy, would refuse to issue a receipt for BOM_AccessPrivateFile. The tool call is never executed. No JWT is issued. The attempt to access the private repository is blocked.
Agent Tries to Exfiltrate Data (Would be Blocked Earlier):
- Since the previous step was blocked, this wouldn't be reached. However, if it were, the agent would intend to call github_mcp.create_pull_request(repo='<user>/public-repo', body='<leaked_data>').
- A BOM (BOM_CreatePublicPR) would be created, again with sourceOfInstruction and provenanceChain (now including URN_AccessPrivateFile, even if hypothetical).
- The SCITT Registration Policy would again deny this, based on rules preventing data potentially sourced from private contexts (indicated by provenanceChain) from being written to public contexts, especially when initiated by untrusted instructions.

The core defense lies in the LLM Proxy intercepting every tool call, creating a detailed BOM for it (including provenance and source of instruction), and having a robust, context-aware SCITT Registration Policy that can identify and block these patterns of escalation and exfiltration based on untrusted inputs.

sequenceDiagram
    participant LP as LLM Proxy
    participant TI as Token Issuer
    participant ToolA as Tool A
    participant SCITT_TS as SCITT Transparency Service
    participant ResR as Resource R

    LP->>TI: Request JWT_ProxyAuthored (subj: URN_ProxyValidatedCall, aud: ToolA_DID)
    TI-->>LP: JWT_ProxyAuthored
    LP->>ToolA: Invoke (auth: JWT_ProxyAuthored)

    ToolA->>TI: Request JWT_ToolScoped (using JWT_ProxyAuthored for auth, claims include original_ref: URN_ProxyValidatedCall, aud: ResR_DID)
    TI-->>ToolA: JWT_ToolScoped
    ToolA->>ResR: Access Resource (auth: JWT_ToolScoped)

    ResR->>ResR: Validate JWT_ToolScoped, extract URN_ProxyValidatedCall
    ResR->>SCITT_TS: GET TransparentStatement (urn=URN_ProxyValidatedCall)
    SCITT_TS-->>ResR: TransparentStatement (contains original_BOM_element, TS_receipt)
    ResR->>SCITT_TS: GET RegistrationPolicy.yaml (from TS's DID/URI)
    SCITT_TS-->>ResR: TS_RegistrationPolicy.yaml
    ResR->>ResR: Evaluate OwnPolicy(original_BOM_element, TS_RegistrationPolicy.yaml, JWT_ToolScoped_claims)
    alt Access Granted by OwnPolicy
        ResR-->>ToolA: Resource Data / Success
    else Access Denied by OwnPolicy
        ResR-->>ToolA: Access Denied Error
    end

Example BOMs (YAML Pseudocode):

BOM_GetPublicIssues.yaml (Initial benign call's payload):

apiVersion: scitt.example.com/v1alpha1
kind: ToolCallProposal
metadata:
  statementID: "call_gpi_001"
  requestorContext: { userID: "user-alpha", sessionID: "session-123" }
  sourceOfInstruction: "user_direct_prompt" # From the user directly
spec:
  toolName: "github_mcp"
  functionName: "get_issues"
  parameters: { owner: "user-alpha", repo: "public-repo" }
  intent: "Fetch open issues from public-repo for user review."
  sensitivityContext: { sourceRepoType: "public", targetRepoType: "public" }

BOM_AccessPrivateFile.yaml (Malicious call attempt's payload):

apiVersion: scitt.example.com/v1alpha1
kind: ToolCallProposal
metadata:
  statementID: "call_apf_002"
  requestorContext: { userID: "user-alpha", sessionID: "session-123" }
  # Crucial fields for policy decision:
  sourceOfInstruction: "github_issue_content:public-repo/issues/42"
  provenanceChain: ["URN_GetPublicIssues"] # URN of the BOM_GetPublicIssues Transparent Statement
spec:
  toolName: "github_mcp"
  functionName: "get_file_content"
  parameters: { owner: "user-alpha", repo: "private-repo", path: "secrets.txt" }
  intent: "Access file private-repo/secrets.txt based on issue instructions." # Agent's stated intent
  sensitivityContext: { sourceRepoType: "public", targetRepoType: "private" } # Context shift

Example SCITT Registration Policy (YAML Pseudocode snippets): This policy is applied by the SCITT Transparency Service.

# scitt_ts_registration_policy_github.yaml
name: "GitHub MCP Secure Tool Call Policy"
on: statement_submission
jobs:
  validate_github_mcp_call:
    steps:
      - name: "Basic Schema and Permission Checks"
        # ... (ensure user has underlying GitHub permissions for action) ...

      - name: "Prevent Cross-Context Escalation from Untrusted Sources"
        run: |
          # If instruction source is external content (e.g., issue body)
          if payload.metadata.sourceOfInstruction.startsWith("github_issue_content:"):
            # And if the previous interaction (from provenance) was with a public repo
            # (This implies the policy engine can resolve URNs in provenanceChain to get their context)
            previous_op_context = resolve_urn_context(payload.metadata.provenanceChain[0] if payload.metadata.provenanceChain else None)
            if previous_op_context and previous_op_context.targetRepoType == "public":
              # And current operation targets a private repo
              if payload.spec.sensitivityContext.targetRepoType == "private":
                deny_statement("Attempt to access private repository based on instruction from public repository content is forbidden.")

      - name: "Prevent Data Exfiltration to Public Repositories"
        if: payload.spec.functionName == "create_pull_request" or payload.spec.functionName == "create_issue_comment"
        run: |
          if payload.spec.sensitivityContext.targetRepoType == "public":
            # Check if provenanceChain contains any access to private data
            for urn_in_chain in payload.metadata.provenanceChain:
              op_context = resolve_urn_context(urn_in_chain)
              if op_context and op_context.targetRepoType == "private": # or op_context.data_accessed_was_sensitive
                deny_statement("Attempt to write potentially private data to a public repository is forbidden.")

Example Resource (GitHub MCP Server) JWT Validating Policy (Conceptual - Not YAML, but logic): This is applied by the actual tool/resource (GitHub MCP server in this case) when it receives a JWT.

function validateAndExecute(jwt, intendedApiCall):
  // 1. Validate JWT core properties (signature, expiry, audience == self_DID)
  validateToken(jwt, expectedAudience = "did:web:githubmcp.example.com")

  // 2. Get URN from JWT subject
  urn = jwt.payload.subject

  // 3. Fetch Transparent Statement from SCITT TS
  transparentStatement = scittClient.fetchTransparentStatement(urn)
  if not transparentStatement:
    throw AuthorizationError("URN from JWT not found or invalid.")

  // 4. Compare BOM in Transparent Statement with the intended API call
  bomPayload = transparentStatement.payload // This is the SCITT-validated BOM
  if bomPayload.spec.toolName != "github_mcp" or \
     bomPayload.spec.functionName != intendedApiCall.functionName or \
     bomPayload.spec.parameters.owner != intendedApiCall.parameters.owner or \
     bomPayload.spec.parameters.repo != intendedApiCall.parameters.repo or \
     (bomPayload.spec.parameters.path and bomPayload.spec.parameters.path != intendedApiCall.parameters.path) or \
     (bomPayload.spec.parameters.body and bomPayload.spec.parameters.body != intendedApiCall.parameters.body) : # etc. for all relevant params
    throw AuthorizationError("JWT validated for a different operation than requested.")

  // 5. If all checks pass, proceed with executing the intendedApiCall
  executeGitHubApi(intendedApiCall)

This ensures that the JWT presented to the GitHub MCP server was indeed issued for the exact operation it is being asked to perform, as recorded in the SCITT-validated BOM. The primary defense against the toxic flow, however, happens at the SCITT Registration Policy level, preventing the malicious operation from even getting a validated URN and JWT.

johnandersen777/GENAI_MUTATING_WORKLOAD_IDENTITY_EXAMPLE.md

GenAI Mutating Workload Identity Example