Platform / 05 MCP Integration

MCP Integration

StackBilt exposes its 6-mode architecture workflow as an MCP-compliant remote server. Connect MCP-compatible agents (Claude Code, Claude Desktop, custom agents) to run architecture flows programmatically. The server provides 22 native tools for flow and artifact management, and proxies up to 47 additional Compass governance tools — tier-gated based on your plan.

Note: Compass governance tool proxying is implemented but pending activation. The 22 native tools are fully operational. See Compass Governance for integration status.

Production endpoint: https://stackbilt.dev/mcp

Protocol versions: 2024-11-05 (SSE transport) · 2025-03-26 (Streamable HTTP transport)

Authentication

MCP endpoints require authentication (except GET /mcp/info). Three methods, checked in order:

MethodHeaderNotes
Static tokenAuthorization: Bearer <STACKBILT_MCP_TOKEN>Admin-level, legacy MCP_TOKEN fallback
Access keyAuthorization: Bearer ska_... or X-Access-Key: ska_...Requires ai:invoke scope
Compass JWTAuthorization: Bearer eyJ...RS256, verified via JWKS

Exchange an access key for a JWT that works at both StackBilt and Compass:

curl -X POST https://stackbilt.dev/api/auth/token \
  -H "X-Access-Key: ska_..." \
  -H "Content-Type: application/json" \
  -d '{"expires_in": 3600}'
# Returns: { "access_token": "eyJ...", "token_type": "Bearer", "expires_in": 3600 }

One key, both services.

Transport Options

TransportEndpointMethodUse Case
Streamable HTTP/mcpPOSTModern clients, single request/response
SSE Stream/mcpGETServer-pushed events, session-based
Legacy SSE/mcp/sseGETOlder 2024-11-05 clients
Legacy Messages/mcp/messagesPOSTPOST endpoint for legacy SSE
Server Info/mcp/infoGETCapabilities discovery (no auth)

Session Management

For Streamable HTTP, sessions use the Mcp-Session-Id header:

  1. First initialize request returns a session ID
  2. Include Mcp-Session-Id in subsequent requests
  3. DELETE /mcp with session ID to terminate

Client Configuration

Claude Code / Claude Desktop (Streamable HTTP)

{
  "mcpServers": {
    "stackbilt": {
      "url": "https://stackbilt.dev/mcp",
      "transport": { "type": "streamable-http" },
      "headers": {
        "Authorization": "Bearer <YOUR_MCP_TOKEN>"
      }
    }
  }
}

Legacy SSE Fallback

{
  "mcpServers": {
    "stackbilt": {
      "type": "sse",
      "url": "https://stackbilt.dev/mcp",
      "headers": {
        "Authorization": "Bearer <YOUR_MCP_TOKEN>"
      }
    }
  }
}

Custom MCP Client (Node.js)

import { Client } from "@modelcontextprotocol/sdk/client/index.js";
import { SSEClientTransport } from "@modelcontextprotocol/sdk/client/sse.js";

const transport = new SSEClientTransport(
  new URL("https://stackbilt.dev/mcp"),
  {
    requestInit: {
      headers: { "Authorization": "Bearer <YOUR_MCP_TOKEN>" }
    }
  }
);

const client = new Client({ name: "my-agent", version: "1.0.0" });
await client.connect(transport);

const tools = await client.listTools();
const result = await client.callTool("runFullFlowAsync", {
  input: "Build a task management app with team collaboration"
});

Native Tools (22)

Flow Execution

ToolDescription
runFullFlowExecute full 6-mode pipeline (PRODUCT → SPRINT). May timeout on long flows.
runFullFlowAsyncFire-and-forget full flow. Returns immediately with flowId. Recommended.
startFullFlowCreate flow without executing. Advance modes manually with advanceFlowAsync.
advanceFlowAsyncRun next pending mode asynchronously. Returns immediately.
runModeExecute a single mode (e.g., just ARCHITECT).
cancelFlowCancel a running flow. Sets status to FAILED.

Flow Monitoring

ToolDescription
getFlowSummaryLightweight progress check (<2KB). Token usage, mode statuses, quality. Use for polling.
getFlowStatusFull flow state (100KB+). Use only when you need everything.
getFlowLogsExecution timeline with timestamps per mode.
getRunQualityConcise quality checks: grounding, state safety, artifact completeness.

Artifact Retrieval

ToolDescription
getArtifactStructured JSON for one mode (2-5KB). Preferred over getFlowPackage.
getFlowPackageComplete artifact package (100-300KB). All modes, JSON or Markdown.
exportModeArtifactExport metadata and artifact ref handle for one mode.
getArtifactContentChunked prose content with cursor pagination.
getFlowCodegenDraftproject.json draft for codegen engine.
getFlowScaffoldDeployable Workers project (ZIP or JSON). Includes scaffoldHints + nextSteps.

Recovery & Governance

ToolDescription
resumeFlowResume a failed flow from where it stopped.
recoverFlowRerun failed mode + recompute downstream.
amendArtifactFix specific sections by ID (70-80% fewer tokens than recoverFlow).
invalidateCacheClear cached mode artifacts so next run regenerates.
getGovernanceStatusGovernance validation results, blessed patterns, persisted ADR IDs.
submitFeedbackSubmit bug reports, feature requests, and flow quality feedback. Params: message (string, required), type (enum: bug/feature/general/flow-quality), rating (number 1-5, optional), flowId (string, optional), mode (string, optional).

Compass Governance Tools (up to 47)

The StackBilt MCP server proxies Compass governance tools through the same endpoint. When an agent calls tools/list, the response merges the 22 native tools above with whichever Compass tools the user’s plan unlocks. Tool calls to Compass names are forwarded via a service binding — no separate Compass connection required.

Tier Gating

TierCompass ToolsAccess Level
Free24 toolsRead-only governance: browse ledger entries, patterns, protocols, projects, experiments, and temporal analyses. Submit requests and get advisory briefings.
Pro / EnterpriseAll 47 tools (24 free + 23 pro-only)Full governance: create and mutate ledger entries, patterns, protocols, and projects. Run LLM-powered governance, strategy, red-teaming, architecture validation, temporal analysis, change classification, experiment lifecycle, artifact quality evaluation, and codebase compliance scanning.

Free-Tier Tools (24)

CategoryTools
Contextset_context, get_context, get_history, clear_session
Ledger (read-only)list_ledger_entries, get_ledger_entry, get_ledger_audit_stats
Patterns (read-only)list_patterns, get_patterns_for_architecture
Requestslist_requests, submit_request, detect_resolved
Protocols (read-only)list_protocols
Projects (read-only)list_projects
Decision Learning (read-only)find_precedents, get_decision_review
Temporal (read-only)list_temporal_analyses, get_temporal_analysis
Change Control (read-only)list_change_classifications
Experiments (read-only)list_experiments, get_experiment
Notary (read-only)get_project_snapshot
Flow Context (read-only)get_flow_context
Advisorybrief

Pro-Only Tools (23)

CategoryTools
Advisory (LLM-powered)governance, strategy, drafter, red_team
Ledger (mutations)create_ledger_entry, update_ledger_entry
Patterns (mutations)create_pattern
Requests (mutations)resolve_request
Protocols (mutations)create_protocol
Projects (mutations)create_project
Architect Integrationvalidate_architecture, persist_architecture_adr, set_integration_mode
Temporal Analysis (LLM)temporal_analysis
Change Control (LLM)classify_change
Experiments (mutations)propose_experiment, update_experiment, review_experiment
Decision Learning (mutations)track_outcome
Quality (LLM)evaluate_artifact_quality
Artifact Assessment (LLM)assess_artifact
Batch Persistencebatch_persist_records
Compliancescan_codebase_compliance

How Proxying Works

The proxy uses the Compass service binding (CSA_MCP_URL) to forward tool calls. Authentication is resolved automatically:

Access-key users: the server exchanges the ska_ key for a Compass JWT via the Token Broker, caches it, and attaches it to proxied requests.

Admin/static-token users: the server uses the CSA_MCP_TOKEN environment variable directly (exchanging for a JWT if needed).

JWT users: the existing Compass JWT is used as-is.

Tool calls that target a name not in the 22 native tools are routed to Compass if the user’s tier permits. If the tool is not allowed for the tier, the server returns a standard -32602 (invalid params) error.

The lightweight tools enable a minimal-overhead pattern:

runFullFlowAsync(input)          →  ~200 bytes

getFlowSummary(flowId) × N      →  <2KB per poll (every 10s)

getArtifact(flowId, "PRODUCT")   →  2-5KB
getArtifact(flowId, "ARCHITECT") →  2-5KB
getArtifact(flowId, "SPRINT")    →  2-5KB

getFlowScaffold(flowId, "json")  →  file manifest + nextSteps

Write files to disk, deploy

Total: ~10 tool calls, ~40KB downloaded. Previous workflow: 18+ calls, 300KB+.

When to Use Each Tool

NeedUseAvoid
Start a flowrunFullFlowAsyncrunFullFlow (may timeout)
Poll progressgetFlowSummary (<2KB)getFlowStatus (100KB+)
Read one modegetArtifact (2-5KB)getFlowPackage (300KB)
Fix one sectionamendArtifactrecoverFlow (reruns entire mode)
Get deployable codegetFlowScaffoldManual file generation

Step-by-Step: Advance Modes Manually

For fine-grained control, advance one mode at a time:

const { flowId } = await stackbilt.startFullFlow({ input: idea });

for (let i = 0; i < 6; i++) {
  await stackbilt.advanceFlowAsync({ flowId });

  // Poll until mode completes
  while (true) {
    const summary = await stackbilt.getFlowSummary({ flowId });
    const running = summary.modeStatuses?.some((m) => m.status === 'RUNNING');
    if (!running) break;
    await new Promise((r) => setTimeout(r, 1500));
  }
}

Governance Integration

Pass a governance config to validate architecture against blessed patterns:

governance: {
  mode: 'ENFORCED',          // PASSIVE | ADVISORY | ENFORCED
  projectId: 'my-proj',      // scope to project patterns
  autoPersist: true,          // record ADRs in governance ledger
  persistTags: ['api', 'v2'],
  qualityThreshold: 80,       // 0-100
  transport: 'auto',          // external_http | service_binding | auto
  transportCanaryPercent: 5   // canary rollout percentage
}
ModeBehavior
PASSIVELog only — never blocks
ADVISORYWarn on issues, flow continues
ENFORCEDBlock on FAIL, require remediation

Plan-tier caps: free plans are capped at PASSIVE, pro at ADVISORY, enterprise gets full ENFORCED.

Advanced Governance Sub-configs

Three optional sub-configs extend the base governance object:

domainLock — Locks domain entities after PRODUCT mode completes, preventing drift in downstream modes.

governance: {
  // ...base config...
  domainLock: {
    enabled: true,                        // Enable/disable domain locking
    strictness: 'strict',                 // 'strict' | 'advisory' | 'off'
    noNewEntities: true,                  // Prevent creation of new domain entities
    allowVendors: ['stripe', 'sendgrid'], // Vendor allowlist
    forbidVendors: ['twilio'],            // Vendor blocklist
    requireTerms: ['Order', 'Customer'],  // Domain terms that must appear
    forbidTerms: ['User', 'Account'],     // Domain terms that must not appear
  }
}

qualityByMode — Per-mode quality thresholds. Overrides the top-level qualityThreshold for specific execution modes, letting you enforce tighter standards on critical modes.

governance: {
  // ...base config...
  qualityByMode: {
    ARCHITECT: 90,
    TDD: 85,
    CODE: 80,
  }
}

qualityWeighting — Hybrid local/CSA weighting for quality evaluation. Controls the balance between local static analysis and Compass governance scoring when computing the final quality score.

governance: {
  // ...base config...
  qualityWeighting: {
    local: 0.4,   // Weight given to local analysis (0.0–1.0)
    csa: 0.6,     // Weight given to Compass governance scoring (0.0–1.0)
  }
}

All three sub-configs are independent and can be combined freely within a single governance object.

Error Handling

All errors follow JSON-RPC 2.0 format:

CodeMeaning
-32700Parse error (invalid JSON)
-32600Invalid request
-32601Method not found
-32602Invalid params (unknown tool)
-32000Tool execution failed

Best Practices

  1. Use runFullFlowAsync to avoid client-side timeouts
  2. Poll with getFlowSummary every 5-10 seconds (read-only, no write contention)
  3. Retrieve per-mode with getArtifact instead of downloading the full package
  4. Check usage fields in getFlowSummary for real-time token cost visibility
  5. Reuse Mcp-Session-Id across requests to maintain context
  6. Cache completed flows — package data doesn’t change after completion
  7. Full flows typically complete in 2-5 minutes depending on input complexity