MCP — Model Context Protocol

The Problem

Before MCP, every connection was custom-built

Each AI needed its own integration for each tool. 3 AIs × 4 tools = 12 bespoke connectors to build, test, and maintain. MCP collapses this to a single shared protocol.

Before MCP

N × M = 12 custom integrations

VS

With MCP

N + M = 7 MCP servers

Architecture

Three roles,
one clear chain

MCP defines exactly who does what. The Host orchestrates, the Client connects, the Server acts.

Host

The AI app you use

Claude Desktop, Cursor, etc. It controls which servers run and what permissions they get. The trust boundary starts here.

MCP Client

The bridge (inside the Host)

Manages the 1-to-1 connection with each server. Handles JSON-RPC messaging, capability negotiation, and transport. Usually invisible to you.

MCP Server

What you build

A lightweight process exposing tools, resources, and prompts. Can be a local subprocess or a remote HTTP service. You own this part.

The Protocol

JSON-RPC 2.0 —
three message types, nothing more

Every single MCP message is one of three shapes. Understand these three and you understand the entire wire protocol.

Type 01

Request

Ask for something. Expects a response back. Always has an id.

Type 02

Response

The answer to a request. Carries the same id to match them up.

Type 03

Notification

One-way broadcast. No id, no response expected. Fire and forget.

Type 03b

Error Response

A response that signals failure. Has a standardized code + message.

request.json

{
  "jsonrpc": "2.0",        // always "2.0" — mandatory
  "id": 42,               // your ticket number
  "method": "tools/call", // what you want done
  "params": {
    "name": "get_weather",
    "arguments": {
      "city": "Paris"
    }
  }
}

How the ID system works

Think of the id like a restaurant order ticket. Multiple requests can fly in parallel — when the server finishes one, it staples its answer to the same ticket number so you know which order just arrived.

→ Client id:42 tools/call city="Paris"

→ Client id:43 tools/call city="London" (in parallel!)

← Server id:43 result: "15°C" (faster)

← Server id:42 result: "22°C" (a bit later)

response.json

{
  "jsonrpc": "2.0",
  "id": 42,        // SAME id as the request
  "result": {
    "content": [{
      "type": "text",
      "text": "Paris: 22°C, sunny skies"
    }]
  }
  // Important: result OR error — never both
}

Rules for a valid Response

Must echo the exact same id from the corresponding request
Must contain result on success, or error on failure — never both
The result shape depends on the method called (tools/call returns content[], tools/list returns tools[], etc.)
Can be sent in any order — responses don't need to arrive in the same order as requests

notification.json

{
  "jsonrpc": "2.0",
  // No "id" field — that's what makes it a notification
  "method": "notifications/progress",
  "params": {
    "progressToken": "task-1",
    "progress": 65,
    "total": 100,
    "message": "Scanning files..."
  }
}

Common notifications

notifications/progress — server updates client on a long-running task
notifications/initialized — client signals it's ready after handshake
notifications/tools/list_changed — server's tools just changed, please re-list
notifications/cancelled — cancel an in-progress request

error-response.json

{
  "jsonrpc": "2.0",
  "id": 42,
  "error": {
    "code": -32601,  // standardized error code
    "message": "Method not found",
    "data": { "method": "tools/unknown" }
  }
}

Standard error codes

-32700 Parse error — invalid JSON received

-32600 Invalid Request object

-32601 Method not found

-32602 Invalid params

-32603 Internal error

-32000 Custom server error (your own)

Claude — sends

Your MCP Server — receives & responds

Session Lifecycle

Every session,
four phases, every time

MCP sessions are deterministic. They always start and end the same way. Click each phase to see the exact messages exchanged.

1

Initialize — Handshake

Client introduces itself. Server lists its capabilities.

›

The very first exchange. Neither side can do any real work until this completes successfully. The client sends its supported protocol version; the server responds with what it can do.

Client → Server

"method": "initialize", "params": { "protocolVersion": "2025-06-18", "clientInfo": { "name": "Claude Desktop", "version": "1.0" }, "capabilities": { "sampling": {} } }

Server → Client

"result": { "protocolVersion": "2025-06-18", "serverInfo": { "name": "weather-mcp", "version": "1.0.0" }, "capabilities": { "tools": {}, "resources": {} } }

1b

Initialized — Ready signal

Client confirms. No response expected. Work can now begin.

›

A simple notification (no id) that tells the server "I received your capabilities, I'm ready." This is mandatory before any other request is sent.

{ "method": "notifications/initialized" } // No id — no response needed

2

Discovery — What can you do?

Claude asks for the list of tools. Server responds with full schemas.

›

Claude asks for the server's tools (and optionally resources and prompts). The server responds with JSON Schemas — structured descriptions that tell Claude exactly what arguments each tool accepts. This is how Claude knows what to call and how to call it.

Request

{ "id": 2, "method": "tools/list" }

Response — tool schema

"tools": [{ "name": "get_weather", "description": "Get current weather", "inputSchema": { "type": "object", "properties": { "city": { "type": "string" } }, "required": ["city"] } }]

3

Execution — Do the work

Claude calls your tools. Your server executes the real logic.

›

Based on the user's message and the tool schemas it discovered, Claude decides to call a tool. It sends a tools/call request with structured arguments. Your server runs the real logic — hits an API, queries a database, whatever — and returns the result.

Call

{ "id": 7, "method": "tools/call", "params": { "name": "get_weather", "arguments": { "city": "Paris" } } }

Result (id:7 — same!)

{ "result": { "content": [{ "type": "text", "text": "Paris: 22°C, sunny" }] } } // Claude reads this and incorporates // it into its response to the user

4

Terminate — Clean exit

Connection closes when the session ends. No special message needed.

›

For stdio: the Host kills the child process when Claude Desktop closes or the conversation ends. For HTTP: the session token expires or the connection is dropped. No shutdown message is required — the transport closing is the signal.

The 5 Primitives

Everything MCP
can express

MCP gives your server five building blocks. Each has a specific role and a specific controller — the AI, the app, or the user.

Primitive 01

Tools — Functions the AI can call

Model-controlled

A Tool is a function the LLM decides to call based on conversation context. The AI reads your tool's description and JSON Schema, decides when it's relevant, and calls it autonomously. Think of it as giving Claude new verbs — search, send, query, create.

›Any action with side effects — writing, sending, creating, deleting
›Fetching real-time data — search results, live prices, current status
›Running computations that need external services
›Anything that changes state in the real world

tools/listtools/call

tool-example.ts

// Register a tool — Claude reads this schema
// and knows exactly when and how to call it
server.registerTool('send_email', {
  description: 'Send an email to a recipient',
  inputSchema: z.object({
    to: z.string().email()
          .describe('Recipient email'),
    subject: z.string(),
    body: z.string()
  })
}, async ({ to, subject, body }) => {
  // your real logic here
  await sendEmail(to, subject, body);
  return {
    content: [{ type: 'text',
      text: `Email sent to ${to}` }]
  };
});

Primitive 02

Resources — Data for context

Application-controlled

Resources are read-only data sources exposed by the server. Unlike tools, the Host application (not the AI) decides when to include them in the conversation context. They're identified by URIs — like files, database records, or documents.

›File contents, configs, documentation
›Database records that change rarely
›Reference data — schemas, specs, knowledge bases
›Anything that should be in context but doesn't need an action

resources/listresources/readresources/subscribe

resource-exchange.json

// Client asks: what resources do you have?
{ "method": "resources/list" }

// Server responds with URI-identified items
{
  "resources": [
    {
      "uri": "file:///project/README.md",
      "name": "README",
      "mimeType": "text/markdown"
    },
    {
      "uri": "postgres://db/users/schema",
      "name": "Users Table Schema"
    }
  ]
}

Primitive 03

Prompts — Reusable templates

User-controlled

Prompts are pre-built conversation templates that the user picks (slash commands in Claude Desktop). They let you package domain expertise — a "code review" prompt, a "database query" prompt — that can embed resources and context automatically.

›Domain-specific conversation starters with embedded context
›Workflows that need specific instructions every time
›Slash commands in Claude Desktop
›Templates that combine multiple resources + instructions

prompts/listprompts/get

prompt-example.json

// Client fetches a prompt template with arguments
{ "method": "prompts/get",
  "params": {
    "name": "code_review",
    "arguments": {
      "language": "TypeScript",
      "focus": "security"
    }
  }
}

// Server returns ready-to-use messages
{ "messages": [{
    "role": "user",
    "content": { "type": "text",
      "text": "Review this TS code for security..."
    }
}]}

Primitive 04

Sampling — Server asks the AI

Human-in-the-loop

The unusual one. Sampling lets your server ask the Host's LLM to generate text. This reverses the usual direction — now the server is the client. It enables agentic loops where your tool calls Claude mid-execution to reason about data before continuing. The Host always approves.

›Agentic loops — tool execution that needs LLM reasoning mid-step
›Summarizing large fetched data before returning it
›Classification or extraction tasks within a pipeline
›Multi-step reasoning without leaving the server

sampling/createMessage

sampling-flow.json

// Server → Client (reversed direction!)
// Your server is asking Claude to think
{
  "method": "sampling/createMessage",
  "params": {
    "messages": [{
      "role": "user",
      "content": { "type": "text",
        "text": "Summarize these 500 log entries: ..."
      }
    }],
    "maxTokens": 200
  }
}
// Host approves → Claude runs → returns summary
// Your server continues with the result

Primitive 05

Roots — Filesystem scope

Host-provided

Roots tell a server what filesystem paths the client considers in scope. The Host provides a list of directory URIs — this is how Cursor tells a server "you're working inside this project folder". Servers should respect these boundaries.

›Filesystem servers that need to know their working scope
›Code analysis tools operating on a specific project
›Document servers scoped to a particular directory
›Any server where "what am I allowed to look at?" matters

roots/listnotifications/roots/list_changed

roots-example.json

// Server asks: what's my working scope?
{ "method": "roots/list" }

// Host responds with allowed directories
{
  "roots": [{
    "uri": "file:///Users/me/my-project",
    "name": "my-project"
  }]
}

// When user opens a new folder, Host notifies:
// { "method": "notifications/roots/list_changed" }
// Server re-calls roots/list to get the new scope

Transports

How messages
physically travel

JSON-RPC is the format. Transports are the wire. Your choice depends on one thing: local tool or remote service?

Claude Desktop / Cursor

spawns your server as a subprocess
stdin ← JSON messages + newline
stdout → JSON messages + newline
stderr → debug logs (ignored)

Your Node.js / Python Server Process

Critical: Never write to stdout except through the MCP SDK. A stray console.log inserts text mid-stream and corrupts every subsequent message. Use console.error for logs — it goes to stderr, which the protocol ignores.

When to use stdio

You're building a tool that runs on the user's own machine
Target: Claude Desktop, Cursor, VS Code Copilot, Zed
Zero network setup — the host spawns your process directly
Config lives in claude_desktop_config.json or .mcp.json
Session ends when the host process is killed

claude_desktop_config.json

{
  "mcpServers": {
    "my-server": {
      "command": "node",
      "args": ["/path/to/dist/index.js"],
      "env": { "API_KEY": "your-key" }
    }
  }
}

Any MCP Client (Claude, Cursor, ChatGPT…)

HTTP POST /mcp — send a request
Content-Type: application/json
← HTTP 200 JSON or SSE stream
Mcp-Session-Id: abc123

Your Server (Railway / Vercel / Cloudflare Workers)

When to use Streamable HTTP

You need a remote server that multiple users or clients connect to
One endpoint /mcp handles everything: POST for requests, optional GET for SSE streaming
Session IDs in headers let one server handle many concurrent users
Always validate the Origin header to prevent DNS rebinding attacks
Don't use the old HTTP+SSE transport — it was deprecated in spec 2025-06-18

server.ts

const transport = new
  NodeStreamableHTTPServerTransport({ res });
await server.connect(transport);
await transport.handleRequest(req.body);

Build It

A working server
in under 30 lines

The official SDK handles all the protocol complexity. You define tools, connect a transport, and you're done.

src/index.ts

// npm install @modelcontextprotocol/sdk zod
// Use v1.x — v2 is pre-alpha until Q3 2026

import { McpServer } from
  '@modelcontextprotocol/sdk/server/mcp.js';
import { StdioServerTransport } from
  '@modelcontextprotocol/sdk/server/stdio.js';
import { z } from 'zod';

// Step 1 — create the server
const server = new McpServer({
  name: 'weather-mcp', version: '1.0.0'
});

// Step 2 — register tools
// Use registerTool() — server.tool() is deprecated
server.registerTool('get_weather', {
  description: 'Get current weather for a city',
  inputSchema: z.object({
    city: z.string().describe('City name')
  })
}, async ({ city }) => {
  // NEVER console.log here — use console.error
  const data = await fetchWeather(city);
  return {
    content: [{ type: 'text', text: data }]
  };
});

// Step 3 — connect transport and start
const transport = new StdioServerTransport();
await server.connect(transport);

Test before connecting to Claude

Use MCP Inspector — a browser UI that lets you call your tools manually and watch every JSON-RPC message in real time.

npx @modelcontextprotocol/inspector node dist/index.js

Production Rules

Six rules that
separate working from broken

These aren't preferences. Each one has a technical reason. Violate them and you get silent failures, crashes, or security holes.

01

Never console.log to stdout

stdout is the JSON-RPC channel. Every byte you write there goes directly into the protocol stream. A stray log line silently corrupts all subsequent messages — the session hangs with no error. Always use console.error, which goes to stderr and is ignored by the protocol.

Critical
02

Use registerTool() — not server.tool()

The old server.tool() API is deprecated per the official Anthropic skill reference. registerTool() is the supported path: better type safety, automatic schema handling, and compatible with future SDK versions.

Critical
03

Use Streamable HTTP — not HTTP+SSE

HTTP+SSE transport was deprecated in spec 2025-06-18. New clients and future versions of all major hosts will not support it. Use NodeStreamableHTTPServerTransport for all new remote deployments.

Important
04

Pin SDK to v1.x for production

v2 of the TypeScript SDK is in pre-alpha with a target of Q3 2026 stable. v1.x receives bug fixes and security patches and is the only production-ready version. Pin your version explicitly in package.json.

Important
05

Validate all inputs with Zod

LLMs are non-deterministic. Claude may pass a string where you expect a number, omit optional fields, or send unexpected keys — especially after model updates. Zod validates inputs before your handler runs and generates the JSON Schema that tells Claude exactly what to send.

Best Practice
06

Keep tools under 10 per server

Tool descriptions consume context tokens. Too many tools forces Claude to scan a long list on every request, increasing latency and reducing routing accuracy. One focused domain per server. If you have 15+ tools, split into two servers.

Best Practice

Understand MCPin 5 minutes

Before MCP, every connection was custom-built

Three roles,one clear chain

The AI app you use

The bridge (inside the Host)

What you build

JSON-RPC 2.0 —three message types, nothing more

Every session,four phases, every time

Everything MCPcan express

How messagesphysically travel

A working serverin under 30 lines

Six rules thatseparate working from broken