Build a Custom MCP Server: The Guide That Goes Past Hello World

Every MCP tutorial ends at the exact moment things get interesting. You get a working stdio server that echoes back a string — and then you’re on your own the second you try to deploy it, add authentication, or point Claude Code at your company’s real APIs. This guide doesn’t stop there.

By the end, you’ll know how to build a custom MCP server — fully deployed, authenticated, running over Streamable HTTP that Claude Code, Cursor, or Goose can query your actual internal APIs with. We’ll cover every gap the other tutorials leave open: transport selection, OAuth 2.1, input validation, horizontal scaling caveats, and structured audit logging. Everything you need to make it production-ready — not just demo-ready.

Let’s build something real.

What MCP Actually Is (and Why the USB-C Analogy Finally Makes It Click)

Think about what USB-C solved. Before it, every device had its own connector: proprietary charging ports, Mini USB, Micro USB, DisplayPort. Every combination required a different cable, a different driver, a different mental model. USB-C collapsed all of that into one standard — plug in any device to any host, it just works.

Model Context Protocol (MCP) is USB-C for AI agent integrations.

Before MCP, if you wanted Claude Code, Cursor, and GitHub Copilot to all query your internal API, you’d write three separate custom tool integrations — one per agent, each with a different interface, each requiring maintenance every time you updated the API. MCP collapses that into a single server. Build it once, connect it to any compliant agent.

The numbers validate the momentum. MCP SDK downloads reached 97 million monthly by December 2025, and server downloads grew from ~100,000 in November 2024 to over 8 million by April 2025 — an 8,000% surge in five months (Nevermined.ai / MCP Manager Blog). This isn’t a niche experiment: 28% of Fortune 500 companies had implemented MCP servers in their AI stacks by Q1 2025, with fintech leading at 45% (Guptadeepak.com). If you’re still stuffing API docs into system prompts and hoping your agent figures it out, you’re already behind.

The Three MCP Primitives — Tools, Resources, and Prompts — and When to Use Each

MCP gives you three building blocks. Knowing which to reach for saves you from awkward over-engineering.

Tools are executable actions with side effects. An agent calls a tool, your server runs code, something happens — a database query executes, an API call fires, a record gets written. If it does something, it’s a Tool.

Resources are read-only data sources. Think of them as URL-addressable endpoints the agent can fetch: a list of open tickets, the current user’s profile, a schema snapshot. No mutations, just reads. Resources are ideal for giving agents ambient context without burning tokens on repeated tool calls.

Prompts are reusable, parameterized prompt templates. Instead of hoping each agent frames your API correctly from scratch, you bake well-engineered instructions into the protocol once and expose them through Prompts.

For the rest of this guide, we’ll focus on Tools — that’s where 90% of the practical value lives for most engineering teams.

Project Setup: TypeScript SDK, Zod, and ES Module Configuration from Scratch

Start with a clean directory. You’ll need Node.js 20+ and TypeScript 5.4+.

“`bash

mkdir my-mcp-server && cd my-mcp-server

npm init -y

npm install @modelcontextprotocol/sdk zod express

npm install -D typescript @types/node @types/express tsx

“`

Your `tsconfig.json` must target ES modules — the MCP TypeScript SDK is ESM-only:

“`json

{

“compilerOptions”: {

“target”: “ES2022”,

“module”: “NodeNext”,

“moduleResolution”: “NodeNext”,

“outDir”: “./dist”,

“strict”: true,

“esModuleInterop”: true

},

“include”: [“src/*/“]

}

“`

Add these to `package.json`:

“`json

{

“type”: “module”,

“scripts”: {

“build”: “tsc”,

“start”: “node dist/server.js”,

“dev”: “tsx watch src/server.ts”

}

}

“`

Create `src/server.ts` — this is where everything lives from here on.

Building Your First Tool — Wrapping a Real API Endpoint with Typed Input Validation

Skip the toy examples. We’re wrapping a real internal REST API — a fictional but realistic `/v1/incidents` endpoint that returns active incidents filtered by severity.

“`typescript

import { McpServer } from “@modelcontextprotocol/sdk/server/mcp.js”;

import { z } from “zod”;

const server = new McpServer({

name: “ops-api-server”,

version: “1.0.0”,

});

const GetIncidentsSchema = z.object({

severity: z.enum([“low”, “medium”, “high”, “critical”]).optional(),

limit: z.number().int().min(1).max(100).default(20),

});

server.tool(

“get_incidents”,

“Fetch active incidents from the ops API, optionally filtered by severity”,

GetIncidentsSchema,

async ({ severity, limit }) => {

const params = new URLSearchParams({ limit: String(limit) });

if (severity) params.set(“severity”, severity);

const res = await fetch(

`${process.env.OPS_API_BASE_URL}/v1/incidents?${params}`,

{

headers: {

Authorization: `Bearer ${process.env.OPS_API_TOKEN}`,

“Content-Type”: “application/json”,

},

}

);

if (!res.ok) {

throw new Error(`API returned ${res.status}: ${await res.text()}`);

}

const data = await res.json();

return {

content: [{ type: “text”, text: JSON.stringify(data, null, 2) }],

};

}

);

“`

Three things to notice. First, the Zod schema is the source of truth — the SDK automatically generates the JSON Schema your agent sees from it, so you’re never maintaining two schemas in sync. Second, credentials come from environment variables only, never from hardcoded strings. Third, the handler returns a structured `content` array — that’s the MCP response contract every client expects.

Choosing Your Transport: Streamable HTTP for Production, stdio for Local Dev

This is where most tutorials quietly leave you stranded.

stdio is the original MCP transport. Your server reads JSON from stdin and writes to stdout. Zero network configuration, trivial to set up, and the right choice when an agent like Claude Desktop spawns your server as a child process on the same machine.

Streamable HTTP is the 2025/2026 standard for remote servers. A single HTTP endpoint handles both request-response and server-sent event streaming. One URL, full duplex, deployable anywhere. Remote MCP servers grew nearly 4x since May 2025 (Zuplo State of MCP Report) — and they’re all running Streamable HTTP.

The older SSE-only transport (separate `/sse` and `/messages` endpoints) is deprecated. Don’t build new servers on it.

For production, wire up Streamable HTTP with Express:

“`typescript

import express from “express”;

import { StreamableHTTPServerTransport } from “@modelcontextprotocol/sdk/server/streamableHttp.js”;

const app = express();

app.use(express.json());

app.post(“/mcp”, async (req, res) => {

const transport = new StreamableHTTPServerTransport({

sessionIdGenerator: () => crypto.randomUUID(),

});

await server.connect(transport);

await transport.handleRequest(req, res, req.body);

});

app.get(“/health”, (_req, res) => res.json({ status: “ok” }));

app.listen(3000, “127.0.0.1”); // Never 0.0.0.0 in dev

“`

Security note: Binding to `0.0.0.0` in development exposes your server to every network interface on the machine — including shared networks. This is the “NeighborJack” class of vulnerability. Always bind to `127.0.0.1` locally.

One critical caveat for production scaling: Streamable HTTP sessions are stateful by default. Each `StreamableHTTPServerTransport` instance holds session state in memory. If you deploy behind a load balancer with multiple instances and a request hits the wrong node, it will fail. Your options: use sticky sessions at the load balancer, externalize session state to Redis, or — the cleanest solution — design your tools to be fully stateless so any node can handle any request.

Authentication That Won’t Get You Fired — From API Keys to OAuth 2.1 Scopes

Here’s the uncomfortable reality: 24% of MCP servers operate with no authentication at all, and 53% still rely on static API keys despite OAuth 2.1 being the 2025 spec standard (Zuplo State of MCP Report; Data Science Collective). In an enterprise environment, a static key is a permanent backdoor.

Step 1: Bearer token validation (minimum viable)

Validate an `Authorization: Bearer ` header on every incoming MCP request before it reaches your tools:

“`typescript

app.post(“/mcp”, async (req, res) => {

const authHeader = req.headers[“authorization”];

if (!authHeader?.startsWith(“Bearer “)) {

res.status(401).json({ error: “Missing or invalid authorization header” });

return;

}

const token = authHeader.slice(7);

const identity = await validateToken(token);

if (!identity) {

res.status(403).json({ error: “Invalid token” });

return;

}

// Attach identity to request context for per-tool scope checks

(req as any).identity = identity;

// … proceed to transport handling

});

“`

Step 2: Per-tool scope enforcement

Different tools warrant different permission levels. A `get_incidents` tool shouldn’t require the same privileges as `create_deployment`:

“`typescript

server.tool(“create_deployment”, “…”, CreateDeploymentSchema, async (args, context) => {

const identity = (context.requestContext as any)?.identity;

if (!identity?.scopes.includes(“deployments:write”)) {

throw new McpError(

ErrorCode.InvalidRequest,

“Insufficient permissions: requires deployments:write scope”

);

}

// … proceed

});

“`

Step 3: Short-lived tokens and rotation

Static API keys don’t expire. Move to short-lived tokens (15–60 minute TTLs) issued by your identity provider. For machine-to-machine flows, use OAuth 2.1 client credentials. Store token refresh logic in a centralized middleware layer — not scattered across individual tool handlers.

50% of MCP server builders cite security complexity as their top challenge (Zuplo State of MCP Report). The practical shortcut: front your MCP server with an API gateway (Kong, Zuplo, AWS API Gateway) that handles OAuth token validation, rate limiting, and key rotation at the infrastructure level. Your server code stays clean; the gateway handles the security perimeter.

Error Handling and Input Validation — Making Your Server Agent-Proof

Agents are optimistic. They’ll call your tools with malformed inputs, out-of-range values, and creative schema interpretations. If your server crashes or returns an unstructured exception, the agent may spiral — retrying indefinitely, hallucinating recovery steps, or failing silently.

Validate at the boundary. Zod handles this automatically when you pass schema to `server.tool()` — inputs that fail schema validation never reach your handler. That’s your first line of defense.

Return structured MCP errors, not generic thrown exceptions:

“`typescript

import { McpError, ErrorCode } from “@modelcontextprotocol/sdk/types.js”;

if (!isValidIncidentId(args.id)) {

throw new McpError(

ErrorCode.InvalidParams,

`Invalid incident ID format: ${args.id}. Expected UUID v4.`

);

}

“`

The error codes you’ll reach for most: `InvalidParams` (bad input), `InternalError` (upstream failure), `InvalidRequest` (permissions/scope check failed), and `MethodNotFound` (the SDK handles this one automatically).

Return something useful on upstream failures rather than crashing:

“`typescript

try {

const data = await callUpstreamAPI(args);

return { content: [{ type: “text”, text: JSON.stringify(data) }] };

} catch (err) {

return {

content: [{

type: “text”,

text: “Unable to fetch incidents right now. The ops API returned an error. Try again in a few moments or check the status page.”,

}],

isError: true,

};

}

“`

Setting `isError: true` tells the agent this is an error state — it can factor that into its reasoning rather than treating the response as valid data and building on top of a failure.

Deploying Your MCP Server to Production and Connecting Claude Code, Cursor, and Goose

Deployment

Deploy your built server anywhere that runs Node.js: Railway, Render, Fly.io, or your internal Kubernetes cluster. Set these environment variables at minimum:

“`

OPS_API_BASE_URL=https://internal-ops.example.com

OPS_API_TOKEN=

NODE_ENV=production

PORT=3000

“`

Your `/health` endpoint is not optional in production — use it as your load balancer health check and Kubernetes readiness probe. Without it, rolling deploys will route traffic to nodes that aren’t ready.

Connecting Claude Code

In your project’s `.claude/settings.json` (or `~/.claude/settings.json` for global config):

“`json

{

“mcpServers”: {

“ops-api”: {

“type”: “http”,

“url”: “https://your-mcp-server.example.com/mcp”,

“headers”: {

“Authorization”: “Bearer ${MCP_OPS_TOKEN}”

}

}

}

}

“`

Connecting Cursor

In `.cursor/mcp.json` at your project root:

“`json

{

“mcpServers”: {

“ops-api”: {

“url”: “https://your-mcp-server.example.com/mcp”,

“headers”: {

“Authorization”: “Bearer ${MCP_OPS_TOKEN}”

}

}

}

}

“`

Connecting Goose

In `~/.config/goose/config.yaml`:

“`yaml

extensions:

ops-api:

type: remote

url: https://your-mcp-server.example.com/mcp

headers:

Authorization: “Bearer ${MCP_OPS_TOKEN}”

enabled: true

“`

Restart the client after updating the config. Your registered tools should appear immediately in the agent’s tool list.

Debugging with MCP Inspector and What to Log for Audit Compliance

Using MCP Inspector

When a tool schema mismatch causes silent failures, `@modelcontextprotocol/inspector` is the fastest way to diagnose it:

“`bash

npx @modelcontextprotocol/inspector http://localhost:3000/mcp

“`

The Inspector opens a browser UI where you can browse all registered tools and their generated schemas, send test requests with custom inputs, inspect raw MCP protocol messages, and confirm the transport handshake completes correctly. Make this part of your pre-deployment checklist. If the Inspector can’t invoke your tool cleanly, no agent will.

Structured audit logging

For any MCP server touching production systems, compliance teams need logs they can query and alert on. Log in structured JSON at minimum:

“`typescript

console.log(JSON.stringify({

timestamp: new Date().toISOString(),

event: “tool_call”,

tool: “get_incidents”,

identity: identity?.sub, // WHO called it

scopes: identity?.scopes, // WHAT permissions they had

input_summary: { // WHAT they asked for (no PII)

severity: args.severity,

limit: args.limit,

},

outcome: “success”, // OR “error”

duration_ms: Date.now() – startTime,

upstream_status: 200,

}));

“`

Never log raw LLM-supplied input values — they could contain PII from the conversation context. Log a sanitized summary instead. Ship these structured logs to your SIEM or log aggregator; JSON format makes them parseable by compliance automation tools without custom parsing rules.

Build It Once, Connect It Everywhere

The gap between “MCP tutorial” and “MCP in production” is exactly what most guides refuse to cross. The implementation isn’t the hard part — deploying behind real authentication, defending against agent-supplied bad inputs, and satisfying your compliance team are where the actual work lives.

At Block, engineers using MCP-powered tooling reported 50–75% time savings on common developer tasks (Pento.ai — A Year of MCP: 2025 Review). That’s not magic. It’s the result of well-built, well-secured MCP server integrations that agents can rely on — not fragile stdio demos that fall apart the moment you leave localhost.

Start with one tool wrapping one internal API endpoint. Get it deployed. Connect it to Claude Code. From there, adding new tools takes under an hour once the scaffold is in place.

The agents are ready. Give them something real to work with.

Explore the [official MCP TypeScript SDK](https://github.com/modelcontextprotocol/typescript-sdk) and [MCP Inspector](https://github.com/modelcontextprotocol/inspector) to continue building.

Leave a Reply

Your email address will not be published. Required fields are marked *