MCP vs Claude Code Skills: When You Need a Protocol and When a Folder Is Enough

If you've been following the PromptForward blog, you've already seen me argue two things:

In MCP Overload: Why Your LLM Agent Doesn't Need 20 Tools, I showed how bolting on tools like Lego bricks quietly kills your agents with cost, latency, and confusion.
In Claude Code: Skills Beats MCP (One Tiny File at a Time), I made the case that small, file-based Skills are a much saner default for day-to-day work in Claude Code.

This post is the missing piece:

When should you actually use MCP, and when should you stick to Claude Code Skills and never spin up a server in the first place?

Let's make this practical, not philosophical.

1. One-paragraph recap: MCP vs Skills

MCP in one paragraph

The Model Context Protocol (MCP) is an open standard for connecting AI apps (Claude, ChatGPT, your own agents) to external systems. An MCP server exposes tools, resources, and prompts; an MCP client (inside your app or IDE) lets the model call those tools to hit databases, APIs, internal services, and more.

Think of it as "USB-C for AI integrations": implement MCP once, reuse the same server across many agents and UIs.

Claude Code Skills in one paragraph

Claude Code Skills are much simpler: each Skill is just a directory with a SKILL.md file and optional supporting files (scripts, templates, extra markdown). Claude reads SKILL.md when the Skill is relevant, and only pulls in extra files if it needs them.

They're built for developer workflows: you drop Skills into .claude/skills/ or ~/.claude/skills, version them with git, and let Claude invoke them automatically.

The short version:

MCP: network protocol + servers + shared integrations
Skills: folders + markdown + scripts in your repo or home directory

2. What the previous posts already argued

In MCP Overload, I walked through the failure mode most teams hit:

Every "this might be handy later" idea becomes a new MCP tool.
Tool definitions (names, schemas, descriptions) sit in the system prompt every turn.
With 10-20 tools, you've burned a huge chunk of each context window on plumbing, not on actual task data.

Result: the agent gets slower, dumber, and more fragile, even though the code technically "works."

In Claude Code: Skills Beats MCP, I took the opposite direction:

Encode your workflows in SKILL.md and defer heavy details (runbooks, long examples, form instructions) into extra files like reference.md and forms.md.
Let Claude load depth on demand instead of stuffing everything into the system prompt.
Use bash and small scripts instead of wrapping every tiny action behind a remote tool.

That post's core thesis was:

"For local coding and ops work, Skills are cheaper, easier, and closer to how senior engineers actually operate."

This post is about when MCP really is worth its complexity and when a Skill directory is all you need.

3. Architecture: protocol vs folder

MCP: shared integration layer

MCP gives you a host-client-server architecture:

The host (IDE, agent runtime, app) manages context and tool calls.
The client speaks MCP.
Servers expose capabilities:
Tools -> model-controlled actions (call APIs, query DBs, trigger workflows)
Resources -> data for context (files, database rows, logs, screenshots, etc.)
Prompts -> reusable prompt templates or flows

This is designed for enterprise-style scenarios: multiple apps, multiple agents, one shared way of talking to internal systems.

Pros:

One MCP server can serve many clients and teams.
Great place to centralize auth, logging, and policy.
Works well for "connect everything to everything" environments.

Cons:

You now run and operate servers.
Tool catalogs tend to grow until they hurt.
All that tool metadata soaks your context budget.

Skills: local capability layer

Skills flip the model:

A Skill is just a directory:
- SKILL.md with a short description and instructions.
- Optionally reference.md, forms.md, examples.md, scripts, etc.
Claude decides when to use a Skill based on the description.
It only reads extra files when necessary.

Pros:

No servers, no infra. It's just files in git.
You can encode very rich workflows without bloating the base prompt.
Plays nicely with bash, CLI tools, and small scripts, which many practitioners now prefer over massive tool catalogs.

Cons:

Scope is local: repo, machine, or single agent deployment.
Governance and access control are whatever your OS and git give you.
Not ideal when you want a single canonical integration used across many systems.

4. Context and token behavior (where things really break)

Most of the pain in MCP Overload came from context behavior, not from the protocol itself.

MCP: front-loading everything

MCP tools need to be described to the model:

Name
Arguments (JSON schema)
Descriptions that explain when to use them

The usual pattern is: load a whole catalog of tools at the start of a conversation. That metadata takes real space in the model's context window on every turn, right next to your actual inputs (stack traces, logs, user data).

If you're not ruthless about pruning:

You lose context budget to tools you rarely call.
The model has more ways to be wrong ("which of these five similar tools do I pick?").
Latency and cost climb as you scale teams and use cases.

The MCP spec doesn't force you to do this badly, but in practice, tool catalogs creep.

Skills: progressive disclosure by design

Skills solve the same problem a different way:

Only the short Skill descriptions are exposed for discovery.
When Claude chooses a Skill, it reads SKILL.md.
If it needs more detail (say, form-filling rules or long examples), it follows links into forms.md, reference.md, etc. -- only at that moment.

This is "progressive disclosure" baked into the design:

You can attach large docs, detailed runbooks, or multi-step flows to a Skill.
They stay out of the prompt until the model proves it needs them.

That's why for coding and operational workflows, Skills often feel quicker, cheaper, and more predictable than an overgrown MCP tool catalog.

5. So when should you use which?

Here's the practical rule set.

Use Skills by default when

You're working in:

Claude Code, CLIs, or dev tooling.
Tasks that are naturally code-adjacent:
- Refactors, migrations, and reviews.
- Incident handling and log spelunking.
- Release and rollout procedures.
- Data cleaning scripts (CSV/JSON/logs).

And you can hit what you need via:

Local tools (git, kubectl, psql, etc.).
Simple scripts and CLIs.
Auth you already have on your dev machine.

In other words: if the job looks like "how a senior engineer would do it in a shell plus editor", start with Skills.

Use MCP when

You have cross-cutting integrations and governance requirements:

Many agents and apps all need to talk to the same:
- CRM, support system, billing, ERP.
- Data warehouse, feature store, logging backend.
You need centralized auth, rate limits, and auditing for tool calls.
You're building an AI platform for the rest of the company, not just one agent in one IDE.

This is where MCP's complexity actually pays for itself.

Use both when

You want nice local workflows and shared infra:

Skills encode how your team works:
- "Investigate a production incident."
- "Roll back a bad release."
- "Clean and re-ingest this CSV."
Inside those Skills, your scripts talk to:
- MCP servers (for shared systems like billing or auth).
- Local tools (for repo-specific actions).

Skills become the UX layer for engineers; MCP becomes the integration layer behind the scenes.

6. One concrete example: debugging an incident

Take a real task: "Help me debug this production incident."

You want the agent to:

Pull recent logs.
Check deploy history.
Look at metrics.
Suggest next steps, maybe even roll back.

Pure MCP design

You build an MCP server that exposes:

get_logs(service, since)
get_deploys(service)
get_metrics(service, window)
trigger_rollback(service, version)

You wire this into your host. The model chooses which tools to call in which order.

This works, but:

The catalog grows as you add more services and variants.
Tool descriptions and schemas eat context.
"Why did it call this tool?" becomes a constant debugging question.

Skills-first, MCP-backed design

Instead:

You create a prod-incidents Skill:
- SKILL.md includes your team's runbook for incidents.
- reference.md explains dashboards, alert patterns, on-call conventions.
- Scripts:
  - scripts/fetch-logs.sh
  - scripts/fetch-metrics.sh
  - scripts/rollback.sh
Those scripts call:
- MCP tools where shared infra is needed.
- Local CLIs or APIs where that's simpler.

Now, when you say "Investigate this incident":

Claude picks the prod-incidents Skill.
Follows your runbook from SKILL.md.
Calls scripts that happen to hit MCP, but the workflow logic lives in the Skill.

You get:

Clear behavior encoded in plain text.
Minimal context overhead until you actually debug something.
The option to move from "all local" to "part MCP" without rewriting how humans think about the task.

7. Opinionated design rules

Here are the rules I follow (and that informed both posts):

Default to Skills. If you can solve it with SKILL.md plus some scripts, do that first.
Keep MCP catalogs brutally small. Three to five well-designed, high-leverage tools? Great. Twenty half-baked tools? You're back in "MCP Overload" land.
Push documentation out of the base prompt. In MCP, keep descriptions and schemas tight; move examples or runbooks into separate resources. In Skills, keep SKILL.md concise and link to reference.md or forms.md.
Favor scripting over "API everything." A lot of heavy Claude Code users converge on the same pattern: a small number of tools, plus scripts and Skills doing most of the real work.
Introduce MCP only when reuse and governance actually demand it. If it's just for one team, in one repo, in one IDE, Skills are enough.

8. Where PromptForward fits

There's a quieter but more important axis than "MCP vs Skills":

Does any of this still behave on real inputs after you change it?

Every change you make can break something:

Add a new MCP tool or server -> the model starts picking it in places you didn't plan.
Change a SKILL.md description -> the model suddenly stops using a workflow that used to work.
Update a script -> an edge case now throws or returns slightly different output.

This is where PromptForward comes in:

Treat prompt changes, MCP tool changes, and Skill changes like code.
Run them against saved datasets: logs, tickets, emails, stack traces, internal docs -- whatever your world looks like.
Catch regressions before they hit production or your users.

The earlier posts already hinted at this: tool choice is architecture, but reliability comes from testing.

9. TL;DR for busy engineers

If you skimmed everything:

Claude Code Skills are how you encode local workflows: folders plus markdown plus scripts, lazy-loaded into context. Default to these.
MCP is how you build a shared integration layer for many agents and apps with proper auth, logging, and policy. Use it when you actually need that.
The sweet spot is often Skills as UX, MCP as backend.
Whichever you pick, you still need to run your agents on real datasets and check for regressions. That's the hole PromptForward is designed to fill.

Don't start with "let's wire up every tool in the company."

Start with: "What do we want the agent to reliably do? And how do we prove it keeps working?"