Like everyone else, I’ve been looking at SKILL.md files and tried converting some of my MCP tooling into that format. While it’s an interesting approach, I’ve found that it doesn’t quite work for me as well as MCP does, which is… intriguing.
Concrete Example
Besides a bunch of actual work projects, I have been using MCP to manage this site content for a while now, and one of the things I need to do over time is to convert a bunch of legacy posts from Textile into Markdown format. To facilitate this, I built an MCP server (using umcp) that provides a bunch of utilities to help with the conversion:
| Tool | Short description |
|---|---|
audit_file |
Audit a Markdown page’s reference links for missing/normalized internal targets. |
audit_markdown |
Audit raw Markdown text for internal link correctness (not file-based). |
bulk_list_dir |
Batch list directory entries for multiple relative paths. |
check_links |
Check multiple internal wiki link targets and report existence. |
extract_links |
Extract and classify inline/reference links from a page (internal/external/assets). |
find_missing_internal |
Scan the wiki for missing internal link targets (global audit). |
fix_whitespace |
Normalize whitespace in a wiki file (trim trailing spaces, collapse blank lines). |
get_capabilities |
Return a capabilities snapshot and wiki statistics (counts/features). |
get_lisbon_time |
Return current Europe/Lisbon local time formatted for frontmatter. |
get_pages |
List all index.* page paths under the space/ directory. |
get_shorthands |
Inspect dynamic shorthand mappings (active vs ambiguous). |
help |
Return detailed schema & usage examples for a named tool. |
html_to_markdown |
Convert HTML snippets to Markdown using markitdown (requires deps). |
list_legacy_index_txt |
List legacy index.txt files that don’t have a corresponding index.md. |
optimize_image |
Optimize one or more image files using platform-specific tools (ImageOptim/Curtail). |
refresh_shorthands |
Rebuild the shorthand mapping from current wiki pages. |
resolve_internal |
Suggest canonical page paths for outdated or shorthand internal targets. |
restart_server |
Exit the MCP server cleanly so it can be restarted (use after code changes). |
search_internal_usage |
Find pages referencing a particular internal target. |
textile_to_markdown |
Convert Textile snippets to Markdown via HTML → markitdown (requires deps). |
validate_yaml |
Validate YAML files using PyYAML and a whitelist of value types. |
Yeah, I might have gone a bit overboard with the number of tools, but it turns out reliably converting thousands of ancient pages has a lot of corner cases…
MCP Workflow Chaining
Now, I am not a fan of MCP, but What I’ve found is that MCP allows me to implicitly chain tool invocations, whereas the SKILL.md approach seems to require very explicit step-by-step instructions and it’s almost completely impossible to “chain” skills together in a meaningful way.
For example, one of the most useful things I can do when converting a page is to audit its internal links, extract all the references, resolve any ambiguous or missing targets, and then update the page with normalized links. In my MCP server, I can return the next steps for a workflow as part of the prompts associated with each tool, like so:
# excerpt from prompt_explain_tool (simplified)
if "audit_file" in name:
workflow = [
"call tool_audit_file(path=...) to get missing/present link sets",
"resolve ambiguous/missing via tool_resolve_internal / tool_find_missing_internal",
"(optional) update links using planning prompt",
"re-run tool_audit_file to confirm clean state",
]
elif "extract_links" in name:
workflow = [
"(optional) run tool_audit_file(detail=true) first",
"invoke tool_extract_links to enumerate link references",
"plan link normalizations (prompt_update_markdown_links)",
"apply edits & re-audit",
]
# Related tools are chosen from candidates that co-occur in workflows
related_candidates = [
"tool_audit_file",
"tool_extract_links",
"tool_resolve_internal",
"tool_find_missing_internal",
"tool_get_pages",
"tool_refresh_shorthands",
"tool_bulk_list_dir",
]
related_tools = [t for t in related_candidates if t in " ".join(workflow) or t.startswith(f"tool_{verb}")]
# Recommend a next tool based on the workflow focus
if "audit_file" in name:
recommended_next = ["tool_resolve_internal"]
elif "extract_links" in name:
recommended_next = ["tool_update_markdown_links"] if "tool_update_markdown_links" in registry else ["tool_audit_file"]
else:
recommended_next = related_tools[:1]
# this is what gets returned to the model
scaffold = {
"tool": name,
"summary": meta.get("description", ""),
"workflow": workflow,
"related_tools": related_tools,
"recommended_next": recommended_next,
}
People who’ve used promptflow or similar frameworks will recognize this pattern of “next steps” prompting, and it works quite well in practice. The model can see the context of what it’s doing, what comes next, and how the tools relate to each other, so it can chain invocations naturally.
Where SKILL.md Falls Short
With SKILL.md, however, no matter how many “next steps”, “related skills”, or “CRITICAL: MUST DO IN THIS ORDER” admonitions I add, even frontier models like Claude Opus 4.5 or gpt-5 struggle to chain the steps together. Each skill invocation tends to feel isolated, and I frequently have to intervene to connect the dots.
My convert-legacy skill is huge (around twice the size of this post so far) and contains dozens of explicit steps and quality requirements, but models still miss crucial transitions or misinterpret the intended flow, so it feels like I’m constantly fighting the format rather than using it.
But, more to the point, simpler skills don’t fare better. A skill to audit a file’s links and suggest fixes should be straightforward, but I’ve seen models completely ignore the “suggest fixes” part or fail to produce actionable output unless I break it down into even more granular skills, which not only also doesn’t work but further defeats the purpose of having a higher-level abstraction in the first place.
By contrast, with my MCP server, even when using smaller models like haiku or gpt-5-mini, the implicit workflows are more reliable because the tools narrow the context and present the models with next steps.
This example is one I can share publicly, but over the past few months I’ve seen this pattern time and again across many implementations and use cases.
We sometimes “fix” the problem by upgrading to a higher-tier model, but that incurs cost and latency (Claude Opus 4.5, for example, is neither cheap nor fast). So at least for now, even considering all its foibles, I prefer the more deterministic behavior of MCP for complex multi-step tasks, and the fact that I can do it effectively using smaller, cheaper models is just icing on the cake.