The Hidden Cost of Markdown-Only AI Skills (And Why Scripts Are Worth 10x More)
Most OpenClaw skills on ClawHub are markdown files. A SKILL.md with instructions, maybe a reference doc. No scripts, no automation, no executable code.
They look like products. They're organized neatly. Some have hundreds of downloads. But here's the thing nobody talks about: every time you use a markdown-only skill, your AI agent has to think through the entire process from scratch. That thinking costs tokens. Tokens cost money.
We measured the real cost difference. The numbers aren't close.
TL;DR: A markdown-only skill costs $1.50-5.00 per use in AI tokens because the LLM must read, interpret, and execute every step. An automated skill with bash scripts costs $0.01-0.10 per use because the script does the work and the LLM just reads the output. Over 100 uses, that's $150-500 vs $1-10. The script pays for itself on the first run.
The Two Types of Skills
Let's be specific about what we're comparing.
❌ Markdown-Only Skill
- SKILL.md with instructions
- No scripts/ folder
- No executable code
- LLM does ALL the work
- Every run = full token cost
✅ Automated Skill
- SKILL.md with strategy
- 3-4 bash scripts
- Reference frameworks
- Scripts do the heavy lifting
- LLM reads output only
Here's what we measured across our own skills. The "markdown-only" column represents typical free skills on ClawHub. The "automated" column is what we built:
| Skill | Markdown-Only | Automated |
|---|---|---|
| copywriter | 89 lines, 0 scripts | — |
| crm | 76 lines, 0 scripts | — |
| delegation | 80 lines, 0 scripts | — |
| outreach | 84 lines, 0 scripts | — |
| Skill Audit Pro | — | 394 lines, 1 script (184 lines) |
| Cold Outreach Pro | — | 885 lines, 3 scripts (341 lines) |
| Market Research Pro | — | 890 lines, 4 scripts (413 lines) |
| SEO Audit Pro | — | 948 lines, 4 scripts (656 lines) |
| X Monitor Pro | — | 719 lines, 4 scripts (494 lines) |
| AI Visibility Pro | — | 1,051 lines, 3 scripts (564 lines) |
Average markdown-only skill: 82 lines, 0 scripts.
Average automated skill: 815 lines, 3 scripts, 442 lines of executable code.
The Token Math (Where It Gets Expensive)
Let's trace what happens when you ask your AI agent to do a security audit using each approach.
Markdown-only: "Follow these instructions"
The LLM has to reason about every step. It reads the markdown instructions, decides what to do, generates commands, runs them, reads the output, reasons about what it means, then generates a report. That's 5,000-12,000 tokens of work — every single time.
Automated: "Run this script"
The script does the work. The LLM just runs one command and reads the result. That's 1,200 tokens vs 12,000 tokens. A 10x reduction.
The Real Cost Over Time
| Usage | Markdown-Only (Sonnet) | Automated (Sonnet) | Savings |
|---|---|---|---|
| 1 run | $0.83 | $0.02 | $0.81 |
| 10 runs | $8.30 | $0.20 | $8.10 |
| 50 runs | $41.50 | $1.00 | $40.50 |
| 100 runs | $83.00 | $2.00 | $81.00 |
| Daily for a year | $302.95 | $7.30 | $295.65 |
At just 10 uses, the markdown-only approach has already cost more than a $19 automated skill. The script pays for itself on run #1.
The Opus multiplier: If you're running these on Claude Opus ($75/M output tokens), multiply the markdown-only costs by 5x. That's $415 for 100 runs. The automated version? Still $10. Same output, 40x cheaper.
But It's Worse Than Just Tokens
Token cost is the obvious problem. Here are the ones nobody mentions:
1. Inconsistency
A markdown-only skill produces different output every time. The LLM interprets the instructions differently based on context, mood of the model, and how the prompt lands. Run it Monday, get 6 checks. Run it Friday, get 8 checks. Run it with a different model, get something completely different.
A script runs the same 8 checks every time. Same input → same output. That's not a feature — it's a requirement for anything you'd actually rely on.
2. Speed
The markdown approach takes 30-90 seconds because the LLM is thinking. The script takes 10 seconds because bash doesn't need to think — it just executes.
When you're auditing 10 skills in a batch (like we did), that's the difference between 15 minutes and 2 minutes.
3. Context window pollution
Every token the LLM uses to "think through" a markdown skill is a token that can't be used for your actual work. If you're in the middle of a complex project and ask for a security audit, the markdown approach dumps 12,000 tokens of reasoning into your context window. That pushes out earlier conversation, makes the model forget what you were working on, and degrades quality for everything that follows.
The script approach uses 1,200 tokens. Your context stays clean.
4. Failure modes
When a markdown skill fails, you get vague output. The LLM might skip a check, misinterpret an instruction, or hallucinate a result. You won't know unless you manually verify every line.
When a script fails, you get an error code. grep: file not found tells you exactly what went wrong. Debug time: seconds vs minutes.
The Six Categories We Measured
We built automated versions of six common skill categories and measured the difference:
| Category | Markdown Cost/Run | Script Cost/Run | Multiplier |
|---|---|---|---|
| Security Audit | $1.20 | $0.02 | 60x cheaper |
| Cold Outreach (ICP + sequence) | $2.50 | $0.08 | 31x cheaper |
| Market Research (TAM + SWOT) | $3.00 | $0.10 | 30x cheaper |
| SEO Audit (meta + speed) | $1.80 | $0.03 | 60x cheaper |
| X Monitoring (search + filter) | $0.90 | $0.01 | 90x cheaper |
| GEO Audit (12 signals) | $1.50 | $0.02 | 75x cheaper |
The pattern is consistent: scripts are 30-90x cheaper per run. The more structured and repeatable the task, the bigger the gap.
Real Example: Security Scanning
Let's look at the most dramatic comparison. Here's what our automated scanner outputs in 10 seconds:
Now here's what happens with a markdown-only approach. The LLM reads instructions like "check for credential access patterns" and has to:
- Decide what grep patterns to use (inventing them from scratch)
- Run each grep command individually
- Read each output
- Decide what's a false positive vs real issue
- Format it into something readable
- Summarize the findings
That's 6 reasoning steps, each generating hundreds of tokens. The LLM might miss patterns it didn't think of. It might check 5 categories instead of 8. And it'll cost $1.20 in tokens for a result that's less reliable than the $0.02 script.
Why Do Markdown-Only Skills Exist?
Three reasons:
- They're easy to write. Anyone can write 80 lines of instructions. Writing a 184-line bash script that handles edge cases takes actual engineering.
- They look like products. A well-organized SKILL.md with headers and bullet points looks professional. Nobody checks if there's a scripts/ folder.
- The cost is hidden. You don't see "this run cost $1.20 in tokens" anywhere. It's buried in your monthly API bill. The skill was free — the ongoing cost is invisible.
We're not saying markdown skills are useless. For strategy, creative writing, brainstorming, and one-time tasks, markdown instructions are fine. The LLM's reasoning IS the value. But for repeatable, structured tasks — audits, scans, calculations, monitoring — a script that does the work will always beat instructions that make the LLM do the work.
How to Tell the Difference Before Installing
Before you install any skill, check these three things:
- Does it have a
scripts/folder? No scripts = the LLM does all the work = token cost every run. - Count the lines. Under 100 lines total? It's probably instructions, not automation. Our automated skills average 815 lines.
- Look for
.sh,.py, or.jsfiles. Executable code means work happens outside the LLM. That's what you want.
Quick rule of thumb: if the entire skill fits in a single SKILL.md with no scripts, you could write it yourself in 5 minutes. The value of a skill is in what it automates, not what it instructs.
FAQ
Don't markdown skills work fine if I use a cheap model?
Cheaper per token, yes. But cheap models produce worse results from markdown instructions — they miss steps, hallucinate patterns, and generate inconsistent output. The irony: you need an expensive model to make a markdown skill work well, but an automated script runs identically on any model because the script does the work.
What about skills that use the LLM for creative tasks?
For creative work (writing, brainstorming, strategy), the LLM's reasoning IS the product. Markdown skills are fine there. This analysis applies to structured, repeatable tasks where the output should be consistent.
How much does a typical user spend on token waste from markdown skills?
If you use 5 markdown-only skills daily, that's roughly $4-12/day in unnecessary token usage — $120-360/month. Replacing those with automated versions drops it to $0.15-0.50/day. Over a year, that's $1,400-4,300 in savings.
Can I convert my markdown skills to automated ones?
Yes. Identify the repeatable parts (anything with a checklist, scoring system, or structured output) and write a bash script for those. Keep the strategy and creative guidance in SKILL.md. Let the script handle the execution.
Are all paid skills automated?
No. Many paid skills on ClawMart are just longer markdown files. Always check for a scripts/ folder before buying. If it's $19 for 80 lines of markdown with no scripts, you're paying for something you could write in 5 minutes.
Get 19 automation scripts that actually work
6 skills. 2,652 lines of executable code. Security auditing, outreach, market research, SEO, X monitoring, and AI visibility — all automated with bash scripts that run in seconds, not minutes.
Get the Bundle — $59 See Individual Skills