How to Write Agent Skills That Actually Work: Lessons from Anthropic’s Thariq

Last week, Anthropic engineer Thariq published a thread about how the Claude Code team uses skills internally. It got 2.7 million views. The reason is simple: most people building agent skills are doing it wrong, and Thariq laid out exactly why.

We read the thread, cross-referenced it with our experience reviewing hundreds of skill submissions on AgentSkillExchange, and distilled the most actionable lessons into this guide. If you write skills — or plan to — this is the post to bookmark.

The Core Problem: Skills That Exist but Don’t Work

Here’s the uncomfortable truth about most SKILL.md files: they’re mediocre. They restate things the model already knows. They’re either too rigid or too vague. They dump everything into one massive file and hope for the best.

Thariq put it bluntly: “Don’t state the obvious.” Claude already knows how to write Python. It already knows what REST APIs are. If your skill is just restating common programming knowledge, you’ve wasted tokens and context window space without adding value.

The skills that actually work focus on one thing: information that pushes the model out of its normal thinking patterns. That’s it. Everything else is noise.

Lesson 1: Build a Gotchas Section (It’s the Highest-Signal Content)

If you take one thing from this post, make it this: every skill needs a Gotchas section, and it should be the part you spend the most time writing.

A gotchas section captures the failure modes that an LLM will hit repeatedly. These are the things that aren’t in the training data, or the things where the “obvious” approach is wrong for your specific context.

What makes a good gotchas section:

Real failure points — not hypothetical ones. If Claude hasn’t actually failed at it, it probably doesn’t need to be there.
Specific corrections — don’t just say “be careful with X.” Say “when you encounter X, do Y instead of Z because of [reason].”
Iterative improvement — your gotchas section should grow over time as you discover new failure patterns.

Example: A bad gotchas section

## Gotchas
- Be careful with API rate limits
- Make sure to handle errors
- Don't forget to validate input

This is useless. Claude already knows all of this. You’ve burned tokens saying nothing.

Example: A good gotchas section

## Gotchas
- The Stripe API returns null for invoice.subscription on one-time payments,
  but the TypeScript types say it's always a string. Always null-check it.
- When creating customers, the metadata field silently truncates values
  longer than 500 characters. Validate length before sending.
- Webhook signatures use the RAW request body. If you parse JSON first
  and re-stringify, the signature check will fail every time.
  Use express.raw() middleware, not express.json().

See the difference? Every entry is specific, corrective, and born from an actual failure. This is the kind of content that turns a mediocre skill into a great one.

Lesson 2: The Description Field Is for the Model, Not for Humans

This might be the most misunderstood part of skill creation. The description field in your SKILL.md frontmatter isn’t a summary. It’s not marketing copy. It’s a trigger description that tells the model when to activate the skill.

Claude reads every installed skill’s description before deciding which one (if any) to load. If your description doesn’t match the kinds of requests users actually make, your skill will never activate — even if it’s perfectly written inside.

How to write an effective description

Think about the exact phrases and scenarios that should trigger your skill. Include them explicitly:

# Bad description
description: A skill for working with Kubernetes deployments.

# Good description
description: >
  Use when deploying, scaling, or debugging Kubernetes workloads.
  Triggers on: "deploy to k8s", "scale pods", "why is my pod crashing",
  "kubectl apply failing", "CrashLoopBackOff", "OOMKilled",
  "ImagePullBackOff". NOT for: cluster setup, Helm chart creation.

On AgentSkillExchange, skills with well-optimized descriptions see significantly higher activation rates. It’s the difference between a skill that gets used and one that gets installed but ignored.

Lesson 3: Use the File System — Progressive Disclosure Matters

A 2,000-line SKILL.md is a problem. Not because the content is bad, but because loading all of it into context at activation time wastes the model’s working memory on information that might not be relevant to the current task.

The fix is progressive disclosure: split your skill across multiple files and let the model pull in what it needs.

The recommended structure

my-skill/
├── SKILL.md              # Core instructions (keep under ~500 lines)
├── config.json           # User-specific settings
├── references/
│   ├── api-patterns.md   # Detailed API documentation
│   ├── error-codes.md    # Complete error reference
│   └── migration-guide.md
├── scripts/
│   ├── deploy.sh         # Reusable deployment script
│   └── validate.py       # Validation helper
└── examples/
    ├── basic-setup.md
    └── advanced-config.md

Your SKILL.md stays lean: it has the gotchas, the core workflow, and pointers to the reference files. When Claude needs the detailed API patterns, it reads references/api-patterns.md. When it doesn’t, those tokens stay available for actual work.

Lesson 4: Don’t Railroad the Model

There’s a temptation to write skills as rigid step-by-step scripts. This feels safe. It’s also a trap.

Over-prescriptive skills break when the model encounters situations you didn’t anticipate. Instead of scripting every action, provide:

Goals and constraints — what should the outcome look like?
Decision frameworks — when should it choose approach A vs. approach B?
Escape hatches — what should it do when the normal path doesn’t apply?

Give Claude the judgment calls. Your skill should provide the context for good judgment, not replace it with a script.

Lesson 5: Think Through Setup and Configuration

Good skills are portable. They work on different machines, for different users, without requiring code changes. The key is externalizing configuration.

// config.json
{
  "api_base_url": "https://api.mycompany.com",
  "default_environment": "staging",
  "team_slack_channel": "#deployments"
}

Your SKILL.md should reference config.json for any value that varies between users. For sensitive data, use environment variables and document which ones are required.

Lesson 6: Store Scripts and Generate Code

Skills can include executable scripts in a scripts/ directory. Instead of writing lengthy SKILL.md instructions explaining how to perform a complex operation, write a script that does it and tell Claude to run the script. This gives you the reliability of tested, version-controlled scripts with the flexibility of natural language instructions around them.

The Marketplace Reality: Curation Over Quantity

Thariq’s thread included a warning: “It’s easy to create bad or redundant skills — curation before release is important.”

The most common problems with submitted skills:

Restating common knowledge — a “Python best practices” skill that just says “use type hints and write tests”
Overlapping scope — three different skills for “code review” that all do roughly the same thing
Missing gotchas — the skill works for the author’s exact setup and breaks everywhere else
Overloaded scope — one skill trying to do everything

Before publishing a skill, ask yourself: Does this contain information Claude can’t already figure out on its own? If the answer is no, your skill needs more work.

Quick Reference: The Skill Quality Checklist

☐ Description field includes trigger phrases and exclusions
☐ Gotchas section exists with specific, real failure points
☐ No “stating the obvious” — every line adds non-obvious value
☐ File structure uses progressive disclosure
☐ SKILL.md is under 500 lines; details are in reference files
☐ Configuration is externalized to config.json, not hardcoded
☐ Required environment variables are documented
☐ Instructions provide goals and constraints, not rigid scripts
☐ Scope is clear and doesn’t overlap with existing skills
☐ Tested with actual tasks, not just read through

What’s Next

The agent skills ecosystem is growing fast. Anthropic’s official documentation, the AgentSkills.io standard, and the Skilljar course are all worth exploring.

Over the next month, we’ll be publishing daily guides covering specific skill categories, tutorials for building your own, and spotlights on the most popular skills in the marketplace. Follow the blog to stay current.

And if you’ve built something good — submit it to AgentSkillExchange. The bar is high, but that’s what makes the marketplace worth using.