Building Cross-Framework Skills: One Skill, Multiple Agents (Codex + Claude Code + Cursor)
If you want one reusable asset instead of three separate prompt packs, build a cross-framework skill. A good cross-framework skill keeps the workflow stable while swapping the adapter layer for Claude Code, Codex, or Cursor. That means shared instructions, framework-specific wrappers, small config files, and a test matrix that catches drift before users do.
This article is for skill authors, internal platform teams, and developer-tool builders who want portable agent skills without flattening every framework into the lowest common denominator. We will cover the file structure, detection logic, prompting patterns, test strategy, and the mistakes that make βportableβ skills brittle.
Quick read
Key takeaways
- Keep the workflow shared, but move framework quirks into thin adapters.
- Separate commands and guardrails so each agent gets the right wrapper.
- Run the same test matrix everywhere, not just the happy path.
- Call out intentional differences so the model does not improvise.
Table of Contents
- Why cross-framework skills matter now
- What should stay portable, and what should not
- A file structure that survives framework drift
- How to detect the active agent cleanly
- Example: one skill, three adapters
- How to test Codex, Claude Code, and Cursor versions
- Common mistakes
- FAQ
Why cross-framework skills matter now
ASE already lists 2,069 published skills across 17 live categories and 10 frameworks, according to the live marketplace stats endpoint. That scale changes the job. Teams do not just need a skill that works once. They need a skill they can reuse across the agents already sitting inside their toolchain.
That is why cross-framework skills are becoming more valuable than single-agent prompt dumps. A platform team might use Claude Code for deeper repo work, Codex for rapid coding workflows, and Cursor for editor-native iteration. If the same deployment review, API validation, or content workflow has to be re-authored three times, maintenance cost rises fast. One workflow with three adapters is usually cheaper than three unrelated skills.
You can see the pattern in ASE’s catalog. Skills such as Firecrawl Web Data API for AI Agents and Review visual regression diffs and publish snapshot baselines in CI with reg-suit are useful partly because the underlying task is stable even when the agent shell changes.
What should stay portable, and what should not
The mistake is trying to make every line universal. That usually produces a vague skill that works nowhere particularly well. Instead, split the skill into two layers.
| Layer | Keep Shared | Split Per Framework |
|---|---|---|
| Workflow | Task goal, success criteria, output format, review checklist | Invocation wording when framework conventions differ |
| Tools | Conceptual tool choice, like βinspect repo, run tests, summarize riskβ | Exact tool names, permissions model, session semantics |
| Files | References, examples, checklists, schemas | Framework-specific command snippets and hooks |
| Gotchas | Task-specific failure modes | Agent-specific behavior differences |
That is the core design principle: portable intent, local execution. The model should always know what outcome to deliver, but it should never have to hallucinate whether a given framework supports a hook, background session, or browser step in the same way.
A file structure that survives framework drift
Use progressive disclosure. Keep the top-level skill short, then branch into adapters only when needed. Anthropic recommends this pattern for skills in general, and it matters even more when one skill spans multiple agents. If you missed our earlier deep dive, read Progressive Disclosure: Why the Best Skills Split Content Across Multiple Files.
cross-framework-skill/
βββ SKILL.md
βββ config.json
βββ references/
β βββ workflow.md
β βββ output-schema.md
β βββ test-cases.md
βββ adapters/
β βββ claude-code.md
β βββ codex.md
β βββ cursor.md
βββ scripts/
β βββ verify.sh
β βββ collect-artifacts.py
βββ examples/
βββ sample-input.md
βββ sample-output.md
SKILL.md should explain the shared mission, when the skill should activate, what βdoneβ looks like, and where to load framework-specific instructions. The adapters/ directory carries the framework glue. That keeps the core file readable and stops one framework’s quirks from polluting another’s instructions.
A practical rule is to keep the shared file under roughly 300 to 500 lines. If it keeps growing, you are probably stuffing adapter logic into the wrong place.
How to detect the active agent cleanly
You need explicit detection, not guesswork. In some environments you can check a known variable, runtime property, or directory convention. In others, you may need the user to declare the framework in config.json. Either is better than telling the model to infer it from vibes.
{
"default_framework": "auto",
"supported_frameworks": ["claude-code", "codex", "cursor"],
"adapter_paths": {
"claude-code": "adapters/claude-code.md",
"codex": "adapters/codex.md",
"cursor": "adapters/cursor.md"
},
"fallback_behavior": "load-shared-workflow-only"
}
Then document the branching rule inside the skill:
When this skill activates:
1. Identify the active framework from config or runtime markers.
2. Load references/workflow.md.
3. Load the matching adapter file.
4. If no supported framework is detected, continue with the shared workflow,
avoid framework-specific commands, and say what is unavailable.
This avoids a common failure mode: a model borrowing the wrong mental model from another agent. That is how you get a Codex-oriented command sequence inside a Cursor-oriented task, or Claude Code assumptions inside a more editor-bound environment.
Example: one skill, three adapters
Imagine you are building a reusable βweb app verificationβ skill. The workflow is stable: inspect the app, launch a local server, run browser checks, collect artifacts, and summarize failures. The adapters differ.
# SKILL.md (shared)
description: >
Use when testing a local web app, reproducing UI bugs, collecting screenshots,
or validating a user flow before merge. Works across Claude Code, Codex,
and Cursor via framework-specific adapters.
## Core workflow
- Start from the requested user flow.
- Verify the app is reachable before testing.
- Capture evidence for every failure.
- End with a concise pass/fail summary plus next fixes.
## Load-on-demand files
- references/workflow.md
- references/output-schema.md
- adapters/<active-framework>.md
Your Claude Code adapter might lean on background processes and stronger tool orchestration. Your Codex adapter may prefer a tighter CLI-first path. Your Cursor adapter may need to assume more editor-centric iteration and fewer long-running orchestration steps. The workflow stays constant. The wrapper changes.
For concrete examples in the ASE catalog, look at Verify local web apps with Playwright scripts and managed dev servers and Co-author structured docs with staged context gathering and reader testing. They solve different problems, but both make the hidden workflow explicit instead of relying on one vague paragraph.
How to test Codex, Claude Code, and Cursor versions
If you only test the shared prompt, you have not tested the skill. You have tested a brochure. Real validation means running the same task set across each adapter and comparing the outputs against a shared rubric.
- Create 5 to 10 fixed tasks. Include one easy case, one ambiguous case, one failure case, one large-context case, and one tool-error case.
- Define the same output schema for all frameworks, including status, evidence, recommended next step, and missing information.
- Track execution differences. Did one framework skip evidence capture, fail to ask a blocking question, or overrun scope?
- Add framework-specific gotchas only after you see recurring failures at least twice.
- Re-run the matrix every time you change adapters, scripts, or activation descriptions.
That last step matters. Framework drift is real. A working adapter in March can become noisy in May after a tool name changes, a permissions flow shifts, or an editor integration starts truncating context differently. This is also why a cross-framework skill needs version notes and not just one frozen SKILL.md.
Pro tip: keep one machine-readable review checklist in references/output-schema.md. That gives every framework the same target even when the path to completion differs.
Common mistakes
- Hiding differences. βWorks everywhereβ is not a strategy. Name the parts that differ.
- Hardcoding one framework’s vocabulary. If the shared file sounds like Claude Code everywhere, the Codex and Cursor adapters start behind.
- No fallback mode. Unsupported environments should degrade gracefully, not improvise unsupported steps.
- No activation boundaries. A portable skill still needs a sharp description, or it will trigger on the wrong tasks.
- Skipping real examples. Show one input and one output. It reduces ambiguity more than another paragraph of abstract guidance.
If you are just getting started with skill authoring, pair this with Build Your First Agent Skill in 15 Minutes and our comparison of OpenClaw vs. Claude Code vs. Codex. They give the context you need before you try to make one skill travel well across multiple agents.
Frequently Asked Questions
Can one SKILL.md really work across Codex, Claude Code, and Cursor?
Yes, if the shared file defines the workflow and the adapter files hold the framework-specific execution details. One monolithic file can work for simple tasks, but multi-agent skills are more reliable when the adapter layer is separated.
What is the best use case for cross-framework skills?
Repeatable workflows with stable success criteria, such as code review, browser verification, API validation, release checklists, and content QA. They are a worse fit for highly framework-native features that depend on one agent’s unique runtime behavior.
How do I know whether to split a skill into separate single-framework versions?
Split when more than about 30 to 40 percent of the instructions differ, when the tools are fundamentally different, or when one framework needs a different output shape. At that point, shared branding may be creating more confusion than reuse.
Conclusion
The best cross-framework skills do not pretend all agents are the same. They treat portability as an engineering problem: shared goals, isolated adapters, explicit detection, repeatable tests, and evidence-backed gotchas. That gives you one maintainable workflow instead of three drifting copies.
If you are publishing to ASE, start with one workflow that already works well in a single framework. Then extract the shared logic, add a thin adapter layer, and test it across the others. That path is slower than copying and pasting prompts, but it produces portable agent skills people can actually trust.
Browse the marketplace, inspect how existing skill pages describe tool fit, and use the official references from Anthropic’s skills docs, OpenAI documentation, and Cursor to keep your adapters grounded in current behavior.