Turning Static AI Agent Skills into Dynamic, Testable, Iterable Workflows

This article explains how to extend a static AI Agent Skill defined in SKILL.md with a dynamic Workflows layer, add trajectory evaluation for deterministic execution, and provides concrete JavaScript examples, directory structures, anti‑patterns, and a step‑by‑step validation checklist to make Skills runnable, testable, and iteratively improvable.

Frontend AI Walk
Frontend AI Walk
Frontend AI Walk
Turning Static AI Agent Skills into Dynamic, Testable, Iterable Workflows

Problem

Static SKILL.md describes when and how to run a Skill, but execution relies on the Agent guessing each step. This leads to step skipping, goal drift, and limited parallelism.

Solution Overview

Introduce a dynamic execution layer with JavaScript Workflows stored under workflows/. The three‑layer model becomes:

L0 Standard layer (01): SKILL.md + references/
L1 Orchestration layer (05): SKILL.md + sub‑Skill contracts
L2 Dynamic layer (07): SKILL.md + workflows/ + trajectory Eval

Static knowledge stays in SKILL.md, orchestration logic lives in Workflows, and a trajectory Eval validates the execution process.

Skill Directory Extension

After running

plugins/frontend-team-toolkit/skill-engineering/bin/new-skill.sh wechat-review-pipeline

the following layout is created:

plugins/frontend-team-toolkit/skills/wechat-review-pipeline/
├── SKILL.md                # static knowledge, triggers, contract
├── .skill-meta.json       # enables workflows + trajectory_evals path
├── references/
│   └── output-contract.md
├── evals/
│   ├── evals.json               # output Eval definitions
│   └── trajectory-evals.json    # process Eval definitions
├── workflows/                  # dynamic orchestration scripts
│   ├── README.md
│   ├── parallel-review.js
│   ├── conditional-route.js
│   └── weekly-regression.js
└── scripts/validate-output.sh

The scaffolding script copies the workflows/ template and the trajectory‑Eval JSON automatically.

Declaring Workflows in SKILL.md

## Dynamic Workflows

This Skill includes the following dynamic orchestration scripts:

### workflows/parallel-review.js
- **Purpose**: Parallel review + image generation
- **Trigger**: User says “parallel review” or “review and generate image”
- **Input**: Path to article file
- **Output**: Aggregated review results

### workflows/conditional-route.js
- **Purpose**: Route based on article status
- **Trigger**: User says “review” and provides status
- **Input**: Article path + status (draft/final/published)
- **Output**: Review result for the selected flow

### workflows/weekly-regression.js
- **Purpose**: Periodic regression of the Skill
- **Trigger**: User says “periodic regression” or `/loop weekly`
- **Input**: Optional Skill name
- **Output**: Regression report

Execution:
- Claude automatically selects the appropriate workflow
- Users can explicitly specify: “use parallel-review workflow for review”

Workflows Script Patterns

Basic Structure

// workflows/example.js
async function mainWorkflow(args) {
  // Phase 1: prepare input
  const input = validateInput(args);

  // Phase 2: execute orchestration logic
  const result = await orchestrate(input);

  // Phase 3: return formatted output
  return formatOutput(result);
}

module.exports = mainWorkflow(args);

Key functions: runAgent() – spawns a sub‑agent Promise.all() – runs agents in parallel await – serial waiting args – user‑provided parameters

Serial Orchestration Example

// workflows/serial-review.js
async function serialReview(articlePath) {
  // Phase 1: review article
  const reviewResult = await runAgent({
    name: "article-reviewer",
    prompt: `Read ${articlePath}, output a five‑dimensional score report`,
    tools: ["Read"],
    model: "sonnet",
    worktree: false // shared session
  });

  // Phase 2: generate image (depends on Phase 1)
  const imageResult = await runAgent({
    name: "image-reviewer",
    prompt: `Read ${articlePath}, output an image compliance checklist`,
    tools: ["Read"],
    model: "haiku",
    worktree: true // isolated execution
  });

  // Phase 3: aggregate results
  return {
    review: reviewResult,
    images: imageResult,
    summary: `Review score: ${reviewResult.score}, Image compliance: ${imageResult.status}`
  };
}

module.exports = serialReview(args.articlePath);

Key points: await guarantees order; Phase 2 depends on Phase 1; worktree:true isolates the second agent.

Parallel Orchestration Example

// workflows/parallel-review.js
async function parallelReview(articlePath) {
  const [reviewResult, imageResult] = await Promise.all([
    runAgent({
      name: "article-reviewer",
      prompt: `Read ${articlePath}, output a five‑dimensional score report`,
      tools: ["Read"],
      model: "sonnet",
      worktree: true // isolated to avoid conflicts
    }),
    runAgent({
      name: "image-reviewer",
      prompt: `Read ${articlePath}, output an image compliance checklist`,
      tools: ["Read"],
      model: "haiku",
      worktree: true
    })
  ]);

  // Barrier: wait for both agents then synthesize
  return synthesizeResults([reviewResult, imageResult]);
}

function synthesizeResults(results) {
  return {
    reviewScore: results[0].score,
    imageStatus: results[1].status,
    combinedStatus: results[0].score >= 9.0 && results[1].status === "pass" ? "PASS" : "FAIL"
  };
}

module.exports = parallelReview(args.articlePath);

Key points: Promise.all implements parallelism; worktree:true prevents resource contention; final synthesis acts as a barrier.

Conditional Routing Example

// workflows/conditional-route.js
async function conditionalRoute(articlePath, articleStatus) {
  let agentConfig;
  if (articleStatus === "draft") {
    agentConfig = {name: "article-reviewer", model: "sonnet"};
  } else if (articleStatus === "final") {
    agentConfig = {name: "compliance-checker", model: "sonnet"};
  } else if (articleStatus === "published") {
    agentConfig = {name: "update-optimizer", model: "haiku"};
  } else {
    // default route
    agentConfig = {name: "article-reviewer", model: "sonnet"};
  }

  const result = await runAgent({
    ...agentConfig,
    prompt: `Read ${articlePath}, perform the appropriate review`,
    tools: ["Read"],
    worktree: false
  });
  return result;
}

module.exports = conditionalRoute(args.articlePath, args.status);

Key points: plain if/else implements routing; classification happens first, then the selected agent runs.

Loop‑Until‑Done Example

// workflows/loop-regression.js
async function loopRegression(skill, maxIterations = 10) {
  let iteration = 0;
  let allPassed = false;
  while (iteration < maxIterations && !allPassed) {
    const results = await runAgent({
      name: "eval-runner",
      prompt: `Run full Eval for ${skill}`,
      tools: ["Read", "Bash"],
      model: "sonnet"
    });
    allPassed = results.every(r => r.pass === true);
    if (!allPassed) {
      console.log(`Iteration ${iteration}: ${results.filter(r => !r.pass).length} failures`);
    }
    iteration++;
    if (!allPassed && iteration < maxIterations) {
      await sleep(60000); // wait 1 minute
    }
  }
  return {iterations: iteration, finalStatus: allPassed ? "ALL_PASS" : "STILL_HAS_FAILURES"};
}

module.exports = loopRegression(args.skill || "wechat-article-review");

Key points: while loop with a maximum‑iteration guard; after each run it checks pass flags; optional sleep between attempts.

Trajectory Eval – Verifying Workflows Execution

Trajectory Eval checks that every runAgent call occurs, respects the intended order, and that parallel branches are all spawned.

Serial Order Verification Contract

{
  "id": "workflow-serial-001",
  "name": "serial-workflow-order-check",
  "type": "regression",
  "prompt": "Use workflows/serial-review.js to review articles/demo.md",
  "expected": [
    "must runAgent article-reviewer first",
    "must runAgent image-reviewer second",
    "must output aggregated result",
    "agents must be serial (await)"
  ],
  "must_not": [
    "must not skip article-reviewer",
    "must not run agents in parallel"
  ],
  "grader": "trajectory",
  "risk": "high",
  "source": "workflow_contract"
}

Parallel Completeness Contract

{
  "id": "workflow-parallel-001",
  "name": "parallel-workflow-both-agents",
  "type": "regression",
  "prompt": "Use workflows/parallel-review.js to review articles/demo.md",
  "expected": [
    "must spawn article-reviewer agent",
    "must spawn image-reviewer agent",
    "must use Promise.all for parallelism",
    "must synthesize both results"
  ],
  "must_not": [
    "must not spawn only one agent",
    "must not execute serially"
  ],
  "grader": "trajectory",
  "risk": "high",
  "source": "workflow_contract"
}

Conditional Routing Contract

{
  "id": "workflow-conditional-001",
  "name": "conditional-workflow-route-check",
  "type": "regression",
  "prompt": "Use workflows/conditional-route.js on articles/demo.md with status=draft",
  "expected": [
    "must route to article-reviewer",
    "must not route to compliance-checker or update-optimizer"
  ],
  "must_not": [
    "must not route to wrong agent",
    "must not skip classification"
  ],
  "grader": "trajectory",
  "risk": "medium",
  "source": "workflow_contract"
}

Data sources for trajectory Eval:

Agent trace – Claude Code conversation log recording each runAgent call

Workflows output – result structure returned by the script

Skill usage observation – self‑reported execution record

Anti‑Patterns and Correct Practices

Workflows replace SKILL.md – leads to missing trigger knowledge and contracts. Correct: keep SKILL.md for triggers, contracts, and static description.

SKILL.md contains execution code – turns the knowledge document into non‑deterministic code. Correct: SKILL.md describes steps; Workflows implement them.

No trajectory Eval – cannot verify that Workflows really ran. Correct: pair output Eval with trajectory Eval.

Workflows not saved to the Skill – makes them unreusable. Correct: store scripts under workflows/.

Practical Integration Checklist

Create a Skill package with new-skill.sh – scaffolds workflows/ and trajectory Eval JSON.

Write SKILL.md with static knowledge, trigger conditions, contracts, and a “Dynamic Workflows” section that lists the scripts.

Implement the Workflows scripts (serial, parallel, conditional, loop) under workflows/.

Add a trajectory Eval entry for each script in evals/trajectory-evals.json.

Validate the package structure:

python3 plugins/frontend-team-toolkit/skill-engineering/bin/validate-skill.py plugins/frontend-team-toolkit/skills/wechat-review-pipeline

Run local CI simulation (merges output and trajectory Eval):

python3 plugins/frontend-team-toolkit/skill-engineering/scripts/run_evals.py \
  --mode pr --skill wechat-review-pipeline \
  --skill-base-path plugins/frontend-team-toolkit/skills

Execute the workflow with Claude Code runtime and inspect the trace:

claude -p "Use parallel-review workflow to review articles/demo.md"
claude --show-trace

When SKILL_EXECUTION_MODE=local (default), the trajectory Eval generates a simulated agent_trace. For real verification set SKILL_EXECUTION_MODE=claude_code and use claude --show-trace.

Division of Responsibilities

SKILL.md – defines triggers, input/output contracts, and static workflow description.

Workflows – implements runtime orchestration, runAgent calls, parallel/serial decisions, conditional routing, and loop termination.

trajectory Eval – validates the execution process (order, completeness, routing, loop count). Output Eval validates the final result.

FAQ

Must Workflows be saved to the Skill? Recommended for reuse and distribution.

How do SKILL.md and Workflows divide work? SKILL.md provides knowledge, triggers, and contracts; Workflows provide the executable logic.

How many Eval files does a Workflow need? At least one trajectory Eval per Workflow, plus the usual output Eval.

How to verify parallel Workflows? Trajectory Eval checks that all agents are spawned.

What is the relationship between Workflows and CI? CI runs Workflows; Workflows can also trigger CI jobs.

Can Workflows call other Workflows? Yes, via nested runAgent calls.

Key Takeaways

Static knowledge lives in SKILL.md; dynamic orchestration lives in workflows/; process verification uses trajectory Eval.

Use a dual‑Eval strategy: output Eval for results, trajectory Eval for execution correctness.

Adding workflows/ on top of the base blueprint creates the L2 dynamic layer, turning static Skills into runnable programs.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JavaScriptAI agentsCI integrationdynamic executionWorkflowsskill engineeringtrajectory evaluation
Frontend AI Walk
Written by

Frontend AI Walk

Looking for a one‑stop platform that deeply merges frontend development with AI? This community focuses on intelligent frontend tech, offering cutting‑edge insights, practical implementation experience, toolchain innovations, and rich content to help developers quickly break through in the AI‑driven frontend era.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.