Coderrob brand logo Coderrob

Hi, I'm Rob—programmer, Pluralsight author, software architect, emerging technologist, and lifelong learner.


Reverse Engineering Agentic Workflows from Copilot Debug Logs


Fri, 24 Oct 2025

Here’s a secret weapon for building your own agentic workflows: GitHub Copilot Chat’s debug logs.

You know how everyone’s out there wrestling with hallucinating AI agents? Trying to figure out how to structure those prompts, which tools to call when, how to handle errors without pulling your hair out, what context to pass between steps…

The answer?

It’s sitting right there in your Copilot Chat debug view. Already solved. Already tested. Already proven to work for your specific use cases.

All you have to do is… analyze the logs.

The Reverse Engineering Approach

Here’s the brilliantly simple workflow:

  1. Set up a typical scenario in your codebase
  2. Use GitHub Copilot Chat to work through it (debug an issue, add a feature, refactor code)
  3. Export the .chatreplay.json debug logs from the Debug View
  4. Analyze the captured actions – which tools were called, in what order, with what context
  5. Recreate the workflow in LangGraph (or your framework of choice)
  6. Replay and refine the flow against scenarios that match your examples

You’re not guessing at agent architecture. You’re extracting proven patterns from a production system that’s already handling your code successfully.

Why This Actually Works

Copilot Chat is Already an Agent

People forget this, but GitHub Copilot Chat is a full-fledged agentic system. It’s already:

And it’s doing this for your specific codebase in your specific IDE with your specific problems.

(Yeah, I know—mind blown, right? It’s not just autocomplete; it’s basically a tiny AI workflow engine.)

The Debug Logs Are a Blueprint

When you open the Debug View in Copilot Chat, you’re seeing:

This isn’t documentation. This is the actual execution trace of a working agentic system.

Your Use Cases, Your Standards

Here’s the kicker: when you solve problems with Copilot Chat, those solutions are already tuned to your standards.

The code it suggests? It’s based on your existing codebase patterns.

The tools it uses? They’re the ones relevant to your tech stack.

The workflow it follows? It’s optimized for the types of problems you actually encounter.

So when you reverse engineer that workflow, you’re not building a generic agent.

You’re building an agent that works exactly like the successful solutions you’ve already validated.

The Process in Detail

Let me break down how this actually works in practice.

Step 1: Set Up Your Scenario

Pick a real, representative task. Something you do regularly:

Make it specific. Make it realistic. This is your training example.

Step 2: Work Through It With Copilot

Open GitHub Copilot Chat and solve the problem. But here’s the important part: turn on the Debug View.

Let Copilot do its thing. Watch it:

Don’t interrupt the process. Let it complete the full workflow.

Step 3: Export the Debug Logs

Once the task is complete (and only in the VS Code client), you can export a .chatreplay.json file. Inside that file is a logs array that interleaves requests, tool calls, and model responses. A single tool invocation looks more like this:

{
  "id": "toolu_02GSD097655DDF",
  "kind": "toolCall",
  "tool": "read_file",
  "args": "{\"filePath\": \"d:/blog/content/posts/rick-roll.md\", \"startLine\": 1, \"endLine\": 30}",
  "response": [
    "File: `d:/blog/content/posts/rick-roll.md`. Lines 1 to 30 ..."
  ]
}

You’ll also see request entries that repeat the full tool catalog and metadata for the underlying model call. There is no friendly decision field waiting for you—you’re exporting raw telemetry that you must interpret yourself.

Step 4: Analyze the Patterns

Now walk the logs array in order and look for the patterns. You’re reconstructing the flow manually—matching each toolCall to the prompt that triggered it and parsing the stringified args payload so you can see the real parameters Copilot passed.

What tools were used? File reading? Semantic search? Running tests? Grepping for patterns?

In what order? Did it search first, then read? Or read first, then search for related code?

What triggered each decision? What in the output of one tool led to calling the next?

How was context managed? What information from step 1 was still relevant in step 5?

Where did it branch? Were there conditional paths based on what was found?

Step 5: Recreate in LangGraph

Now you’ve got everything you need to build your own agent. Copy the sequence of tool calls, but remember that each toolCall["args"] value in the export is a JSON string—you’ll want to pipe it through json.loads (or your parser of choice) before you can feed the parameters into your own tooling.

from langgraph.graph import StateGraph, END

# Define your agent state
class AgentState(TypedDict):
    task: str
    file_contents: dict
    search_results: list
    changes_needed: list
    validation_passed: bool

# Define your nodes (based on Copilot's tool calls)
def read_relevant_files(state):
    # Your implementation
    return state

def search_for_patterns(state):
    # Your implementation
    return state

def analyze_changes_needed(state):
    # Your implementation
    return state

def apply_changes(state):
    # Your implementation
    return state

def validate_changes(state):
    # Your implementation
    return state

# Build the graph (based on the flow you observed in the logs)
workflow = StateGraph(AgentState)

workflow.add_node("read_files", read_relevant_files)
workflow.add_node("search", search_for_patterns)
workflow.add_node("analyze", analyze_changes_needed)
workflow.add_node("apply", apply_changes)
workflow.add_node("validate", validate_changes)

# Add edges (based on the flow you observed in the logs)
workflow.set_entry_point("read_files")
workflow.add_edge("read_files", "search")
workflow.add_edge("search", "analyze")
workflow.add_conditional_edges(
    "analyze",
    lambda state: "apply" if state["changes_needed"] else END
)
workflow.add_edge("apply", "validate")
workflow.add_edge("validate", END)

agent = workflow.compile()

You’ve just recreated Copilot’s workflow for your specific use case.

(And yeah, I know—LangGraph might feel like overkill at first. But trust me, once you see it work, you’ll be hooked.)

Why This Approach Works Reliably

Here’s why this approach tends to land much closer to success than starting cold:

You’re Not Inventing, You’re Copying

You’re not guessing at what tools to use or when. You’re literally copying a workflow that already succeeded.

If Copilot solved your problem by reading file A, searching for pattern B, then modifying file C - that’s a proven path. Replicate it.

(It’s like having a recipe from a chef who actually knows how to cook, instead of winging it with whatever’s in your fridge.)

You Control the Scope

You’re not trying to build an agent that solves everything. You’re building an agent that solves this specific type of problem the way you already solved it successfully.

Start with one scenario. Master it. Then add more scenarios, each reverse engineered from Copilot’s successful solutions.

You Have Reference Implementations

When your agent doesn’t work quite right, you have the debug logs to compare against.

“Copilot called semantic_search first, but I’m calling read_file. That’s why my context is different.”

It’s like having the answer key while you’re taking the test.

(And let’s be honest—who doesn’t love having the answer key?)

Scaling the Approach

Once you’ve got one workflow working, the pattern becomes clear:

  1. Identify common tasks you solve with Copilot
  2. Capture debug logs for each type
  3. Extract the patterns - many will share similar structures
  4. Build a library of workflows for different scenarios
  5. Compose them together for complex tasks

Before long, you’ve got a custom agentic system that handles your specific development tasks, built entirely by reverse engineering proven solutions.

The Meta-Learning Opportunity

But here’s where it gets really interesting.

After you’ve reverse engineered 10, 20, 50 different scenarios, you start to see the meta-patterns:

These aren’t implementation details. These are design principles for agentic workflows.

You’re not just copying individual solutions. You’re learning how to structure agent decisions by observing a production system.

Why This Beats Starting From Scratch

Compare this to the alternative:

Starting from scratch:

Reverse engineering from Copilot:

One of these paths is way shorter than the other.

Limitations and Caveats

The Practical Reality

Will your reverse engineered agent work perfectly for every edge case? No.

Will it work reliably for the scenarios you extracted it from? More often than not—especially once you account for the limitations above.

And that’s the point.

You’re not trying to build AGI. You’re trying to automate your specific workflows in your specific codebase to your specific standards.

GitHub Copilot Chat already solved that problem. The debug logs are sitting right there, showing you exactly how.

All you have to do is look.

Getting Started Today

Want to try this? Here’s your action plan:

  1. Pick one repetitive task you do weekly
  2. Solve it with Copilot Chat with debug view enabled
  3. Export the logs and study the tool calls
  4. Map out the decision flow on paper
  5. Implement the simplest version in LangGraph
  6. Test it on the same scenario - it should work
  7. Try it on a similar scenario - adjust as needed
  8. Repeat for more task types

Six months from now, you’ll have a custom agentic development assistant built entirely from reverse engineered patterns that you know work because you’ve already watched them succeed.

You don’t need to invent agentic workflows. You just need to study the ones already working in your IDE.

The debug logs are the blueprint. LangGraph is the construction tool. Your custom agent is the result.

Welcome to agentic development on easy mode.

(And hey, if it doesn’t work perfectly? That’s okay. At least you’re starting from a place of proven success, not blind guessing. Progress, right?)

-Rob