Skip to content

2026

Building an autonomous ML researcher with Claude Code dynamic workflows

As an experiment, I re-implemented the autonomous ML research-and-engineering workflow encoded in Hugging Face's ml-intern as a Claude Code dynamic workflow that delegates execution to the Hugging Face skills (hf-skills) instead of ml-intern's custom tools1. I did it in three steps: extract a technology-neutral specification of the workflow, compile that specification into a single generic workflow script, then run the script against a concrete task. The result is one workflow that accepts any ML research task as an argument, rather than having Claude Code write a new workflow script for each task.

Agentic editing of terminal screencasts

asciinema is naturally suited to agentic screencast editing. A .cast recording is plain text (JSON Lines), one event per line of the form [interval, code, data], where interval is seconds since the previous event. Editing reduces to arithmetic on those intervals (and optionally to substitution on the payloads, e.g. for redaction), so a small tool can expose trimming, speeding, and cutting as cheap operations that a language model can reason about and combine.

As a demonstration, I recorded an ~85-minute Claude Code session running an ML fine-tuning task with the ml-research plugin and turned it into a 40-second GIF of the highlights without leaving Claude Code. The edit was driven by short natural-language instructions and one custom skill (cast-edit) that wraps the format with a small Python tool.