Skip to content

Fundamentals

Application setup

In a freeact application you need:

  • A code execution container. This is an ipybox Docker container running a resource server for providing skill sources and a Jupyter Kernel Gateway for stateful code execution in IPython kernels. ipybox containers are managed by the CodeExecutionContainer context manager. The container in the following example is based on the prebuilt ghcr.io/gradion-ai/ipybox:basic Docker image.

  • A code provider for downloading skill sources from the resource server and registering MCP servers. During registration, the resource server automatically generates Python client functions from MCP tool metadata so that the generated sources can be served to clients. A code provider is created with the CodeProvider context manager. It communicates with the container's resource server at resource_port.

  • A code executor for executing code actions generated by a code action model. A code executor is created with the CodeExecutor context manager. On entering the context, it connects to the container's kernel gateway at executor_port and creates a new IPython kernel. Code executions made with the same CodeExecutor instance are stateful i.e. executed code can access definitions and results from previous executions.

  • A code action model for generating code actions in response to user queries and code execution results or errors. Code action models are created with the LiteCodeActModel class which supports any model that is also supported by LiteLLM, including locally deployed models. Skill sources loaded with the code provider are passed to the model's constructor.

  • A code action agent for coordinating the interaction between a code action model, a code executor and the user or another agent. freeact agents are created with the CodeActAgent class. Depending on the complexity of a user query, the agent generates and executes a sequence of one or more code actions until the model provides a final response to the user1.

This is demonstrated in the following example. You need an Anthropic and a Gemini API key for running it.

examples/fundamentals_1.py
import asyncio
import os

from examples.utils import stream_conversation
from freeact import (
    CodeActAgent,
    CodeExecutionContainer,
    CodeExecutor,
    CodeProvider,
    LiteCodeActModel,
)


async def main():
    async with CodeExecutionContainer(
        tag="ghcr.io/gradion-ai/ipybox:basic",
        env={"GEMINI_API_KEY": os.environ["GEMINI_API_KEY"]},  # (1)!
    ) as container:
        async with CodeProvider(
            workspace=container.workspace,
            port=container.resource_port,
        ) as provider:
            skill_sources = await provider.get_sources(
                module_names=["freeact_skills.search.google.stream.api"],  # (2)!
            )

        model = LiteCodeActModel(
            model_name="anthropic/claude-3-7-sonnet-20250219",
            reasoning_effort="low",
            skill_sources=skill_sources,
            api_key=os.environ["ANTHROPIC_API_KEY"],  # (3)!
        )

        async with CodeExecutor(
            workspace=container.workspace,
            port=container.executor_port,
        ) as executor:
            agent = CodeActAgent(model=model, executor=executor)
            await stream_conversation(agent)  # (4)!


if __name__ == "__main__":
    asyncio.run(main())
  1. A GEMINI_API_KEY is needed for generative Google search with Gemini in the code execution container.

  2. A module that provides generative Google search with Gemini. It is pre-installed in the code execution container.

  3. Needed for the code actions model. Added here for clarity but can be omitted if the ANTHROPIC_API_KEY is set as environment variable.

  4. Runs a minimalistic text-based interface for interacting with the agent.

Info

Using a code provider is optional if you don't want to provide skill sources to a code action model. At the moment, it is the responsibility of an application to provide skill sources to a code action model. We will soon enhance freeact agents to retrieve skill sources autonomously depending on the user query and current state.

MCP Integration

For examples how to use MCP servers in code actions, see sections Quickstart and MCP integration.

Higher-level API

Using the lower-level API, as in the previous section, comes with some boilerplate code for constructing CodeExecutionContainer, CodeProvider and CodeExecutor instances. You can avoid this using the higher-level execution_environment context manager which creates a CodeExecutionEnvironment instance that provides more convenient context managers for code provider and code executor. The following code is equivalent to the previous one:

examples/fundamentals_2.py
import asyncio
import os

from examples.utils import stream_conversation
from freeact import CodeActAgent, LiteCodeActModel, execution_environment


async def main():
    async with execution_environment(
        ipybox_tag="ghcr.io/gradion-ai/ipybox:basic",
        ipybox_env={"GEMINI_API_KEY": os.environ["GEMINI_API_KEY"]},
    ) as env:
        async with env.code_provider() as provider:
            skill_sources = await provider.get_sources(
                module_names=["freeact_skills.search.google.stream.api"],
            )
        async with env.code_executor() as executor:
            model = LiteCodeActModel(
                model_name="anthropic/claude-3-7-sonnet-20250219",
                reasoning_effort="low",
                skill_sources=skill_sources,
                api_key=os.environ["ANTHROPIC_API_KEY"],
            )
            agent = CodeActAgent(model=model, executor=executor)
            await stream_conversation(agent)  # (1)!


if __name__ == "__main__":
    asyncio.run(main())
  1. Runs a minimalistic text-based interface for interacting with the agent.

Agent protocol

freeact provides a wide spectrum of options for interacting with agents, from getting just the final response to streaming model and code execution outputs as they are generated at all steps.

Final response

For obtaining only the final response, await response on a CodeActAgentTurn which is returned by the CodeActAgent.run method. This waits until the task-specific sequence of model interactions and code executions is complete.

examples/utils.py::final_response
async def final_response(agent: CodeActAgent, user_message: str) -> str:
    turn: CodeActAgentTurn = agent.run(user_message)
    resp: CodeActAgentResponse = await turn.response()  # (1)!
    return resp.text
  1. Waits until the sequence of model interactions and code executions is complete.

Progressive streaming

For streaming model and code execution outputs as they are generated at each step, consume stream on CodeActAgentTurn, CodeActModelTurn and CodeExecution objects. After a stream has been fully consumed at a given step, the corresponding aggregated CodeActAgentResponse, CodeActModelResponse or CodeExecutionResult objects are available immediately without waiting, as demonstrated in the following example:

examples/utils.py::stream_conversation
async def stream_conversation(agent: CodeActAgent, **kwargs):
    usage = CodeActModelUsage()

    while True:
        user_message = await ainput("User message: ")

        if user_message.lower() == "q":
            break

        agent_turn = agent.run(user_message, **kwargs)
        await stream_turn(agent_turn)

        agent_response = await agent_turn.response()
        usage.update(agent_response.usage)  # (4)!

        print("Accumulated usage:")
        print(json.dumps(asdict(usage), indent=2))
        print()


async def stream_turn(agent_turn: CodeActAgentTurn):
    produced_images: Dict[Path, Image.Image] = {}

    async for activity in agent_turn.stream():
        match activity:
            case CodeActModelTurn() as turn:
                print("Model response:")
                async for s in turn.stream():
                    print(s, end="", flush=True)
                print()

                response = await turn.response()  # (1)!
                if response.code:  # (2)!
                    print("\n```python")
                    print(response.code)
                    print("```\n")

            case CodeExecution() as execution:
                print("Execution result:")
                async for s in execution.stream():
                    print(s, end="", flush=True)
                result = await execution.result()  # (3)!
                produced_images.update(result.images)
                print()

    if produced_images:
        print("\n\nProduced images:")
    for path in produced_images.keys():
        print(str(path))
  1. Returns immediately as turn.stream() is already consumed.

  2. The code action produced by the model. If None, the response is a final response.

  3. Returns immediately as execution.stream() is already consumed.

  4. Accumulates token usage and costs across multiple agent runs in a conversation. See also usage statistics.

Usage statistics

CodeActModelResponse.usage contains token usage and costs for a single model interaction, CodeActAgentResponse.usage contains accumulated token usage and costs for a single agent run. For accumulating token usage and costs across agent runs in a conversation, create a CodeActModelUsage object on application level and update it with usage objects from agent responses.


  1. For trivial queries, the model may also decide to provide a final response directly, without generating a code action.