Fundamentals
Application setup
In a freeact
application you need:
-
A code execution container. This is an
ipybox
Docker container running a resource server for providing skill sources and a Jupyter Kernel Gateway for stateful code execution in IPython kernels.ipybox
containers are managed by theCodeExecutionContainer
context manager. The container in the following example is based on the prebuiltghcr.io/gradion-ai/ipybox:basic
Docker image. -
A code provider for downloading skill sources from the resource server and registering MCP servers. During registration, the resource server automatically generates Python client functions from MCP tool metadata so that the generated sources can be served to clients. A code provider is created with the
CodeProvider
context manager. It communicates with the container's resource server atresource_port
. -
A code executor for executing code actions generated by a code action model. A code executor is created with the
CodeExecutor
context manager. On entering the context, it connects to the container's kernel gateway atexecutor_port
and creates a new IPython kernel. Code executions made with the sameCodeExecutor
instance are stateful i.e. executed code can access definitions and results from previous executions. -
A code action model for generating code actions in response to user queries and code execution results or errors. Code action models are created with the
LiteCodeActModel
class which supports any model that is also supported by LiteLLM, including locally deployed models. Skill sources loaded with the code provider are passed to the model's constructor. -
A code action agent for coordinating the interaction between a code action model, a code executor and the user or another agent.
freeact
agents are created with theCodeActAgent
class. Depending on the complexity of a user query, the agent generates and executes a sequence of one or more code actions until the model provides a final response to the user1.
This is demonstrated in the following example. You need an Anthropic and a Gemini API key for running it.
import asyncio
import os
from examples.utils import stream_conversation
from freeact import (
CodeActAgent,
CodeExecutionContainer,
CodeExecutor,
CodeProvider,
LiteCodeActModel,
)
async def main():
async with CodeExecutionContainer(
tag="ghcr.io/gradion-ai/ipybox:basic",
env={"GEMINI_API_KEY": os.environ["GEMINI_API_KEY"]}, # (1)!
) as container:
async with CodeProvider(
workspace=container.workspace,
port=container.resource_port,
) as provider:
skill_sources = await provider.get_sources(
module_names=["freeact_skills.search.google.stream.api"], # (2)!
)
model = LiteCodeActModel(
model_name="anthropic/claude-3-7-sonnet-20250219",
reasoning_effort="low",
skill_sources=skill_sources,
api_key=os.environ["ANTHROPIC_API_KEY"], # (3)!
)
async with CodeExecutor(
workspace=container.workspace,
port=container.executor_port,
) as executor:
agent = CodeActAgent(model=model, executor=executor)
await stream_conversation(agent) # (4)!
if __name__ == "__main__":
asyncio.run(main())
-
A
GEMINI_API_KEY
is needed for generative Google search with Gemini in the code execution container. -
A module that provides generative Google search with Gemini. It is pre-installed in the code execution container.
-
Needed for the code actions model. Added here for clarity but can be omitted if the
ANTHROPIC_API_KEY
is set as environment variable. -
Runs a minimalistic text-based interface for interacting with the agent.
Info
Using a code provider is optional if you don't want to provide skill sources to a code action model. At the moment, it is the responsibility of an application to provide skill sources to a code action model. We will soon enhance freeact
agents to retrieve skill sources autonomously depending on the user query and current state.
MCP Integration
For examples how to use MCP servers in code actions, see sections Quickstart and MCP integration.
Higher-level API
Using the lower-level API, as in the previous section, comes with some boilerplate code for constructing CodeExecutionContainer
, CodeProvider
and CodeExecutor
instances. You can avoid this using the higher-level execution_environment
context manager which creates a CodeExecutionEnvironment
instance that provides more convenient context managers for code provider and code executor. The following code is equivalent to the previous one:
import asyncio
import os
from examples.utils import stream_conversation
from freeact import CodeActAgent, LiteCodeActModel, execution_environment
async def main():
async with execution_environment(
ipybox_tag="ghcr.io/gradion-ai/ipybox:basic",
ipybox_env={"GEMINI_API_KEY": os.environ["GEMINI_API_KEY"]},
) as env:
async with env.code_provider() as provider:
skill_sources = await provider.get_sources(
module_names=["freeact_skills.search.google.stream.api"],
)
async with env.code_executor() as executor:
model = LiteCodeActModel(
model_name="anthropic/claude-3-7-sonnet-20250219",
reasoning_effort="low",
skill_sources=skill_sources,
api_key=os.environ["ANTHROPIC_API_KEY"],
)
agent = CodeActAgent(model=model, executor=executor)
await stream_conversation(agent) # (1)!
if __name__ == "__main__":
asyncio.run(main())
- Runs a minimalistic text-based interface for interacting with the agent.
Agent protocol
freeact
provides a wide spectrum of options for interacting with agents, from getting just the final response to streaming model and code execution outputs as they are generated at all steps.
Final response
For obtaining only the final response, await response
on a CodeActAgentTurn
which is returned by the CodeActAgent.run
method. This waits until the task-specific sequence of model interactions and code executions is complete.
async def final_response(agent: CodeActAgent, user_message: str) -> str:
turn: CodeActAgentTurn = agent.run(user_message)
resp: CodeActAgentResponse = await turn.response() # (1)!
return resp.text
- Waits until the sequence of model interactions and code executions is complete.
Progressive streaming
For streaming model and code execution outputs as they are generated at each step, consume stream
on CodeActAgentTurn
, CodeActModelTurn
and CodeExecution
objects. After a stream has been fully consumed at a given step, the corresponding aggregated CodeActAgentResponse
, CodeActModelResponse
or CodeExecutionResult
objects are available immediately without waiting, as demonstrated in the following example:
async def stream_conversation(agent: CodeActAgent, **kwargs):
usage = CodeActModelUsage()
while True:
user_message = await ainput("User message: ")
if user_message.lower() == "q":
break
agent_turn = agent.run(user_message, **kwargs)
await stream_turn(agent_turn)
agent_response = await agent_turn.response()
usage.update(agent_response.usage) # (4)!
print("Accumulated usage:")
print(json.dumps(asdict(usage), indent=2))
print()
async def stream_turn(agent_turn: CodeActAgentTurn):
produced_images: Dict[Path, Image.Image] = {}
async for activity in agent_turn.stream():
match activity:
case CodeActModelTurn() as turn:
print("Model response:")
async for s in turn.stream():
print(s, end="", flush=True)
print()
response = await turn.response() # (1)!
if response.code: # (2)!
print("\n```python")
print(response.code)
print("```\n")
case CodeExecution() as execution:
print("Execution result:")
async for s in execution.stream():
print(s, end="", flush=True)
result = await execution.result() # (3)!
produced_images.update(result.images)
print()
if produced_images:
print("\n\nProduced images:")
for path in produced_images.keys():
print(str(path))
-
Returns immediately as
turn.stream()
is already consumed. -
The code action produced by the model. If
None
, the response is a final response. -
Returns immediately as
execution.stream()
is already consumed. -
Accumulates token usage and costs across multiple agent runs in a conversation. See also usage statistics.
Usage statistics
CodeActModelResponse.usage
contains token usage and costs for a single model interaction, CodeActAgentResponse.usage
contains accumulated token usage and costs for a single agent run
. For accumulating token usage and costs across agent runs in a conversation, create a CodeActModelUsage
object on application level and update it with usage
objects from agent responses.
-
For trivial queries, the model may also decide to provide a final response directly, without generating a code action. ↩