Model integration
freeact
provides both a low-level and high-level API for integrating new models.
- The low-level API defines the
CodeActModel
interface and related abstractions - The high-level API provides a
LiteLLM
class based on the LiteLLM Python SDK
Low-level API
The low-level API is not further described here. For implementation examples, see the freeact.model.litellm.model
or freeact.model.gemini.live
modules.
High-level API
The high-level API supports usage of models from any provider that is compatible with the LiteLLM Python SDK. To use a model, you need to provide prompt templates that guide it to generate code actions. You can either reuse existing templates or create your own.
The following subsections demonstrate this using Qwen 2.5 Coder 32B Instruct as an example, showing how to use it both via the Fireworks API and locally with ollama.
Prompt templates
Start with model-specific prompt templates that guide Qwen 2.5 Coder Instruct models to generate code actions. For example:
SYSTEM_TEMPLATE = """You are a Python coding expert and ReAct agent that acts by writing executable code.
At each step I execute the code that you wrote in an IPython notebook and send you the execution result.
Then continue with the next step by reasoning and writing executable code until you have a final answer.
The final answer must be in plain text or markdown (exclude code and exclude latex).
You can use any Python package from pypi.org and install it with !pip install ...
Additionally, you can also use modules defined in the following <python-modules> tags:
<python-modules>
{python_modules}
</python-modules>
Important: import these <python-modules> before using them.
Write code in the following format:
```python
...
```
"""
EXECUTION_OUTPUT_TEMPLATE = """Here are the execution results of the code you generated:
<execution-results>
{execution_feedback}
</execution-results>
Proceed with the next step or respond with a final answer to the user question if you have sufficient information.
"""
EXECUTION_ERROR_TEMPLATE = """The code you generated produced an error during execution:
<execution-error>
{execution_feedback}
</execution-error>
Try to fix the error and continue answering the user question.
"""
Tip
While tested with Qwen 2.5 Coder Instruct, these prompt templates can also serve as starting point for other models.
Model definition
Although we could instantiate LiteLLM
directly with these prompt templates, freeact
provides a QwenCoder
subclass for convenience:
import os
from freeact.model.litellm.model import LiteLLM
from freeact.model.qwen.prompt import EXECUTION_ERROR_TEMPLATE, EXECUTION_OUTPUT_TEMPLATE, SYSTEM_TEMPLATE
class QwenCoder(LiteLLM):
"""Code action model class for Qwen 2.5 Coder.
Args:
model_name: The LiteLLM-specific name of the model.
skill_sources: Skill modules source code to be included into `system_template`.
system_template: Prompt template for the system message that guides the model to generate code actions.
Must define a `{python_modules}` placeholder for the `skill_sources`.
execution_output_template: A template for formatting successful code execution output.
Must define an `{execution_feedback}` placeholder.
execution_error_template: A template for formatting code execution errors.
Must define an `{execution_feedback}` placeholder.
api_key: Provider-specific API key. If not provided, reads from `QWEN_API_KEY` environment variable.
**kwargs: Default completion kwargs passed used for
[`request`][freeact.model.base.CodeActModel.request] and
[`feedback`][freeact.model.base.CodeActModel.feedback] calls.
These are overriden by `request` and `feedback` specific kwargs.
"""
def __init__(
self,
model_name: str,
skill_sources: str | None = None,
system_template: str = SYSTEM_TEMPLATE,
execution_output_template: str = EXECUTION_OUTPUT_TEMPLATE,
execution_error_template: str = EXECUTION_ERROR_TEMPLATE,
api_key: str | None = None,
**kwargs,
):
# Qwen 2.5 Coder models often hallucinate results prior
# to code execution which is prevented by stopping at the
# beginning of an ```output ...``` block. Also, Qwen Coder
# models on Fireworks AI sometimes leak <|im_start|> tokens
# after generating code blocks.
default_kwargs = {
"stop": ["```output", "<|im_start|>"],
}
super().__init__(
model_name=model_name,
execution_output_template=execution_output_template,
execution_error_template=execution_error_template,
system_instruction=system_template.format(python_modules=skill_sources or ""),
api_key=api_key or os.getenv("QWEN_API_KEY"),
**(default_kwargs | kwargs),
)
Model usage
Here's a Python example that uses QwenCoder
as code action model in a freeact
agent. The model is accessed via the Fireworks API:
import asyncio
import os
from rich.console import Console
from freeact import CodeActAgent, QwenCoder, execution_environment
from freeact.cli.utils import stream_conversation
async def main():
async with execution_environment(
ipybox_tag="ghcr.io/gradion-ai/ipybox:basic",
) as env:
skill_sources = await env.executor.get_module_sources(
module_names=["freeact_skills.search.google.stream.api"],
)
model = QwenCoder(
model_name="fireworks_ai/accounts/fireworks/models/qwen2p5-coder-32b-instruct",
api_key=os.environ.get("FIREWORKS_API_KEY"), # (1)!
skill_sources=skill_sources,
)
agent = CodeActAgent(model=model, executor=env.executor)
await stream_conversation(agent, console=Console()) # (2)!
if __name__ == "__main__":
asyncio.run(main())
-
Your Hugging Face user access token
-
Interact with the agent via a CLI
Run it with:
Alternatively, use the freeact
CLI directly:
python -m freeact.cli \
--model-name=fireworks_ai/accounts/fireworks/models/qwen2p5-coder-32b-instruct \
--ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
--skill-modules=freeact_skills.search.google.stream.api \
--api-key=$FIREWORKS_API_KEY
For using the same model deployed locally with ollama, modify --model-name
, remove --api-key
and set --base-url
to match your local deployment: