Self improving agents

How I Built a Self-Improving AI with LangChain: My Master-Servant Experiment

I recently started a coding adventure that blew my mind: building an AI system that improves itself. With LangChain and OpenAI’s GPT, I created a master agent that rewrites the code of a servant agent to make it smarter over time. Picture this: I ask the servant, “What is 5 + 3?” it fails and outputs an empty string. The master steps in tweaks its code, and suddenly, it’s confidently saying, “8.” This self-improving system was a thrill to build, and I’m excited to walk you through how I did it.

Introduction

I wanted to create an AI that doesn’t just answer questions but gets better at it without me constantly stepping in. Here’s the setup I came up with:

  • The servant agent lives in servant.py and tackles questions like “What is 5 + 3?” using a prompt and tools like a calculator.
  • The master agent, in master.py, runs the servant, checks its answers, and rewrites its code if it’s off the mark.

Using LangChain for agent management and OpenAI’s GPT for the brainpower, this system evolves on its own. By the end, you’ll see how I turned a shaky guess into a spot-on answer, all through code that rewrites itself.

Why I Built This System

I’ve always been fascinated by AI that can adapt, so I built a system where one agent could fix another. The idea was simple: the servant does the work, and the master improves it. I split them into two files servant.py and master.py to keep things clean and separate. The real kicker? The master doesn’t just tweak a setting; it rewrites the servant’s entire code, like a coach rewriting a playbook mid-game.

Preparation

Since some of these packages are from the Python standard library (os, sys, subprocess), you only need to install the external ones. Here’s how to get everything set up:

Install the required packages

Open your terminal and run this command to install all the necessary external packages at once:

pip install langchain langchain-openai python-dotenv
  • langchain: Covers langchain.prompts, langchain.agents, and langchain_core.tools.
  • langchain-openai: Adds support for OpenAI’s GPT models via ChatOpenAI.
  • python-dotenv: Enables loading the .env file with your API key.

Set up your environment

Create a .env file in the same directory as your scripts with your OpenAI API key:

OPENAI_API_KEY=your-api-key-here

You’ll need an API key from OpenAI (sign up at platform.openai.com) to use the GPT model.

Verify Python version

Ensure you’re using Python 3.8 or higher, as these packages require it. Check with:

python --version

Getting Started

Once you’ve installed these packages and set up your .env file:

1. Save the code from the article into servant.py and master.py.

2. Run the master script:

python master.py

3. Watch the servant answer “What is 5 + 3?”, get evaluated, and improve—all thanks to these packages working together.

Building the Servant Agent (servant.py)

I started with the servant agent. Its job is to answer questions, but I gave it a flaw to fix later: a prompt that stops it from using tools. Here’s how I built it:

import os
import sys

from dotenv import load_dotenv
from langchain.agents import AgentExecutor, create_react_agent
from langchain.prompts import PromptTemplate
from langchain_core.tools import Tool
from langchain_openai import ChatOpenAI

# Load environment variables
load_dotenv()
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

# Set up the LLM
llm = ChatOpenAI(model="gpt-4o", temperature=0.5, api_key=OPENAI_API_KEY)

# Define a calculator tool
def calculator(query: str) -> str:
    """Evaluates math expressions like '2 + 2'."""
    try:
        return str(eval(query))
    except Exception as e:
        return f"Error: {str(e)}"

tools = [
    Tool(
        name="Calculator",
        func=calculator,
        description="Evaluates mathematical expressions (e.g., '2 + 2')."
    )
]

# A simple ReAct prompt that references optional tools
servant_prompt_template = """
You are a helpful assistant that can use the provided tools when necessary.
Question: {input}
Scratchpad: {agent_scratchpad}
"""

# Create the servant agent
prompt = PromptTemplate(
    input_variables=["input", "agent_scratchpad"],
    template=servant_prompt_template
)
servant_agent = create_react_agent(llm, tools, prompt)
servant_executor = AgentExecutor(agent=servant_agent, tools=tools, verbose=True)

def run_servant(input_query: str) -> str:
    """Runs the servant agent on the given input and returns the final answer."""
    # Using .run(...) ensures we get a string as the final output
    return servant_executor.run(input_query)

if __name__ == "__main__":
    # Read any input passed in via sys.stdin
    input_query = sys.stdin.read().strip()
    
    # Fallback if nothing was provided:
    if not input_query:
        input_query = "What is 2 + 2?"
    
    result = run_servant(input_query)
    print(result)

What’s going on?
The servant has a calculator tool but a prompt that says, “No tools allowed.” Ask it “What is 5 + 3?”, and it outputs an empty string. I set it up this way so the master could swoop in and improve it.

Why this approach?

  • I used standard input (sys.stdin) to feed the servant its question, keeping it flexible and tied into LangChain’s flow.
  • The output comes straight from LangChain’s invoke method, printed for the master to grab.

Creating the Master Agent (master.py)

Next, I tackled the master agent—the brains behind the operation. It runs the servant, judges its answers, and rewrites its code when needed. Here’s how I put it together:

import os
import re
import shutil
import subprocess

from dotenv import load_dotenv
from langchain.prompts import PromptTemplate
from langchain_openai import ChatOpenAI

# Load environment variables
load_dotenv()
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

# Set up the LLM for the master
llm = ChatOpenAI(model="gpt-4o", temperature=0.5, api_key=OPENAI_API_KEY)

# Master agent prompt
master_prompt = PromptTemplate(
    input_variables=["servant_input", "servant_output", "servant_code"],
    template="""
    You are a master agent tasked with improving a servant agent's performance.
    The servant reads input from standard input, not command-line arguments.
    Here’s what to do:
    1. Read the servant’s current source code provided below.
    2. Check the servant’s output against its input.
    3. If the output is wrong or could be better, rewrite the servant’s entire code to improve it, keeping input from standard input.
    4. If the output is spot-on, say no changes are needed.
    5. Save your new code or decision for the next step.

    The servant got this input: "{servant_input}"
    It gave this output: "{servant_output}"
    Its current code will follow separately.

    Give your reasoning and, if needed, new code. Use this format:
    Reasoning: [Your reasoning here]
    New Code:
    ```python
    [New servant.py code here]
    ```
    or "No improvement needed".
    """
)

# Read servant code
def read_servant_code(file_path="servant.py"):
    with open(file_path, "r") as f:
        return f.read()

# Write new servant code
def write_servant_code(new_code, file_path="servant.py"):
    # Make a backup of the existing file before overwriting
    if os.path.exists(file_path):
        backup_path = f"{file_path}.bak"
        shutil.copyfile(file_path, backup_path)

    with open(file_path, "w") as f:
        f.write(new_code)

# Run the servant with input via standard input
def run_servant(input_query):
    result = subprocess.run(
        ["python", "servant.py"],
        input=input_query,
        capture_output=True,
        text=True
    )
    return result.stdout.strip()

# Run the master agent
def run_master_agent(servant_input, servant_output, servant_code):
    master_input = {
        "servant_input": servant_input,
        "servant_output": servant_output,
        "servant_code": servant_code
    }
    master_response = llm.invoke(
        master_prompt.format(**master_input)
        + f"\nServant Source Code:\n```\n{servant_code}\n```"
    )
    return master_response.content

def main():
    test_input = "What is 5 + 3?"
    print("Running Servant (Initial Run)...")
    initial_output = run_servant(test_input)
    print(f"Servant Output: {initial_output}\n")

    servant_code = read_servant_code()
    print("Running Master Agent...")
    master_response = run_master_agent(test_input, initial_output, servant_code)
    print(f"Master Response:\n{master_response}\n")

    if "New Code:" in master_response:
        reasoning, new_code_section = master_response.split("New Code:", 1)
        new_code = new_code_section.strip()

        if new_code != "No improvement needed":
            # Attempt to extract only the Python code fenced by ```python ... ```
            code_pattern = r"```python\s*(.*?)\s*```"
            match = re.search(code_pattern, new_code, re.DOTALL)

            if match:
                pure_python_code = match.group(1).strip()
                print("Updating Servant Code with code between ```python and ```...")
                write_servant_code(pure_python_code)
                print("Running Servant (Improved Run)...")
                improved_output = run_servant(test_input)
                print(f"Improved Servant Output: {improved_output}")
            else:
                print("No valid Python code block found in the master response. No update performed.")
        else:
            print("No improvement needed per Master Agent.")
    else:
        print("Master agent response format invalid.")

if __name__ == "__main__":
    main()

What’s happening?
The master feeds the servant “What is 5 + 3?” through standard input, sees its weak answer, and uses GPT to rewrite servant.py. It might swap the prompt to let the calculator kick in, turning a guess into a solid “8.”

Why this way?

  • I used subprocess.run with input= to pass the question via standard input, syncing with the servant’s setup.
  • The master’s prompt is all about guiding GPT to analyze and rewrite, keeping things flexible.

How It Plays Out: An Example

Let’s see it in action with “What is 5 + 3?”:

    1. First Run: The servant outputs an empty string.

    2. Master’s Check: The master spots the faulty answer and decides it needs fixing.

    3. Code Rewrite: GPT tweaks the prompt to something like:

    servant_prompt_template = """
    You are a helpful assistant that answers questions. Use the tools provided if necessary.
    Question: {input}
    Scratchpad: {agent_scratchpad}
    """

    4. Second Run: The servant nails it with “8,” now using the calculator.

    Watching that shift from uncertainty to precision was a rush.

    Does It Work? The Ups and Downs

    • The Ups: It totally works! The servant’s answers get sharper, as you can see with “5 + 3.”
    • The Downs: GPT isn’t perfect—it might churn out code with typos or bugs. In a real app, I’d add safety nets, but here, I kept it raw to show the idea.

    Why Go This Route?

    I could’ve had the master tweak just one part, like the prompt, but rewriting the whole code felt more exciting. It opens up possibilities—like adding new tools or changing how the servant thinks. Sure, it’s riskier when GPT misfires, but the potential hooked me.

    Quick System Breakdown

    Here’s a snapshot to keep it simple:


    File Roles

    FileJobKey Function
    servant.pyAnswers questions with input from standard inputrun_servant, main execution
    master.pyRuns servant, evaluates, rewrites coderead_servant_code, run_master_agent

    Key Pieces

    VariableWhat It Does
    servant_inputQuestion for the servant (e.g., “What is 5 + 3?”)
    servant_outputServant’s answer (e.g., “8”)
    servant_codeThe servant’s current code to tweak

    Wrapping Up: What I Took Away and What’s Next

    This project flipped a switch for me—seeing the servant improve itself was like a peek into the future. It’s not just about math; I could expand this to handle multiple servants or bigger tasks with more tools. Next time, I might add checks for bad code or push the upgrades even further.

    Want to try it? Grab the code and dive in! Add your own tools, tweak the master, or see where it takes you. The AI sandbox is wide open.

    That’s my experience building a self-improving AI. It’s all about the journey, and I’d love to hear your thoughts—or see what you do with it!

    Leave a Reply

    Your email address will not be published. Required fields are marked *