The advancements of large language models (LLMs) has changed what is possible through a simple chat interface. Since the recent version of LLMs were first introduced, the pace of innovation everyday has made it hard to appreciate how far things have come. The first iteration was talking to LLMs to get answers to different tasks you needed to complete (e.g. coding, writing, Q&A etc.). The next iteration will be instructing LLMs to retrieve the required information and take action to complete an overall objective.
Auto-GPT and OpenAI Plugins are just a couple of examples leading in this direction. Every month, several builders are launching products in an attempt to claim various verticals, promising the integration of autonomous agents into your daily life.
I wanted to run my own experiment to see how well agents perform when instructed with a task and if multiple agents can improve the overall outcome of a given objective. Some code examples are influenced by Auto-GPT and BabyAGI but have been written in Go (sorry Python community) to learn and have better control over what I am trying to accomplish.
Here is everything we will look at:
GPT 3.5 vs GPT 4
One agent vs Three (or more?) agents
Objective outcomes
Improvements
One Agent, One Objective
First, let’s create a baseline and have one agent complete a given objective.
The objective: Create a landing page for a fintech startup that helps users analyze their financial data.
Prompt setup:
GPT 3.5
We can quickly summarize and move passed gpt-3.5 since it has a hard time following system prompt instructions. The outputs are inconsistent and generally not usable. We will come back to gpt-3.5 in a later test when we introduce more agents to see if there are improvements. It takes some playing around with the temperature to get an actual output for the “data” attribute. For now here is a working output:
{
"task": {
"command": {
"args": {
"fileDirectory": "/landing",
"fileName": "index.html",
"data": "<html><head><title>Financial Analysis</title></head><body><h1>Welcome to our Fintech Startup</h1><p>Here you can analyze your financial data and get valuable insights.</p></body></html>"
},
"name": "WriteFile"
},
"description": "Create a landing page for a fintech startup.",
"id": "12345"
},
"thoughts": {
"reasoning": "I used the 'WriteFile' command to create an HTML file with a simple landing page content in the specified directory.",
"speak": "I created a landing page for the fintech startup using the specified file directory and name.",
"text": "Created landing page."
}
}
Observations:
When it works, gpt-3.5 provides a basic usable output.
GPT 4
Even though gpt-4 is much better at following prompt instructions on a more consistent basis (still not perfect), it takes the objective more literally if it is vaguely defined. In the output with gpt-4 for the objective above, the agent is trying to create a website with functionality and not just a marketing website for a fictional fintech startup.
{
"task": {
"command": {
"args": {
"fileDirectory": "./",
"fileName": "index.html",
"data": "<!DOCTYPE html><html><head><meta charset='utf-8'><meta name='viewport' content='width=device-width, initial-scale=1.0'><title>Fintech Startup</title><link rel='stylesheet' href='styles.css'></head><body><header><h1>Analyze Your Financial Data</h1></header><main><section><h2>Upload your data</h2><input type='file' id='fileInput'><button onclick='analyze()'>Analyze!</button></section><section><h2>Your analysis:</h2><p id='analysisResults'>Results will be displayed here...</p></section></main><footer><p>© 2022 Fintech Startup. All rights reserved.</p></footer><script src='script.js'></script></body></html>"
},
"name": "WriteFile"
},
"description": "Write the index.html file",
"id": "1"
},
"thoughts": {
"reasoning": "Creating the landing page markup (HTML) to display the contents and provide user interaction.",
"speak": "I have written the HTML code for our fintech startup's landing page.",
"text": "I have constructed a proper landing page structure for a fintech startup using index.html file."
}
}
Observations:
A slightly more detailed output than gpt-3.5, but still basic given the simple objective. A more descriptive prompt will likely lead to better output with gpt-4.
Both outputs created a working website in html, which is a great start. However, let’s see how we can get even better quality out of this experiment.
The Task Manager
We can probably develop logic to interact with a single agent and allow it to iterate and progress towards the objective by incorporating human feedback for improved output. However, the primary goal of this experiment is achieving end-to-end automation using one or more LLM agents. With context and prompt limitations, it is difficult to see a single agent producing a result that completes a sizable objective. Objectives can only be so descriptive, and if we want to mimic the real world (or agile development practices for software) then we can look at an objective as an epic that requires a breakdown of stories and tasks with ongoing analysis of each.
An agent specifically instructed to breakdown an objective into various tasks can help create more granular steps for our engineering agent to work towards the objective and get a more complete output.
The objective: Create a landing page for a fintech startup that helps users analyze their financial data. The landing page should only have marketing content for the startup.
Task manager agent prompt:
We first call the task manager agent to get a list of tasks for the given objective and then have an engineering agent complete each task with context of the last completed task.
Updated prompt for the engineering agent:
GPT 4
Even with gpt-4’s ability to consistently follow instructions from a system prompt, there are still occasions of unexpected outputs. This is especially frustrating when trying to force a formatted response to then try to parsing it. Short of fine-tuning an open source LLM with formatted outputs, gpt-4 is still the best option available for experimenting with agents quickly. Here is the task manager’s output:
{
"tasks": [
{
"description": "Gather marketing content and requirements from the startup team",
"id": "1"
},
{
"description": "Design a high-level wireframe for the landing page",
"id": "2"
},
{
"description": "Determine the technology stack and necessary libraries for front-end development",
"id": "3"
},
{
"description": "Implement responsive design for mobile and desktop devices",
"id": "4"
},
{
"description": "Develop the landing page with designed wireframe and collected marketing content",
"id": "5"
},
{
"description": "Optimize landing page for fast loading and SEO",
"id": "6"
},
{
"description": "Test landing page for cross-browser compatibility, accessibility, performance, and ensure it meets the requirements",
"id": "7"
},
{
"description": "Deploy the landing page to a production environment",
"id": "8"
},
{
"description": "Continuously review and analyze landing page engagement and optimize as needed",
"id": "9"
}
],
"thoughts": {
"reasoning": "I've identified the least number of engineering-related tasks to create a landing page for the fintech startup. Each task is ordered by priority and grouped by their relevance.",
"speak": "I've prioritized and grouped the necessary engineering tasks to build the fintech startup's landing page. Please review the task list to ensure it aligns with your expectations.",
"text": "The prioritized list should help us stay focused on building the site efficiently and ensure we're meeting our objectives."
}
}
The previous task alone is probably not enough information for the engineering agent to complete the next task. The context limitations don’t allow injecting the output of all tasks back into the system prompt to infer from. A better setup, to get around these limitations, would be to create embeddings and pull all relevant information (from previous tasks) before completing a task.
Note: not all these tasks are mean’t to be completed by an engineering agent, but for the purposes of this test we will examine the output of a single agent before we introduce more agents.
## Call engineer agent to complete task: ##
{
"task": {
"command": {
"args": {
"fileName": "requirements.md",
"data": "- Marketing content: logos, banners, infographics, texts\n- Clear product description\n- Team bios and photos\n- Testimonials from users\n- Features and benefits list\n- Call-to-action with signup form\n- Security and trust badges\n- Pricing options and special offers\n- Footer with contact information, social media links, and legal policies",
"fileDirectory": "marketing"
},
"name": "WriteFile"
},
"description": "Gather marketing content and requirements from the startup team",
"id": 0
},
"thoughts": {
"reasoning": "To achieve the goal of the task, we need to collect all necessary marketing content and requirements from the startup team. By doing so, we can ensure that the landing page will have all the essential elements to make it informative and engaging.",
"speak": "Creating a requirements file with marketing content and requirements from the startup team.",
"text": "This file will serve as a reference for the information we need as we proceed with the creation of the landing page."
}
}
## Starting next task ##
## Call engineer agent to complete task: ##
{
"task": {
"command": {
"args": {
"fileDirectory": "design",
"fileName": "wireframe.md",
"data": "1. Header with logo and navigation\n2. Hero section with banners, product description, and call-to-action\n3. Features and benefits section\n4. Testimonials section\n5. Team bios and photos\n6. Security and trust badges\n7. Pricing options and special offers\n8. Footer with contact information, social media links, and legal policies"
},
"name": "WriteFile"
},
"description": "Design a high-level wireframe for the landing page",
"id": 0
},
"thoughts": {
"reasoning": "To design a high-level wireframe, I'm going to create a simple outline of the structure for the landing page, which will later serve as a basis for more detailed designs. I'll use the gathered information from the previous task to guide the structure definition.",
"speak": "I've created a high-level wireframe for the landing page. The structure now reflects all the key components required.",
"text": "Now the wireframe is set, we can focus on the next steps of building the landing page."
}
}
## Starting next task ##
## Call engineer agent to complete task: ##
{
"task": {
"command": {
"args": {
"data": "1. React for the front-end framework\n2. Material-UI for the UI components\n3. Axios for HTTP requests\n4. React-Router for navigation\n5. Styled-components for CSS styling",
"fileDirectory": "tech-stack",
"fileName": "frontend-stack.md"
},
"name": "WriteFile"
},
"description": "Determine the technology stack and necessary libraries for front-end development",
"id": 0
},
"thoughts": {
"reasoning": "To complete the task, I identified a front-end stack that is well-suited for creating a modern and performant landing page. React was chosen for its ability to create interactive UIs, Axios for making AJAX requests, Material-UI for pre-built UI components, React-Router for client-side navigation, and Styled-components for enhanced CSS customization.",
"speak": "I have determined the front-end technology stack to use for the landing page and written it to the file 'frontend-stack.md' in the 'tech-stack' directory.",
"text": "I've identified a suitable technology stack for the front-end development and saved it in the relevant file."
}
}
## Starting next task ##
## Call engineer agent to complete task: ##
{
"task": {
"id": 0,
"description": "Implement responsive design for mobile and desktop devices",
"command": {
"name": "WriteFile",
"args": {
"fileDirectory": "responsive-design",
"fileName": "responsive-styles.md",
"data": "@media (max-width: 768px) {\n /* Mobile styles */\n}\n\n@media (min-width: 769px) {\n /* Desktop styles */\n}"
}
}
},
"thoughts": {
"reasoning": "To implement responsive design, we need to write media queries targeting specific breakpoints for mobile and desktop devices. This will allow us to apply different styles for each device type.",
"speak": "I will write responsive design styles using media queries targeting mobile and desktop devices.",
"text": "Creating a file with media queries for mobile and desktop styles"
}
}
## Starting next task ##
## Call engineer agent to complete task: ##
{
"task": {
"command": {
"args": {
"fileDirectory": "landing-page",
"fileName": "index.html",
"data": "<!DOCTYPE html>\n<html lang='en'>\n<head>\n <meta charset='utf-8'>\n <meta name='viewport' content='width=device-width, initial-scale=1.0'>\n <title>Fintech Startup</title>\n <link rel='stylesheet' href='responsive-design/responsive-styles.md'>\n</head>\n<body>\n <header>\n <h1>Welcome to Fintech Startup</h1>\n </header>\n <main>\n <section>\n <h2>Analyze Your Financial Data</h2>\n <p>Our platform provides a seamless financial data analysis experience. Leverage our tools to gain insights into your spending habits and make informed financial decisions.</p>\n </section>\n </main>\n <footer>\n <p>© 2022 Fintech Startup. All rights reserved.</p>\n </footer>\n</body>\n</html>"
},
"name": "WriteFile"
},
"description": "Develop the landing page with designed wireframe and collected marketing content",
"id": 0
},
"thoughts": {
"reasoning": "I have selected WriteFile command to create the landing page using the provided HTML structure and content. I have also made sure to include the previously developed responsive styles.",
"speak": "I have created the landing page using the given wireframe and marketing content.",
"text": "I have written the landing page content to an index.html file, and also included the responsive styles that were previously implemented."
}
}
## Starting next task ##
## Call engineer agent to complete task: ##
{
"task": {
"command": {
"args": {
"fileDirectory": "landing-page",
"fileName": "index.html",
"data": "<!DOCTYPE html><html lang='en'><head><meta charset='utf-8'><meta name='viewport' content='width=device-width, initial-scale=1.0'><meta name='description' content='Fintech Startup - Analyze Your Financial Data'><title> Fintech Startup | Financial Data Analysis </title><link rel='stylesheet' href='responsive-design/responsive-styles.min.css'> </head><body><header><h1>Welcome to Fintech Startup</h1></header><main><section><h2>Analyze Your Financial Data</h2><p>Our platform provides a seamless financial data analysis experience. Leverage our tools to gain insights into your spending habits and make informed financial decisions.</p></section></main><footer><p>© 2022 Fintech Startup. All rights reserved.</p></footer></body></html>"
},
"name": "WriteFile"
},
"description": "Optimize landing page for fast loading and SEO",
"id": 0
},
"thoughts": {
"reasoning": "I added a meta description tag, updated the title and href tag values, and compressed the file content to remove unnecessary spaces to optimize the provided HTML code.",
"speak": "I have optimized the landing page by including a meta description tag for SEO, updating the title and href tag values, and compressing the file content.",
"text": "Added a meta description tag to improve SEO. Updated title and href tags. Compressed HTML to reduce file size and improve load times."
}
}
...
Since there are more granular tasks to work from, the LLM agent can produce more detailed outputs. It can work to incorporate the output from a previous task into the next task instead of trying to produce a single complete output from a given prompt. This is still not perfect since any single task might need review and iterations to complete.
Observations:
The “id” field doesn’t map properly. Tried different prompt variations but no real luck. I am sure there is some prompt hack that can get the agent to reuse the same “id”.
Some tasks are not complete or go in the wrong direction. Without feedback and iterations with the agent we are stuck with these results.
Overall better outputs with the breakdown of tasks even if the agent doesn’t have all the information. To improve this, in a future experiment we can create an agent for each business function and have the task manager delegate to the right agent.
The “Lead” Engineer
Let’s add a few more constraints before a task can be considered complete. In this test we introduce a lead engineer agent to review each task and decide whether it is complete or not before the engineer agent can move onto the next task. The engineer can ask questions to the lead and the lead can leave comments on any given task.
Adding another command for the engineer:
The lead engineer prompt:
GPT 3.5
Aside from frequent hallucinations and formatting issues, there is still something to learn when adding this additional agent to test gpt-3.5. LLMs that are subpar might benefit from “self-refinement” techniques. In the numerous tests I ran, more often than not, when the initial engineer agent hallucinated or didn’t complete the task, the lead engineer agent was able to review and provide guidance on what was missing. Unfortunately, this doesn’t always lead to better outputs with gpt-3.5 but can be a technique to use if you are working with LLMs that hallucinate often or haven’t followed all the instructions. Here is one example:
{
"tasks": [
{
"description": "Create wireframes for the landing page",
"id": "WF001"
},
{
"description": "Design UI for the landing page",
"id": "UI002"
},
...
],
"thoughts": {
"reasoning": "In order to create a successful landing page for the fintech startup, it is important to start with wireframes and design the UI. Once the design is approved, a prototype can be developed and integrated with analytics and data analysis tools. Adding website copy and call-to-action buttons and forms is crucial for engagement and lead generation. Optimization for SEO and testing on different devices and browsers is also important for a seamless user experience.",
"speak": "Here are the tasks needed to complete the landing page for the fintech startup",
"text": "Tasks for fintech startup landing page."
}
}
## Call engineer agent to complete task: ##
{
"task": {
"command": {
"args": {
"fileDirectory": "./",
"fileName": "wireframes.txt",
"data": "Here are the wireframes for the landing page: \n\n[insert wireframes]"
},
"name": "WriteFile"
},
"description": "Create wireframes for the landing page",
"id": 1
},
"thoughts": {
"reasoning": "Before starting development on the landing page, wireframes need to be created to provide a visual layout. Writing these wireframes to a file will allow for easy access and sharing with the rest of the team.",
"speak": "Wireframes have been created and written to a file.",
"text": "Wireframes created and written to a file."
}
}
## Call lead engineer agent to review task or ask a question: ##
{
"task": {
"command": {
"args": {
"data": "Here are the wireframes for the landing page: \n\n[insert wireframes]",
"fileDirectory": "./",
"fileName": "wireframes.txt"
},
"name": "WriteFile"
},
"description": "Create wireframes for the landing page",
"id": 1,
"leadComments": "This command is correct for writing the wireframes to a text file. However, before executing this command, make sure to actually create the wireframes and replace the '[insert wireframes]' part in the data argument with the actual wireframe images or diagrams.",
"leadSignOff": false
}
}
...
Another example:
...
## Call engineer agent to complete task: ##
{
"task": {
"command": {
"args": {
"fileDirectory": "./",
"fileName": "stakeholder_requirements.txt",
"data": "List of stakeholder requirements and preferences:"
},
"name": "WriteFile"
},
"description": "Gather requirements and preferences from stakeholders",
"id": 1
},
"thoughts": {
"reasoning": "We need to gather requirements and preferences from our stakeholder to create a landing page that meets their needs. It's important to document this information for reference when creating the landing page.",
"speak": "I will document requirements and preferences from the stakeholders by writing them to a file.",
"text": "Documenting stakeholder requirements and preferences."
}
}
## Call lead engineer agent to review task or ask a question: ##
{
"task": {
"id": 1,
"description": "Gather requirements and preferences from stakeholders",
"command": {
"name": "WriteFile",
"args": {
"data": "List of stakeholder requirements and preferences:",
"fileDirectory": "./",
"fileName": "stakeholder_requirements.txt"
}
},
"leadComments": "The command looks good, but it would be helpful to gather requirements and preferences from the stakeholders before saving them to a file. Have you completed the task of gathering the requirements and preferences from the stakeholders?",
"leadSignOff": false
}
}
## Call engineer agent to complete task: ##
{
"task": {
"command": {
"args": {
"question": "What are the key features and messaging that the stakeholders want to see on the landing page?"
},
"name": "AskQuestion"
},
"description": "Gather requirements and preferences from stakeholders",
"id": 1
},
"thoughts": {
"reasoning": "We need to gather requirements and preferences from our stakeholders to create a landing page that meets their needs. Asking them directly about their key features and messaging preferences will give us a better idea of how to proceed.",
"speak": "I will ask the stakeholders about their key features and messaging preferences to better understand what they want to see on the landing page.",
"text": "Asking stakeholders about key features and messaging preferences."
}
}
...
Observations:
In both examples, the lead engineer catches the issues with the command and provides direction to properly complete the task.
Unfortunately, this test didn’t help the engineer agent complete the task and instead needed additional information.
GPT 4
Since gpt-4 is pretty consistent at following instructions, it seems there are rare instances of the engineer needing to ask a question or the lead engineer needing to leave comments for a task. This doesn’t help us much since we want cases of deeper analysis and “back and forth” review before a task is considered complete. We can examine the output to see if there were occurrences where the lead engineer missed an opportunity to comment and improve an output:
{
"tasks": [
{
"description": "Define project scope and functional requirements",
"id": "task_1"
},
{
"description": "Brainstorm and plan design concept for the landing page",
"id": "task_2"
},
{
"description": "Create wireframes and mockups for the landing page",
"id": "task_3"
},
{
"description": "Develop responsive HTML, CSS, and JavaScript templates",
"id": "task_4"
},
{
"description": "Integrate marketing content and multimedia elements",
"id": "task_5"
},
...
],
"thoughts": {
"reasoning": "This plan considers the necessary steps for achieving the objective, starting with project scoping, planning, and functional requirements. Then, it moves on to the design and development phase, followed by optimization and testing. Finally, it concludes with deployment.",
"speak": "We'll start by defining the project scope and requirements, then move on to the design and development phase. We'll optimize the page for search engines and test for compatibility before deploying to a web hosting environment.",
"text": "This approach ensures a smooth progression from planning to deployment, while keeping our focus on engineering tasks and delivering a high-quality landing page for the fintech startup."
}
}
## Call engineer agent to complete task: ##
{
"task": {
"command": {
"args": {
"data": "Project Scope:\n1. Create a visually appealing landing page for the fintech startup\n2. Display marketing content that communicates the value proposition of the startup\n3. Be mobile responsive and accessible across various devices\n\nFunctional Requirements:\n1. Navigation menu\n2. Header - High-impact visual (image or video) with the startup's value proposition\n3. Features - Clean layout of core features and benefits\n4. How it works - Step-by-step process\n5. About us - Information about the startup and team\n6. Testimonials - User reviews and feedback\n7. Call-to-action - Sign up or request demo button\n8. Footer - Links to social media, terms, privacy policy, and contact information",
"fileDirectory": "docs",
"fileName": "project_scope_and_functional_requirements.txt"
},
"name": "WriteFile"
},
"description": "Define project scope and functional requirements",
"id": 0
},
"thoughts": {
"reasoning": "I have defined the project scope and functional requirements for a landing page for the fintech startup. I made sure to address relevant sections for the landing page and listed the related functional requirements.",
"speak": "I have defined the project scope and functional requirements, which will be saved in a file called 'project_scope_and_functional_requirements.txt' in the 'docs' directory.",
"text": "Project scope and functional requirements defined and ready to be saved"
}
}
## Call lead engineer agent to review task or ask a question: ##
{
"task": {
"command": {
"args": {
"data": "Project Scope:\n1. Create a visually appealing landing page for the fintech startup\n2. Display marketing content that communicates the value proposition of the startup\n3. Be mobile responsive and accessible across various devices\n\nFunctional Requirements:\n1. Navigation menu\n2. Header - High-impact visual (image or video) with the startup's value proposition\n3. Features - Clean layout of core features and benefits\n4. How it works - Step-by-step process\n5. About us - Information about the startup and team\n6. Testimonials - User reviews and feedback\n7. Call-to-action - Sign up or request demo button\n8. Footer - Links to social media, terms, privacy policy, and contact information",
"fileDirectory": "docs",
"fileName": "project_scope_and_functional_requirements.txt"
},
"name": "WriteFile"
},
"description": "Define project scope and functional requirements",
"id": 0,
"leadComments": "",
"leadSignOff": true
}
}
## Task completed! ##
## Starting next task ##
## Call engineer agent to complete task: ##
{
"task": {
"command": {
"args": {
"fileDirectory": "docs",
"fileName": "design_concept_plan.txt",
"data": "Design Concept Plan:\n1. Color scheme - Use the brand's primary colors along with white and gray for a clean and modern look\n2. Typography - Use a combination of serif and sans-serif fonts for better readability and visual hierarchy\n3. Navigation menu - Incorporate a sticky menu so that users can easily access different sections of the page\n4. Header - Use a background image or video that fits the fintech theme, overlay a clear value proposition statement and a prominent call-to-action button\n5. Features, How it works, and About us - Use clear and concise text, well-organized layout, and relevant icons or images\n6. Testimonials - Use a carousel or grid layout for displaying user reviews, include user photos and short quotes\n7. Call-to-action section - Re-emphasize the value proposition with attractive visuals and another prominent call-to-action\n8. Footer - Neatly present the social media links, terms, privacy policy, and contact information, using easy-to-read typography and subtle colors"
},
"name": "WriteFile"
},
"description": "Brainstorm and plan design concept for the landing page",
"id": 0
},
"thoughts": {
"reasoning": "I am brainstorming a design concept plan for the landing page that covers visual elements and layout, considering the requirements and objectives from the previous task. This plan will serve as a guide for the implementation of the landing page. It includes the color scheme, typography, navigation menu, header, content sections, and footer.",
"speak": "I have come up with a design concept plan that covers the visual elements and layout of the landing page. It is focused on achieving a clean and modern look while effectively communicating the startup's value proposition.",
"text": "Brainstorming and planning design concept for the landing page considering the project scope and functional requirements."
}
}
## Call lead engineer agent to review task or ask a question: ##
{
"task": {
"id": 0,
"description": "Brainstorm and plan design concept for the landing page",
"command": {
"name": "WriteFile",
"args": {
"data": "Design Concept Plan:\n1. Color scheme - Use the brand's primary colors along with white and gray for a clean and modern look\n2. Typography - Use a combination of serif and sans-serif fonts for better readability and visual hierarchy\n3. Navigation menu - Incorporate a sticky menu so that users can easily access different sections of the page\n4. Header - Use a background image or video that fits the fintech theme, overlay a clear value proposition statement and a prominent call-to-action button\n5. Features, How it works, and About us - Use clear and concise text, well-organized layout, and relevant icons or images\n6. Testimonials - Use a carousel or grid layout for displaying user reviews, include user photos and short quotes\n7. Call-to-action section - Re-emphasize the value proposition with attractive visuals and another prominent call-to-action\n8. Footer - Neatly present the social media links, terms, privacy policy, and contact information, using easy-to-read typography and subtle colors",
"fileDirectory": "docs",
"fileName": "design_concept_plan.txt"
}
},
"leadComments": "",
"leadSignOff": true
}
}
## Task completed! ##
## Starting next task ##
## Call engineer agent to complete task: ##
{
"task": {
"command": {
"args": {
"data": "Landing Page Wireframe and Mockup:\n\n1. Header:\n - Background image or video\n - Value proposition statement\n - Call-to-action button\n2. Navigation menu:\n - Sticky design\n - Links to Features, How it works, About us, Testimonials, and Call-to-action section\n3. Features:\n - 3 columns layout\n - Icon + short title + description\n4. How it works:\n - Step-by-step infographic or illustration\n - Numbered steps + short descriptions\n5. About us:\n - Company background + team members\n - Photos and short bios\n6. Testimonials:\n - Carousel or grid layout\n - User photo, name, and short quote\n7. Call-to-action section:\n - Re-emphasize value proposition\n - Visuals\n - Prominent call-to-action button\n8. Footer:\n - Social media links\n - Terms, Privacy policy, and Contact information",
"fileDirectory": "docs",
"fileName": "landing_page_wireframe_mockup.txt"
},
"name": "WriteFile"
},
"description": "Create wireframes and mockups for the landing page",
"id": 0
},
"thoughts": {
"reasoning": "Based on the design concept plan in the previous task, I've outlined the wireframe and mockup elements for the fintech startup's landing page. I will save this information in a text file.",
"speak": "I have created wireframes and mockups for the landing page, incorporating key elements from the design concept plan. These include the header, navigation menu, features, how it works, about us, testimonials, call-to-action section, and footer.",
"text": "Create wireframes and mockups based on the design concept plan and save it in a text file."
}
}
## Call lead engineer agent to review task or ask a question: ##
{
"task": {
"id": 0,
"description": "Create wireframes and mockups for the landing page",
"command": {
"name": "WriteFile",
"args": {
"data": "Landing Page Wireframe and Mockup:\n\n1. Header:\n - Background image or video\n - Value proposition statement\n - Call-to-action button\n2. Navigation menu:\n - Sticky design\n - Links to Features, How it works, About us, Testimonials, and Call-to-action section\n3. Features:\n - 3 columns layout\n - Icon + short title + description\n4. How it works:\n - Step-by-step infographic or illustration\n - Numbered steps + short descriptions\n5. About us:\n - Company background + team members\n - Photos and short bios\n6. Testimonials:\n - Carousel or grid layout\n - User photo, name, and short quote\n7. Call-to-action section:\n - Re-emphasize value proposition\n - Visuals\n - Prominent call-to-action button\n8. Footer:\n - Social media links\n - Terms, Privacy policy, and Contact information",
"fileDirectory": "docs",
"fileName": "landing_page_wireframe_mockup.txt"
}
},
"leadComments": "",
"leadSignOff": true
}
}
## Task completed! ##
## Starting next task ##
## Call engineer agent to complete task: ##
{
"task": {
"command": {
"args": {
"fileDirectory": "src",
"fileName": "index.html",
"data": "<!DOCTYPE html>\n<html lang='en'>\n<head>\n <meta charset='UTF-8'>\n <meta name='viewport' content='width=device-width, initial-scale=1.0'>\n <meta name='description' content='A fintech startup that helps user's analyze their financial data'>\n <title>Fintech Startup</title>\n</head>\n<body>\n<header></header>\n<nav></nav>\n<main>\n <section class='features-section'></section>\n <section class='how-it-works-section'></section>\n <section class='about-us-section'></section>\n <section class='testimonials-section'></section>\n <section class='call-to-action-section'></section>\n</main>\n<footer></footer>\n<script src='main.js'></script>\n</body>\n</html>"
},
"name": "WriteFile"
},
"description": "Develop responsive HTML, CSS, and JavaScript templates",
"id": 0
},
"thoughts": {
"reasoning": "I will first create the index.html file for the landing page based on the wireframe and mockup. The HTML file will contain a header, navigation menu, main content area (with sections for features, how it works, about us, testimonials, and CTA), and a footer.",
"speak": "Creating the HTML file for the landing page.",
"text": "Generate HTML template"
}
}
## Call lead engineer agent to review task or ask a question: ##
{
"task": {
"command": {
"args": {
"data": "<!DOCTYPE html>\n<html lang='en'>\n<head>\n <meta charset='UTF-8'>\n <meta name='viewport' content='width=device-width, initial-scale=1.0'>\n <meta name='description' content='A fintech startup that helps user's analyze their financial data'>\n <title>Fintech Startup</title>\n</head>\n<body>\n<header></header>\n<nav></nav>\n<main>\n <section class='features-section'></section>\n <section class='how-it-works-section'></section>\n <section class='about-us-section'></section>\n <section class='testimonials-section'></section>\n <section class='call-to-action-section'></section>\n</main>\n<footer></footer>\n<script src='main.js'></script>\n</body>\n</html>",
"fileDirectory": "src",
"fileName": "index.html"
},
"name": "WriteFile"
},
"description": "Develop responsive HTML, CSS, and JavaScript templates",
"id": 0,
"leadComments": "",
"leadSignOff": true
}
}
## Task completed! ##
## Starting next task ##
## Call engineer agent to complete task: ##
{
"task": {
"command": {
"args": {
"fileDirectory": "src",
"fileName": "index.html",
"data": "<!DOCTYPE html>\n<html lang='en'>\n<head>\n <meta charset='UTF-8'>\n <meta name='viewport' content='width=device-width, initial-scale=1.0'>\n <meta name='description' content='A fintech startup that helps user's analyze their financial data'>\n <title>Fintech Startup</title>\n <link rel='stylesheet' href='styles.css'>\n</head>\n<body>\n<header>\n <h1>Welcome to Fintech!</h1>\n</header>\n<nav>\n <ul>\n <li><a href='#features'>Features</a></li>\n <li><a href='#how-it-works'>How it works</a></li>\n <li><a href='#about-us'>About Us</a></li>\n <li><a href='#testimonials'>Testimonials</a></li>\n </ul>\n</nav>\n<main>\n <section class='features-section' id='features'>\n <h2>Features</h2>\n <p>Our platform provides the following features to help you analyze your financial data:</p>\n <ul>\n <li>Automated financial tracking</li>\n <li>Detailed spending reports</li>\n <li>Customizable budgeting tools</li>\n <li>Investment planning and tracking</li>\n </ul>\n </section>\n <section class='how-it-works-section' id='how-it-works'>\n <h2>How it works</h2>\n <ol>\n <li>Sign up for a free account</li>\n <li>Connect your financial accounts</li>\n <li>Analyze your spending habits and set goals</li>\n <li>Stay on track with personalized insights and recommendations</li>\n </ol>\n </section>\n <section class='about-us-section' id='about-us'>\n <h2>About Us</h2>\n <p>We are a team of financial experts and technologists passionate about helping you make smarter financial decisions. Our mission is to provide you with the tools and insights you need to optimize your finances and grow your wealth.</p>\n </section>\n <section class='testimonials-section' id='testimonials'>\n <h2>Testimonials</h2>\n <blockquote>\n <p>Fintech has changed the way I manage my finances for the better. The platform provides me with valuable insights that help me stay on track and meet my financial goals.</p>\n <footer>- Happy customer</footer>\n </blockquote>\n </section>\n <section class='call-to-action-section'>\n <h2>Ready to start optimizing your finances?</h2>\n <a href='#'>Sign up now</a>\n </section>\n</main>\n<footer>\n <p>© 2023 Fintech Startup. All rights reserved.</p>\n</footer>\n<script src='main.js'></script>\n</body>\n</html>"
},
"name": "WriteFile"
},
"description": "Integrate marketing content and multimedia elements",
"id": 0
},
"thoughts": {
"reasoning": "I have integrated marketing content for the fintech startup by adding headings, navigation, and descriptive text to the different sections of the landing page. This should clearly communicate the value proposition, features, and user experience to site visitors.",
"speak": "I have updated the HTML file with marketing content and integrated it into the appropriate sections. Users should now see a well-structured and informative landing page.",
"text": "I updated the HTML file with marketing content to make the landing page informative and well-organized for users, which covers the task objective."
}
}
## Call lead engineer agent to review task or ask a question: ##
{
"task": {
"command": {
"args": {
"data": "<!DOCTYPE html>\n<html lang='en'>\n<head>\n <meta charset='UTF-8'>\n <meta name='viewport' content='width=device-width, initial-scale=1.0'>\n <meta name='description' content='A fintech startup that helps user's analyze their financial data'>\n <title>Fintech Startup</title>\n <link rel='stylesheet' href='styles.css'>\n</head>\n<body>\n<header>\n <h1>Welcome to Fintech!</h1>\n</header>\n<nav>\n <ul>\n <li><a href='#features'>Features</a></li>\n <li><a href='#how-it-works'>How it works</a></li>\n <li><a href='#about-us'>About Us</a></li>\n <li><a href='#testimonials'>Testimonials</a></li>\n </ul>\n</nav>\n<main>\n <section class='features-section' id='features'>\n <h2>Features</h2>\n <p>Our platform provides the following features to help you analyze your financial data:</p>\n <ul>\n <li>Automated financial tracking</li>\n <li>Detailed spending reports</li>\n <li>Customizable budgeting tools</li>\n <li>Investment planning and tracking</li>\n </ul>\n </section>\n <section class='how-it-works-section' id='how-it-works'>\n <h2>How it works</h2>\n <ol>\n <li>Sign up for a free account</li>\n <li>Connect your financial accounts</li>\n <li>Analyze your spending habits and set goals</li>\n <li>Stay on track with personalized insights and recommendations</li>\n </ol>\n </section>\n <section class='about-us-section' id='about-us'>\n <h2>About Us</h2>\n <p>We are a team of financial experts and technologists passionate about helping you make smarter financial decisions. Our mission is to provide you with the tools and insights you need to optimize your finances and grow your wealth.</p>\n </section>\n <section class='testimonials-section' id='testimonials'>\n <h2>Testimonials</h2>\n <blockquote>\n <p>Fintech has changed the way I manage my finances for the better. The platform provides me with valuable insights that help me stay on track and meet my financial goals.</p>\n <footer>- Happy customer</footer>\n </blockquote>\n </section>\n <section class='call-to-action-section'>\n <h2>Ready to start optimizing your finances?</h2>\n <a href='#'>Sign up now</a>\n </section>\n</main>\n<footer>\n <p>© 2023 Fintech Startup. All rights reserved.</p>\n</footer>\n<script src='main.js'></script>\n</body>\n</html>",
"fileDirectory": "src",
"fileName": "index.html"
},
"name": "WriteFile"
},
"description": "Integrate marketing content and multimedia elements",
"id": 0,
"leadSignOff": true
}
}
## Task completed! ##
...
Observations:
The engineer agent doesn’t have any issues completing each task, but there are some tasks that could be better completed with input from a properly defined product management agent.
The lead engineer signs off on all tasks without any comment. From my testing with gpt-4, this was the usual result. A more aggressive prompt could force the lead engineer to comment on each task and improve outputs (even if unnecessary).
The outputs are still pretty minimal. Additional style guides and tools could direct the agent to create more creative code.
Other Improvements
This experiment involved a single LLM being instructed to act as different types of agents, so the referenced knowledge base is the same (the model’s training data). Some potential improvements include assigning each agent its own knowledge base and context associated with its identity, such as embeddings, to better complete their objective. Alternatively, using multiple LLMs, each fine-tuned to perform very specific tasks, could also lead to better results.
Recently, there was a paper describing the use of a technique, Tree of Thoughts (ToT), to get better outputs from LLMs. The current popular modes of interacting with LLMs are input/output (for simple question and answer) or Chain of Thought (CoT) (for more complex questions). CoT can be described as LLMs maintaining coherence by considering the previous context (sequence of steps) to complete an objective (similar to task breakdown and previous task context in the examples above). ToT attempts to allow LLMs to explore multiple paths (tree nodes) each with their own partial solution (sequence of nodes).
“A specific instantiation of ToT involves answering four questions: 1. How to decompose the intermediate process into thought steps; 2. How to generate potential thoughts from each state; 3. How to heuristically evaluate states; 4. What search algorithm to use.”
I am sure there will be more research and discoveries in the coming months.
Conclusion
This is a brief experiment that involves multiple agents working together towards a shared objective. It also shows how LLMs can potentially be improved to provide better outputs with “self-refinement” techniques. The results are varied, but hoping this can offer insight on what is possible and inspire new ideas for solving today’s challenges when building agents.
Looking to interview startups or engineers building with LLMs. Reach out: samir@sudoapps.com.