Notes from the AutoGPT Frontier

Today is a bank holiday, so I had some time to peruse my AI and ML backlog and ended up playing with SuperAGI, which is a cute (but hilariously and erroneously misnamed) tool that tries to provide an AutoGPT-like harness and provide GPT with access to web search, scraping tools, etc.

I liked the looks of it and the fact that it uses Celery as a task pool (plus the overall codebase looks reasonable, although it seems to suffer from an excess of OOP), so I got it running inside a sandbox and asked it to summarize the top ten Hacker News posts.

Here’s what it came up with, verbatim from the next-to-last step (before it wrote the summary to a file):

Tool ThinkingTool returned: Problem: Summarize the top ten items on the homepage of https://news.ycombinator.com.

Variables:
- Website: https://news.ycombinator.com
- Top ten items: articles or posts displayed on the homepage of the website
- Summary: a brief description or overview of each of the top ten items

Approach:
1. Visit the website https://news.ycombinator.com
2. Identify the top ten items displayed on the homepage
3. Read each item and extract the main idea or topic
4. Write a brief summary for each item, highlighting the key points
5. Organize the summaries in a clear and concise manner

Decision:
Since the task requires summarizing the top ten items on the homepage, I will focus on the articles or posts that are displayed on the main page. I will not include any ads or other content that may appear on the website. I will also aim to provide a balanced summary of each item, highlighting the main idea without being too lengthy or too brief.

Summary:
1. "The Future of Remote Work" - an article discussing the impact of remote work on businesses and employees, with insights from industry experts.
2. "Ask HN: What's your favorite online course?" - a forum post where users share their favorite online courses and resources for learning.
3. "The Rise of the Electric Scooter" - an article exploring the popularity of electric scooters as a mode of transportation in cities around the world.
4. "Show HN: A tool for creating beautiful color palettes" - a showcase of a web-based tool that helps users create color palettes for design projects.
5. "Why We Sleep and Why We Often Can't" - an article discussing the importance of sleep for physical and mental health, and the reasons why many people struggle to get enough sleep.
6. "Ask HN: What are your favorite productivity tools?" - a forum post where users share their favorite tools and strategies for staying productive.
7. "The Dark Side of Open Source" - an article exploring the potential downsides of open source software, including security risks and community conflicts.
8. "Show HN: A tool for generating realistic faces using AI" - a showcase of a web-based tool that uses artificial intelligence to generate realistic human faces.
9. "The Power of Good Design" - an article discussing the importance of good design in products and services, and the impact it can have on user experience and business success.
10. "Ask HN: What are some good books on entrepreneurship?" - a forum post where users share their recommendations for books on entrepreneurship and business.

The thing is, NONE OF THOSE ARTICLES ACTUALLY EXISTED. The HN home page looked nothing like it.

Somehow gpt-3.5-turbo decided to hallucinate them instead of using the web scraper, which just goes to show you how stupefyingly fallible these things are right now.

Glass half full kind of people will love the self-prompting aspect of this kind of approach and wax lyrical about how a is constructing “world models” to accomplish goals, but the truth is that we not only need more reliable models, but also a way to identify plausible hallucinations–and gpt-4 isn’t that much better.

This page is referenced in: