In this episode of In-Ear Insights, the Trust Insights podcast, Katie and Chris discuss the evolution of autonomous AI agents and the promise of Level 5 systems. You’ll discover the core differences between simple AI tools and complex agent swarms with persistent memory. You’ll learn how to apply the 5P framework to prevent your AI projects from spiraling out of control. You’ll explore the real-world benefits and risks of offloading high-level executive decision-making to software. You’ll see how these emerging technologies change the way you manage workflows and team communication.
00:00 – Introduction
02:15 – Defining Level 5 Agentic AI
05:30 – The 5P Framework and AI Guardrails
08:45 – Managing Virtual CEO Agents
12:20 – Risks of Autonomous Systems
15:10 – Why Persistent Memory Matters
18:40 – Call to action
Watch this episode to decide if your business is ready for autonomous AI agents.
Watch the video here:
Can’t see anything? Watch it on YouTube here.
Listen to the audio here:
- Need help with your company’s data and analytics? Let us know!
- Join our free Slack group for marketers interested in analytics!
[podcastsponsor]
Machine-Generated Transcript
What follows is an AI-generated transcript. The transcript may contain errors and is not a substitute for listening to the episode.
Christopher S. Penn: In this week’s In-Ear Insights, let’s talk about the fifth level of agentic AI enablement. Agentic AI is level three, which includes tools like Claude Code and Claude Cowork. Level four is tools like Hermes Agent, which we’ve covered on past episodes of the podcast and the live stream, which you can check out at TrustInsights.ai/YouTube. This week, let’s talk about level five systems.
Level five systems are systems that have persistent memory, meaning that they can act fully autonomously on their own. The most recent generation of AI models like Claude Opus 4.8 and the brand-new Mini Max M3 allow for this to happen. So, Katie, one of the most popular development systems for this is a system called Paperclip, which allows you to effectively give a virtual CEO a mission. It goes off and starts hiring other people—virtual people—inside its little environment to build its own little agency to go and do something, which is sort of where level five is. We’ve talked about this in the past. Based on what I’ve roughly just described, I imagine you probably have some questions.
Katie Robbert: Questions, reactions. First question, why? I think the thing, my struggle with a lot of this is, and I know it’s very cliché, but just because you can doesn’t mean you should. I know we’re talking about level five of AI, but is that really necessary? That’s sort of where I’m stuck. I understand that there’s probably some really good use cases for that specific level. My challenge is that people who don’t need that level of technology, who are maybe level one or level two, maybe don’t even need that, are ditching everything they’ve ever known to go try to reach level five. That’s just a little bit about where my head’s at to sort of set the stage for my questions. I guess my first question is, why this software? Why not something else? What is so special about Paperclip that I can’t do with anything else that I’m already paying for?
Christopher S. Penn: That’s a good question. Paperclip is one of those tools that is hot within the AI dev community because it’s the first of its kind. Just like OpenClaw was the first of a kind. It doesn’t mean it’s the best and it isn’t even the only one anymore. A lot of other providers are starting to catch up. For example, we talked about Hermes Agent on the live stream a couple of weeks ago and in that time people have looked at it and said, “This is really cool, this is better than OpenClaw.”
The concept of these autonomous employees has leveled up to the point where Anthropic has come out and said, “Here is Claude Managed Agents,” because clearly you all want this, but maybe you’d like some governance around it. So here is our version of OpenClaw. Every provider—Google announced at I/O Gemini Spark, which is their version of OpenClaw. So they are starting to get to that level four. Paperclip is the first of its kind in the level five setup, if you will. It’s janky. We’re going to see it on the live stream this week. It is not particularly user-friendly, it’s super buggy. But it is the first of its kind that allows you to do this kind of high-level abstraction. So to play off of where you were starting with this, let’s put it in the 5P framework.
Katie Robbert: For those who don’t know, the 5P framework by Trust Insights is Purpose, People, Process, Platform, Performance. Purpose: What is the question you’re answering? What is the problem you’re solving? People: Who’s involved both internally and externally to your organization? Process: How are you doing this? This is your standard operating procedures. Platform: What tools are you using? And Performance: Did you achieve the output that you outlined? If you want to learn more, go to TrustInsights.ai/5P-framework. So, Chris, I think the big question I have is what is the purpose of it? When I hear you say it’s hot in the dev community, I’ll be honest, having worked in dev for a long time, that means nothing to me. You guys are all about the shiny object, the shortcuts, the “let me do this faster so that I can hustle and create more.” I don’t buy into it. So I need a better reason to pay attention to Paperclip other than devs are excited about it.
Christopher S. Penn: The number one use case right now that I use it for is R&D. If I give the virtual CEO of my nascent little AI agents here—which I’ve named Katie, with an AI in the middle—a remit, like, “You are the CEO of an events management company and I want you to go out and get everything set up to run a workshop in the Boston area.” You have to go and hire a team to find a venue, to get pricing, catering, logistics, transportation, all the stuff that you normally hire a third-party human agency to do. I would say, “Why don’t you, the machine, go and set yourself up with what you need to do?”
The CEO agent starts and says, “Well, I’m the CEO, I’m not going to do all this work by myself. I can hire up to however many agents your environment permits. I’m going to go and hire a salesperson, a marketing person, an operations person, a finance person,” and so on. It goes off and does these things, and ideally, what comes back is a fully baked plan, or depending on the system and your goals and objectives, an actual full operation. One of the questionable use cases I’ve seen in production is a CEO of a very speculative investment firm that looks at Polymarket and says, “Okay CEO, you have to turn $25 into $100. Go and figure out how to bet on Polymarket and hire all the statisticians you need to build the predictive algorithms to decide what you should bet on.” It would go off and do it, and if it has access to the Polymarket API which you would give it, it will go and actually start placing those bets. That’s a more tangible outcome.
Katie Robbert: What strikes me is that these CEOs are being programmed by people who aren’t CEOs and aren’t willing to pay for a CEO. That’s my hang-up. If you’re not a CEO, it’s not a mythical role that people can’t do; there is something to be said about having the actual experience to manage a company the way a good, solid CEO does. Not a sketchy “I just gave myself the title and now I can make all kinds of risky, bad decisions.” That sounds incredibly dangerous. I saw one of the comments in our free Slack community this morning that you were having with one of our community members, Chris. If you want to join the free Slack community, go to Trust Insights, Analytics for Marketers; it’s totally free to join. You can see Chris and I argue all day, like we were about to this morning, where he said AI is replacing the CEO. I was like, “Hi, still here.”
The way that you’re describing a CEO being designed is so sketchy, so risky, and so dangerous. We’ve been saying this about AI since the jump: there are so many dangerous parts to the way people are using AI in a very uneducated and impulsive way. If you are just, “Well, I don’t have a CEO and I’ve never been a CEO, but let me go ahead and stand up a virtual CEO,” that to me is a problem.
Christopher S. Penn: In the context of that discussion, I was discussing token usage and how one company burned $500 million of Anthropic Claude credits. Our community member Todd commented that putting a company out of business is the CEO’s job. I said, “Well, AI has replaced the CEO to make the company go bankrupt here.” We didn’t clarify it was a good CEO; it was just a CEO. There are plenty of folks who are really not good.
Katie Robbert: But I think that’s the distinction I want to draw. We’re talking about how you can build your own virtual agency and your virtual CEO, and the CEO can hire people. That doesn’t mean that it’s any good. I want to be 100% clear about that because I don’t want to give the wrong information to people that this is a good replacement for a human.
Christopher S. Penn: Correct. Unless you have done all the research and done the planning and you have built your system using the 5P framework by Trust Insights where at least there are more guardrails. That’s really what the 5P framework does best: it puts guardrails around AI to say, “This is the purpose, this is how you perform, this is your definition of done in process, and these are the things you shouldn’t do because we know it’s going to go off the rails if you do X, Y, and Z.” If you don’t have that, then I 100% agree with you, Katie; that system is going to go off the rails faster than Keith Richards driving at Formula One.
Katie Robbert: Yeah, don’t try to do the analogy thing. The other question I have: you talked about how on this week’s live stream—which, as Chris mentioned, you can find at Trust Insights, AI, YouTube; we do it every week at 1:00 PM Eastern on Thursdays—we’re going to be talking about how you used Paperclip to set up this virtual agency. I’m going to have a million questions at that point. Without getting too deep into the live stream, did you just say, “Hey CEO, go find a way to do this workshop,” or did you actually give it some structure to say, “This is a budget and these are the guardrails”? How autonomous can one of these systems really be? How much do you have to instruct it versus how much can it just figure out on its own?
Christopher S. Penn: If you let it figure things out on its own, it tends to create the highest probability things based on the underlying model. When I do it, I follow the 5P framework. I use our own tools like our Prompt-to-Skill. The CEO needs three things. One, the CEO needs a character card—who is this person and how do they think? Second, a project plan. AI only knows what you tell it a lot of the time. A model has an expansive view of a topic, but it only understands high probability. If I don’t tell it that it is a core requirement that this workshop be within a quarter-mile of public transit, it doesn’t know that. I need a project plan. Third, a work plan—the steps you should take to bring the project plan to life. It’s like software development: a requirements document, a technical specification, and a work plan. Those three documents give AI the guardrails it needs to say, “I’m going to go follow this plan.”
Could a machine do that itself? Yes, but very often with agentic software, you’re using small, fast models. Small and fast models are also cheap, which is why we use them because agents can do hundreds of millions of token calls. But they’re not particularly bright, which means they can get caught in loops and chase their own tail. If you don’t spend the time upfront to do that planning, you’re in for a bad time.
Katie Robbert: Here’s where I’m confused. In every other context, we’ve said you have to build that plan, do the project plan, and do all that upfront work; that’s not new. Why is this suddenly level five of AI when we’ve been doing this all along? You could do this with Claude Cowork or, technically, if you already had the project plan, followed the 5Ps, and had the requirements, you could give it to something more basic like a Google Gemini and say, “Is this feasible? Help me fill in the pieces and do the research.” Why is this appropriate for level five versus level two?
Christopher S. Penn: Two things. Number one, systems like Paperclip and even Hermes Agent now have persistent memory. They remember from chat to chat, session to session. More importantly, the sub-agents themselves remember what each other are doing, which is not the case in tools like Claude Code. When Claude Code spins up sub-agents, one sub-agent does not know what the others are working on. They can repeat work, clobber each other, and cause issues.
The second thing—and Claude Code now supports this as they move up the hierarchy—you give it a definition of done and it goes off and does it. In version 2.1.14, we have a “goal” command that says, “Work on this project until you have completed all of the work plan.” It goes into auto mode until it meets the definition of done. In a system like Paperclip, it’s even more abstracted; you give it a PRD, a spec, and a work plan, and then you walk away. It doesn’t ask you questions anymore; it just does it based on your plans. A lot of people are doing what you described, which is very naive: prompts like, “Oh, go build me this thing.” You end up with a final product that may not be what you want.
Katie Robbert: This goes back to the 5Ps. If you have a clear purpose, people, process, platform, and a clear performance—the definition of done—you wouldn’t need to do that. You could just say, “I want a website that does X.” If you don’t give it all the rest of the information, it could go on forever and burn out all your tokens. Congratulations on the “goal” command, but you still need the rest of the information. I’m not trying to be combative; I’m genuinely confused as to why this is something people need to pay attention to. I’m biased because I’ve always been a “do the work upfront before you touch the software” kind of person.
Christopher S. Penn: It’s a thing because people don’t do that planning upfront, and they are trying to build AI systems that will do it for them and infer their intent. The 5P framework allows you to be flexible across the entire spectrum of AI enablement, from level one, working in ChatGPT, all the way to level five, where I hand off a completed project plan and have the machine do it. Each stage builds on the previous one. You have to know good prompting to build a GEM, know standard operating procedures to build an agent, know what “agentic” means to build an agent cluster, and know that to build a Paperclip agent swarm.
They all have to run by the 5Ps. What software developers and tech folks are doing is trying to skip the 5Ps and have the tools do it for them. That is good for Trust Insights because it means that when we build with these tools, they will perform head and shoulders above everything else in the market.
Katie Robbert: I just take away “tech for tech’s sake.” I am the skeptic and the laggard. I get that people want to move faster and make more money—hustle culture—but I don’t buy into it. I want to knock cool, useful tools that improve quality of life and job quality, like Claude Cowork, which has overhauled the way I approach work. What I’m skeptical about is the use cases, the way people use these things in an unwieldy, risky way. Our account manager, Kelsey, thinks I’m a stick in the mud, but there’s a good reason for that. I’m risk-averse. I’m having a hard time wrapping my head around randomly standing up a CEO to create a virtual agency. I’m just negative today, man. I’m sorry, I’m not buying it.
Christopher S. Penn: That’s okay. These tools, especially Paperclip and anything in the level five category, are very immature. They are not ready for production and not well-hardened to security risks. The concept is there: an agent system with persistent memory that remembers what you’ve done, what you’ve told it, and can take higher-level directives. The plumbing of the inter-process communication is a technical thing everything will benefit from. Will this particular software be the leader like OpenClaw was? We don’t know. But it’s not bad to see where it is now so we can see what will likely filter down to the mainstream in three to six months.
Katie Robbert: The way you described it, I can buy into it more, especially the interconnected piece. If you have agents that are siloed, you’re just describing an everyday organization—why recreate that with technology? The problem we at Trust Insights solve is basic things like communication between teams and not replicating processes. That concept is fantastic. Where we started, standing up a random CEO to execute a half-baked plan, I’m not into. Getting more communication across disciplines and breaking down silos? I’m here for it all day.
Christopher S. Penn: Another thing persistent memory systems will know: let’s say you’re working on a slide deck and you have to change the template. Once you get the agent to do it, wouldn’t it be nice the next time you start up if it asked, “By the way, do you need to change templates? I’m going to ask you this now because I remember how much trouble it was.” That’s what the persistent memory offers.
Katie Robbert: That would be nice. If I can get a reminder from the system I’m using to build it, that’s worth all the money in the world. Changing a PowerPoint template after you’ve built the PowerPoint? God help you. It is not fun.
Christopher S. Penn: Exactly. That’s where a system like Paperclip and the infrastructure it’s building will filter down to the mainstream so your everyday AI tools become smarter and more context-aware. The product needs a lot of polish, but the ideas are sound.
Katie Robbert: I’m glad we talked about what Paperclip is and the concept behind it, so when we get to the live stream on Thursday, we can dive right into how to do the thing and build the thing.
Christopher S. Penn: That’s going to do it for today’s episode. If you have thoughts, pop by our free Slack at Trust Insights, Analytics for Marketers, where you and over 4,700 other marketers and business folks are asking and answering questions. Wherever you watch or listen, you can find us at TrustInsights.ai/TIPodcast. Thanks for tuning in, talk to you on the next one.
|
Need help with your marketing AI and analytics? |
You might also enjoy: |
|
Get unique data, analysis, and perspectives on analytics, insights, machine learning, marketing, and AI in the weekly Trust Insights newsletter, INBOX INSIGHTS. Subscribe now for free; new issues every Wednesday! |
Want to learn more about data, analytics, and insights? Subscribe to In-Ear Insights, the Trust Insights podcast, with new episodes every Wednesday. |
Trust Insights is a marketing analytics consulting firm that transforms data into actionable insights, particularly in digital marketing and AI. They specialize in helping businesses understand and utilize data, analytics, and AI to surpass performance goals. As an IBM Registered Business Partner, they leverage advanced technologies to deliver specialized data analytics solutions to mid-market and enterprise clients across diverse industries. Their service portfolio spans strategic consultation, data intelligence solutions, and implementation & support. Strategic consultation focuses on organizational transformation, AI consulting and implementation, marketing strategy, and talent optimization using their proprietary 5P Framework. Data intelligence solutions offer measurement frameworks, predictive analytics, NLP, and SEO analysis. Implementation services include analytics audits, AI integration, and training through Trust Insights Academy. Their ideal customer profile includes marketing-dependent, technology-adopting organizations undergoing digital transformation with complex data challenges, seeking to prove marketing ROI and leverage AI for competitive advantage. Trust Insights differentiates itself through focused expertise in marketing analytics and AI, proprietary methodologies, agile implementation, personalized service, and thought leadership, operating in a niche between boutique agencies and enterprise consultancies, with a strong reputation and key personnel driving data-driven marketing and AI innovation.