So What? Marketing Analytics and Insights Live
airs every Thursday at 1 pm EST.
You can watch on YouTube Live. Be sure to subscribe and follow so you never miss an episode!
In this episode, we explore the world of autonomous AI agents. This shift allows you to transform a standard large language model into a virtual employee capable of managing complex, long-running tasks. By mastering the Hermes Agent, you’ll discover how a self-learning system writes its own skills to automate repetitive workflows. This unique capability means your digital assistant becomes more efficient the more you use it. Setting up these powerful tools in a secure environment ensures you scale output without compromising primary work systems.
Watch the video here:
Can’t see anything? Watch it on YouTube here.
In this episode you’ll learn:
- What fully autonomous AI agents are and why they matter
- What prerequisites you need to have for autonomous AI Agents
- How to create guardrails that generate success with autonomous AI agents
Transcript:
What follows is an AI-generated transcript. The transcript may contain errors and is not a substitute for listening to the episode.
Katie Robbert – 00:36
Well, hey everyone. Happy Thursday. Welcome to So What, the Marketing Analytics and Insights live show. I’m Katie, joined by Chris and John. Howdy, fellas.
John Wall – 00:44
Hello.
Katie Robbert – 00:50
This week we are talking about how to get started with Hermes—not the luxury brand, but Hermes Agent and autonomous AI agents. This is going to be a tough one for me because for all of the years I’ve been on this planet, I’ve always known H-E-R-M-E-S to be pronounced “Air-mez,” which is the luxury brand. But now this is “Her-meez,” which just sounds wrong—but anyway, here we are.
Part of what we want to know is what the heck is it? Chris, you’ve been talking about the Hermes large language model and the Hermes Agent for about a month, if I recall correctly, and I never stopped to ask you what it is. So, what is it?
Christopher Penn – 01:41
We’ve talked about it—and if you haven’t heard it already, it’s in the Trust Insights newsletter, on the Trust Insights podcast, and previous episodes of the Trust Insights live stream, all of which you can get at TrustInsights.ai.
The five levels of AI: level one is where you’re the ChatGPT copy-paste monkey. Level two is Gems and GPTs—standard operating procedures. Level three is agents, systems like Claude Code or Claude Cowork, where you go from being an individual contributor to being a manager.
Level four is where autonomous agents sit. These are like employees—almost like a virtual person of sorts. This is a whole family, the most famous of which is Peter Steinberger’s Open Claw, which came out in December and just shook the world. OpenAI bought that entire thing as quickly as they could, but it spawned a huge number of copies. There is Nemo Claw from NVIDIA, DeepFlow from ByteDance, and there’s Hermes Agent from Nous Research.
They’re all pretty much the same thing. They are a more autonomous wrapper around a language model of some kind. As with all these, you could swap in different models. You could use Google Gemini or Claude Opus if you wanted to and you like paying thousands of dollars in monthly bills.
This thing is a harness; it’s basically a collection of apps. The reason why I like Hermes Agent over the other ones right now—and why Anthropic, for example, this week released some features in their own agents that copy some of the Hermes Agent—is that Hermes Agent is self-learning. As you do more with it, it writes its own skills and plugins. It figures out, “Oh, you keep asking me to do this thing, so I’m just going to turn it into a skill because it’s stupid for you to keep asking me the same thing over and over again.” That’s what this thing is—the easy way to think about it is as a virtual employee.
Katie Robbert – 03:54
In a nutshell, if I understood that correctly, something like the Hermes Agent is a front end to the large language models, which you can pick as your favorite back end. It’s like a Google Form that sits on top of an Excel spreadsheet, or the query screen that sits on top of a database—something that the average person can use to access the bigger software happening in the background.
Christopher Penn – 04:32
And the front end has a lot of rules built into it that make it work better. Let me give you an example that is not Hermes Agent, just so we can make this more concrete. If I go into our Claude desktop application and select “third-party inference” from the developer menu, I get a preferences window. This allows me to change the brain of Claude Cowork. I can change it to Google Vertex, AWS Bedrock, or my own gateway.
I have it set for the Chinese company MiniMax. This allows me to use Claude Cowork with someone else’s models. For really big companies—for example, we have a very large client that has their own AI hub internally—they have their own systems and servers and want to be able to use Claude Cowork, but they do not want to use Anthropic’s servers. They have their own server and their teams would put in “company.com/anthropic,” and suddenly Claude Cowork works with that model as the engine instead of Anthropic’s models. Exactly what you’re saying, Katie—this is the front end and you’re just pulling out the engine and putting in a different one.
Katie Robbert – 05:51
Got it. It’s not overly technical, but I’m struggling to keep up because it feels like it’s on that cusp of being technical enough that I’m sitting here going, “Wait, what just happened?” Before we move forward, John, any questions?
John Wall – 06:12
Would you consider this workflow management? It’s organizing, categorizing, and then plugging into the engines. What would be another way you would define this as far as what the user needs?
Christopher Penn – 06:27
Here’s how I think about this. These are all the models I have access to on my MiniMax subscription within Claude Cowork. I use this during the daytime so I’m not chewing up Trust Insights’ very valuable Anthropic tokens so that Katie, Kelsey, and you can work in Cowork and use the smartest models.
I will use this also—you can see here at the very bottom—for a Framingham municipal budget audit. I want to audit the mayor’s budget. That is not a good use of Anthropic’s valuable tokens or Trust Insights’ work resources. So I put in the cheap model and now I can do fun stuff without damaging the company’s ability to have the best resources for the paying work.
Katie Robbert – 07:16
It all clicked when you showed me that dropdown of which models you have access to. When you were just showing the settings like “Gateway” and other things, it was still a little vague in my brain. But then you showed the front-end UI. You’re using Claude Desktop as the example, but we’re talking about the Hermes Agent, which is where you swap out Claude for the Hermes Agent and then the model you’re using—MiniMax versus Opus.
Christopher Penn – 07:50
Correct. Claude Cowork is what we call a level three system. You can delegate a lot to it. Katie, you’ve certainly spoken a lot about how you have literally 100x-ed your output in the last three months. Hermes Agent is the next evolution of that for extremely long-running tasks where you don’t need to babysit it. Maybe it’s going out to research or build something. As long as I provide a really solid plan and good technology, it will pretty much figure it out.
Here are the caveats before we get into the nuts and bolts. All of these autonomous agents can potentially be hijacked by prompt injection. If they browse a questionable site and misinterpret the instructions—especially if you use a very smart model like Claude Opus—they can potentially break out of their own environment.
If you were to run Hermes Agent on your work computer and you’ve got valuable stuff in there, it might say, “You know what, I might make use of that.” And you say, “No, you’re not allowed to go in that directory,” and it says, “Yeah, I am.” Our caution is that at the very least, you want to run it in a container of some kind. The best practice is: don’t run it on a production work system. Put it on a box that you don’t care about, that you can literally pull the cord out of the wall if it starts to misbehave.
Do not put it on your work or personal computer. If you can’t afford a box at home, you can buy a VPS from a company like Hetzner, Linode, or Akamai for about eight dollars a month, but you want to keep it in its own environment.
Katie Robbert – 09:54
Can you purchase a virtual machine? It sounds like what we used to do in software—we would do our quality assurance on virtual machines so we could have replicas of different environments and combinations of operating systems and browsers without having hundreds of physical machines. That sounds like a good option for someone who doesn’t have the space or money for another physical machine. But to your point, you want to have this quarantined and self-contained so it’s not doing things you don’t want it to do. AI is still unpredictable and still hallucinates, so you want to give it very strong guardrails.
Christopher Penn – 10:50
Yes. The other thing to know about these level four systems is that they are very compute-intensive. They will make hundreds of API calls an hour—potentially more—so if they’re not hooked up to something cost-effective, you are going to get a massive bill. In the early days of Open Claw, people tied it to their Anthropic subscriptions. One person on Reddit said they ran “claudebot” overnight and ended up with a $10,000 Anthropic bill.
Katie Robbert – 11:34
There’s a reason I always tell you guys I don’t like surprises. That is a really good example of planning ahead so you don’t get a $10,000 surprise.
Christopher Penn – 11:49
Exactly. These agents can run on pretty much any computer that can serve up a basic Linux environment. If you’ve got an old MacBook from 2016 laying around, you can format it, install Linux, and bring it back to life. You could probably even run it on a 2010 or 2011 MacBook. You don’t need brand new hardware because you’re probably not doing the AI part on the machine; you’re just serving up the software and connecting AI to it.
In terms of models, the major ones like OpenAI and Gemini are very expensive. I recommend MiniMax for people starting out because they have a subscription specifically for agents that is very generous and cost-effective—the $100 a year plan is probably the best.
Some folks are doing this with local machinery. One of our friends bought the NVIDIA DGX Spark, which is a $5,000 computer, but by running Google’s Gemma 2—he basically bought his own miniature data center, put it on his desk, and tied an agent to that. Now that box only costs the price of electricity to run.
Katie Robbert – 13:49
We’ve covered local models in previous episodes of the live stream, which you can find in the So What playlist on the Trust Insights YouTube channel. I believe that next week, Chris and John are going to be covering what’s new with local models and how to set them up. So if you’re curious about that, we’ll be going deeper in a future episode.
Christopher Penn – 14:30
Yes, that’ll be next week. We’ll probably be talking all sorts of crazy nerd stuff. So, you need a place to run this and a compute plan. Again, my recommendation is MiniMax because of its generosity. However, if you have your own compute, bring it to the party. If you absolutely must be US-based, Google’s Gemini 1.5 Flash is a decent model, but you can still rack up a hefty bill.
This is where it gets ugly because this tool is not made for the average person. It’s made for someone comfortable in a terminal. The first two steps are basically: copy this string, go into your Linux environment, and paste it. That’s a bit of an oversimplification.
John Wall – 15:43
Isn’t that how senior citizens get their bank accounts cleaned out? “Just paste these two things in your console.”
Christopher Penn – 15:53
The assumption is that if you’re using something like this, you know what a Linux box is.
Katie Robbert – 16:06
They’re definitely catering to a certain audience, which they should. Those are the people, like you Chris, looking to push the boundaries of what can be done. I’m a savvy user, but I’m just trying to keep my head above water with the amount of work I have on my plate. I’m not looking to push the boundaries. John, I’m fairly certain we can’t find a good use case for you to be spending your time doing this given the nature of your role—maybe outside of work—but we’ll see what it can do.
Christopher Penn – 17:21
Exactly. I’ve logged into the little computer I bought three years ago that has been sitting above my desk collecting dust. I installed all the stuff, and now if we run “hermes setup,” it asks how you want to connect. You have options like LM Studio for local models, Anthropic, OpenAI, or Open Router. I’m currently using the Global Direct API from MiniMax.
John Wall – 18:29
Is MiniMax running local on that machine too?
Christopher Penn – 18:32
No, I’m using the Singapore data center because I don’t have enough hardware. I need to throw another Trust Insights workshop to afford a machine that could run MiniMax, because you need about $10,000 worth of hardware for it.
Katie Robbert – 18:51
Been vetoed. I can think of a lot better things to do with $10,000.
Christopher Penn – 19:03
So, you would get the base API from their documentation and choose your model—I currently pay for the 2.7 model. It also asks if you want to set up messaging. Once it’s set up, you don’t have to use the terminal to control it anymore. You could set it up to talk to Slack, Discord, Telegram, or even email and text your system.
Katie Robbert – 19:47
So it’s another layer of user interface for whatever you’re comfortable with. I appreciate the flexibility for those of us who are already in a lot of different systems. Adding one more can be the tipping point where you just never do it.
Christopher Penn – 20:23
If I were putting this into production for Trust Insights, I would 100% set up Slack. Then you or anyone in the company could kick off a project just by talking to the Hermes Agent in a Slack channel. You could say, “Go find me five prospects that are mid-market healthcare companies in Nebraska,” and you wouldn’t need to log into the internals.
Christopher Penn – 21:48
The configurations are in place. We’ll make it a system service, and now it says, “Congratulations, Hermes is set up.”
Katie Robbert – 22:07
And you’re like, “Great, now what?” That’s like every time I use terminal—my little guy with the confetti says “congratulations.”
Christopher Penn – 22:25
Let’s launch Hermes Chat and see what happens. There we go. “Welcome to Hermes Agent.”
Katie Robbert – 22:49
Why is it the medical symbol? I noticed the 70s colors and then the caduceus—as if it’s a healthcare tool. I’m a little nervous about that.
Christopher Penn – 23:10
It’s waiting for instructions. This is the most important part of the episode: this is still AI and a language model. You have to stop—collaborate and listen.
Katie Robbert – 23:40
No, instead we’re going to do the 5P Framework.
Christopher Penn – 23:46
Before you give an ambiguous instruction that the system will do badly and in an unpredictable way, imagine this is a virtual employee. What would you tell a brand-new sales associate on their first day? John, would you just say, “Go find some prospects”?
John Wall – 24:26
No, the normal process is to learn everything about us—what our product is and what we do. From a 5P perspective, you’d say, “You need to understand everything being bought and sold by Trust Insights and the types of people who buy that.” You and your agent should go out and scrub the web for AI training and business strategy. But as far as process, people, and performance—there’s no existing dependable resource, so I would kind of stall out there.
Katie Robbert – 25:36
If we had someone starting tomorrow and we had to get them up to speed, you’d start with the foundational knowledge of who Trust Insights is. We have all of that in our sales playbook. The Purpose of reading the playbook is to understand the Trust Insights customer, how to sell to them, the team involved, and our ICPs (Ideal Customer Profiles).
For the Process, number one is reading the playbook, and number two is going through the CRM to understand the sales cycle. Then start with a small proof of concept identifying a good prospect. The Platform is the CRM or an Excel spreadsheet. Performance is: were you able to identify a handful of prospects based on what you learned?
We’ve been telling people that the 5P Framework is the best way to prompt agentic AI. We have data that demonstrates its efficacy. If I told you, John, to “go to the store and get some milk,” you would fill in the blanks. But you can’t leave that vagueness open for agentic AI because it will go to the store, pick up the milk, and just stand in the aisle holding it.
Christopher Penn – 28:50
It needs that level of detail. Each level of AI enablement builds on the previous one. Level one is prompting. Level two is turning prompts into SOPs. Level three is writing project management plans from SOPs. At level four, you have to write a job description that encompasses the SOPs, project plans, and good prompting. If you just wing it, it’s going to be a flaming disaster.
John Wall – 29:58
The problem has never been “not enough humans.” We’ve been able to use automation and AI to cover more ground, so bodies aren’t the issue.
Christopher Penn – 30:25
Using the 5P Framework, let’s design a sales prospecting agent. The Purpose is to research mid-market B2B companies in the New England area interested in Trust Insights. For People, we include the company information, ICPs, and the team. The Process includes our sales messaging, product information, and sales methodology.
We’ve documented all of this. Agents do not work if you don’t do this background work. AI is just another piece of software; to make it move faster, you have to do the prep work up front.
The built-in search in Hermes is not great, so I use a private web search MCP from my GitHub that helps AI search better. For Performance, the outcome we expect is 50 prospects including company name, stakeholder name, title, and email address. The output goes into a CSV file in the “Sales Outreach” folder. Do not stop until you hit 50 valid contacts.
Katie Robbert – 34:48
The reason we’re covering Hermes and autonomous agents is that you can give an instruction like “don’t stop until you hit 50” with these off-brand models without running up the same costs as you would on Claude Cowork. You can’t tell Claude Cowork that unless you’re okay with overage fees or locking your team out of a shared account until the usage resets.
Christopher Penn – 35:54
We can see in the chat that it’s reading the sales playbook, identified the search skill, and started writing the headers for the CSV. It has clear instructions and is going to crank. MiniMax will show you the capacity—we’ve used 348 out of 4,500 requests in this five-hour period. We get 45,000 requests a week. This is the capacity you need to run an autonomous agent.
Christopher Penn – 37:20
This thing just does its thing for a long time. If you set up the messaging gateway, it will ping you when it’s done or if it “falls and can’t get up.” For example, I’ve had it identifying events in our industry or within a two-hour drive of my house. It produced a spreadsheet with 8,000 lines of events.
This file took three and a half days to run. I did zero of the work. Because we prompted it with strict guardrails—like cross-validating contact info with LinkedIn—the results are valid. That’s the value: these systems are even more hands-off than Claude Cowork.
Christopher Penn – 40:10
Every day I get a “newspaper” in my Discord inbox for my hometown. The agent built Python code that checks City Hall and local listings for things like “Frog & Toad Spring Tea” or hikes at Callahan State Park. This stuff doesn’t sell regular newspapers, but I want to know about it. The agent runs every day at 7:00 AM. The prompt for this is 13 pages long, following the 5Ps exactly. I’ll put that in the Analytics for Marketers Slack group as an example.
Christopher Penn – 41:50
The last unique thing about Hermes is that it is self-learning. It built several skills in the “skills” folder itself. I can also add skills manually, like coding superpowers or the Trust Insights “Job to AI” skill.
Katie Robbert – 43:05
A lot of these systems are now compatible. You don’t have to adapt the language as much as you used to when moving between ChatGPT and other models. You can just drop those skills in without having to worry about where it was created.
Christopher Penn – 43:54
These agents do interesting things depending on their internal ethics. A few weeks ago, I asked the agent to check out a disgusting website using a playbook. It built its own “God mode” hacking toolkit to try and break into the site. I had to tell it to slow down! It did a great job building a red-teaming toolkit, but we’d need a lot of bail money if I let it enact that.
Christopher Penn – 45:45
We’ll let the prospecting agent run for a few days. It already found 10 companies. I’ll hand the results to John and the team to see if the prospecting is any good.
Katie Robbert – 46:33
I’m curious to see if it connects the dots between “this prospect will buy what we sell” versus “this prospect will buy this specific thing.” That goes back to the specificity of the “get some milk” instruction.
Christopher Penn – 47:10
My hope is that this gets the gears turning. If it can do prospecting, it can draft pitches. If I add a HubSpot or Gmail connector, it could put drafts directly in the system for human approval. You could walk into the office and have 30 drafts ready to go.
Christopher Penn – 48:05
I have another version running where I’m giving the agent access to the Robinhood API with $25. I told it to turn that into $100.
John Wall – 48:32
Make sure you have guardrails so you don’t end up with $20 million in petroleum stock!
Katie Robbert – 49:03
My mind has changed on the use cases for this.
John Wall – 49:18
The big value prop is doing crazier projects without causing havoc in the Claude Cowork we depend on day-to-day.
Katie Robbert – 50:21
Don’t get in my way, I’m getting stuff done!
Christopher Penn – 50:51
Thanks for tuning in. Check out the Trust Insights podcast and newsletter. Join our free Analytics for Marketers Slack group. See you next time.
|
Need help with your marketing AI and analytics? |
You might also enjoy: |
|
Get unique data, analysis, and perspectives on analytics, insights, machine learning, marketing, and AI in the weekly Trust Insights newsletter, INBOX INSIGHTS. Subscribe now for free; new issues every Wednesday! |
Want to learn more about data, analytics, and insights? Subscribe to In-Ear Insights, the Trust Insights podcast, with new episodes every Wednesday. |
Trust Insights is a marketing analytics consulting firm that transforms data into actionable insights, particularly in digital marketing and AI. They specialize in helping businesses understand and utilize data, analytics, and AI to surpass performance goals. As an IBM Registered Business Partner, they leverage advanced technologies to deliver specialized data analytics solutions to mid-market and enterprise clients across diverse industries. Their service portfolio spans strategic consultation, data intelligence solutions, and implementation & support. Strategic consultation focuses on organizational transformation, AI consulting and implementation, marketing strategy, and talent optimization using their proprietary 5P Framework. Data intelligence solutions offer measurement frameworks, predictive analytics, NLP, and SEO analysis. Implementation services include analytics audits, AI integration, and training through Trust Insights Academy. Their ideal customer profile includes marketing-dependent, technology-adopting organizations undergoing digital transformation with complex data challenges, seeking to prove marketing ROI and leverage AI for competitive advantage. Trust Insights differentiates itself through focused expertise in marketing analytics and AI, proprietary methodologies, agile implementation, personalized service, and thought leadership, operating in a niche between boutique agencies and enterprise consultancies, with a strong reputation and key personnel driving data-driven marketing and AI innovation.