So What How To Get Started With Generative AI Lead Scoring

So What? How To Get Started With Generative AI Lead Scoring

So What? Marketing Analytics and Insights Live

airs every Thursday at 1 pm EST.

You can watch on YouTube Live. Be sure to subscribe and follow so you never miss an episode!

In this episode of So What? The Trust Insights weekly livestream, you’ll learn about generative AI lead scoring and how to get started. You will discover why accurate lead scoring is critical to sales success. You’ll also learn practical steps for implementing generative AI lead scoring for your business. Don’t miss this opportunity to improve your sales process with the power of AI.

Watch the video here:

So What? How To Get Started With Generative AI Lead Scoring

Can’t see anything? Watch it on YouTube here.

In this episode you’ll learn:

  • What AI lead scoring is
  • When generative AI lead scoring is better than traditional lead scoring
  • How to get started with AI lead scoring

Transcript:

What follows is an AI-generated transcript. The transcript may contain errors and is not a substitute for listening to the episode.

Katie Robbert – 00:00
So, well, hey everyone. Happy Thursday. Welcome to So What? The Marketing Analytics and Insights live show. I’m Katie, joined by Chris and John.

Christopher Penn – 00:38
Hello.

Katie Robbert – 00:40
All right. Nice. Nicely done. This week, we are talking about how to get started with generative AI lead scoring. Using generative AI for sales processes is a super hot topic. Everyone’s trying to figure out what to do, how to use it, and when to use it. But before we get into that, John, can you tell us a little bit about lead scoring in general? Forget AI for a second. What is lead scoring? How do people use it? Why do people ignore it? Just tell us everything. Because you are our resident sales guru.

John Wall – 01:17
I have been wrapped up in this many a time and seen it go wildly awry and cause trouble and fighting. Basically, the idea is you’re prioritizing leads. When you get down to it, that’s just all it is. You’re sorting out somehow. You’ve got all these names coming in, and you need to figure out which ones are worth talking to, which ones are crazy and are just going to waste our time. Can we identify past customers so that they know that they already know us? All these kinds of questions. Lead scoring is the Swiss army knife that fixes a bunch of this stuff. So having a system set up, just before we jumped on.

John Wall – 01:55
I kind of jumped on my soapbox already. A big part of it is people think that it’s demographic info that’s the best source of lead scoring. Is that okay? Oh, you’re a Fortune 500. Okay, add 2,000 points to your lead score. You know, you’re in Florida? Oh, that’s it. Add another 500 points. But what we’ve seen proven out over time is really the big thing if you’re able to track behavioral traits. So it doesn’t matter what company this person is at. Have they downloaded three white papers in the past week? Have they attended webinars? Are they on the pricing page? That kind of stuff is great to kind of trigger the alarms that this is somebody that needs to get talked to right away or is getting ready to buy something.

John Wall – 02:39
At the other end of it, though, one thing to keep in mind is really all this stuff is only of use if you’re at a point where you can’t handle the incoming volume. If you’re able to get the inbound leads, and somebody is able to put eyes on them and figure out what’s going on here, you don’t have to spend $50,000 on a building. Who?

Christopher Penn – 03:07
Chris is going to show us how to get this done in four hours now. Maybe that will change everything that you will want to build these as soon as possible to get there.

Christopher Penn – 03:09
But.

John Wall – 03:09
But yeah, this is the kind of thing where if you’ve got 10 sales guys in a bullpen, and your goal is to avoid the fights over the leads, because guys like that will fist fight over a lead and you don’t want to be involved with that. So having a lead scoring system and being able to do the round robin is another thing. That’s a really popular thing in large orgs where it hits a lead score, and then who does it actually go to? You actually can target and set up whether you want to make it fair or you want to make sure that the winners only get the good stuff or the people you hate.

Christopher Penn – 03:38
The Glengarry leads.

John Wall – 03:39
Yeah, yeah, totally. The people you hate get the bad leads, you know, for a couple of months to prove that they’re good or whatever. All those kinds of, you know, that’s where you get the sadistic sales masters pulling the levers and controlling where stuff goes. But yeah, that’s my kind of, okay, stump speech on lead scoring and why you need to do it.

Katie Robbert – 03:57
All right, and a quick review. Contacts, prospects, leads. Can you just give us a quick overview of the difference? Because I think a lot of companies say, “Oh, well, they entered our system. It’s a lead.”

John Wall – 04:13
Yeah, right. Yeah. Well, this is. There are, at least now, some standards. Like a lot of CRM systems have leads as lower tier. They haven’t engaged with sales yet. That’s kind of the dividing line. Then contacts are records that have had some kind of direct interaction with sales. Depending on the CRM system, a lot of times, in some systems, contacts can be rolled up to organizations, so you can pull an org and see all the contacts. Some systems do the same thing with leads where the leads get bundled with the companies. Usually that’s good because you want to see who are the ones we’re doing business with, and who are the prospects that we want to jump on board with.

John Wall – 05:00
But other systems and other sales structures won’t add the leads to an organization because they feel that’s actually polluting the data. Because if you’re selling something where there’s, you know, 70, 20, 700 locations for the company, you don’t want, you know, the stockroom person from Peoria, Illinois to be in the company record for the home office of Pepsi or whatever. So there’s dividing lines there. So yeah, there’s all kinds of different lines. But in general, you usually have suspects or prospects, so they’re there. You’re trying to figure out who they are. You’ve got leads that are being worked somehow, whether it’s by automation or being passed to a salesperson. Then contacts are actually part of records of companies that you know are real. At the top of the mountain are customer records.

John Wall – 05:51
You know, they’re usually contacts but tagged as a customer. Those are people you know have money. So that’s the best of the best.

Katie Robbert – 05:58
I feel like I could ask you questions all day long, so perhaps that’s for another episode. Today, we need to get into the generative AI lead scoring. So Chris, now that we have an overview of lead scoring and leads, where should we start?

Christopher Penn – 06:13
Well, John hit on something really important, which is any AI system is only as good as the data going into it. If your CRM or your marketing automation system is a hot mess, AI ain’t going to help you. Right? This is, as Katie often says, new technology does not solve all problems. Data governance and data cleanliness in your CRM is the oldest of the old problem.

Katie Robbert – 06:38
Well, and quick plug for that. On June 5, I am doing a webinar for the Marketing AI Institute, the AI for B2B Marketer Summit on using AI, very simple process for scoring your data, quality of your data set and recommendations on how to fix it. So that’s just a quick plug. Go see them and sign up for the webinar. I don’t think you’ll regret it.

Christopher Penn – 07:07
So you’ve got to have good data to start with. If you don’t, none of this is going to work. So that would be step number one. Step number two is to figure out what data you have to work with because again, it goes back to if you have no idea what’s in your database, you’re going to have a bad time. So that’s going to be if you want to use stuff with generative AI in a programmatic way, you first have to look at your data and just see what is even available. For today’s examples, we’re going to use synthetic data because for obvious reasons, we can’t put personally identifying information, real people on a public life. That would just be, there’d be all sorts of things wrong with that. So this is all synthetic CRM data.

Christopher Penn – 07:49
This is, in fact, if you want to use this for your own work, the Trust Insights GitHub repository. If you go to GitHub.com/Trust Insights, there’s a synthetic CRM generator that generates a file just like this if you are fluent in the R programming language. It will make you as many fake CRM records as possible, which is a great thing to do to practice. So we have your standard stuff in here. We have first name, last name, gender, title, company name, annual revenue, number of employees, date of last contact, company location, email address, deal size, industry, number of opens, days since the last contact, the deal status, and then your sales notes.

Christopher Penn – 08:27
The sales notes field is where all the good stuff is, and it’s where everybody has a big data gap because, and I am 100% guilty of this, I don’t log all my activities. I don’t know.

Katie Robbert – 08:42
We know we are aware, Chris.

John Wall – 08:46
In fairness, all great sales organs realize that you don’t even let the reps do anything. The goal is to have systems and virtual assistants and other things. You just, you cannot trust the salesperson to take any records whatsoever. Never mind just, you know, the basic call records.

Christopher Penn – 09:04
Exactly. But that’s where things automations, like having your sales call system automatically putting the transcripts into your CRM records, is a huge help because that was a much better way to process that data. So this is what we have to work with. Now, some of these fields you would, you have to sit here and think, what should we use? What should we not use? What is useful information? Things like people’s names? Not useful. Right? That’s not a predictive. That is typically not a predictive factor for whether or not a lead is going to be any good. Same with someone’s gender. That’s just not going to be a good predictive factor. So knowing that, and doing some inventory matters. Someone’s job title. Yeah, that’s more useful. If they’re the intern, then you’re probably not going to get a lot of use out of that.

Christopher Penn – 09:50
Maybe the company name may or may not be predictive, depending on how good your marketing data is. Some CRM systems have augmentation where they will bring in, you know, characteristics about that company annual revenue. That’s useful. Right? If the company’s annual revenue is like $5, then unless you’re selling chewing gum, they can’t afford you. The number of employees is also a good proxy for that. The date of last contact. Yeah, super important. Going back to what John was saying about behavioral stuff. If we have not been in contact with this lead in months, it’s going to be a harder boulder to push uphill than someone who is dropping by yesterday. Company location is useful, especially for sales organizations that have territories. If you’re a territory, if you’re outside territory, you need to know that. Email address. Not helpful.

Christopher Penn – 10:43
The domain name might be, but the email address is not helpful. Deal size, that’s where your people would presumably put in what they estimate the deal to be worth. This is a highly variable field that is a behavioral and governance problem. Because there’s somebody who’s like, “Every deal’s a million dollars.” On the other side, I put $1 for every deal. Like this deal’s only worth a dollar. Then Katie’s like, “Why is there our pipeline worth $25?”

John Wall – 11:10
That’s called sandbagging. Yes. That’s a well-established sales tax.

Katie Robbert – 11:16
But I also think, you know, some of the things that John mentioned are, to me that I would care about in terms of the score is, you know, have we worked with them before? Is this, you know, have we even engaged with them before in terms of, you know, do we have a close loss? You know, those kinds of things to me would definitely make a difference in terms of the score.

Christopher Penn – 11:42
Exactly. You have things like number of opens of emails is useful, how many days since they last opened an email, what’s the deal status, if they’re a prospect, are they suspect, are they an opportunity? Because your sales serum is going to dump all the stuff out. Then, of course, the notes field. Now, this is a toy example, your actual CRM. Like if you’re using Salesforce or HubSpot, this spreadsheet is going to be hundreds of columns long. What’s the last page they were on, UTM, tracking codes, page history, all that stuff is going to be in there. Again, because we don’t want to show live data that’s not in here. But we’re going to build with this toy example with the understanding that you will then go and apply this to real production systems. So go start making some AI, right?

Katie Robbert – 12:30
No, no.

Christopher Penn – 12:33
The first place to start is thinking through how do you know whether this is any good or not, whether this data is not any good or not. The place to do that is with your ideal customer profile. If you have your ideal customer profile, you can then say, how closely does who’s in our database resemble the people that we know? We can sell to.

Katie Robbert – 12:58
I would also argue that we skipped a very important step, and that is at least a basic user story for what we’re doing. You know, you could do user stories for every level of the lead score, but for an overall project like this, so, you know, as the head of business development, I want to prioritize the high quality leads so that I can close those and improve my close rate or bring in revenue or, you know, get the CEO off my back or whatever the thing is. But making sure that you at the very least start with a high level user story so that as you’re working through refining the lead score, you’re like, “Oh, this is why I’m doing this.” Because, Chris, to your point, there’s a lot of data here, and some of it’s not really going to be useful.

Katie Robbert – 13:52
You know, so if it’s, “I want to know who specifically is in my database,” then you do want first name, last name, versus “I want to figure out which leads are most likely to close.” You don’t need first name, last name. So I think that I just wanted to sort of remind people not to skip the requirements gathering.

Christopher Penn – 14:14
Exactly, don’t skip the requirements gathering. The worst possible thing you could do is just dump this whole spreadsheet into a generative AI tool and say, “Hey, score my leads.” That would be a unmitigated disaster. What you would want to do though, is think through how would I ask these questions with things like user stories, with things like the 5P framework, thinking through how are we going to do this? Who? Obviously, the P of the people is here, right? There’s these, literally the people. What’s the process, the lead scoring? Katie, you’ll have something in your mind.

Katie Robbert – 14:54
No, my understanding, and John, please correct me if I’m wrong, is that basically you have to have your set of criteria. So all the things we’ve been talking about on this episode so far, so you know, number of opens, number of, you know, things downloaded, you know, location, company size, those are all assigned some a range of numbers from like, you know, 0 to 10, or whatever it is. You know, each individual data point within it. So like if your company is like 50 to 100 people, you’re going to score maybe a one or a two, versus your company is 5,000 plus people, you’re going to score 10. So you need to define that criteria first, and then you can start scoring things. I would assume that’s going to be a big, huge part of the process.

Katie Robbert – 15:48
Is defining the criteria and assigning numeric values.

John Wall – 15:53
Yeah, yeah, you’ve got the big thing. This is just another chicken-and-egg kind of problem. Because you will create a lead scoring system, then live with it for a little while. What you’re going to find is like, okay, these, you know, these indicators here actually don’t matter. These do, we miss them? So they get iterative. So you can take that all the way back to the user story. You know, usually the user story is, as chief revenue officer, I want to create a more effective lead handling process. Like, that’s all it is. Then as time goes on, it becomes, oh, as CRO, I want to know how much the email is a factor in getting to close deals and have the lead scoring system reflect that.

John Wall – 16:34
So yeah, there’s a number of different ways you can do that. Like you just hit the company size one. Like, that’s a great one. That’s always the first one. Because you just want, you know, if Fortune 50 company shows up that needs to get a score that immediately goes right to one of your best people so they can just figure out what’s going on. If it’s one person who’s, you know, proven that they will waste your customer service reps, you know, two hours on the phone and never buy anything, you know, you make. Even this is another thing, people kind of think of it as a positive thing, but it can be negative reputation, too. You can give somebody negative 20 points if they’re, you know, on this side of things.

John Wall – 17:10
Then all your marketing and sales programs can say, “Oh, by the way, anybody below zero, they don’t get anything.” You know, they get blocked at the door. They get a separate website that shows Sesame Street. You know, you can do all kinds of stuff to keep them away from what you’ve got going on.

Katie Robbert – 17:25
I would add into the scoring criteria, anyone who says certain phrases on a phone call will get negative points. We can discuss what those phrases are offline or you can find me in analytics for marketers, and I’ll let you know what they are.

John Wall – 17:38
You’re killing me. Yeah, because I have the same thing. There was somebody on a call with a couple of weeks ago. I’m like, “This person said that I will never do business with this person ever. I don’t care who the hell they are.” Like, “No, you don’t talk like that. It’s 20, 25 for God’s sake.”

Katie Robbert – 17:52
But, I think the point there is that a lead score is personal to a company. You should not be accepting a generic, out-of-the-box lead score. Because what we at Trust Insights value as this is a good customer versus a bad customer is going to be very different from, you know, the company down the street. So if we’re trying to use the same lead scoring system, we’re going to get very different results, and our sales team is going to be very unhappy.

John Wall – 18:20
Yeah. So there’s a dividing line there too of ABM account-based management. Right? There’s this whole thing of, like, you’ve got the right thing of everybody’s different. Even within a company, there’s people that you don’t want to deal with, and there’s people that you definitely want to answer the phone on the first ring every time. So there’s that whole spectrum. But then there’s the account-based management level of, okay, these certain companies we know will never do business with, so they can. So now you get into, okay, well, does the account have its own score, or do you add up all the individual scores, you know, for the account? But yeah, no, everything you said there is, Marcus, far as there are people you want to do business with and people you don’t.

Katie Robbert – 19:01
Poor Chris just wants to get to the AI, and John and I are ranting about sales processes.

Christopher Penn – 19:05
Oh no, this is totally fine. For folks who are watching. This is every day at the office.

Katie Robbert – 19:11
Well, I mean, it’s either John and I ranting about sales, or Chris ranting about AI. So it’s really just a matter of what day you find us on.

Christopher Penn – 19:19
Exactly. It’s every day at the office. So our next step is to figure out how do we prompt this thing, how do we make this work? Because there’s three levels of generative AI, right? There is done by you, done with you, and done for you. When we talk about automation, we’re talking at the levels of done with you or even done for you. But you can’t get to those upper levels without doing the done by you. Done by you starts by seeing if this is even a task that your AI systems can handle. So the first part would be, you need an ideal customer profile. You need to say, “Hey, here’s who we think is useful.” If you don’t have one, call John Wall. He will help you set up a service to get that rolling.

Christopher Penn – 20:06
So I’m going to go into, I’m going to use Google’s Gemini, but you can use the system of your choice. Does not matter. We’re going to want to start with an ideal customer profile, then we’re going to want to start with some of this data. So let me go and just get from Google Drive my ideal customer profile, just Insights ICP. There it is. Let’s bring this in. We’re going to start with a prompt that goes something like this. I want you to compare this target company. We’re going to take just the first row, the first two rows of data from our spreadsheet, and we’re going to put it in here. That is not at all what we want to do. Thanks, Excel. You came in as an image instead of text. Here’s our target company information.

Christopher Penn – 21:02
Now, I’m using Google’s Gemini 2.5 Pro at the moment. One thing you have to think very hard about is what model are we going to use for this at scale? The smarter a model is, the more it’s going to cost you. Google’s Gemini. If you look in the pricing here, you can see the pricing is $2.50 at its maximum for input, $15 for its output, $10 normally for its output. That doesn’t sound like a lot. But just for this one analysis, here it is, it’s already at 3,700 tokens, and it continues to go. So, yeah, it is now at 5,000 tokens. So you can imagine if your CRM has hundreds of records or thousands of records, and you’re using the smartest model available, you are going to get a bill. You’re getting a large, very large bill.

Christopher Penn – 22:13
It may not necessarily be the best fit because this analysis is great for a person. This analysis is not good for a CRM because you don’t want to put like a book into every single CRM record. You want a number, you know, the lead score. So we can at least say, okay, does this work at all? The answer is yes, we can at least get a set of numbers out of it. The next step is to say, well, if we can get a set of numbers out of it, then what would happen if we said, “Well, I just want you to give me some of the numbers that would go in here and say, here’s the output. I want this in a format that a computer can interact with.” Because generative AI is like the engine.

Christopher Penn – 23:01
When we’re talking about things like lead scoring, you kind of need the rest of the car. You need a way to talk to your CRM to get data from it. You need a way to clean up the data and get it ready for generative AI. You need to process it. With generative AI, they need to clean the processed material, then hand it back to your CRM. No tool on the market today, no generative AI foundation tool can do those steps. None. You have to put it in something else. Which means that if we think back to the 5P framework part of platform is, how are you going to get data in and out of our AI system? Because having a human being copy and paste is. You’re not saving any time.

John Wall – 23:44
No.

Katie Robbert – 23:44
Again, that’s why you don’t want to skip over the requirements. Because as fun and exciting as it sounds to have this AI studio, you know, suddenly give you a bunch of scores. Okay, then what do you do with it? Also, can we go back to the cost piece? What is this output going to cost us? Is it going to cost me 10 bucks?

Christopher Penn – 24:07
If you get to a million tokens? Yeah, it’s going to cost you 10 bucks. So the pricing all is based on how many tokens that you’re generating, putting in and taking out the system.

Katie Robbert – 24:22
But I guess my question is, we’re not going to be using a million tokens today, right? We don’t.

Christopher Penn – 24:29
No, not because we’re not using the API we’re using. We’re in AI studios of the development ground now. However, we will be putting this into a production API. We’re going to talk about how that would work, and what kind of costs would be involved.

Katie Robbert – 24:45
Don’t skip your requirements gathering because these are all things you want to know before you start doing stuff.

Christopher Penn – 24:52
So this looks better. So now we have the thinking block where the model is allowed to think out loud, which you have to do. If you say, “Just give me the number,” you are going to get terrible results because language models need to talk. The more they talk, the better the results get. The flip side of that is the more they talk, the more it costs you because every single piece of information that comes out costs you money. This one came up with the lead score of 80. Okay, good. That is in a format. Now, this is a format called JSON, JavaScript object notation that machines can read. Next, here’s the challenge with our ICP. Our ICP is 2,600 tokens. Could, if we’re going to put this in production, can we shrink that down?

Christopher Penn – 25:36
We’ve talked about this on past episodes of the Live Stream, about what sparse priming representations are sparse. Priming representations basically take all the features of a big ICP and crank them down to just the highlights. An example of that would be the section I have highlighted in blue here, which is everything that is in that long ICP document, condensed down for the Gemini model to use that contains all the key points, but like a CliffsNotes version of it. This now is going to save you a lot of time. Let me take away that file and edit this in. Now, let’s delete our previous results. Instead of 5,000, 3,000 tokens going in, what if it looks like this? There’s our ICP.

Christopher Penn – 26:32
Now we are only at 633 tokens, so we have shaved off merely like 90% of the prompts length without losing a lot of accuracy. Now if we rerun this with the exact same test information here, it’s going to generate results. It should come up with a lead score that’s probabilistically in the same range. Not exactly, but it’s going to be close enough.

Katie Robbert – 27:03
Just a general question about these systems. So you had already spent, I don’t know, 3,000 tokens on that first run, and then you deleted it. Does it then take back and reimburse you those tokens? If you say, “Just kidding, I didn’t mean to run that.”

Christopher Penn – 27:24
Nope. Once you’re in production in the API, which is not this, but once you’re in production, every time it generates a token, that is a sunk cost.

Katie Robbert – 27:32
But I was talking about this.

Christopher Penn – 27:35
AI Studio is free, but it’s obviously a lot of copy-paste. You can’t run software within AI Studio. But this is meant for developers, like what we’re doing right now, software developers to try things and do exactly that. So they’re like, “Oh, this didn’t work.”

Katie Robbert – 27:51
But again, I feel like, you know, and I’m going to keep harping on it going back to the five P’s, that’s where platform is going to be really critical to figure out, you know, which platforms can we test this in? Can we do a proof of concept versus going straight to the thing that costs? That is maybe the best, but costs the most money if you’re still in this testing phase.

Christopher Penn – 28:13
Exactly. This is why things like the SDLC matter, and my process matters. Done by you. We’re gonna test this part. This is, this is the engine. We are building the engine right now. You don’t just want to wing it and then put a bad engine into production because you’re going to waste millions of tokens. Thousands of dollars. Here’s the part that’s really bad that you don’t see with a lot of the LinkedIn, you know, AI bros, you won’t know it’s bad. You won’t know that what this, you know, the AI bro and their super secret system put together, it’s just handing back lead scores. You have no idea whether it’s actually right or not unless you do these phases first to test it out.

Katie Robbert – 29:00
So I’m getting a big AI bill and no new closed deals. That tells me it’s bad.

Christopher Penn – 29:09
That tells you it’s bad. But by that point, your AI consultants has moved on to the next sucker.

Katie Robbert – 29:13
Oh, for sure.

Christopher Penn – 29:15
You’re just stuck. So here, with that compressed ICP down to the very small thing, it still comes back to the lead score of 75 is 80 last time. So it’s within the ballpark. We might want to do some more tuning of the ICP. If this was in production for real, we might do some more tuning on the ICP, see if maybe making a little bit longer would increase its accuracy. But we’re in the ballpark. Like it didn’t come back with a five and say, “Oh, this is completely unqualified.” Now we’ve got, at this point, we’ve got a working engine, we’ve got a prompt, it generates a result that is in a useful format for machines. We now have to figure out how do we automate this?

Christopher Penn – 29:55
Because this is still not doing anything that you can’t do by hand with ChatGPT, for example. It’s a waste of your time.

Katie Robbert – 30:05
Yeah, because there’s things that we had talked about, you know, from the CRM data that aren’t necessarily reflected in our ICP. So, like, number of emails opened, are they a previous customer? How does that factor into the scoring?

Christopher Penn – 30:23
So that goes into the prompt itself to say, how do I want to think about those factors? I might give it an example. Even I might say like, here is an example of how to, how I want you to report on this. I might say alignment. Alignment is. I want you to figure out how closely the sales notes mirror what’s in the ICP or number of emails or things like that. So you would put this either in the example section or in the upper section to say, “This is how I want you to do your thinking.” So again, this is engine building. You have to sit here and tell the tool, “This is how I want you to think this process through.” Which goes back to requirements guidance.

Katie Robbert – 31:12
It’s funny how it always goes back to that.

Christopher Penn – 31:14
Exactly. In the interest of time, we’ve got a working prompt. We now need to figure out how we’re going to put this thing into action. You could use a cloud-based service like Google’s Gemini. There are a number of different models you could use. Open AI’s models like for GPT 4.1 or 03, you could use really any company’s models that are safe to use with proprietary data, including personally identifying information because you are handing over the crown jewels of your company to an AI company. So please don’t use any free services because you’re not paying, you are the product. The thing to keep in mind is that all AI models essentially are in families. So you can do stuff in Gemini. The Gemma models which are derived from Gemini will inherit a lot of that.

Christopher Penn – 32:14
The way Gemini does things, you can use Open AI’s 03, then use a lightweight version like 04 mini to do the API level processing. You can use if a company makes a big foundation model and makes an open model like Mistral or Gemini, or if you, I wouldn’t use deep seek the hosted services, but if you have deep SEQ host on your hardware you could, whatever the case is, you want to stay within that family. You don’t want to do this process in chat GPT, then you say the Quinn family from Alibaba because they’re two. It’s like getting a book out of one library and returning it to a different library outside that library system. They’re like, “This isn’t, this is even match our Dewey decimal numbers.” So we don’t know where to put this.

Christopher Penn – 33:05
Stay within the family. So because we’re in Gemini, we could and we will use the Gemma family of models. Why? Because they can run on your hardware. Which means, Katie, you don’t get a big bill.

Katie Robbert – 33:20
Oh, thank goodness.

Christopher Penn – 33:24
You don’t get a big bill. But you have to have the processing capacity to do this. For a lot of organizations, particularly once you start getting into, you know, mid-market and enterprise, they’re like, “You know what, we’ve already got a big enterprise account with Google Cloud. What’s another hundred thousand dollars on that account?” They won’t care. But for a company like Trust Insights, we’re like, “You know what? We don’t have a hundred thousand dollars to throw at this. No, we don’t have $10,000 to throw at this.”

Katie Robbert – 33:49
No, we don’t. How do we then get this from, okay, I have a lead score of 75 into our CRM, so that someone like John can be like, “Okay, great, this person, you know, they’re above the threshold, let me go chase them down.”

Christopher Penn – 34:05
Yep. So first things first, your CRM would need to have a new lead score field. In it, might call it AI lead score. That, that part’s important. Second thing is your CRM had better have an API that other software can get information from your CRM. Other software can push data to your CRM. So for example, if you were to go in just Googling, you could say, “Show me the HubSpot API.” HubSpot has an API. Salesforce has an API. Pretty much everybody does.

Katie Robbert – 34:37
I was going to ask, in this, the year of 2025, are there large software systems that don’t have APIs?

John Wall – 34:47
All hospital software, you don’t want anybody accessing that from outside. From the Internet.

Katie Robbert – 34:54
Yeah, hospital systems, sure. But I mean, like CRM systems, like, I guess I should clarify, in this, the year of 2025, are there marketing and sales-centric software systems that don’t have APIs?

Christopher Penn – 35:09
There are, there are. They’re not the big ones, but they definitely are some of the smaller players. Also there’s a lot of companies that have homegrown CRMs, and those it’s highly dependent on where the developer thought to build an API or not.

Katie Robbert – 35:24
Fair.

Christopher Penn – 35:27
So because we know these systems have this, we can go to our old friend N8N, and we can say, “Hey, do you have a connector for your CRM?” Look, HubSpot does have a connector. Terrific. I could see all the different options that are available. Like for just starting out, if someone makes, if a new contact enters the database, that would be a great time to start the cleaning process so that you’re not doing batches and waiting for a bunch of processing. For today though, I’m just going to use a manual trigger. Now, as I mentioned, we absolutely, positively cannot use live data on this because you don’t want to give away your stuff. I have mocked together, as I mentioned, that pile of fake information. This part of the process is going to be completely fake.

Christopher Penn – 36:20
Desktop, live stream, and I think, what did I call that file? Synthetic CRM. Let’s test that. I have my CRM data. Now I need to do to turn that CRM data into something that an AI tool could understand. So we’re, and for me, this is for us today, this is a CSV file. If you’re using HubSpot, you might not need the Step or Salesforce or ActiveCampaign, because it’s already going to be in JSON format. So we can test this, make sure. Yep, 5,000 records in this CRM. Now, we know from looking earlier when we were looking at the Excel spreadsheet, these are not all prospects. Some of them are closed one, some of them closed, lost. We really care about the prospects, so we probably should do some filtering. Let’s add a filter.

Christopher Penn – 37:20
We want our deal status to equal prospect, so that we’re only getting our prospects. We have 880 prospects in our CRM. So far, so good. Now, the next thing is for testing. The worst thing you could do is to try and do this in all 880 records, especially if you’re using a Cloud API, which is going to cost you every time you run it for development. Put in a limiter like just use two records, because otherwise you could be here for ages and ages. We run a quick test here, make sure that it came out with two items. You just take a look real quick and go, “Okay, yes, this looks like our data.” This part here. A lot of this would be replaced by your actual CRM in production.

Christopher Penn – 38:16
Our next step in the process is it’s time to put in the engine. Let’s put in a basic AI engine. We’re going to define this. For this part, we’re going to take the exact prompt that we were using earlier that we tested out, and we’re going to paste it in here. Let’s go ahead and just tie this on the end. This is where you have to decide what AI model do you want to use. This is again, goes back to all your requirements. I’m going to be using the Ollama, which is my local version of Gemini that I have running on my computer, because I don’t want to send Katie.

Katie Robbert – 38:58
Bill, thank you.

Christopher Penn – 39:01
Let’s take a look at our expression here. There you can see the fields from our CRM data are in here. There’s our prompt example. This looks good. Now, let’s go ahead and run a quick test. We should see if we did this right. Our local AI. There it is, churning away, thinking about the answers and processing that data. Once we’ve got that part, let’s see, we’re going to need to get the data out of it, because the data.

Katie Robbert – 39:38
I was going to ask, I’m like, “Okay, but then how do you get it?”

Christopher Penn – 39:43
Exactly. Now we need to do a set. The set is going to do. We’re going to take this, the results from our AI, and we need to turn it back into something usable. Let’s test this step. Now, it cleans up this JSON, which is great. We want that. From there, we need to just get the lead scores. Remember, we have two parts to this prompt. There’s a thinking part where the model gets to foam at the mouth, and then there’s the actual number. You may or may not want to have the foaming at the mouth part. You might want to keep it maybe as an audit trail, but you might not. So let’s take our set field here. We’re going to call this lead score, because that’s what it is.

Christopher Penn – 40:38
We have to extract out the lead score from that great big huge JSON array. Now let’s see if we did that right.

Katie Robbert – 40:49
Now, as you’re putting that together, my thinking here is that as much as I probably don’t want to read through what you call the foaming at the mouth, it is probably a good thing to have, to your point, for an audit trail, because how else will you know if it goes wrong, where it went wrong? So you may not need to look at it until something goes wrong, but then you’ll be glad you have it.

Christopher Penn – 41:12
Exactly. Certainly I think for the initial R&D part, you absolutely have to have this, the, the diagnostics. So let’s see if we got. Yep, there’s the lead score. It came up with a 68 for our test record, which is great.

Katie Robbert – 41:35
Now you’re not chasing that one, John. Yeah.

Christopher Penn – 41:43
Now we have to convert that JSON data. In this case, I’m going to convert it to a CSV. In reality, this is where you would hook it back up to your HubSpot system and say, “Update my system.” This is the process of CS pen desktop live stream output scored leads CSV. So again, this is. Now let’s tidy this up a little bit. This is essentially the flow to implement and get started with generative AI lead scoring. After this, once you’ve done your testing, and you beat it up, and you look for variations, and you look for all the weird stuff that can be in your CRM, you would change the upfront node to be your CRM to say when a new contact comes in, maybe you might have some gating on that.

Christopher Penn – 42:45
When a new contact comes in from certain forms on your website, run through this whole process, and then pop the score back on. This has to stay operating, so you’re. If you have any. This has to stay in operation. The server that this running is running on has to be on 24/7 so that the data gets processed, or the highest level of automation is agentic. Right? Turn this to an agent where what I would do again if this was in production. N8N allows you to grab your configuration from its interface. You can see in here, it produces a nice JSON file that has the prompts, has all the settings, and stuff inside of it. JSON is computer language. It’s a computer language.

Christopher Penn – 43:39
I can go to a generative AI tool and say, “Convert this into Python code, and make me a bespoke custom app that I can put into production and host that on the server of my choice somewhere.” Then it runs all the time without me, and I’m no longer part of it. So that’s the highest evolution. Once you’ve done the prompting, you’ve done the testing, you’ve done the data cleansing, you’ve built a workflow, you’ve tested the workflow, you’ve built the automation, you’ve tested the automation, you build the software, you deploy the software. Now your bespoke lead scoring lives somewhere on the cloud, and it rains dollars on your sales, right?

John Wall – 44:21
Just a bucket of money, right?

Christopher Penn – 44:23
Yep. What to think about, though, is that this is a very basic process, right? You can obviously include all things, the page views and stuff like that. What it all what you could do in this flow is add things to it. You might have a second part of the LLM chain that says, “Here’s who this company is, here’s the products and services that we know they’re interested based on the pages they visited, write the suggested email for the sales rep to reach out.” If you have a standard sales playbook that we talk about often, maybe there’s something from the playbook that automatically gets put in, and it gets put back into CRM software. So now instead of, you know, John having to go, “I got 480 leads to follow up on with today.”

Christopher Penn – 45:10
Now John can go and say “send, send, send.” You could fully automate that. I would not recommend it.

Katie Robbert – 45:20
What do you think, John? Are you feeling excited and motivated, or are you feeling replaced?

John Wall – 45:27
Oh, no, you’re never replaced. Right? Because the goal is in most organizations, like once they light it up, the problem is always too many leads. Some people call them leads, other people call them contacts, other people call them bums. It’s a matter of how they are. But yeah, anything that helps you sort and just get through the grunt work faster, that’s going to help you get to the points where you’re actually talking with people and really making deals happen, trying to close business. So yeah, no automate it all. That’s my mantra.

Katie Robbert – 46:01
Awesome. All right, Chris. I hope you know that we will be setting aside time to do this for Trust Insights, because there’s definitely some cleaning of our CRM that we could be doing and better lead scoring than we have. So, you know, if you are inspired by this episode, great, give it a shot. If you have questions, you can certainly reach out to us in our free Slack community at trustinsights.ai/analyticsformarketers. Or if you’re just like, “You know what, I don’t even want to deal with it,” you do it. Let’s go to Trust Insights AI contact and you will get the one and only John Wall who will help you set up the lead scoring with Generative AI. He will say, “Here’s what you do.” We’re going to set you up with Chris.

John Wall – 46:49
We got stuff for Chris to do. I feel like we should have a George Wendt cheers tribute. As a Boston-based company, I think so. We have to give a big norm.

Katie Robbert – 46:58
Norm.

Christopher Penn – 47:01
That’s the part I will end with is you really have to get your requirements gathering right. If you get that wrong, as these systems get more complex, it just gets to be more and more work to rework them. If you nail the requirements, you really spend some time and say, “If I could have AI do X and Y and Z within my HubSpot, because it’s A, or my Salesforce or my active campaign, I would like it if it could do that.” That way, you can start having the machines help you plan and generate all this stuff. Because obviously from what you saw today, from a very toy example, it’s not rocket surgery to put it all together. But the more, the higher up that ladder you go, the more things can go wrong, and the harder is to make changes afterwards.

Christopher Penn – 47:57
If you didn’t get your requirements gathering up done right up front first, if you’re going to spend and invest money in a process like this, put 50, 60, 70% of your time and money in the requirements gathering. It pays dividends like crazy. Because the software part is, you know, AI handles that, but the AI cannot guess what really works at your company. Spend some time building your ideal customer profile. Get that right because it is the, the hinge on which this whole thing pivots on. If the ICP is wrong, your lead scoring is wrong.

Katie Robbert – 48:33
I would definitely, I would add to that, definitely think through what it’s building off of your ICP. But what things about your ICP do you want to score? How do you want to score them? How heavily do you want to weight them? Not everything is going to carry the same weight, you know, so really think through, because you can make a really sophisticated automated lead score if you think those things through, you know. So I might say, “You know what? I don’t care about company size. I care about, you know, team size.” John could counter with a different argument. We need to be on the same page in order for this to work because you don’t want to have.

Katie Robbert – 49:15
Well, here’s the Katie version of a lead score and here’s a John version of a lead score that I feel like opens it up to a lot of potential conflict and no sales being made.

Christopher Penn – 49:27
Exactly. A lot of people stealing each other’s lunch in the break room on Friday. That’s going to do for this week’s show, folks. Thanks for tuning in, and we will talk to you on the next one. Thanks for watching today. Be sure to subscribe to our show wherever you’re watching it. For more resources and to learn more, check out the Trust Insights podcast at TrustInsights.ai/podcast and a weekly email newsletter at TrustInsights.ai/newsletter. Got questions about what you saw in today’s episode? Join our free analytics for Marketers Slack Group at TrustInsights.ai/analyticsformarketers. See you next time.

 


Need help with your marketing AI and analytics?

You might also enjoy:

Get unique data, analysis, and perspectives on analytics, insights, machine learning, marketing, and AI in the weekly Trust Insights newsletter, INBOX INSIGHTS. Subscribe now for free; new issues every Wednesday!

Click here to subscribe now »

Want to learn more about data, analytics, and insights? Subscribe to In-Ear Insights, the Trust Insights podcast, with new episodes every Wednesday.


Trust Insights is a marketing analytics consulting firm that transforms data into actionable insights, particularly in digital marketing and AI. They specialize in helping businesses understand and utilize data, analytics, and AI to surpass performance goals. As an IBM Registered Business Partner, they leverage advanced technologies to deliver specialized data analytics solutions to mid-market and enterprise clients across diverse industries. Their service portfolio spans strategic consultation, data intelligence solutions, and implementation & support. Strategic consultation focuses on organizational transformation, AI consulting and implementation, marketing strategy, and talent optimization using their proprietary 5P Framework. Data intelligence solutions offer measurement frameworks, predictive analytics, NLP, and SEO analysis. Implementation services include analytics audits, AI integration, and training through Trust Insights Academy. Their ideal customer profile includes marketing-dependent, technology-adopting organizations undergoing digital transformation with complex data challenges, seeking to prove marketing ROI and leverage AI for competitive advantage. Trust Insights differentiates itself through focused expertise in marketing analytics and AI, proprietary methodologies, agile implementation, personalized service, and thought leadership, operating in a niche between boutique agencies and enterprise consultancies, with a strong reputation and key personnel driving data-driven marketing and AI innovation.

Leave a Reply

Your email address will not be published. Required fields are marked *

Pin It on Pinterest

Share This