In this week’s In-Ear Insights, Katie and Chris talk about the newest advances in natural language generation and walk through an example of what’s available now for creating content with the assistance of AI. Watch the demonstration, listen to the implications for marketers, and start formulating your AI-based content marketing strategy. Tune in to find out how!
Subscribe To This Show!
If you're not already subscribed to In-Ear Insights, get set up now!
- In-Ear Insights on Apple Podcasts
- In-Ear Insights on Google Podcasts
- In-Ear Insights on all other podcasting software
Advertisement: Data Science 101 for Marketers
Do you want to understand data science better as a marketer? Would you like to learn whether it’s the right choice for your career? Do you need to know how to manage data science employees and vendors? Take the Data Science 101 workshop from Trust Insights.
In this 90-minute on-demand workshop, learn what data science is, why it matters to marketers, and how to embark on your marketing data science journey. You’ll learn:
- How to build a KPI map
- How to analyze and explore Google Analytics data
- How to construct a valid hypothesis
- Basics of centrality, distribution, regression, and clustering
- Essential soft skills
- How to hire data science professionals or agencies
The course comes with the video, audio recording, PDF of the slides, automated transcript, example KPI map, and sample workbook with data.
Sponsor This Show!Are you struggling to reach the right audiences? Trust Insights offers sponsorships in our newsletters, podcasts, and media properties to help your brand be seen and heard by the right people. Our media properties reach almost 100,000 people every week, from the In Ear Insights podcast to the Almost Timely and In the Headlights newsletters. Reach out to us today to learn more.
Watch the video here:
Can’t see anything? Watch it on YouTube here.
Listen to the audio here:
- Need help with your company’s data and analytics? Let us know!
- Join our free Slack group for marketers interested in analytics!
What follows is an AI-generated transcript. The transcript may contain errors and is not a substitute for listening to the episode.
Christopher Penn 0:02
This is In-Ear Insights, the Trust Insights podcast.
In this week’s In-Ear Insights, we’re talking about natural language generation and some of the newest capabilities that are available to everybody with what’s going on.
So a quick bit of background for three years now, there have been these models are called Transformers just to type of software that have been used by artificial intelligence researchers to generate language, you give it something and it’s kind of like autocomplete, right, where if you’re, you’re typing along on your phone, and your phone’s going to try and guess the next word, this software does guess the next sentence.
And the models have been evolving over time, you know, what GPT, one was three years ago, GPT-2 was two years ago, GPT.
Three is this year.
And the company putting these things together consortium called Open AI, has released access to the GPT three, but not the model itself, because they’ve, you know, the usual claims out your proprietary secret and all this stuff.
And so a lot of competing software models, again, other companies trying to copy it, have been released.
And those are the ones that you can test out and, and run.
And so where we are today is we now have a plethora of models to choose from, again, different pieces of software, to create language, and instead of auto completing a word, or a sentence, and now starting to do things like paragraphs or entire documents.
And that’s the thing that I think is so fascinating, has such strong implications for marketing.
So Katie, what I talked about a model like for example, GPT, six, J, what do you What’s your first inclination?
Katie Robbert 1:52
I understood the word plethora, and then everything else went over my head.
No, I understand, you know, the concept of the autocomplete based on, you know, things you have previously typed.
And so that’s where, as a marketer, I would need it broken down into what does this mean for me? How do I implement something like this? How does this change my daily job function, besides, you know, my text messages to my group text, you know, suddenly getting, you know, smarter, or more or more savvy or smarter than I am? You know, what does this mean for me as a marketer? And so, while I find GP, T, BJ to qR, whatever, it was interesting, do I need to understand the mechanics of it? Or do I just need to understand how it impacts me in immediate?
Christopher Penn 2:52
Yeah, it is, what it how it impacts you, and more importantly, how it’s going to impact the people that you do business with, right and, and what things that what’s going to change for their business.
So let’s do an example.
Because I think an example probably would help illustrate this, I’m going to go ahead and I’ve pulled up a press release.
So this is from scission.
This is a press release.
And I’m going to take a few lines of text here, I’m going to go over to this the model six here.
I’m going to put this in and say run this model.
And what this software is going to do is it’s going to read everything that is in the first few paragraphs of this press release.
Right and this goes on and on and on and on and on as press releases do.
And it’s going to attempt to infer what would the rest of this press release be about? Right.
And try to figure out, could it write the rest of it.
So you have the chainmaille grill brush the upset grill pro the chainmail grill brush pictured.
So it’s starting to essentially try to autocomplete what the rest is released.
What should have said they pump it another example this is what I did yesterday.
Whoops, there we go.
So this is one that was a about trees so trees in and how it affects your plumbing.
And the rest of this release goes on and on and on as as they do.
And the revised version you can see here on the second half.
His ways to avoid plumbing problems include having tree roots be placed five feet from your bottom your home planting fruit trees away from 12 feet away from home, keeping mulch down so that tree roots don’t go down as deep into the soil.
And when you compare it to the second half of the actual release, I actually kinda like the one the machine generated better because it’s more helpful makes more sense.
Oh, these are the things I should be doing to not ruin the pipes that go into my house with no true it’s true.
ripping them up.
And so from a marketing operations perspective, this is helping us essentially do the first draft on marketing copy.
And it’s not always perfect.
Sometimes not even great.
Sometimes you get salad.
But with each generation model, the occurrences of salad get less and less and less.
Katie Robbert 5:24
So how, without getting too deep into the technical weeds, you fed it two paragraphs from a press release, it was already written, how does it know where to take it next? Is it pulling from? I’m assuming what I know about AI is that, you know, he can only pull from what it knows.
So it was obviously fed other data in order to pull from so is it going into its, you know, large, vast database of other content and looking for Okay, what content Do I have on grills? Because that seems to be the prevalent keyword? Or what content Do I have on trees and plumbing? Because those seem to be the prevalent keywords and pull out what I know.
Christopher Penn 6:11
So the way so where it where this particular model? And that’s, that’s actually a really important question is where does Where did this thing get its data from the in the white paper that these folks released? There’s this thing called the common crawl, which is essentially a crawl of a massive number of websites, PubMed, which is the medical archive archive.org, which has papers, academic papers, GitHub, free law, stack, exchange, the Patent and Trademark Office, Project Gutenberg, which has books, subtitles for TV shows.
So you can see this, this is massive, massive corpus of text that they’ve pulled data from, that helps it understand like you’re saying, infer the subject, and then look to see what are the things that it knows from books three is, is a huge corpus of public domain and and copy copyright released books.
So when it pulls up something like plumbing, it can look in those books and say, Okay, well, what things do I know about? What’s actually very interesting about this particular academic paper that I felt was very nice is that in the appendix, they also tell you what is not included? So they’ve specifically excluded some things that they said they don’t want in the model, because it causes problems for fanfiction, yeah, Congressional Record, etc.
Katie Robbert 7:34
I find it interesting that Wikipedia is included.
Because I personally don’t find Wikipedia to be a quote unquote, credible source.
The reason and sort of a quick anecdote was because I have this friend who was responsible for updating Wikipedia for clients.
And so he took it upon himself to include a Wikipedia entry for his dad, who was just a regular guy, and wrote this whole article about how his dad invented peeps, which was not true.
But Wikipedia took the information and now it lives out there as something that is internet.
True, I guess.
And so I have a little bit of pause when I see that Wikipedia is one of the sources of information, the other ones, PubMed, all those other abstracts, that all makes sense to me, because I know the the methodology behind vetting that information, but Wikipedia to me stands out as the Hmm, is that really the best source of information?
Christopher Penn 8:43
Yep, I would actually argue a common crawl also fall falls into the category too, because common, quote, unquote, incorporates the extraction of data from places like Google News and things.
And there are plenty of news sources that are not sharing truthful news.
You know, without getting too political, there’s a whole bunch of things like yeah, that’s not actually true.
But part of the reason that these sources are include is because they contain well structured language.
So understanding and be able to predict what’s the next word, know, the logical word in a sentence would be something that these sources would be helpful for, even if they’re factually not true.
Because one of the things about language generation is that the outputs are not supposed to be factually true.
They are supposed to make linguistic sense.
So when we craft for example, a piece of fiction or a blog post, we will still have to edit it, you and I will have to edit it to make sure that it what is saying is factually correct.
It will just mean less editing of language, or just you know, in the case of some folks, I know, just staring at a blank page.
I don’t know what to write about this topic.
Katie Robbert 9:52
Hmm, that makes sense.
And I think that understanding that nuance is going to be super helpful in terms of empathy.
Imagine your technology like this into something like your content marketing program.
So I think that there’s the assumption that, well, if AI can write for me, then I’m out of a job, I no longer have to create original content.
Nope, you still have to edit it, make sure that it’s factually correct.
AI might get you to a place where you have a grammatically correct piece of content.
But the information might be a whole bunch of lies.
And so that’s where that’s where that human judgment piece, and that always goes back to will AI take my job? The answer is no, your job is just going to look different.
I’m definitely one of those people where I would have an easier time editing something that I would started from scratch.
And so that would be helpful to me, if I could say I want to write about, you know, process development, but I don’t even know where to start, but the AI could get me started.
And then I’d be like, okay, now I know where to take this information, that would be something as a marketer, that would be super, super helpful in kickstarting the writing process.
Christopher Penn 11:05
And that I think, is where there’s a lot of value to what these these things can do, again, with the understanding that they’re not perfect, and that there can be some pretty substantial issues with it.
But it’s a good starting point.
So you know, for fun, we can take something that you said in a previous episode, and we put that in and say, try to create something new from this.
And you can give these prompts like write a thing about this or in, in my case, if you’re working on a blog post, and you’re not sure what to say next, put in what you’ve written so far, it will do an autocomplete.
And then you say, Yeah, that makes sense.
Or Oh, no, I wanted to write about that instead.
And you know, to your point, it is useful.
This could be just like, so here, it actually interprets what you were saying, because of the way you were using language as an interview.
This is directly copied from our podcast, right?
Katie Robbert 12:09
Like, I don’t know any of these people.
I don’t know.
Kevin Balinese or Victor parsec.
Christopher Penn 12:17
Neither do I.
But you can see that it picked up on the language pattern of podcast and looked for try to construct more similar text.
So it doesn’t understand thematically what we’re talking about.
But it does understand the language type we’re working with.
Katie Robbert 12:35
That’s really interesting, because I’m looking down, I’m like, Well, I can’t use that.
And so that to me, says, okay, so either we’re not using the right content to model off of, or there’s some heavy editing that would need to be done in place, which again, is that human interaction with the AI? Now, the more is this one of those things where for you personally for your account, the more that you use it on your content, the more it starts to understand your writing style? Or is it just sort of generic for everybody, everybody kind of gets the same output? Like does it learn about you specifically? Or is it just learning? In general,
Christopher Penn 13:18
in these pre trained models that have been released, these are the generics these are the things have been largely just put out there.
If you because this, this particular model is available on GitHub, you can actually import it and bring it into like your own machine learning instance, you can then do what’s called fine tuning where you upload your blog posts your podcast and things and then it yes, it starts to learn a tune against the things that you provide and number, the more things you provide it, obviously, the faster it begins to train and expect certain types of language that you would use that maybe the general model wouldn’t find as appropriate.
Katie Robbert 13:55
And so that then, so if I’m, if I’m a marketer, which I believe I am some days, if I wanted to bring something like this into my team into my practice, what are the things that I need to, you know, be aware of what are my business requirements for bringing something like this into my practice,
Christopher Penn 14:19
you need a machine learning engineer that is fluent in Python.
And you need a machine learning instance of some kind like Google colab, for example, we have a $10 month plan with them.
And you import this code, and you have the engineer, essentially, copy the model, and set up fine tuning.
And then you and your engineer work together to say I need copies of emails, I need copies of white papers and webinars and transcripts and things to give it to them all to the fine tune.
And then they run this and you know, they tune up the model.
At that point, you’ve essentially created a new model.
That’s tuned more To you, and you keep refining it and tune in.
And over time, obviously, you spit back what the, it spits back the things, more of the things that you want to see.
But in terms of what you need up front, you need that machine learning instance, the tools, you need the ingredients, which is the data, and you need the chef, the person to actually make the thing.
Katie Robbert 15:20
Is there a version of this that exists where I could just literally navigate to a website, paste my stuff in and then get that back, or I really have to have that machine learning instance, like could could, are there companies that hope posts, that’s where I could just sign up for an account.
Christopher Penn 15:39
They’re starting to be like our friends over at market Muse offers similar capability.
But for the for the building that truly customized model to you, you kind of have to have it running in your own instance.
Because even for SaaS companies, they will have data that is across their client base, they will not be able to they could not afford to maintain the computational cost of a model individual model for every single company, they might have models that are tuned to an industry, like our healthcare clients will use this model, our B2B tech companies will use this model.
And honestly, if you know for like PR companies, PR firms and ad firms and stuff like that, that’s going to be part of the roadmap for them for the future, if they want to survive, as they will have to have the ability to offer custom, at least industry tuned models for for this sort of thing.
But yeah, if you want something that’s you, you got to make it for you.
Katie Robbert 16:34
Got it? Um, no, it’s and I was thinking about, you know, some of our PR friends who all they do is, you know, create blog content and press releases.
And so it sounds like if they don’t have the capabilities themselves, then they should be looking for partners like Trust Insights, who could build that infrastructure for them, and generate that content to pass off to them to give to their team to edit, which I think would make a lot of sense.
So it sounds like, if you don’t have the capabilities, you should start looking for a partner that does so that you don’t fall behind in terms of where the industry is going with creating original content churning out content, we were talking last week on the podcast, or a couple of weeks ago, I don’t remember now time is sort of irrelevant, about how the changes to Google search engine are going to impact SEO.
And having all of that sort of 360 contextual content around a topic is really what’s going to set you up for success with people being able to find you and you staying at the top of the search results.
Well, in order to do that, you need to be able to create that content.
In order to create that content, you need to have the resources and if you’re lacking on resources, which we know a lot of teams are right now, you need to have something like this AI paragraph autocomplete system in your back pocket in order to keep up with the demand.
And so it sounds like there’s a lot of things that would need to happen.
First thing I would do is start looking for a partner that could help me
Christopher Penn 18:10
I would say there’s actually even a step zero before that, which is people and you know, the the more savvy PR professionals do this already, people should be curating the content that they think is best in class, whether it’s theirs, whether it’s their clients, whether its competitors, you know, if you have a library of these, the blog posts that did really well, in our industry, these are the press releases that actually got somebody to click on them.
These are the top performing YouTube videos in the associated transcripts, you don’t need an engineer.
And it’s actually frankly, a waste of their time to help curate and pull together that library, this is the best of the best that represents what we want to be the kind of content we want to create, have that compiled so that when you do get to the point of either bring it in house, hiring an agency, you know, hiring a machine learning engineer, whatever, you have that corpus of materials that says this is what I want, and they kill, it will drastically shorten the amount of time it will take to get that thing up and running.
Because otherwise, you know, as with any of these models, they’re only as good as the data you feed them.
So if you were having to wait for, you know, a new white paper every quarter, it’s gonna be a long time before that model spits out something that’s, that’s usable.
On the other hand, if you feed it, you know, 1000 white papers from your industry that all are relevant, important.
He will, it will speed up the development model really fast.
So kind of the that step zero is already be aggregating that content.
And I know some folks do that because, you know, they want to be able to train their teams to say like, this is what we expect you to create.
Katie Robbert 19:48
Yeah, that’s interesting.
So it really becomes more of a research project first.
To put all of that together into one place, and you know, something that John and i will be talking about on an upcoming live stream is how to research efficiently.
And, you know, first and foremost, it’s, you know, knowing what it is that you want to accomplish.
And so you’re not just going down that black hole of googling around all these different topics.
But you know, Chris, to your point about that step zero.
It’s really, what do we want our tone and message and branding to be, when we’re publishing content, even when we’re doing it on behalf of a client, it still has to have our stamp on it.
That’s why they came to us to write for them in the first place.
Christopher Penn 20:35
Think about this, the model is the cooked dish, right? To make a dish, you need a recipe, you need ingredients, you need tools, and you need a chef, someone with skills, right? So that that all that data that you collect all those ideal things, all those that corpus, that’s your ingredients, your recipe is the GPT pre trained instance, your tools are things like Python and Google colab.
And then you the chef, is going to be that combination of the machine learning engineer, and probably a subject matter expert, because you’re going to they’re going to have to, you know, depending on how specialized you are, they’ve got to look at things and tell you does this pass even pass the sniff test, like if this thing spits out a press release? And you don’t have the experience? Know, for example, if it spits out something about viruses and talks about spike proteins and you know, h2 receptors, that’s subject matter expert has to look at and go, Ah, that that makes absolutely no sense.
He was the layperson will probably not know, you know, the difference between a foreign spike and an ace is after the expert can say, That’s right, that’s not right.
And then you can help retrain the model.
Katie Robbert 21:43
Yeah, I definitely don’t know the difference between those two things.
Why would you need an expert to help me? But I think that that’s also an important distinction is that, you know, you need to understand the content that’s coming back to you.
And that goes back to doing your upfront research, you know, so it’s not a shortcut.
In terms of Well, I was assigned a topic that I don’t know anything about.
So let me just have the AI do it.
Well, that doesn’t help you.
If you still don’t, I don’t know what the topic consists of, because then you have no way to edit it, other than to make sure that it’s grammatically correct, it could still be full of lies.
Christopher Penn 22:19
So we are closer I would say we are one step closer to the marketing singularity in the sense of, you know, machines doing more stuff that and and taking lower value tasks off our plates.
But to your point, this is not a replacement for people.
This is not a replacement for expertise.
This is an augmentation does help speed things up.
But it will, I will, I certainly would not just put prompted here that since immediately copy and paste into my blog, that would be a recipe for disaster.
Katie Robbert 22:54
I would agree with that.
And so if you’re interested in learning more, or you’re looking for a partner to help you or you just have general questions, you know, you know where to find us.
You can find us in our slack group at TrustInsights.ai at AI slash analytics for marketers Find us on our website, TrustInsights.ai dot AI, hit up our contact form or find us on social media at Trust Insights on most channels.
Christopher Penn 23:18
right, well, well, I hope that you have some fun building things and we will talk to you next time.
Need help making your marketing platforms processes and people work smarter.
Visit Trust insights.ai today and learn how we can help you deliver more impact.
Need help with your marketing data and analytics?
You might also enjoy:
Get unique data, analysis, and perspectives on analytics, insights, machine learning, marketing, and AI in the weekly Trust Insights newsletter, Data in the Headlights. Subscribe now for free; new issues every Wednesday!
Want to learn more about data, analytics, and insights? Subscribe to In-Ear Insights, the Trust Insights podcast, with new 10-minute or less episodes every week.