INBOX INSIGHTS: All About Generative AI (4/12) :: View in browser
An overview of generative AI
We’ve well established that of the two of us, Chris is the technologist and I am not. Not to sell myself short, but I’ll never be as in the weeds with technology like he is. That’s not to say that I don’t understand it and can speak the language well enough.
I joke that it’s like sports, I understand enough to hold an intelligent conversation. I know which teams are winning and who the star players are. I know the rules of the game and what the fans care about. Once you start digging into the stats of individual players and having strong opinions of local versus national commentators, I’m out.
With that context, I want to talk about what you, a marketer who isn’t a data scientist, needs to know about generative AI. This is a topic that for a lot of reasons, including the future of your career, you should have a good understanding of.
In this week’s podcast Chris and I talked about what generative AI is. We got pretty technical, so I wanted to provide a more user-friendly version.
If you want even more resources about AI, check out our friends over at the Marketing AI Institute.
Let’s start at the top. As stated in the podcast, Artificial Intelligence is an umbrella term for programming computers to replicate human intelligence tasks of all kinds. There are three main areas of Artificial Intelligence: Regression, Classification, and Generation. Regression is the finding of the information. Classification is organizing the information, and Generation is creating something with what you have found and organized. Personally, I struggle to remember the three types of AI most days. I can usually remember two, and never the same two. Try this instead, FOG – Find, Organize, Generate. First you have to find the information, then you have to organize it to be able to use it, then you can generate what it is that you need.
What I have learned is that you cannot have only one kind of AI. You need all three to work in concert to have a valuable output. Now that we all have a basic understanding, let’s dig deeper into generative AI.
Everyone, and I mean everyone, is talking about generative AI. Well, more specifically they are talking about ChatGPT and similar systems. The ChatGPT interface works by using generative artificial intelligence. Since we’re all obsessed with this shiny new toy and trying to figure out how to use it to do our work, let’s make sure we know what we’re getting into.
Generative AI is exactly what it sounds like, it generates things. Ok, that’s straight forward. If you recall back about twenty seconds ago, we stated that generative AI needs regression and classification to operate. Great, still not overly complex. Here’s where I want us to focus – what generative AI cannot do. A lot of us have gotten wrapped up in letting AI write our content for us that I wanted to take a moment to understand where it’s generating from.
Generative AI relies on information. You need to feed it something to work with. This means that generative AI, and systems like ChatGPT are not creating original content. Think about the music industry. You could argue that you cannot write an original melody anymore. There is not an infinite number of musical notes to use. If that’s true that means all “new” music is referencing something that was already published.
Sometimes it’s a straight up sample, like how Montell Jordan’s hit, “This is how we do it” samples Slick Rick’s, “Children’s Story”. And sometimes it’s copyright infringement, like how Ed Sheeran went to court for seemingly sampling “Let’s get it on” in his song, “Thinking out loud.”
If you ask ChatGPT to write you a song, it will first go to all the resources it has to understand what a song is. Then it will reference those resources to write the song you asked for. Is it original content? Sort of. You run the risk that the lines written are an exact copy of something pre-existing, even if they are new to you.
The point is that generative AI cannot create something that has never existed before. It relies on finding existing reference material to understand what you’re asking it to do. This is why the topic of “AI generated content” is such a big deal in marketing. We work hard and take days (sometimes weeks) writing our blogs, white papers, and social copy. Systems like ChatGTPT can write it in a matter of moments. However, unless you are diligently editing, you could be publishing something that is copying pieces from other posts. This won’t do you or your brand any favors.
In some cases you want generative AI to replicate what already exists. You can ask for a recipe for pizza dough. The system will find references of making dough, organize the information into a consumable way and then generate a recipe for you. In this instance, referencing pre-existing content is what you’re after. But again, the AI isn’t making it up, it’s looking at what’s already out there. It needs to learn first and then create.
As generative AI get more sophisticated, we need to understand the basic mechanics of how it works so that we can learn to use it properly. Over the next few months, it will be harder to tell the difference between human generated and AI generated. To be fair, it’s already difficult and we didn’t even cover deep-fakes. The models that power generative AI will learn faster, become more refined, and generate content that mimics specific tones and voices more closely. If you want to stay a step ahead and use AI, you need to understand what it can and cannot do.
Are you using generative AI?
Reply to this email or come tell me about it in our free Slack Community, Analytics for Marketers.
– Katie Robbert, CEO
Do you have a colleague or friend who needs this newsletter? Send them this link to help them get their own copy:
In this week’s In-Ear Insights, Katie and Chris discuss generative AI, one of the three major branches of artificial intelligence. This includes tools like ChatGPT, Google Bard, and Microsoft Copilot. They start by defining artificial intelligence and the three big categories within it: regression, classification, and generation. Generative AI makes things and allows people to interact with artificial intelligence in a way that they don’t have to be an expert to do so. However, it can’t create something truly unique that has never been seen before, and it doesn’t do well with vagueness. The models are also being used unethically by creating misinformation, disinformation, and deep fakes at a massive scale to create the appearance of credibility. They advise to use generative AI ethically and be specific when using it.
Last week on So What? The Marketing Analytics and Insights Livestream, we discussed marketing mix modeling. Catch the episode replay here!
This Thursday at 1 PM Eastern on our weekly livestream, So What?, we’ll be discussing generative AI capabilities and fine-tuning. Are you following our YouTube channel? If not, click/tap here to follow us!
Here’s some of our content from recent days that you might have missed. If you read something and enjoy it, please share it with a friend or colleague!
- In-Ear Insights: What is Generative AI?
- Review your data before you make decisions
- Mailbag Monday: What should we be looking for with a prompt engineer?
- So What? Why should you be using Marketing Mix Modeling?
- A look at Trust Insights inbound links in honor of our 5th birthday
- INBOX INSIGHTS, April 5, 2023: Marketing Mix Modeling
- In-Ear Insights: What is Marketing Mix Modeling?
- Now with More Dummies
- Almost Timely News, April 9, 2023: What’s Coming With Generative AI
Take your skills to the next level with our premium courses.
Get skilled up with an assortment of our free, on-demand classes.
- The Marketing Singularity: Large Language Models and the End of Marketing As You Knew It
- Powering Up Your LinkedIn Profile (For Job Hunters) 2023 Edition
- Measurement Strategies for Agencies course
- Empower Your Marketing with Private Social Media Communities
- How to Deliver Reports and Prove the ROI of your Agency
- Competitive Social Media Analytics Strategy
- How to Prove Social Media ROI
- What, Why, How: Foundations of B2B Marketing Analytics
In this week’s Data Diaries, let’s talk about two different ways to get what you want out of large language models like OpenAI’s GPT-4 and the associated web interface, ChatGPT. The GPT family of models – which is a generic term that stands for generative pre-trained model – are astonishingly good at what they do. We’ve seen folks use the GPT family of models to do everything from write books and screenplays to extract data from websites to debate military strategy.
However, the fact that a large language model can do something doesn’t necessarily mean it does the thing well (or factually correct). The process of programming models to do what we want, called prompt engineering, is the way a vast majority of people get results from models. These prompts can be as short as “Translate this Danish text into English” and as long as several pages of text instructions.
We’ve discussed prompt engineering at length in various blogs and videos, including:
- Improving Prompt Engineering with the SDLC
- Prompt Engineering Livestream
- Prompt Categories and Engineering
So let’s focus on the second major way to get what you want. A public GPT model like OpenAI’s GPT-3.5 (the default model that ChatGPT uses) is a bit like a giant Swiss Army knife. Because it’s trained on vast amounts of data in every domain of business, it has some expertise in everything – but it doesn’t have deep expertise in any one thing, not in the way we’d normally work with a human expert. As a result, to get satisfactory results, sometimes you have to provide an enormous amount of data in the prompt, leading some folks to ask what the point of the service is if you have to do the work yourself anyway.
That’s a topic for another time, but the answer is that instead of using a big public model, a different strategy is to roll your own, to build your own GPT model that’s customized to your industry, perhaps even your company. How does that work? It’s a several step process, beginning with gathering your data, downloading one of the open source models available online for free, then retraining it (a process called fine-tuning) on your data specifically so that your data has precedence in the model, and then deploying it like any other piece of software within your company.
Why would we want to do this? There’s an inverse relationship between purpose and prompt; the more general a model is, the longer your prompt has to be to keep it on the rails. The more specific a model is, the shorter your prompt has to be because the model really can’t do much outside its specific task. Additionally, some industries may have sensitive data that they don’t want to send to a third party like OpenAI; healthcare, finance, and the military all come to mind as industries that would like to harness the power of generative AI, but can’t just go handing off restricted information to third parties.
For example, Bloomberg LP recently announced its own custom GPT model, BloombergGPT, that was trained specifically on Bloomberg’s 41 years of proprietary financial transaction and research data. There’s a very good chance that their large language model cannot even make jokes, certainly can’t do lyrical interpretations in the style of Weird Al, or write guitar tabs. What it can do better than any public model is the kind of financial analysis that Bloomberg customers might want to do with natural language – and that’s data they will keep very close to their vest.
So, let’s say you wanted to start building your own fine-tuned model. What would be the process? Obviously, you’d want to define the goal, the purpose, in crystal clear requirements so that you knew what the model should and should not be able to do.
The next step would be gathering your data, the data you would need to train the model. This is by far the most complicated, time-consuming part of fine-tuning a model. You have to determine what tasks the model will be asked to perform, such as summarization, Q&A, extraction, rewriting, etc. and based on those tasks, provide labeled data so that the model can learn from it.
Let’s say you wanted a model that could interpret Google Analytics data. You’d need to provide it with Google Analytics datasets and a long series of existing analyses for it to learn from. It might look something like this:
c("Organic Social", "Direct", "Organic Search", "Email", "Referral", "Unassigned", "Organic Video")|c(3904, 1374, 439, 301, 265, 133, 9)|
That would be accompanied by a narrative like:
“In this examination of company’s default channel grouping traffic data, we see Organic Social overwhelmingly the largest driver of traffic to the website. This poses a substantial risk to the company as Organic Social is an unreliable channel for long-term performance. Our recommendation is to diversify traffic sources and focus efforts on sources that are under your control, such as email and SMS. Organic Video is unusually low despite the company’s marketing efforts; examine your organic video placements for correct tracking information.”
This would be one row in a spreadsheet (technically a JSONL file, but it’s functionally a spreadsheet) out of dozens, hundreds, perhaps even thousands of examples. These examples then get fed to the open source model and it changes what it knows.
So what? Here’s the key takeaway from this: if, as you evolve through your AI journey, you realize that at some point you’re going to need a custom model, now is the time to start thinking about gathering your data and making sure it’s ready for that inevitable requests to do so. Think about the primary tasks large language models are asked to do, and then think about what data you have available that would fit in those major categories. By the time your organization has evolved enough to need a custom model, you’ll be well prepared (and you might have even gotten a head start).
- Case Study: Exploratory Data Analysis and Natural Language Processing
- Case Study: Google Analytics Audit and Attribution
- Case Study: Natural Language Processing
- Case Study: SEO Audit and Competitive Strategy
Here’s a roundup of who’s hiring, based on positions shared in the Analytics for Marketers Slack group and other communities.
- Data Analyst at Southeast Brands
- Digital Analytics Technical Expert at AddData
- Director Of Growth at McGraw Hill
- Senior Analyst – Martech Engineering Innovation at Blast Analytics
- Senior Campaigns & Content Marketing Manager at Goldcast
- Senior Consultant – Martech Engineering Innovation at Blast Analytics
- Senior Data Engineer (M/F/D) at DCMN GmbH
- Senior Data Scientist at Foursquare
Are you a member of our free Slack group, Analytics for Marketers? Join 3000+ like-minded marketers who care about data and measuring their success. Membership is free – join today. Members also receive sneak peeks of upcoming data, credible third-party studies we find and like, and much more. Join today!
We heard you loud and clear. On Slack, in surveys, at events, you’ve said you want one thing more than anything else: Google Analytics 4 training to get ready for the July 1 cutoff. The newly-updated Trust Insights Google Analytics 4 For Marketers Course is the comprehensive training solution that will get you up to speed thoroughly in Google Analytics 4.
What makes this different than other training courses?
- You’ll learn how Google Tag Manager and Google Data Studio form the essential companion pieces to Google Analytics 4, and how to use them all together
- You’ll learn how marketers specifically should use Google Analytics 4, including the new Explore Hub with real world applications and use cases
- You’ll learn how to determine if a migration was done correctly, and especially what things are likely to go wrong
- You’ll even learn how to hire (or be hired) for Google Analytics 4 talent specifically, not just general Google Analytics
- And finally, you’ll learn how to rearrange Google Analytics 4’s menus to be a lot more sensible because that bothers everyone
With more than 5 hours of content across 17 lessons, plus templates, spreadsheets, transcripts, and certificates of completion, you’ll master Google Analytics 4 in ways no other course can teach you.
If you already signed up for this course in the past, Chapter 8 on Google Analytics 4 configuration was JUST refreshed, so be sure to sign back in and take Chapter 8 again!
Where can you find Trust Insights face-to-face?
- B2B Ignite, Chicago, May 2023
- ISBM, Chicago, September 2023
- Content Marketing World, DC, September 2023
- MarketingProfs B2B Forum, Boston, October 2023
First and most obvious – if you want to talk to us about something specific, especially something we can help with, hit up our contact form.
Where do you spend your time online? Chances are, we’re there too, and would enjoy sharing with you. Here’s where we are – see you there?
- Our blog
- In-Ear Insights on Apple Podcasts
- In-Ear Insights on Google Podcasts
- In-Ear Insights on all other podcasting software
Our Featured Partners are companies we work with and promote because we love their stuff. If you’ve ever wondered how we do what we do behind the scenes, chances are we use the tools and skills of one of our partners to do it.
- Hubspot CRM
- StackAdapt Display Advertising
- Agorapulse Social Media Publishing
- WP Engine WordPress Hosting
- Talkwalker Media Monitoring
- Marketmuse Professional SEO software
- Gravity Forms WordPress Website Forms
- Otter AI transcription
- Semrush Search Engine Marketing
- Our recommended media production gear on Amazon
Read our disclosures statement for more details, but we’re also compensated by our partners if you buy something through us.
Some events and partners have purchased sponsorships in this newsletter and as a result, Trust Insights receives financial compensation for promoting them. Read our full disclosures statement on our website.
Thanks for subscribing and supporting us. Let us know if you want to see something different or have any feedback for us!
Need help with your marketing data and analytics?
You might also enjoy:
Get unique data, analysis, and perspectives on analytics, insights, machine learning, marketing, and AI in the weekly Trust Insights newsletter, INBOX INSIGHTS. Subscribe now for free; new issues every Wednesday!
Want to learn more about data, analytics, and insights? Subscribe to In-Ear Insights, the Trust Insights podcast, with new 10-minute or less episodes every week.