This data was originally featured in the January 3rd, 2024 newsletter found here: INBOX INSIGHTS, JANUARY 3, 2024: PROCESS OVER OUTCOME, DATA AS AN AI ADVANTAGE
In this week’s Data Diaries, let’s talk about one of the major factors that will determine success or failure when it comes to generative AI: the quality of your data, and data as an AI advantage.
Right now, we’re still in the early days, the infancy of using tools like ChatGPT or Stable Diffusion. The use cases people are applying these tools to are mostly generic – “help me write a blog post about marketing automation”, for example. That requires no external data, relying on the data that’s already inside the models.
However, what’s inside the models is an aggregation of publicly available data. It’s good, certainly, but one thing it isn’t is unique. Everyone’s using the same base model and the same base data. What would differentiate you? Your data, the data you have behind closed doors that was never published on the Internet for language model makers to scrape.
Back in November of last year, OpenAI rolled out one of the first mass customization capabilities for non-technical users with their Custom GPTs. This facility lets the non-technical user customize ChatGPT to use their own data, function calls, and custom prompts. What makes a Custom GPT work really well? You guessed it – the data you upload.
How would you use your own data in an example like this? Say I was going to build a Custom GPT for email marketing. A while back, I wrote a book that is no longer publicly available on 52 tips for email marketing. I still have a digital copy on my hard drive, but the public version is gone and has been gone for a decade now. Much of the advice is still practical and useful, so I would take that PDF and load it into a Custom GPT. Now, my Custom GPT has knowledge that is unique, that is different than what others would be using if they were to build a Custom GPT for email marketing.
Here’s another example. Suppose I wanted to make a Custom GPT that could talk with a specific perspective about marketing analytics. As the administrator and owner of the Analytics for Marketers Slack group, I could export the entire archive of conversations from the last 5 years, anonymize and de-identify it to protect user privacy, and then load that conversation data into a Custom GPT. Those conversations aren’t publicly available; no search engine or crawler can get to it and use it. But my Custom GPT could leverage that knowledge to be a better conversationalist about marketing analytics by giving it more specific conversations just about marketing analytics.
A third example that many companies can probably relate to are conference calls. You’ve been on so many calls, and some of those calls have been super valuable. Maybe you have a call center with call recordings. Maybe you have a sales CRM that logs calls from your sales team. Whatever the case is, you have a treasure trove of audio data that you could use AI to transcribe, de-identify, then add to a generative AI system. That custom data would give an incredible amount of context to any AI system, and it’s uniquely yours.
So your first task is to examine the data you have access to that others don’t. What data do you have laying around, on laptops and servers, that might be useful to help a language model increase its context and understanding about your specific way of doing things?
As generative AI continues to evolve, your specific data will become more and more valuable – if you can find it and use it well. Now is the time to start inventorying it and preparing it for use with AI.
Need help with your marketing AI and analytics?
You might also enjoy:
Get unique data, analysis, and perspectives on analytics, insights, machine learning, marketing, and AI in the weekly Trust Insights newsletter, INBOX INSIGHTS. Subscribe now for free; new issues every Wednesday!
Want to learn more about data, analytics, and insights? Subscribe to In-Ear Insights, the Trust Insights podcast, with new episodes every Wednesday.