INBOX INSIGHTS: Data Governance, Hot Ones Text Analysis (10/25) :: View in browser
In case you weren’t aware, there were 11,038 Martech solutions available to you as of May 2023. We know that as of now, Q4 2023 there are a lot more to add to that list. Of those, many that are AI-powered tools.
Many of us, me included, are working on annual plans, budgets, and overall “what the heck happened” summaries for the end of the year. As we’re planning, we’re looking at replacing older tech with new tech, experimenting with AI solutions, and skilling up for the new year.
With that, this is your data governance public service announcement.
Data governance, according to Google’s definition, is setting internal standards—data policies—that apply to how data is gathered, stored, processed, and disposed of. It governs who can access what kinds of data and what kinds of data are under governance. Data governance also involves complying with external standards set by industry associations, government agencies, and other stakeholders.
In simple terms, who has access to your data and where do you find it when you need it?
Part of your annual planning should be a data governance audit.
Who owns your data?
If I had a nickel for every time I heard, “That person who set that up doesn’t work here anymore” I might have…three dollars? That’s a lot of nickels. With 11,038 (and growing!) technology solutions, there is a good chance that you don’t know all the usernames and passwords to each account you use. I was working with a client recently that asked me to help them reclaim their accounts because the person that set them up left with little warning. How do you fix this? You can’t prevent people from leaving an organization. Sometimes you have notice and sometimes you don’t. A best practice you can start implementing, even on older systems, is to have a general email address that is not tied to one specific person. For example, we use [email protected] when we request access to systems for our clients and trial new software for ourselves. This account doesn’t belong to either me or Chris, but we both have access to it. With this email, we have a process in place for access. When we bring on contractors we change the password and give them access. When we offboard contractors, we change the password. We don’t have to worry about who owns the systems or who has access. The caveat here is that we’re a small company and can have this level of control over things. Sometimes Chris will sign up with it with his personal email account to test it out. Once he’s evaluated it and we decide it’s right for the company we’ll set up an account through [email protected].
What if you can’t set up a general account?
This is more common than not. Individuals own accounts and then grant access to agencies, contractors, and other employees. In this instance, you should be auditing your software at least quarterly to see who owns your systems. Outside of that quarterly audit, you should develop a process for employee and agency turnover. Because you’re auditing your systems and you know who owns them, you can have protocols in place for hand-offs. This could include something as simple as a password reset and more complex solutions such as migrating data to new setups. Before you get to that, you first need to know who owns your data.
How do I get my data out of an account I don’t own?
Great question. Sometimes you can’t. This is why it’s important to know who owns the data and how you can regain ownership of your software. In the event you need your data and you don’t have access you might need to stand up a new system, which is the last possible option you want to consider. This is a pain and you lose access to historical data. The upside, if you are someone who chooses to see them, is that setting up a new system is an opportunity to do it the right way. Again, not a great option but sometimes it’s your only one.
In less regulated industries and companies, we don’t give a lot of thought to the accounts tied to our systems. We bring on companies like Trust Insights to set up our systems, analyze our data, and report it back to us. And then we forget that we gave up access to our data to anyone because we’re focused on marketing tactics, revenue generation, and growth.
So, if you do nothing else to protect your data moving into 2024, make sure you know who owns it and who has access. Remove people that no longer work with it. Make sure the right people have ownership and level of access. Set up protocols for inevitable turn over. Create process for trialing and setting up new systems.
Are you paying attention to your data governance? Reply to this email to tell me or come join the conversation in our Free Slack Group, Analytics for Marketers.
– Katie Robbert, CEO
Do you have a colleague or friend who needs this newsletter? Send them this link to help them get their own copy:
In this episode of In-Ear Insights, the Trust Insights podcast, Katie and Chris discuss the peculiar budget cuts that CMOs are making for 2024 and how it will impact marketing teams. We talk about the surprising reductions in spending for CRM, customer experience, and brand building. We analyze the disconnect between using AI to improve productivity while severely cutting staff. Katie and Chris examine the different types of creative thinking needed on marketing teams and the risks of letting go of divergent thinkers. We explain why AI alone can’t magically fix poor data tracking or replace most marketing roles. Katie and Chris provide helpful perspective on AI’s capabilities and limitations that decision makers should understand before slashing budgets. Tune in to gain insight into crafting budgets and strategies that balance AI and human skills.
Last time on So What? The Marketing Analytics and Insights Livestream, we looked at updates in video SEO. Catch the episode replay here!
This week on So What? The Marketing Analytics and Insights Live show, we’ll be talking about the CMO Survey’s latest results. Tune in Thursday at 1 PM Eastern Time! Are you following our YouTube channel? If not, click/tap here to follow us!
Here’s some of our content from recent days that you might have missed. If you read something and enjoy it, please share it with a friend or colleague!
- Understanding Customer Needs with a Purpose and Data
- So What? Revisiting Video SEO tactics
- Understanding customer needs with a purpose
- INBOX INSIGHTS, October 18, 2023: Be Remarkably Human, Generative AI and YouTube Consumption
- In-Ear Insights: Changing Your MarTech Stack
- Let’s take a look at the job market
- Now with More Games, Threads, and Fairness!
- Almost Timely News, October 22, 2023: The Generative AI Beginner’s Kit
Take your skills to the next level with our premium courses.
Get skilled up with an assortment of our free, on-demand classes.
- The Marketing Singularity: Large Language Models and the End of Marketing As You Knew It
- Powering Up Your LinkedIn Profile (For Job Hunters) 2023 Edition
- Measurement Strategies for Agencies course
- Empower Your Marketing with Private Social Media Communities
- How to Deliver Reports and Prove the ROI of your Agency
- Competitive Social Media Analytics Strategy
- How to Prove Social Media ROI
- What, Why, How: Foundations of B2B Marketing Analytics
Today, let’s look at an interesting use case for text analytics – and when generative AI isn’t the right answer. Over the last few days, I was listening to some neuroscience podcasts talking about how speech evolved in our brains, and how sound and rhythm are far more primitive and well-established in our brains than language is. That got me wondering, would that show up in data if we were to test people put under duress, and whether language would devolve as the primitive brain focused more on managing stress and strain than language formation.
How would we gather such data and put it to use? This is what separates more experienced data science and AI practitioners from layfolk; being able to frame out a problem, understand what data is available, and then construct the necessary infrastructure – people, processes, and platforms – to achieve the purpose, test the hypothesis, and create measurable performance data.
Let’s start with the user story. Why would someone need to perform this kind of text processing? In this specific instance, we might have a user story like “As a neuroscience enthusiast, I want to test whether physical duress diminishes the body’s language capacity so that I can better understand how the language center of the brain interacts with the body’s ‘survival brain’”.
Fortunately, we don’t need to set up clinical trials or sit before a human ethics review board to conduct this test and gather this data, because it’s already been gathered for us. There are 21 seasons of the YouTube series Hot Ones, an interview show in which host Sean Evans subjects people to increasingly spicy foods that cause substantial physical duress. If we were to download a selection of the episodes – say about half of them, or 122 episodes – transcribe them, and analyze the way in which people used language throughout, we might be able to understand the impact physical duress has on our ability to use language.
This is a critical point: this task is not something generative AI can accomplish, at least on its own. Language models like those that power ChatGPT can certainly do text analysis, but the entire process of acquiring transcripts and processing them in phases isn’t something you can do even with the most sophisticated prompting.
So how do we accomplish this task? We use an ensemble of AI and non-AI tools to do so. For example, the non-AI tool yt-dlp can extract data straight from YouTube; we can extract in a variety of formats, but this is not the provenance of AI in any way.
We could take YouTube closed captions directly, but YouTube’s built-in captioning software doesn’t do a great job of transcription, especially in dealing with Hot Ones’ guest antics as they endure ever spicier foods. OpenAI’s Whisper transcription model (which is a multimodal generative AI model, speech to text) does a much better job with this, so we’ll use that to convert our video downloads to text.
We could take each transcript then and process it with generative AI. However, that processing will run into two major problems. First, transcripts are inherently fairly large chunks of text, and those chunks of text need to be broken up for generative AI to use. Second, generative AI is a prohibitively expensive application for what we’re trying to do. We really care about some basic measures of text analytics, like word counts, word diversity, and grade level. Our hypothesis – that language devolves under duress as the more primitive brain takes charge – doesn’t need generative AI to do that analysis. Old school text analytics will do that just fine.
That said, generative AI CAN speed up the process of writing the code necessary to do that processing. We’ll use ChatGPT’s GPT-4 model to generate Python 3 code to accomplish the actual task, which is so efficient that it can run on pretty much any laptop:
Our code examines word count, word length, word diversity, grade level using the Fleisch Kincaid analysis method, the SMOG index for readability, and the automated readability index for readability as our quantitative distillation of these texts. First, we clean the data of any show episodes that are NOT standard interviews, such as hot sauce season announcements. Then we break each episode into 10 sections that roughly correspond to the 10 different hot sauces used, and then we chart out the results:
What we see is thoroughly unsurprising; as each episode of the show progresses, the language used by both host Sean Evans and his guest degrades. The physical duress we hypothesized is provably true; the most duress appears around 70-80% of the way through the show, which roughly corresponds to the hot sauce “Da Bomb”, a fan favorite for the amount of strain it places guests under.
This experiment is not something you can do out of the box with generative AI. Generative AI played a key role in making it happen, to be sure, but the non-AI portions were equally important to accomplish the purpose. While it’s important to try out AI on every task to learn what it’s good at and what it’s bad at, it’s equally important to know how to fit AI into the full suite of capabilities you have once you’ve determined what it can’t do.
Now, this is a fun example of how you might use this sort of data. What practical applications might this ensemble of techniques have? Are there conditions in the workplace where people might be subject to ever higher levels of stress, and would we find value in identifying that? Certainly. Think about all the call center calls and customer chats you collect on a regular and frequent basis. Knowing that language capacities diminish under duress, we might apply these same techniques to explore whether customers are under increasing strain, and when in the process they are. Could we then provide a better customer experience, perhaps by monitoring customer interactions to detect language degradation? Absolutely – and it would be relatively straightforward to do so, with the techniques explored in this experiment.
Got questions about how to use AI effectively? Drop us a line at TrustInsights.ai/aiservices
Want the source transcripts? They’re available on our Github repository.
- Case Study: Exploratory Data Analysis and Natural Language Processing
- Case Study: Google Analytics Audit and Attribution
- Case Study: Natural Language Processing
- Case Study: SEO Audit and Competitive Strategy
Here’s a roundup of who’s hiring, based on positions shared in the Analytics for Marketers Slack group and other communities.
- Analytics Engineer (Contract) at Harnham
- Data Analyst at Indeed.com
- Head Of Analytics & Data Science at Intrepid Digital
- Java Engineer at Save the Children UK
- Marketing Analytics Manager at Beyond Finance
- Marketing Operations Manager at Birdeye
- Marketing Research Sales Director at InsightsNow
- Senior Analytics Implementation Engineer at Sky
- Senior Analytics Manager – Digital Consumer Engagement (Dce) at LEGO
- Senior Data Analyst at Cox Careers
- Senior Manager, Marketing Analytics at Asurion
- Sr. Principal Customer Success Manager at Tealium
- Web Analytics & Search Specialist at Career Portal
Are you a member of our free Slack group, Analytics for Marketers? Join 3000+ like-minded marketers who care about data and measuring their success. Membership is free – join today. Members also receive sneak peeks of upcoming data, credible third-party studies we find and like, and much more. Join today!
Now that you’ve had time to start using Google Analytics 4, chances are you’ve discovered it’s not quite as easy or convenient as the old version. Want to get skilled up on GA4? Need some help with your shiny new system? We can help in two ways:
Where can you find Trust Insights face-to-face?
- SMPS AEC AI, October 2023
- DigitalNow, Denver, November 2023
- Social Media Marketing World, San Diego, February 2024
- MAICON, Cleveland, September 2024
First and most obvious – if you want to talk to us about something specific, especially something we can help with, hit up our contact form.
Where do you spend your time online? Chances are, we’re there too, and would enjoy sharing with you. Here’s where we are – see you there?
- Our blog
- In-Ear Insights on Apple Podcasts
- In-Ear Insights on Google Podcasts
- In-Ear Insights on all other podcasting software
Our Featured Partners are companies we work with and promote because we love their stuff. If you’ve ever wondered how we do what we do behind the scenes, chances are we use the tools and skills of one of our partners to do it.
- StackAdapt Display Advertising
- Agorapulse Social Media Publishing
- WP Engine WordPress Hosting
- Talkwalker Media Monitoring
- Marketmuse Professional SEO software
- Gravity Forms WordPress Website Forms
- Otter AI transcription
- Semrush Search Engine Marketing
- Our recommended media production gear on Amazon
Read our disclosures statement for more details, but we’re also compensated by our partners if you buy something through us.
Some events and partners have purchased sponsorships in this newsletter and as a result, Trust Insights receives financial compensation for promoting them. Read our full disclosures statement on our website.
Thanks for subscribing and supporting us. Let us know if you want to see something different or have any feedback for us!
Need help with your marketing data and analytics?
You might also enjoy:
Get unique data, analysis, and perspectives on analytics, insights, machine learning, marketing, and AI in the weekly Trust Insights newsletter, INBOX INSIGHTS. Subscribe now for free; new issues every Wednesday!
Want to learn more about data, analytics, and insights? Subscribe to In-Ear Insights, the Trust Insights podcast, with new 10-minute or less episodes every week.