In this episode of In-Ear Insights, Katie and Chris walk through the new attribution modeling in Google Analytics 4. Watch as they explain the details, the different settings, which built-in model you should use most of the time, how to visualize your funnel, and explore some anomalies that might require you to use more than one attribution modeling source.
Subscribe To This Show!
If you're not already subscribed to In-Ear Insights, get set up now!
- In-Ear Insights on Apple Podcasts
- In-Ear Insights on Google Podcasts
- In-Ear Insights on all other podcasting software
Advertisement: Data Science 101 for Marketers
Do you want to understand data science better as a marketer? Would you like to learn whether it’s the right choice for your career? Do you need to know how to manage data science employees and vendors? Take the Data Science 101 workshop from Trust Insights.
In this 90-minute on-demand workshop, learn what data science is, why it matters to marketers, and how to embark on your marketing data science journey. You’ll learn:
- How to build a KPI map
- How to analyze and explore Google Analytics data
- How to construct a valid hypothesis
- Basics of centrality, distribution, regression, and clustering
- Essential soft skills
- How to hire data science professionals or agencies
The course comes with the video, audio recording, PDF of the slides, automated transcript, example KPI map, and sample workbook with data.
Sponsor This Show!Are you struggling to reach the right audiences? Trust Insights offers sponsorships in our newsletters, podcasts, and media properties to help your brand be seen and heard by the right people. Our media properties reach almost 100,000 people every week, from the In Ear Insights podcast to the Almost Timely and In the Headlights newsletters. Reach out to us today to learn more.
Watch the video here:
Can’t see anything? Watch it on YouTube here.
Listen to the audio here:
- Need help with your company’s data and analytics? Let us know!
- Join our free Slack group for marketers interested in analytics!
What follows is an AI-generated transcript. The transcript may contain errors and is not a substitute for listening to the episode.
Christopher Penn 0:02
This is In-Ear Insights, the Trust Insights podcast.
In this week’s in In-Ear Insights, Google Analytics four has added in attribution modeling, which has been something that was missing at launch.
And it is now available.
So we’re going to take some time today to walk through it, what’s in it, what’s not in it, and some some important considerations.
Now, for those folks who are listening, you’ll want to hop on over to our YouTube channel over at Trust insights.ai slash YouTube, because it shares a walkthrough is on screen.
So again, that’s Trust insights.ai, slash YouTube, we can see the video, but we’ll talk through it as well.
So I’m going to start with this.
This is the standard GA4 interface.
Now, Katie, when you look at this, where would you expect attribution to be?
Katie Robbert 0:53
Oh, that is a great question.
So full disclosure, I am not super familiar with the GA4 interface.
So Chris is not asking me trick questions.
I am just like any other marketer who this is new to so I see things like life cycle, acquisition, engagement, monetization, retention, and then user demographics and tech.
Um, you know, none of these really scream attribution to me.
I mean, you could say lifecycle, maybe acquisition, or retention, user tech, but I, like I know from using GA3, that that’s really going to be things like what device? So yeah, my guess would be under the life cycle menu, acquisition.
Christopher Penn 1:47
And here is the problem with Google Analytics for attribution.
Adding it, you have to go to the upper menus, where you see reports, explore advertising, and configure.
And then it is in advertising.
So even though the entire section is about ads section of attribution, it’s called advertising.
I don’t know why it’s labeled that folks from Google Analytics team, if you are watching this, please consider renaming this section to attribution so that people know that that’s what it is because it is actually in fact, yellow.
Katie Robbert 2:19
Well, to you know, I can understand I can sort of, I can see the other side of it, because advertising, at least to you, and I curse assumes that there’s money involved.
So paid ads, those kinds of things.
But advertising is really just another term for marketing, which is essentially what this breaks down to, which is, how did you reach your customers? How did you advertise yourself to them? whether there was money or not money behind it? And so did you advertise using email? Do you advertise using organic search? Did you advertise through your partner network? And so I can, I can draw the line and the logical conclusion? Do I think that it’s intuitive based on how we’ve been trained to think about attribution? Absolutely not.
But I can understand why it’s been called advertising.
This is sort of, you know, not to go too deep down a rabbit hole.
But one of the issues with change management is inconsistent, like inconsistent things.
And so for years, it’s been called attribution.
And now they’ve just made the decision without communicating now it’s called advertising.
And so that’s a problem.
So none of us know how to find the thing except for Chris, who literally clicks every button says, what does this button do?
Christopher Penn 3:38
And here’s the funny part, I’m going to go into my admin my settings for my property.
And if we look here, there’s a section called attribution settings.
So I’ll just say, you know, advertising settings, and in here where you can go in and set like things you would expect in your attribution model.
But that that’s that’s not what that that label in the actual applications.
So yeah, a little bit of a change management slash Who Moved My Cheese moment.
So in Look, fossile, Katie, what’s up?
Katie Robbert 4:08
Well, it’s, it’s funny, because in thinking through the consistencies, and so it’s called advertising.
But then when you look at the actual sub menu, it’s called attribution again.
And so one might say that this is just a QA issue when it was meant to be called attribution.
But somebody said, Oh, advertising the word spelled correctly.
check that box.
So just, you know, sort of thinking out loud as to why this came about.
It could be a mistake.
Christopher Penn 4:37
it very well could be.
So what do we have in here? Well, obviously everything in this is contingent upon you having set up conversions if you didn’t set up conversions, which are GFR goals.
This will This section will be completely unhelpful.
So let’s go ahead and and just see what conversions I’ve got set up.
There are four in here that do not matter that are just pollution to in my account, because Integrating Firebase for no good reason.
Don’t do that.
By the way, it’s just bad for your analytics.
So we got purchase, which is someone buying a book, downloading a file, subscribing to my newsletter, or going to my public speaking page to book, some speaking.
And what we see here right away is conversions by default channel grouping.
Well, we all thought, hey, default channel groupings are going away.
Right? That’s what we thought.
So because in GA, one of the things that’s really tricky, is you have to get your default channel groupings correct.
Or the application doesn’t show them correctly.
So they’re back here in GA4, like, Where did this come from? If you look at the help files, GA4 has a default channel groupings.
Now, they list through like, you know, what are what’s in this, this functionality.
The part that was surprising to me was that default channel groupings, so let’s look at this.
Oops, GA4 is that unlike Google Analytics 3, GA for default channel groupings are not case sensitive and cannot be edited.
So what this means is that you have to get your source medium pairings correct.
And matching what Google says they must be for your default channel groupings to work properly, if you don’t do that your reports is going to be wrong, and you can’t fix it.
Katie Robbert 6:27
So given that, so this is the current state today, given the way that Google is rolling out some of these features, I think there is a likelihood that somewhere down the line in the future, you could edit the channel groupings, but just not today.
So if you are using the GA for data, then as Chris says, you have to get your source medium, correct.
But with the cavea, they have to match the way that Google is defining them, not just how you’re defining them for your company.
And that’s, you know, that’s problematic, because that’s a whole new way of learning for a lot of people.
Like, we know that a lot of companies are now just getting used to Okay, let me just standardize how we’re using source medium.
Now they have to relearn it all over again, because Google’s once again, change the rules.
Christopher Penn 7:22
So the one thing that I would say there is that you can’t do it on the snapshot page.
But later on, you can say let’s just use medium instead.
So if you’ve got a, you know, a source medium hierarchy that works for you, you can just ignore Google’s default channel groupings.
The danger is, of course, that if you have somebody who’s not trained it, they’ll use the defaults, the defaults are wrong.
So in the attribution, we have our usual model comparison tool.
And what we’ve got here are the same models that everyone should be used to from previous versions of Google Analytics.
Last click, first click linear attribution, position based and time decay.
And if you want a refresher on this, we have, I don’t know, any number of podcasts and some webinars and things on what these different attribution models mean.
The model comparison tool is somewhat helpful in terms of just figuring out which of these models is sort of best reflects reality.
For my personal website, here, we see that email gets slightly more credit than it should we see that referral gets slightly less credit than it should.
And organic social gets slightly less credit than it should.
But again, you can change this to either medium or source medium, if you want to get more granular, which I think is probably more useful.
Katie Robbert 8:39
It’s, what’s interesting, too, is, it looks like they maybe I’m mistaken, it looks like they’ve added a new model that gives more priority to ads.
But if we’re clear, when the the Add preferred model is only Google ads, it’s not other types of paid advertising.
So proceed with caution.
Google is giving themselves the most credit if you choose to let them
Christopher Penn 9:09
Yes, in fact, if you look at the default channel grouping, some of these like paid search video and display require the Google ad network.
Right so you it is not any advertising this display, it’s only Google’s.
So just Just be aware that that there’s definitely a bias built into the tool.
Katie Robbert 9:27
So this is Google, they can do that.
Christopher Penn 9:29
Exactly their tool and you’re getting it for free.
So you can’t really complain like I’m not getting what I pay for it because you’re paying nothing for it right.
Now, the conversion paths This is a somewhat somewhat of an improvement over the previous versions.
I’m going to go ahead and change this from a default channel grouping to medium because again, we’ve got all those mediums that may or may not be approved and you can choose the model.
I have these models.
The one that’s least bad we’ve said this a lot, but one this least bad is time decay.
And what this now does is this part here is still unhelpful, right? There’s, you know, just the number of times something happened, but you now have sort of a funnel ish view of, of your mediums to see, okay? Well, what medium at each of these given touch points, is helping move things along helping convert, so for Let’s go, I’m gonna just do Hey, go ahead and do just people subscribing to my newsletter, I think that’s a, an easy thing.
So in the beginning referral, it sort of introduces people to the newsletter, in the middle, not a whole lot happens.
And then you know, sort of last touch, you can see, you know, two thirds of my conversion, touch points are sort of last touch, touch points.
None is no media.
And I know what that is.
That’s, that is actually social media, because I screwed up, one of my click tracking things.
On my personal website, just states the importance of having good governance, and then referral email and the organic search, what I find interesting here is that from a medium basis, organic search really isn’t showing up a whole lot here.
And this conflicts with reality, you know, in looking at attribution models from GA3, and our own attribution models from our website, this does not match up with, with what we’ve we’ve seen and pretty much know to be true about my website.
Katie Robbert 11:31
So why is that? Um, you know, I know, you know, if I go back a couple of steps, I know that one of the issues with GA3, and the default channel grouping is that there’s a lot of problems with it straight out of the box, which is why we always counsel and coach people to modify it so that it’s collecting data correctly.
And so I guess my first question is in GA, for, you know, how confident are you in the default channel grouping that’s out of the box, given that you can’t edit it, we know that a GA3 email is a problem.
We know that social is a problem.
And then I guess my next question is, what’s, what’s the disconnect between GA3 and G four in terms of this data? It’s the same data.
Christopher Penn 12:30
So how confident Am I in it, I don’t know the underlying model.
So I have, I have no confidence in it, because I don’t know how it works, right? It’s a black box.
If I knew for sure what it was and how it worked and sort of the underlying base, because the model I might have more confidence in I am making the assumption.
And this is an assumption because it’s not published anywhere in Google’s documentation, that it’s still using the same models from the old Google attribution product, which was using Shapley values.
But even that doesn’t necessarily again match up when I look at the the the Markov chain model version that we use for my newsletter, it is a very different story, because it looks at across all the different touch points early, middle and late to say like this channel, does the most heavy lifting.
And we can see here that, you know, organic searches 57% of of the drivers conversion, whereas here, you know, it’s a tiny little bit, you know, 1% of early touchpoints and 3% of light touch points.
That doesn’t match up.
So I don’t know why this is the case.
And I can’t explain it because I don’t know what the underlying model is.
All I can say is that we are working with theoretically the same data, right? Because our our Markov chain modeling pulls from Google Analytics data, right? So it’s not like it’s pulling from some different data sources.
It’s all the same data set.
why this is the case? I don’t know.
Katie Robbert 14:07
Well, and I guess my question wasn’t around your confidence in the model.
My question was around your confidence in the way that in Google Analytics for Google has classify the different data points.
And so that was back to sort of the default channel group.
Christopher Penn 14:25
I’m not using default channel grouping.
So I’m using source medium here, in this section, because I know default channel groupings are raw.
Katie Robbert 14:33
So every single data point that you’re looking at has been categorized, then by you through a UTM tag.
Christopher Penn 14:41
That’s correct, or unknown source.
So when we turn on, you know, default channel groupings, we now have referral, organic search, organic, social, etc.
Again, based on Google’s definitions, and unassigned and even here, this still doesn’t match up with reality.
Katie Robbert 15:00
So not letting my anxiety get the best of me.
If I’m if I’m, you know, I, if I’m someone who’s being asked to look at this information and make a decision with it, I don’t know that I can do that.
Because based on what I’m being told by you, someone who knows Google Analytics inside and out, you can’t yet trust the data.
So what am i someone who doesn’t know it as well supposed to do about it?
Christopher Penn 15:30
So I don’t know, I honestly don’t know what to say, except that you probably want more than one source of attribution modeling something to try and get on.
So I’m assuming I’m hoping that you took our advice.
And you didn’t just turn off your GA3 account, right? Because obviously, you can go back to GA3, and look at the same data, in fact that I’m actually curious to see what would that would look like.
So let’s go to my old GA3 account.
Katie Robbert 16:04
Well, and I guess that that’s, you know, one of the points that we want to drive home is GA4 is still very much, almost experimental at this point.
Not all the features are rolled out, we didn’t think we didn’t know if attribution model was coming back.
And lo and behold, there it is.
However, there are still some challenges with it, in terms of its intuitiveness, in terms of the data that is being brought in and in terms of the underlying model.
And so while we’re sharing this information with you about Google Analytics for and we still encourage you to set up a parallel instance, that’s what it should be parallel to Google Analytics three.
So now is not the time to abandon ship on GA3, and go to GA4.
Christopher Penn 16:56
Yeah, so I just looking very quickly, organic search ranked a lot higher in this in GA3.
So there’s a definite difference between these two models.
GA 3d models are older.
They don’t you I’m would assume again, this is the assumption that not using the same machine learning models that GA4 is.
So the big question that, you know, we need to ask and our friends at Google is what is the underlying architecture of the GA4 attribution mala? Again, you don’t have to give away all the secrets to say it’s exactly it’s just what family of classification is? Is it Markoff chances is the Shapley values, is it a neural network? If we know that, then we can start to say, Okay, well, based on we how do we know that model works, then this answer may or may not be sensible, or at least get some insight as to essentially how the attribution model is coming up with decisions.
But to your point, Katie right now, I would be real hesitant to make decisions on that model, because it would say effectively, that organic social is the thing for early fall interactions on my newsletter.
I know for sure, again, because we’ve built our own models, that is very much not the case in any way shape, or form organic social, typically, for at least for this example, which is my personal site is horribly bad compared to organic search compared to you know, real referral traffic and stuff.
And it would lead me to making a bad decision decision that would not be productive.
Katie Robbert 18:30
So let me reverse the conversation a little bit.
So we’re talking about Google Analytics four and our inability to feel confident in the data right now.
How confident are you in the accuracy of the data in Google Analytics three.
Christopher Penn 18:49
In the underlying data, for my website, I am very confident in like the the data that we pull out through the API and use for the Trust Insights, attribution modeling, I am very confident in that because it’s one of those things where you know, as you’re writing the code, you can hit you have to hand inspect the data coming in to make sure that it’s it’s structured properly and things and I am very confident in that data.
And so and by extension, I am confident that what’s what j three is processing, and displaying is correct.
It makes mathematical sense.
If you look at the underlying data, and the underlying data looks good, it’s not full of weird garbage.
And it makes intuitive sense to based on things that we’ve run we owe for our site, and for my own website, I’ve run surveys to my audience.
I’ve asked people we’ve run focus groups and things.
And we can see the attribution data from people themselves saying like, yeah, this is how I found you or this is how I heard of you.
Again, that one thing that we constantly tell people all the time, always, as much as possible, have a little thing on every form says how did you hear about us and let people tell you how they heard about you is so valuable for calibrating your attribution models because you right, Katie, if you didn’t have that other data, you could have two different attribution models, you would know which one’s correct, because you have no source of truth.
But when people say I searched for you, okay, so clearly the model it says search is the thing is probably more accurate than the model.
But that’s not the case.
Katie Robbert 20:26
And I think that that’s an important part of this as well, is there’s a lot of trust in what the machines spit out.
Because it’s machine learning.
It’s a AI, it must be correct.
And so I think that it’s a good practice to question it, and make sure that there’s some sort of other validation, that the data is accurate.
And so that’s sort of the second piece, I think of the point that we’re making here is number one, you know, make sure that you haven’t turned off Google Analytics three, because Google Analytics for a while it’s good, it’s not where it needs to be yet.
And then the second is, if it if your spidey senses are telling you it’s not right, it’s probably not right.
So having that second data source in Google Analytics, three, having that third data source, in customer feedback data directly from the customers is going to continue to validate whether or not the information coming out of this big black box machine learning thing is correct or not.
And so what we’re seeing, at least in the initial is, we don’t feel confident that the that the data coming out of Google Analytics for for attribution modeling is correct yet.
Christopher Penn 21:40
And I can’t underscore that third data source.
Enough, you know, you’ve got to ask people, how did you hear about us? What made you come in today? You know, why did why did you choose us? It is such important information, because it is the customer telling you is not relying on an intermediary.
Because again, as much as we love Google Analytics, as much as we love the team there and and, and you know, the way they do their work, they are still intermediary between us and the customer.
Right as marketers, they still are interpreting the customer.
And there is never any intermediary that is better than talking to the customer directly.
Katie Robbert 22:23
I agree with that.
And I think that that sort of goes back to one of the underlying pieces of knowledge about how machine learning works, machine learning is only as good as the data that you feed it.
And so it starts with the customer self report information.
And so if the data coming out of the machine learning model, and the data that your customers reporting, is mismatching, which one do you think is wrong?
Christopher Penn 22:52
And again, there’s other pieces of software, you know, if you’re using a CRM and marketing automation system, like Mautic, or Hubspot, or marchetto, those pieces of software also have their own attribution models.
salesforce.com has its own attribution modeling.
And you may want to put a few of these models together, you know, side by side and say, okay, which ones here, you know, agree with each other, which ones don’t, I think it’s a valuable exercise.
But again, your ground source of truth has to be what the customer told you.
And that’s a great way to evaluate all these different models and say, Okay, here’s what customers have told us.
Which model most correlates to what the customers told us.
If you can do that, you’ll be in a really good condition for saying, which pieces of software should we trust, to make decisions, particularly if we’re talking about like, say, you’ve got a $50,000 budget on, you know, advertising this quarter.
If you if one mile says spend on Facebook, and another model says spend it all on LinkedIn? Or who do you believe because if you’re wrong, you’ve blown 50 grand?
Katie Robbert 24:01
Right? So back to the original point of this podcast is surprise, Google Analytics, four introduced attribution modeling.
They call it an advertising and we’re not sure if it’s wholly accurate as of yet.
Christopher Penn 24:18
So do the homework.
And again, if you have not asked your customers, how you hurt how they heard about you, now is the time they what’s the expression Katie, like, today is the day
Katie Robbert 24:30
today a great day to start fixing things.
Christopher Penn 24:34
Today’s the day to put that on your forums to ask people in an newsletter survey to you know, float a poll on sewed on your social media accounts today, which we’re recording this on August 16 to August 16th is the day for you to start collecting that customer data.
If you got a questions about anything we’ve talked about today, in today’s show, please join us over at the analytics from our Got a slack group totally free go to Trust insights.ai slash analytics for marketers where you can join over 1900 other professionals and talking about all the marketing and analytics challenges you’re facing and wherever it is that you’re tuning into our show today.
If there’s a challenge you prefer to receive that we’re probably there to go to Trust insights.ai slash ti podcast, you can find the show in any number formats.
Thanks for tuning in and we’ll talk to you soon take care need help making your marketing platforms processes and people work smarter.
Visit Trust insights.ai today and learn how we can help you deliver more impact
Need help with your marketing data and analytics?
You might also enjoy:
Get unique data, analysis, and perspectives on analytics, insights, machine learning, marketing, and AI in the weekly Trust Insights newsletter, Data in the Headlights. Subscribe now for free; new issues every Wednesday!
Want to learn more about data, analytics, and insights? Subscribe to In-Ear Insights, the Trust Insights podcast, with new 10-minute or less episodes every week.