livestream header

So What? Dark Data Mining

So What? Marketing Analytics and Insights Live

airs every Thursday at 1 pm EST.

You can watch on Facebook Live or YouTube Live. Be sure to subscribe and follow so you never miss an episode!


In this week’s episode of So What? we focus on Dark Data Mining. We walk through what it is in marketing, examples of dark data, and what you can do about it. Catch the replay here:


In this episode you’ll learn: 

  • what is it?
  • where do you find it?
  • what do you do about it?

Upcoming Episodes:

  • How do you benchmark a website’s performance? – TBD
  • Auditing your Tag Manager account – TBD


Have a question or topic you’d like to see us cover? Reach out here:

AI-Generated Transcript:

Katie Robbert 0:17
Well, hey, Happy Thursday, Happy April 1, we are not pulling any pranks today. You know, quite honestly, we just don’t have the energy. So welcome to so what marketing analytics and insights live show? Today we’re talking about dark data. What is it? Where do you find it within your own systems and what to do about it. And so, you know, I remember when we first launched Trust Insights, we, we talked a lot about dark data. And what we realized very quickly, was that people didn’t fully understand what it meant. And so I do remember, I was either doing an interview or a podcast and Chris, you commented afterwards that the example that the interviewer gave was incorrect. And so a very, it’s, you know, I certainly don’t want to call anybody out. But basically, you know, the way that we think about dark data is it’s data that’s collected that is not then used. And I think one of the examples that we give a lot is, you know, fitness data, if you have a fitness tracker, and you’re constantly tracking your steps and your heartbeat, and you know, all of these things, but you never look at the information, you never do anything with it, you never use it to make a decision to make a change. That then is by definition dark data, not because it’s something you can’t find, but because it’s something that literally just sits there on a shelf in the dark, never getting used. So Chris, dark data,

Christopher Penn 1:46
it’s all very thin slice of it, specifically dark website traffic. This is something that pretty much everybody has. It is something that pretty much everybody does nothing about. And it impacts our ability to actually make decisions to know what’s going on. So let’s actually let’s do this, let’s go into our friend Google Analytics here. Now of the Trust Insights account here, this is Google Analytics three, and we’re going to go into acquisition, I’m going to go into all traffic, and just do source medium. Now, source medium is pretty straightforward with traffic come from sort of what general channel is it? And one of the things that you’ll notice in this is true of everyone’s web analytics, is you’ll have this direct none category here, right? This is literally dark web traffic. We don’t know what it is there. This is traffic that came in, that has no attribution what service so we can’t tell where it came from. We can’t tell from an ad we were running maybe if it was somebody literally just typing in Trust Insights. AI, which if you were that person, thank you. But we don’t know. And what we need to know is, given that this is the number three traffic source on our website, is there a problem? Like Do we have a dark data problem? Look, let’s flip over really quickly into bar graph mode here. And I’m going to extend this out. Let’s look at the last month, actually, let’s look at the whole quarter, right, it’s quarter end. Let’s see what happened in q1. For us, 10% of our traffic is is missing. Right? We have no attribution on and I know Katie, as the person making decisions, is that a problem? For you?

Katie Robbert 3:33
It is, um, you know, 10%? When you think about it, you’re like, Oh, well, it’s only 10%. But, you know, especially where we have to be really strategic and tight about our budget, I want to know exactly where people are finding us so that I know exactly where I should be putting our time and resources and effort. And, you know, we’ve seen it upwards of 70% for some companies, you know, which is a bigger problem, but 10% it’s not insignificant when you look at the whole thing. You know, if you drop that 10% out, then what are you losing? And so I think that, you know, my first question to you, Chris is, you know, yes, I want to know what that actually is, you know, why does it come in that way? Why can’t Google Analytics, figure out what it is? Aren’t they Google? Aren’t they super smart? It’s their system? You know, and I think that these are all the questions that come to mind for me, I’m sure they come to mind for other marketers. You know, and I know that sort of the stock answer is, well, Google can’t figure it out. And there’s got to be something we can do. But I absolutely want to know, is there a way to reduce that number over time?

Christopher Penn 4:49
Exactly. And so I think it’s really worth digging into that that point of what causes this. So direct traffic comes from a few different things. One, it can come in from you typing in your URL to, it can come in from sources that have no attribution. So for example, when you click on an email, link in your email, if you’ve got no tracking codes in your email, it’s going to come in as direct not just simply will have no attribution because email your your, like your Microsoft Outlook on your desktop is recognized is not recognized as a source of traffic. For systems that are encrypted, like Apple Safari browser on iOS blocks, an awful lot of tracking data. So if you’re you’ve got a visitor coming from an encrypted system, it will strip off some of that attribution data unless it’s explicitly encode with UTM tracking codes. And something we discovered this week actually, if even one of Google’s stock tracking codes is malformed you fat fingered the word. All of it breaks. We tested this, Katie, on your website, we tested it. And we said, okay, UTM, source equals whatever, and then just we mangled the little ampersand. And it came in as direct, not like, Oh, that’s interesting. So Google didn’t even attempt to preserve the working parts, it just threw the whole thing out. So that’s, that’s kind of what causes this. Now, our first step, obviously, is to figure out we have a problem this case number four traffic source, I would agree is kind of a problem we want to know. So now we have to kind of figure out what is the nature of the problem? And the easiest way to do this is to think about, where’s this traffic going, like, if it was going to say one of our, you know, purchase pages? like, Ah, that’s really good. I’m really just going like blog posts, like, okay, fine, I can live with it, you know, on a Tuesday traffic going to the blog. So how do we do this? Go ahead, Katie.

Katie Robbert 6:45
I was gonna say, Well, I slightly disagree with you. Because one of the things that we do, as part of our own health check analysis, is what we call the most valuable pages report, which is an attribution analysis, based on the content from our own website. And so you’re saying, you know, just as an example, it’s not as important if we don’t know the traffic going to the blog. However, if we see in our most valuable pages report that the blog is a big driver of conversion traffic, then yeah, I want to know what that is. So I guess I disagree with you that it’s okay, we can live with it. Because, you know, our blog, when you look at our most valuable pages report is what number six number seven in terms of the top driver of conversions for us. So therefore, I still want to know what it is you can’t you can’t get out of this one, Chris Penn.

Christopher Penn 7:39
Alright, so let’s go and figure out where the traffic goes, goes. I’m gonna pull up Google Data Studio here, switch over to our Edit Mode, I got to bring up a brand new scratch page a place, we’re gonna have just a little bit of fun. Let’s go ahead and we’re going to clear everything is here, I’m going to add in a new table, and we’ll put in some colors, some heat maps and colors for fun. Okay. Now, the first thing we want to do is we have I want users want to know how many people swap out new users with all users. We have the page, I actually, I personally prefer the page, or page URL itself, as opposed to the title because sometimes, if you’re careless, you have overlapping titles. And so we see this is our traffic. Now the last thing I want to do is I want to add in our source medium. And I’m going to actually swap that here and put source medium first page Second. Okay. Now, there’s a lot of different sources here. So I want to filter this

Katie Robbert 8:48
by direct, not,

Christopher Penn 8:50
yeah, direct only. I want to include a source medium, which equals directly.

Katie Robbert 9:07
Now, this is going to tell us when they come into our website from this mysterious spot, this is where they go,

Christopher Penn 9:16
yes, this is where they go. Because again, if we’re sending, we’re getting a lot of direct traffic, like major conversion pages, I would view that as kind of an emergency, right? That’s something we need to fix sooner rather than later. Because if we don’t fix it, then we could be, for example, spending money on ad campaigns that and our ads are broken, that would really suck. So in this case, we have, oh, that’s a little hard to straighten out our data. Okay, so we have the majority of it goes to the homepage by a substantial margin, right, the blog, contact us and then somebody came back, the bookmark to So that’s a newsletter subscription page, then we have a lead gen thing there. So there’s not a ton of stuff on here, there’s like, oh, my goodness, we’re in a whole lot of trouble. It’s it’s mostly just the homepage. So I actually feel pretty good about this, that’s people typing in it, when the your direct traffic goes to the homepage, it loses people just typing in the URL. So that’s not a bad thing.

Katie Robbert 10:24
It’s not a bad thing. Um, and so I guess, then the next question is, let’s say, you know, it was going to a services page, or let’s say it was going to the contact form, which actually, you know, is number three on this list for I’m guessing what the past 30 days or so was the date range? Okay, year to date. I would want to know where that was coming from. So how, how do you start to resolve that in your own data? Because I know that we’ve seen again, we’ve worked with clients where the first time we get into their Google Analytics, it’s upwards of 70% of their data is on attributable, but we’ve been able to reduce that number down through, you know, the work that we’ve been doing down to like 35%, which still sounds really, really high. But compared to 70%, it’s now less than half.

Christopher Penn 11:18
Right, exactly. So the next thing to do is to try and figure out, does the direct traffic resemble something else? Right? Is there a channel that it looks like that you can say, Okay, I can kind of see like this, this has a correlation of sorts, to another job, because there are some channels like social media, for example, there’s dark social, where you don’t know where the attributions coming from, for example, when you use Slack, if you’re in our analytics for marketers, slack group at dot AI, slash analytics for marketers, if I paste a link in there, and I don’t put tracking codes on it, it’s going to come in as direct because slack is not a browser, right? And so there is no attribution. So that would be an example of that. However, those interactions may may look like say, Twitter, or Facebook or another social channel, where there is attribution data. So we want to try and figure out is there a relationship that it looks like, does our dark traffic look like search traffic, for example, if people are searching on their iPhones, we we’re not going to get that data. So one of the things, first things we’re going to want to do is we’re actually going to want to go into our audience data here. And we’re going to want to look at our device, our devices, and technology here. Again, what there are some brands of technology based on browser that we know, obscure data, in this case, are from our mobile traffic, which is 26% of our site? of that half about 13%? Is Apple, right? So okay, hmm. Apple devices would not be sending us tracking data unless we explicitly encoded our URLs everywhere with UTM codes. So somebody coming to us from a Google search inside of the Safari browser, on an Apple device would not come it would come in as direct. Right? So that would be dark example, the dark search. So we’ve already asked him that they were going to have some interference from mobile devices, for sure. Our next step is to figure out again, let’s do that just first version, let’s just eyeball it. Just take a look here. Let’s do.

Katie Robbert 13:34
So, Chris, while you’re pulling that up, a question came up while you were putting together that simple dashboard. You know, what about looking at the path for those direct to the homepage ones might give you more of a clue if there’s a pattern? What do you think about that approach,

Christopher Penn 13:49
one of the things that we do as a filtering mechanism to look at other direct traffic is actually screen out the homepage. To because obviously, if someone is just typing in the URL there we we appreciate that’s actual direct traffic. So one of the things that you can do is, if you want, you can say instead of page, you can specify, let’s go back here, and look at landing page. So let’s do that. And landing page will be the first step in the journey.

Unknown Speaker 14:21
Put that in here.

Unknown Speaker 14:24
And go back to view.

Katie Robbert 14:27
So in this example, what is the difference between page and landing page?

Christopher Penn 14:32
page is that page was visited by that user landing pages those the first page that they meant to, and what we’re seeing here is again, well, there’s there’s direct traffic to the contact page, right? But the first page in that particular journey was, was just a homepage though. So if you wanted to, to like really get rid of page entirely

Unknown Speaker 14:58
I think in this example, landing page gets more helpful

Christopher Penn 15:01
yeah so now we see we have 201 visits went straight to the homepage as their first very first stop second was unknown i literally have no idea what happened there that by the way be from someone using a very highly secure browser like there’s a strict versions of firefox that block everything there’s certain ad blockers like ghostery for example that can even prohibit the loading of most tracking codes

Unknown Speaker 15:27

Christopher Penn 15:28
we have resources blog and so on so on and so forth contact page stop center six so this is this is looking pretty good okay so we have our all users we’re actually going to turn that off we’re going to take our direct traffic let’s put in our organic search traffic

Katie Robbert 15:51
this is based on your hypothesis

Christopher Penn 15:55
yes and we’re going to put in email traffic

Katie Robbert 16:00
now the reason that you’re doing that chris for us specifically is because we know the majority of the channels that we’re using our organic search organic social and email we’re not currently at this time running ads so we don’t need to factor in paid search and paid social

Christopher Penn 16:22
that’s correct

Katie Robbert 16:24
now if you were a company that was doing all of those things you would want to make sure that you have those segments created and then use them in this kind of an exercise but for us we you know for lack of resources aren’t doing more than you know the organic

Christopher Penn 16:45
exactly so now we’re looking at our direct traffic which is so the orange line in there against it difficult to see does it look like does is there a relationship with any of the other lines it’s difficult to say in this case it’s difficult to say because you really can’t eyeball correlations and you probably shouldn’t but initially initially doesn’t appear to be anything like strongly obvious so that brings us to a more complex version of this analysis where instead of something very simple we’d actually want to look at every source medium right because to your point katie for a larger organization they might have 20 or 30 or 40 different campaigns and sources and going on at the same time they might have paid email paid social SEM running display ads and all that stuff means that we need to instead of doing a basic version this we need to move to a more advanced version so i’ve gone ahead and run we have actually have a piece of software we wrote that does this that says okay take your direct traffic and show us the correlations between direct traffic and existing known channels and in this case for us this is for our website this does not apply to anything else google organic has the strongest correlation to our dark web traffic which suggests that our initial analysis of apple devices being about 13% of our overall traffic and dark search being the thing right we also have in the headlights which is there’s a point three eight correlations not as strong and then after that we start going into essentially no statistical relationship once you’re below point two five everything to the left of here is pretty much invalid so you have you know some stuff from Talkwalker email some facebook stuff is a slightly stronger relationship slightly stronger relationship with my email newsletter with the company email newsletter and google organic so we now have a sense of okay this is probably what is our our dark web traffic we’ve decomposed it to this point the next question is can we solve it right can we solve any of this the first thing

Katie Robbert 19:06
before we go too far chris i think one of the things that i just want to sort of like go back to is so this is software that we created this is our code

Christopher Penn 19:16

Katie Robbert 19:16
no other marketing agency has this and so if you’re looking at your data and saying i have a bunch of direct none this is something that you know we uniquely do is there a way is there a version of this that other marketers can do that doesn’t require us handing over our proprietary code

Christopher Penn 19:39
yes if you go into google analytics and you export this version that we’re doing here hit next to google sheets it will spit this out your SWOT a nice spreadsheet you then have to break out column c into individual columns for you to this four channels, and then you can, once you’ve done that, then you run a correlation right inside of the spreadsheet software of your choice to be able to do column by column. correlation.

Katie Robbert 20:13
Okay, I think that’s helpful because again, not everyone has access to something like AR, or has the skill set to run something like AR and write their own code. So as long as there are other alternatives to do that, and remember, John, I saw you write this down, do not eyeball correlations. So you just you’re like, Oh, crap, I got to remember not to do that.

John Wall 20:37
Yeah, that was the the economist and me twitching uncontrol.

Christopher Penn 20:46
So we’ve ascertained through basic correlation analysis, that organic search is quite the number one thing. And then number two, and number three being our email newsletter, and then Facebook having the strongest correlations. So in the spirit of the the name of the show, so what, what we have to do is you have to start trying to get rid of as many things on this list as possible. The first thing, and probably the I would say the easiest thing to do, would be to look at these different sources and say, do you have control over? Right, so we have control over our email newsletter. So the first thing I would do in this particular instance, is fire up our email newsletter software, and just go in and double check, like, hey, for our newsletter, are there any of anything that’s in here that where we have links to our website that don’t have UTM? tracking codes? Right. So here’s one of our calls to action. It’s got the UTM tracking codes are spelled correctly, things like that. What else we got here? We’ve got the homepage. Now, if I were to go back earlier in the quarter, oh, earlier, the quarter, I had the headline banner did not have UTM tracking codes on it. Right. So,

Katie Robbert 22:03
so fired, Chris. Right. But I think that this is actually a really useful pro tip this I’m guessing a big assumption is a very, very common piece of the puzzle that’s overlooked. Because you know, one of the things we say is you don’t need to UTM your own website. Well, this isn’t actually the website. This is an asset outside of the website that redirects to your website. So you do want to UTM tag it, even though it is your own URL. It’s a it’s a subtle difference. But it’s an important one.

Christopher Penn 22:44
Exactly. Right. Exactly. Right. So yes, this is your email. And here we have a case where that was not tracked. Right. So even though we use a piece of software called Mautic, that it auto appends UTM tracking codes, it’s not guaranteed, if you don’t know for sure, whether it correctly did it? Or what was correctly interpreted. If you put them in yourself manually, and you double check your work, okay, yep, I made sure that I’ve got those tracking codes in place. But the same is true for social media posts. The same is true for any link that you share anywhere outside of your website, stick those tracking codes on and make sure that they’re using the conventions that are approved by Google. Which brings us to the next point. Google has a list of source mediums that they strongly recommend. Let’s put this in.

Katie Robbert 23:36
So it’s funny that you say strongly recommend, it reminds me of one of my friends has a teenage daughter that she’s been you know, now, I guess you just text with your kids, you don’t actually talk to them. And so she’s like, you know, I strongly recommend that you come downstairs and do the dishes and her daughter will respond. I know that’s not a question. You’re actually telling me to do it. And so it reminds me of the statement that you just made where you’re saying Google strongly recommends these source mediums. But I’m guessing it’s less of a question and recommendation more of a do it this way.

Christopher Penn 24:09
If you want your stuff to work. Yes, Google has this nice article and support that says these the default channel definitions. So when they look at how to allocate tracking and stuff and give credit without a whole lot of extra, they actually specify exactly, you know, email, medium exactly matches email, by the way, that’s all lowercase. So if you had a capital E for email, it would not associate that with that channel grouping and stuff. So our recommendation definitely is make sure that you are adhering to Google’s definitions and any of the UTM tracking codes you’re using. That’s that’s pretty straightforward. So the next thing that we recommend doing is when you are doing all this tracking, make sure that you have that you’re keeping it someplace, you know, sensitive So that it’s I guess, govern I know, Katie can talk about some of the clients we’ve had that don’t have that level of governance.

Katie Robbert 25:10
Well, and that’s just it. So you know, we joke slash not joke that governance is this awful, scary word. But really, what it just means is having, you know, at least in this context, some kind of a repeatable process. And so the easiest way to make sure that there is compliance with correct UTM tracking is to put it in a spreadsheet. And what we’ve done, and what we make available to our clients is an automated spreadsheet, I put this sort of lightly, but basically, it’s just a spreadsheet that auto creates your UTM URL with the information that you put in, you know, and so you can restrict it down to only the source mediums that your agency that your company should be using in a drop down. So you can see in column F, it’s a drop down menu. And so we don’t give people ourselves our clients the option to put in things that aren’t on Google’s approved list if they’re using Google Analytics. And that’s one way to really keep better control over what those UTM URLs start to look like. You know, the source, there’s a little more flexibility there, because you’re going to have, you know, we have like marketing props, and so what and so things that are unique to us versus just your straight, Facebook, Twitter, Instagram, email newsletter, like you want to have a little bit more flexibility there. But again, really make sure that you’re thinking about it in a structured way.

Christopher Penn 26:45
Exactly. So in terms of getting bad data out of Google Analytics, having those UTM tracking codes, really is the easiest way to start slimming down what that big pile of direct traffic. And again, if your site has, you know, 1520 2530 5070 80% direct traffic. At that point, you start running into real statistical problems, like, can you even rely on the decisions that are being made? So if you look at the steps we’ve taken today, which is, number one, figure out, is there a problem, right? Number two, look at the magnitude the impact of the problem, if it’s going to key pages on your website, that this traffic thing you need to do something about it. Number three, do the analysis to figure out where the source of the problem could be coming from. And then number four, implement, at least what you have control over to slim down the problem. You’ll make inroads to it like Katie was saying, one of our clients had 70% of the traffic was direct. At that point, there was absolutely no way to even think there analytics real reliable, as we start cutting them down. Even just this week, we identified yet another set of problems that they had, that will probably chop down in an additional 10 or 11%. Of that of that 30% remaining. Oh, good question here. From our listeners, how would you ensure encourage tracking links outside of markers or typical content creators? Again, to what Katie was saying, a lot of tools do help enforce this already. If you’re using like, you know, Salesforce CRM, you could actually build templates and stuff in in Salesforce and Hubspot. It’s hard to have pre approved content. One of the systems that I particularly like, is having your own URL shortener, and having your own URL shortener means that you can’t change the tracking codes. So we use this system. It’s an open source, it’s called URLs is your your your own URL shortener. And all of our stuff goes into this as it goes in its Auto UTM track tag. So no matter what the link is, it goes in here. And then what you get out of it, is you get a short link. So if you were to go into say, the Trust Insights newsletter, I’m going to scroll down here. And you look at the news, right? These are the all the links we ship, every single one of them is a shortened URL, this shortened URL, you can’t rewrite it, you can’t break the UTM tracking codes that are embedded in it. In fact, not only are these tracking codes have you Google Analytics codes and and they also have you can see in our shortener StackAdapt code, so as you click on these URLs, we can track who clicks on them and then show them retargeting ads later, which is kind of a fun trick for another show.

Katie Robbert 29:46
Well, and I think that this is a good point, Chris, because we haven’t even covered you take all of this painstaking time to put together the UTM tracking codes and then some websites, some systems and platforms. strip them out all together and change them. You know, calling you out Facebook, we know that you do that. So knock it off. And so I think that that’s one of the things that, you know, even as you’re building in, you need to sort of make sure that you’re protecting all of that data that you’re so carefully building into your URLs. You know, to go back to this question about how do you ensure tracking links outside of typical content creators, you know, that’s where the governance structure comes in. And so, you know, what we often see is, the larger the company, the more siloed it is, the more disconnected even if you have like one central digital marketing team, you might have a bunch of other business lines, doing their own version of digital marketing. And so it’s really needs to be a collaborative effort and have like one or two people sort of overseeing, and QA, it’s not policing, it’s really just cueing making sure that everybody is getting the right credit for all of their really hard work that they’re doing, by ensuring that you have the right tracking codes on all of the URLs that are going out, you know, if you have an email team, if you have a social team, if you have outside, you know, partners, agencies that are doing things on your behalf, making sure that that is part of the onboarding conversation, and that it’s something that’s revisited, probably quarterly to make sure that these things are coming in correctly, because they want to retain their contract with you. So they want to probably make sure that their stuff is getting the right credit, in your data as well.

Christopher Penn 31:39
That’s really is the key is attribution modeling, that is shared publicly, is the easiest way to get people to self enforce, right? People will say, Yeah, my thing did awesome this quarter, like I should get it, you know, my budget should get more money, because my thing did awesome. If you pull up one of our standard attribution models, you know, the my, my newsletter did 37% of the conversions for for q1. So, Katie, you can fire me, but I’m taking my newsletter with me.

Katie Robbert 32:12
Dang it. No. I know what the contract says You owe me 50% of your newsletter.

Christopher Penn 32:22
But when you think about like ad partners, co marketing agreements, it’s like that he showed this model to people and you say, Okay, this is what we’re using to make decisions on for the coming quarter? What’s going to get budget? What’s going to get resources? Who’s gonna who got credit? What should we do more of, and as soon as all the parties involved, see this, they go, Oh, heck, I need to improve my tracking, because I want more budget. I want I want a bigger seat at the table kind of thing. You know, when you’re doing one of these attribution models, for one of our clients, there was practically a knife fight between three different teams, because they’re saying no, no, we deserve credit, you know, we and there was a whole big thing of like, Oh, we didn’t have tracking codes on our stuff. Wouldn’t you know it? Next month’s meeting, they had tracking codes on everything. And and the contribution that they brought to marketing went from like 2% to 37%. And suddenly they got religion, like, yeah, I by tracking code, I look better in front of the boss. So, you know, to that question, how do you encourage tracking links? It’s pretty easy. You have almost like, a scoreboard of sorts. Yeah, that’s where your attribution model kinda is a score by who’s gonna become the biggest score this quarter.

Katie Robbert 33:36
Now, John, you do a newsletter for marketing over coffee, and you have, you know, links that go outside of the outside of your website, you have sponsors, how often are people like providing you that information up front? You know, when you have sponsors, like, they want to know where their stuff goes? And how well, you know sponsoring with marketing over coffee converts? What does that look like for you?

John Wall 33:59
Yeah, we spend a lot of time just encouraging sponsors to, you know, give us a trackable link, because we it’s just like you said, this positive reinforcement, we want them to actually, at the end of the month, go back and say, Oh, my God, this newsletter drove traffic for us and was for real. And then, you know, another thing we see is, you know, there are brands that once they get big enough, they already have the religion and they just say, you know, we’re just going to earmark, you know, x $1,000 a month, and that’ll just go straight to the homepage, and we don’t care. But you know, every other business, that’s not to the point where you just always set aside, you know, $200,000, for general branding, wants to know where their money is going. And so yeah, you really need to do that. And you can’t undersell what you’re just talking about the positive reinforcement, you know, because with our spreadsheet where if you don’t use a UTM code, somebody comes by to break your finger. Like that’s not really motivation, you know, you want to have get positive results and get people on board with that because, yeah, going around every month trying to beat UTM codes into people is just horrible. All around.

Katie Robbert 35:00
It is and, you know, it’s something that, you know, we’ve seen, again, sort of, you know, big companies, small companies, it’s that extra step that, oh, I don’t have to worry about that it comes into my Google Analytics correctly. So Chris, one of one of the other things we haven’t really talked about is the actual infrastructure setup of Google Analytics, making sure that if you don’t have a UTM tracking code, that you at least your channel groupings are set up correctly in Google Analytics, so that you kind of give yourself a little bit more of that advantage.

Christopher Penn 35:34
I think we want to save that for another show, because some things are really rapidly changing on the Google Analytics infrastructure front, to the point where what will be best practice going forward in the next few months, is a big departure. It’s a big, big departure from the way we’ve traditionally done it. We were talking this week on this week’s marketing over coffee episode about some of those infrastructure changes. But the big one is server side tagging, as systems like Apple Inc, crack up their privacy, to the point where some things like your Google Analytics tracking code may not necessarily work correctly, as browsers and their support for third party cookies. server side tagging is going to become more important where no matter what the device is, no matter how private it is, at the at the end of the day, there’s a person consuming resources on your web server. And those server logs can then be used as a base for improving tracking to the point where you may even be able to recover more of that direct traffic, because you may be able to see just from the raw data on on the hardware that you own or rent. What that is, but it’s more complicated than then we’ve seen in the past the old days, copy your tracking code at a Google Analytics paste Johnny website, and you were done. Now, it’s you have to install a new server on Google Cloud and build a separate new container and Tag Manager is there’s a lot a lot to it. So I think that’d be a good, more advanced show down the road. Once a you know, I think we have a way to explain it that isn’t literally four hours of technobabble.

Katie Robbert 37:15
I would agree with that, because even you were just describing it now. I’m like, Wait, did I just blacked out? Where am I? What’s going on? I mean, I was totally paying attention, I swear. No, but it is it’s very, it’s very technical, and it can be overwhelming. And so I think that we still want to sort it out a bit more before we try to re explain it to people. So you know, we’re focusing on the things that are knowable right now. And that’s within Google Analytics, three, making sure that you are setting up UTM tracking codes correctly. Now, if you don’t have a spreadsheet, if you don’t have one set up, you can go to the free UTM builder as well. So Google provides one of those, they give you the same examples that they give on their support page. And you can start to build it through that free UTM builder, I believe Chris is going to pull it up now. And it will give you the opportunity to also use a shortcode. So you put in your website URL, you put in your campaign source, it doesn’t have to be a paid campaign, it can be organic. So they give you the examples. The one thing to note is they use lowercase for everything. If Google’s giving you an example, follow their example and do it the way that they’re doing it. And so you have up to those five categories. So source medium, campaign name, term and content, source and medium are the two things that were really strict about. The other things are really sort of up to you. And then you can either copy the URL as is. You know, we talked a little bit about those URL shorteners, or you can use Google’s shortener, you can use Bitly you can use to support it now. I’m sorry to say that again.

Christopher Penn 38:59 is the only one supported Now Google has end of life to their URL shortener.

Katie Robbert 39:03
Okay. Um, oh, it says authorization required. Okay. Um, you know, you can build your own URL shortener, there’s a lot of different options, we do recommend shortening the link because again, it preserves those UTM tracking codes and doesn’t give other services like Facebook to strip out the UTM tracking codes. So you can build your you can build them one by one, it’s not the most efficient, we do recommend going the more automated spreadsheet route. You know, and if you have other ways of doing it, if it’s built into your systems, like your social sharing system, I know that you know, you have like buffer and Agorapulse and Sprout Social a lot of those systems have that kind of thing built in. A lot of the email marketing tools have some level of UTM tracking built in. So definitely check those settings. We actually just talking with one of partners about that so there’s a lot of different ways to ensure that you get that tracking set up correctly

Christopher Penn 40:08
yep so in short that’s what dark traffic is which is a very tiny example of dark data it’s stuff that you don’t know what’s in the box so you can’t use it to make decisions and there are ways to decode it to some degree there are ways to mitigate it to some degree and it will never be perfect no matter what there will always be some amount of data that you can’t make use of but at least you’re going from 70% to 7% hopefully and making making good use of it so

Katie Robbert 40:41
let’s get to it before before we wrap up i do actually just want to acknowledge this comment so chip you’re saying all these tools are great but they’re adding friction to the process i disagree with you i say that the process needs to revise to add these tools in because if you’re not using them then you’re not collecting your data correctly so i don’t think they add friction to the process i think your process is broken if you’re not doing it

Christopher Penn 41:11
and on that note if you have comments or questions hit us up on our slack group thanks but until next week we’ll talk to you just enjoy the replay let’s go ahead and and head on out of here thanks for watching today be sure to subscribe to our show wherever you’re watching it for more resources and to learn more check out the Trust Insights podcast at Trust slash ti podcast and a weekly email newsletter at Trust slash newsletter got questions about what you saw in today’s episode join our free analytics for markers slack group at Trust slash analytics for marketers see you next time

Transcribed by


Need help with your marketing data and analytics?

You might also enjoy:

Get unique data, analysis, and perspectives on analytics, insights, machine learning, marketing, and AI in the weekly Trust Insights newsletter, Data in the Headlights. Subscribe now for free; new issues every Wednesday!

Click here to subscribe now »

Want to learn more about data, analytics, and insights? Subscribe to In-Ear Insights, the Trust Insights podcast, with new 10-minute or less episodes every week.

Leave a Reply

Your email address will not be published. Required fields are marked *

Pin It on Pinterest

Share This