In this week’s In-Ear Insights, Katie and Chris discuss the differences between data science and data analytics. They delve into how data science focuses on exploring why things happen, while data analytics focuses on what happened. They also touch on the importance of requirements gathering and the role of data engineering. Tune in to the full episode to learn more about these topics and the overlap between data science, data analytics, and data engineering.
Subscribe To This Show!
If you're not already subscribed to In-Ear Insights, get set up now!
- In-Ear Insights on Apple Podcasts
- In-Ear Insights on Google Podcasts
- In-Ear Insights on all other podcasting software
Advertisement: Google Search Console for Marketers
Of the many tools in the Google Marketing Platform, none is more overlooked than Google Search Console. Marketers assume it’s just for SEO, but the information contained within benefits search, social media, public relations, advertising, and so much more. In our new Google Search Console for Marketers course, you’ll learn what Google Search Console is, why it matters to all marketers, and then dig deep into each of the features of the platform.
When you’re done, you’ll have working knowledge of the entire platform and what it can do – and you’ll be ready to start making the most of this valuable marketing tool.
Sponsor This Show!Are you struggling to reach the right audiences? Trust Insights offers sponsorships in our newsletters, podcasts, and media properties to help your brand be seen and heard by the right people. Our media properties reach almost 100,000 people every week, from the In Ear Insights podcast to the Almost Timely and In the Headlights newsletters. Reach out to us today to learn more.
Watch the video here:
Can’t see anything? Watch it on YouTube here.
Listen to the audio here:
- Need help with your company’s data and analytics? Let us know!
- Join our free Slack group for marketers interested in analytics!
What follows is an AI-generated transcript. The transcript may contain errors and is not a substitute for listening to the episode.
Christopher Penn 0:00
In this week’s In-Ear Insights, let’s talk about all things data, data science, data analytics, data engineering.
Katie, when you hear these terms, particularly data science and data analytics to you, what is the difference between the two?
Katie Robbert 0:19
I have a hard time differentiating the two because I feel like data science is the person and data analytics is the action.
And so a data scientist is doing data analysis.
And this is where I sort of struggle.
And this is where I’m looking for more education is, you know, are you just advanced analytics.
So you are an analyst who understands artificial intelligence and machine learning versus what I think data science, I always go back to my clinical.
And so I always think about things like clinical trials and principal investigators.
But that’s not what we’re doing here we’re working with marketing data is commercial.
And that’s not to say that it’s not similar in nature, you can’t have a principal investigator for a marketing data analysis project, it’s just the quality, the not the quality, the criteria are different, you know, you’re not necessarily doing clinical trials to determine the efficacy of, you know, some sort of a drug intervention, you’re trying to figure out the ROI of your ad campaign.
And so I suppose if you take away, you know, the ad campaign versus, you know, drug intervention, at the end of the day, you’re still looking at just metrics and numbers.
But I think for me, it’s just sort of a mind shift, because one is more academic, and one is more commercial.
Christopher Penn 1:59
That’s, that’s a good starting place.
Here’s how I differentiate the two, what versus why.
So data analytics analytics, the, it comes from the Greek word on the line, right, which means to unlock to loosen up.
analytics is all about what happened.
We’ve looked at Google Analytics, we look at Facebook analytics, what happened? Did we have more conversions last week or less things? When you get to the science part, the scientific method, hypothesis testing, things like that.
That’s That’s why so conversions went up 40% Last week, why? What? why did why did that happen? And if it’s a good thing, is it reproducible? Could we do it again, right? Or is it? Is it just a one time thing? Do we get mentioned on Tumblr, or Slashdot or something, and we got a lot of traffic for no reason.
And it’s probably not something we can make happen yet.
So to me, that’s the big dividing line is what versus why data science is used to explore why things happen.
We do things like retroactive AB testing, or any kind of AV testing is, is why did this this work, when you go on a website and say, we’re going to change the copy on the Trust Insights website from you know, practic, pragmatic change management to do it yourself analytics, and we see one gets more lift than the other.
Now we have a better idea of why we’re getting conversion rates higher or lower, because the copy changed, we changed the copy, we tested it.
So to me, that’s, that’s the big difference.
Katie Robbert 3:28
But I feel like that’s a little too loose in terms of the definition because you know, I’ve run a B tests on the content on our website, but I’m not a data scientist.
And so and so, but this is where I think the education piece is important.
Because, you know, I know the scientific method, I understand the scientific method, but did I follow it, you know, step by step by step? Maybe, you know, I wanted to understand if we change the copy on our homepage, would we get more conversions? That’s my hypothesis.
That’s my experiment.
And then, so, I ran the test, I did the control in the experiment, which is part of that and then I got my results and then I said okay, what do I do now? And the what do I do know is I need to change the copy on the website, I need to find time to make updates, but is that to like so? If that is what makes a data scientist then is it what is it the all marketers are data scientists, not all data scientists are marketers?
Christopher Penn 4:41
No, but I would say if you if you successfully make toast and you don’t like absolutely burn it right or you forget you know, you’ll remember to turn the toaster on.
Are you a cook? Okay, have you successfully cooked something
Katie Robbert 4:59
but I feel These are titles that people work hard to earn.
It’s sort of the, you know, I’m the CEO of my household, because I pay the bills and I clean the dishes.
You know? Yeah, I can say that.
But that doesn’t really hold any weight in the outside world.
Christopher Penn 5:18
I would argue for the perspective of marketers trying to figure out where to direct their careers and professional growth, that it’s okay in this case, right? If you are using the scientific method and scientific principles, and you doing it correctly, I would say yes, you are practicing data science.
Now, if that is not your primary profession, you might not want to use data scientist, just like if you are not cooking all day in the kitchen commercially, you’re probably not a professional cook, you’re probably not a chef.
Right? But that doesn’t mean you don’t cook.
And one of the challenges that I think comes with titles is this sort of this very, very binary, either you are or you aren’t.
Well, it’s a spectrum.
The person who’s behind the grill at the local Mexican restaurant, you know, are they a Michelin star? Chef? No.
Are they a good enough? Chef? Yes, I think their food is terrific.
It’s better than what I would do.
And so I think there’s that spectrum with things like data science and data analytics.
Are you an analyst, if you are can use Google Analytics.
You might not be a professional analysts were at your job 40 hours a week, but you’re absent performing analytical skills tasks, and getting analytical results.
If you’re doing scientific testing, you are performing data science tasks, and should you call yourself a data scientist? Probably not from a hiring perspective.
But as data science part of your role? Yes, it is.
Katie Robbert 6:45
And I feel like that’s really the conversation.
Because right now, there’s a lot of people looking for jobs, and they’re trying to, you know, the advice that we tend to give is to tailor your resume, to highlight your experience for the job that you’re out there.
If I start promoting myself as a data scientist, because I’ve asked the question, why, you know, am I miss misrepresenting myself? Because I’m really good at asking why.
But I don’t know how to use advanced techniques, like, you know, retroactive AB testing and regression and, you know, machine learning.
Christopher Penn 7:29
You might not be a a qualified data scientist.
But that mindset would be a very good mindset to pair with a junior data scientist, someone who has a lot of skills but doesn’t have professional experience, right? Somebody who’s just out of the data science program, or just out of college and and they lack the real world experience, they lack the practical application, to know when to ask why.
And crucially, when to know when to stop asking why and move on with your day to stop falling down, rattles them that there is something there.
Now, again, I don’t think you could plausibly get hired as like a principal data scientist at say, Facebook, that’s probably not going to happen.
But I do think that your skills allow you and there’s empirical proof literally sitting in front of you right now that you can manage a data scientist, why capably?
Katie Robbert 8:26
So I could call myself a data science manager, but I can’t call myself a data scientist.
Christopher Penn 8:31
Because with everything, it’s it’s a combination of skills of experience of a theory.
And when, when you are someone who has the experience, and you know, some of the theory, yes, you lack the skills, you’re not a data scientist in that regard.
But that would permit you to manage it.
Because a lot of the time, you don’t need to know, you know, should I use the file library or the FF package to deal with a large json file in our it? Do you really need to know that? No, what do you need to know? Hey, what’s the most efficient way is to open a large json file to process it like a six gigabyte JSON file? A data scientist would How are you actually a data engineer would know those answers to see okay, you should be using the F F package to read that file in chunks so that your our instance doesn’t completely explode all over writing and your computer shuts down.
But from a strategic perspective, you would be asking the question, how can we make this code more memory efficient? How can we make this code higher performing? And and knowing those questions to ask is is critical to that because particularly a junior data scientist and a junior data engineer is going to get so lost in the minutia that they don’t see the big picture.
Katie Robbert 9:55
I feel like data scientists versus data analysts versus data engineer, in some ways, is just kind of splitting hairs.
You know, because you could argue for both sides that you could argue that they all need to be their own separate disciplines.
Or you could argue that they’re all the same thing.
It’s just a matter of what job function you’re performing for any given task.
And so is it.
And this is just a genuine, you know, question, you know, can you have data science and data engineering skills? But really you’re an analyst? Or is it that you’re a data scientist? And you have analyst capabilities? Like, is it all one in the same and it just is dependent on who you’re talking to? Or are they really individual disciplines.
Christopher Penn 10:55
So this is where you do get to the gray area with data analytics is a subset of data science, right? You cannot be a data scientist without analytic skills just can’t, okay, you can be an analyst without data science skills, right, you can just be a straight up analyst whose profession is to clean prepare, analyze data and stuff like that.
Both of them do need some level of data engineering skill, but data engineering truly is its own profession, it’s much more closer to it, you know, data storage structures, hierarchies, normalization of data optimization, queries, key index keys, all these different things that go into making data, particularly large amounts of data as efficient as possible, knowing when you need to normalize and denormalized datasets, a data scientist and a data analyst might need to know some aspects of that, but they are probably very rarely going to be in the weeds, saying, Okay, well, what type of file format should we use using ND JSON from using parquet? You know, what’s, what’s the, the structure underneath that makes that make this data work the best? That’s not really not something that data scientists and data analysts ever tangle with? data analysts, you know, it’s like, okay, is it an Excel or not? Is where that question usually starts, particularly when you get to massive data, like, you know, things like BigQuery, and redshift, and nine, all these cloud providers where your compute power is just beyond what you can reasonably have on a desktop or even a server.
So, again, I think goes back to what is the majority of your day? The majority of your day is you’re looking at storage arrays and stuff.
Yeah, you’re probably a data engineer? Do you need to have analytic skills to understand like, how much how efficient is the code that’s running? Yes, you do.
Does that make you a data analyst? No, not really.
Katie Robbert 12:52
What I was managing development teams, we used to get into similar debates, conversations, arguments, whatever you want to call them about the different kinds of roles and very similar to data scientists versus data analysts versus data engineering, we had database architect, developers, front end developers and back end developers.
And for someone who wasn’t close to the process, they would just say, Well, everybody’s working with, you know, the code and the technology, it doesn’t matter who I go talk to.
And the people who were in those roles, understandably, got very protective of the things that they do.
And, you know, we were constantly as an organization trying to figure out, have we made these roles to discrete or can one person do more than one thing.
And what we found, at least in this instance, was we needed separate front end and back end developers.
So people who worked on the UI, versus people who worked on the underlying code that made the website run, we needed a separate database architect because of the amount of protected health information and personally identifiable information that we needed to figure out how to house an access, versus a developer who would just create the widgets and functions and things once the data was accessed.
And so we did need all of those discrete roles.
Plus, we needed a separate QA person, because, you know, you should never be testing your own thing, because you’re too close to it.
So that sort of, I’m looking at you, Chris.
But I see this as a similar conversation of Sure, you can have a data analyst who does data science and data engineering.
However, if you want to get more sophisticated, if you want to mature your data organization, you should probably have at least a data scientist and a data engineer and someone who does the analysis.
So those are three separate well OLS Chris, you because of the size of our company, perform all three roles.
But at some point, the goal is to have you as leading the data science and then have a team of analysts and engineers who are performing the functions as you’re directing them.
Christopher Penn 15:17
towards us machines.
Specialized specialization correlates with scale.
Right? You nailed it exactly.
At a mom and pop cafe.
Yeah, you’re working the grill, you’re you’re making salads, you’re doing all this stuff, right? When you are a massive multinational food chain.
You have specialization roles for everything.
Right now you have the fries, the person who’s on the fry line, you’re the person who’s at the burger line, you’re the person who’s on the front end, you’re the person who’s cleaning stuff you have person on Drive thru? Could one person do all this ask yes, annually, particularly in in the new retail companies, one person can change roll the roll roll like today, you’re on the burger line, tomorrow, you’re managed to fry machine.
The same is true in data science, data analytics and data engineering, you’re small, you kind of got to do it all.
And then as you scale, if you want to scale, you have to specialize in the same way that you know, the sushi chef should probably not also be making desserts.
So I can look this tuna and my cupcake.
Just just from a a perspective to get your data scientist it comes down to are you leveraging that person’s strengths to the to the maximum benefit of the organization, right.
And they will come a time when my skills are not best used, put it together social media updates, right? Right now at our scale, it’s fine technicians GPT-3, is doing most of it.
You know, Chad TP just cranks out tweets for us, that’s fine.
But as we as we scale, as we want to scale, we have to specialize.
Katie Robbert 16:55
And with that, you know, there’s pros and cons to that as well.
So when you have one person performing many job functions, you have a little bit more of a shared understanding.
So Chris, you then see the data science and the data analytics and the data engineering.
So you have the transparency into all of those pieces.
As you start to specialize, you run the risk of siloing.
And so your data scientists, maybe doesn’t talk with your data engineer as often as they should.
And so you start to have that breakdown of communication instead of that collaboration.
And then if your data engineer decides to resign and move on to a different job, then you no longer have that skill set versus you have one person doing all three roles.
So there’s pros and cons, to each of those scenarios.
And it really just depends on the amount of risk you’re willing to take on in terms of if I lose my data scientist, but I still have a data analyst and a data engineer, can things continue to move forward? Or if I have one person doing all three jobs that I lose that person? Can I replace that with one person? Can I afford to find three different people to replace that one person? And so there’s definitely, you know, ways to think about approaching the specialization versus the generalization.
Christopher Penn 18:21
There are and, you know, we saw that firsthand when he left our old company, they lost 100% of their data science capabilities immediately.
And a lot of the code that we’d written stopped working within three months, because code evolves and changes in libraries change and nobody kept anything up to date.
The the antidote, at least from what you’ve taught me over the over the years is twofold.
One, you have cross training, people who cross train with each other and to the thing that I used to hate but don’t anymore is standard operating procedures, which is the recipes the cookbook, like a and and this is, this is the thing that got me over that fear that other people wouldn’t do it right? Is if I can look at the recipe and say the recipe is correct, you can snap need to follow the recipe, then, as long as the recipe is good enough, that person should be able to get the result that that the recipe dictates right now, there will be there will be subtle variations and stuff.
And if there’s the variations are too far, then you know, the recipe is not good enough.
Katie Robbert 19:20
And I feel like we can spend some time talking about how that comes together.
Because that sort of what we were talking about on our live stream last week was in terms of prompt engineering, it’s a lot like delegation, creating those recipes is the same thing.
You need to be setting the expectation and if you didn’t write down the recipe correctly, and you ask somebody follow it, and they perform it incorrectly.
You need to go back to the start and say what did I not delegate correctly? Versus what did this person not do correctly? And so it goes, that becomes part of it is Do you feel like you have the tools than to delegate to a data scientist and a data analyst at a data engineer versus one person who’s doing all three things.
Christopher Penn 20:07
When you look at a cookbook, a good cookbook is probably what, one half, two thirds, maybe even three quarters pictures, right? Like, here’s, here’s what here’s, here’s what this should look like.
Now, here’s what the finished dish should look like.
And, you know, there’s that whole Pinterest meme of the way it supposed to be with actually turns out the horrifying things look, but the same is true.
For the standard operating procedures.
I’m in the midst of writing out a book on private social media communities.
And I have literally written out recipes, like, here’s how you do this, you said, if you want to increase engagement for a specific topic, here’s the recipe for this.
And it’s, it’s just a question of how specific and how detailed you want to be to get the result.
And there are cases where you’re sometimes too specific, where you don’t need to list out every single thing, right? That’s, that’s an experience thing.
But going back to data science, data analytics, if you have recipes for, hey, here’s how you process a JSON l file.
And it’s a piece of code, but the code is reasonably well documented.
Then you can say, here’s us at the forks.
Fun tip, going back to last week’s live stream, you can put your code into some of these large language models and say, Please write the documentation for the code.
And so if you are a programmer, like me, who doesn’t like writing documentation, now machines can do it for you.
And then you can just tune it up.
But it’s a great way to improve your code without, again, from the technical person’s perspective, wasting all that time doing documentation.
Now, there’s, there’s there’s now help available to you.
Katie Robbert 21:48
Yeah, documentation is not a waste of time, by the way, for those listening.
It is, then, you know, again, that’s a topic we could cover in depth as to why it’s thought it was time.
But back to your point, Chris, about the scalability, that documentation is needed, if you want to go from having a generalist to having different specialists, because then you have to do that cross training.
And then you need to have those redundancies in place in case 1/3 of the, you know, the three people decides to leave.
So going back to the original question, data science versus data analytics versus data engineering, what I’ve learned in this conversation, is that data science is the umbrella.
It’s sort of like the top of the pyramid.
And then within the data science discipline, you have functions of data and data analytics, and data engineering, and probably data, QA and other pieces that go into that.
Data design, storytelling, all of those things, they fall under that main umbrella of data science, is that an accurate or a mostly accurate understanding?
Christopher Penn 23:05
And what I would suggest doing is to make things as clear as possible in your own roles.
This is gonna sound familiar, right out user stories, like what’s the user story of a data scientist? Why do you need a data scientist in your organization? As a manager, I need a data scientist to understand where the perform what will perform better for my email marketing program.
When you say that go, Okay, it’s pretty clear, you need a data scientist.
But if you said as a marketing measure, I need a data scientist so that I can understand my Google Analytics.
Now you need you need a data analyst.
Right? Because it’s a different role.
So that writing out those user stories will help you better understand, what do you actually need?
Katie Robbert 23:52
And I mean, it always goes back to requirements of some kind before you start to hire for any role.
What is the problem I’m trying to solve? If I’m trying to hire a data scientist, just so I can say I have a data scientist, probably not the best use of their time.
And I know, Chris, you’ve been approached before to join teams, organizations, agencies, just because of the fact that you hold data scientists, but really, they want someone who’s going to be in the trenches doing all of the, you know, one on one analysis, and so it’s a mismatch.
And it really does come down to what’s the problem I’m trying to solve by having this role by having this skill set and for us, it makes sense because we deal with large quantities of data and the number one question we have to ask is, why did that happen? What what what the heck is going on here?
Christopher Penn 24:51
And I will say that if you do the requirements gathering and the analysis properly you will save yourself months of headache, you know, possibly some salt stomach lining and potentially millions of dollars because you will not wasted enormous amount of time off after to redo everything we have seen that time and again, with any number of clients, where they jump into something without having done that, that prep work to say okay, so what are we cooking? And you know, they’re we’re halfway through with some sushi rolls like Yeah, but yeah, this is a pizzeria.
Nobody wants sushi.
I mean, you could make like a pizza flavored sushi roll, I guess.
But probably no one’s gonna want that, or very few people are gonna want that.
And, you know, as a commercial plug, I would say if you are considering things like a major scene lending or AI project, and you’re not sure about that requirements gathering part of me, we help with that.
Katie Robbert 25:56
I was watching.
Very early this morning, there’s a show on Food Network called worst cooks in America, which is both it’s it’s half entertaining, half anxiety inducing.
But there’s basically the formula reality show.
But on on the episode that I was watching this morning, one of the quote unquote worst cooks was having a full on meltdown, because they were overwhelmed.
And the chef who’s the coach was like, This is why I really need you to do your BS, and plaas, which is basically things in place, if the prep work that you do ahead of time, to say, I know, I’m going to need two onions.
So here’s my two onions, I know I’m gonna need, you know, a green bell pepper, let me make sure I have that.
And so you get all of your things.
It’s your requirements gathering.
And she was sharing with him, she’s like, this is why we do it so that we don’t get overwhelmed.
So we don’t start to panic.
When everything is a mess, and we can’t find everything.
It’s the exact same thing with business requirements.
And Chris, the fact that you are the one who gave that PSA and not me, it makes me feel like my job is done.
If I can get the Chris Penn to start with wanting people to do requirements, documentation, my job is done about
Christopher Penn 27:18
it, because we’ve had quantites particular didn’t.
And we know just how painful that is.
You know, there’s just one client with the gift.
We’ve been working with him for a very long time.
And it took years to get the basics in place because they didn’t do any requirements gathering.
And even today, we are still fixing things that should have been fixed years ago.
But because they didn’t make some key decisions, they didn’t have the right people in place to that no processes at all.
It’s it’s been kind of a Sisyphean task, if you will, pushing that boulder up the hill.
As I get older and more experienced in the jobs, like yeah, you kind of want to stop doing the same thing over and over again, you want to maybe do something a little bit new, or you want to reduce some of the drudgery and the drudgery can be reduced by that preparation.
So as much fun as it isn’t sometimes.
You It’s like everything you see, over a long enough time horizon, the time you spend on the preparation is the time you save in production.
Katie Robbert 28:30
It’s absolutely true, very practical example of that I was making biscuits yesterday for dinner.
And because I’ve just done it.
So often I knew I was going to need salt, I knew I was going to need baking powder, I knew I was going to need, you know the various ingredients.
Because when I first started making them, I didn’t have all of that stuff ready.
And then with biscuit dough all over my hands, I’m trying to find the other ingredients that I need.
And I’m just making a mess in my kitchen.
Now this is a very small, you know, manageable example.
But because I was prepared, they came together faster.
I didn’t make a mess in my kitchen.
I didn’t have you know, big dough hands that I was touching everything with and then had to be clean and sanitize.
It’s the same thing with those creating those requirements.
And so if you’re considering a data scientist, versus a data analyst, versus a data engineer, as Chris has mentioned, write down that user story.
Write down those requirements, try to have a better understanding of the question that you’re trying to answer the problem that you’re trying to solve.
And that’s going to help you figure out what kind of role you need.
And you may need all three and that’s fine.
But then you can at least when you bring that person on, you can say this is the problem that you’re trying to solve.
This is the problem that the data engineer is trying to solve.
Instead of you know, just sort of a Okay, here’s the data.
Christopher Penn 30:02
And if you’re thinking in your own career about, you know, the skills or professional growth that you want to do, or nurture.
Look at those questions.
How do we how do we work with this data? Right? How do we store this data? How do we process this data that stayed enduring? What happened in the data? What is you know, what are the KPIs? What are the metrics? Well, what happened last week? That’s data analysis, or analytic skills? Why did these things happen? What can we do to reproduce them? What can we do to fix things that went wrong? What should we be looking at next, those are data science skills, and kind of circling back to where it started, you may not have the title of data engineer or data scientist, but if you have those skills, they make you more valuable.
On the flip side, if you are a manager, or an executive in your organization, and you are looking at hiring,
Unknown Speaker 30:54
right out the user stories, write up the use cases, write out all this stuff, because you may find that you have people in your organization who have this the skills, they may not have the title they have, they have the aptitude.
And you could save yourself a lot of money and possibly help retain staff by letting them branch out.
And and row there’s capabilities on problems that need solving.
Katie Robbert 31:18
All right, well, I feel like I actually learned a lot about data science versus data analyst.
So Chris, thank you for sharing your experience and that information.
So as Chris mentioned, if you are looking to, you know, enhance your team just or even just get an understanding of the skill sets that you already have, you know, give us a shout, trust insights.ai/contact.
And, you know, we’re happy to help you start to unravel what’s going on and what you need.
Christopher Penn 31:55
Exactly, kind of if you’ve got comments or questions about anything we’ve talked about in the show that just want to chat about general, pop on over to our free slack group go to trust insights.ai/analytics for marketers, where you have over 3000 other marketers are asking or answering each other’s questions every single day.
And wherever it is you watch or listen to the show.
If there’s a challenge that prefer to have, we probably have it.
Go to trust insights.ai/t AI podcast.
You can find all the options there.
Thanks for tuning in, and we will talk to you next time.
Need help with your marketing data and analytics?
You might also enjoy:
Get unique data, analysis, and perspectives on analytics, insights, machine learning, marketing, and AI in the weekly Trust Insights newsletter, INBOX INSIGHTS. Subscribe now for free; new issues every Wednesday!
Want to learn more about data, analytics, and insights? Subscribe to In-Ear Insights, the Trust Insights podcast, with new 10-minute or less episodes every week.