So What header image

So What? How to get started with Data Portability

So What? Marketing Analytics and Insights Live

airs every Thursday at 1 pm EST.

You can watch on YouTube Live. Be sure to subscribe and follow so you never miss an episode!


In this week’s episode of So What? we go through the importance of data portability, assessing your vendor, and the set up process.

Catch the replay here:

So What? How to Get Started With Data Portability


In this episode you’ll learn: 

  • What is data portability and why it’s important
  • How to assess a vendor’s data portability
  • What to do when your system doesn’t have data portability

Upcoming Episodes:

  • TBD


Have a question or topic you’d like to see us cover? Reach out here:

Note – The following transcription is AI-generated and may not be entirely accurate:

Katie Robbert 0:31
Well Hi everyone. Happy Thursday. Welcome to So What? The Marketing Analytics and Insights Live Show. The marketing analytics and insights live show! I am Katie joined by Chris and John. How’s it going, guys?

Christopher Penn 0:39

Katie Robbert 0:41
Not even trying to high five this week.

John Wall 0:46
high degree of difficulty.

Katie Robbert 0:50
This week we are talking about data portability, specifically how you get started. So we’re gonna talk about what it is how to assess your vendors, data portability, and what to do when your system doesn’t have data portability. So data portability is exactly what it sounds like it is the portability of your data from one place to another. So think about using third party systems like social media platforms, CRMs, email, marketing, automation, you know, all that good stuff. How do you get the data out of it when you need to, not just for reporting, but what happens if you change systems? Or suddenly that particular company? Who has the software, they go under or they you know, merge with another company, or whatever the situation is? The question that isn’t addressed enough is how portable is the data? It’s not addressed enough? When you’re doing those upfront requirements when you’re assessing? What system should we be using. So, you know, we often go through the five P’s, which Chris, I’m guessing is where we would probably want to get started today when talking about data portability. And before, you know, before you choose a system, or if you have a system that’s already been chosen for you, you want to go through the five P’s, which are purpose, people process, platform performance. And within there, you should be thinking about data portability. So Chris, where would you like to? You know, jump off on this conversation.

Christopher Penn 2:20
Let’s start at the very beginning was so what is the purpose of data portability? Sometimes it’s you’re changing vendors, right? That’s, that is probably the most common use case for markets like, hey, we want to save some money on SEM. So we’re moving from Salesforce to Hubspot, or from Hubspot to your Sugar CRM, or whatever the case is. But that’s probably the most common purpose. But there are other purposes for data portability. Some of them may be you want to inspect your data, you may want to use your data in different systems, you may want to just keep a copy of your data. Because in a world where so much of our stuff is in the magical cloud, sometimes the cloud goes away. And then you’re like, I have nothing, right? Anyone who has ever like unsubscribed from a streaming service is like oh, yeah, that’s right, I have none of my favorite shows anymore, because I unsubscribe, it all just vanishes an instant. Whereas those of us who are old and have gray hair hair, like binders of CDs, and DVDs, delaying or like, Okay, if I unsubscribe from everything, I still have my pile of plastic discs.

Katie Robbert 3:23
It reminds me of when the cloud first became a thing. I was working at a company where, you know, God bless our director of it at the time, he was trying very hard to educate the rest of the leadership team on the cloud, and what it meant for our data, because we were working with sensitive data, HIPAA compliant data. So there was a lot of considerations. And I just remember all of the leaders like for the next three months in meetings, like Oh, so you know, well, we’re going to have the cloud. So it, you know, we’re just going to put it in the cloud. Well, it’ll be in the cloud. So you know, when are we getting the cloud? Well, how do I get access to the cloud? So where’s the cloud? And it just, it became this buzzword bingo game? Of how many times in a meeting was somebody gonna say the cloud? So I appreciate you bringing up that example? Because I think, you know, while this episode is not about, you know, cloud security and you know, data storage in the cloud, I do think it’s an it’s an important part of the conversation. As we think about data portability.

Christopher Penn 4:27
There’s a wonderful t-shirt floating around the IT community, it says there is no cloud, it’s just someone else’s computer.

Katie Robbert 4:34
That’s exactly what it is.

Christopher Penn 4:37
So purpose is obviously the first place to figure out what it is you’re trying to do. The most important part of data portability, believe it or not, is actually the people and the reason for

Katie Robbert 4:49
Love it. Love it.

Christopher Penn 4:51
The reason for this is because there’s different levels of data portability, so let me give you a few examples. data portability means Essentially, can you get the data out of the system, right. And so there are some examples. This is a Google Takeout, for example, very simple. It says, Hey, let’s push this button, get your data, right. And many social networks have this with their, in fact, most companies are required to have something like this. Now, thanks to GDPR, the general data protection regulation of the EU, one, which says, a user may at any point request a copy of their data, California implemented this Virginia implemented this and a few other states implemented this as well in the USA, and various other privacy regulations around the world. Essentially, say, you’ve got to be able to get a copy of your data. Sometimes it’s easy, push the button, get the thing, it’s like, putting coins in a gumball machine, this stuff comes out. Sometimes, it’s not that easy. So for example, with WordPress, if you want to export your WordPress data, this is the admin panel for the MySQL server that sits underneath WordPress, like WordPress has this fancy web interface. But underneath, there’s infrastructure, there’s the typically the Apache web server, there’s the MySQL database, and then there’s the PHP programming language. And this to get your data out of WordPress, if you want to get the whole database, you have got to have a person on staff, it may be you may not be you who can do that, who knows how to use MySQL, or its variants. There are other forms of data export, where you may be able to easily get the data, but the data is not in a format that you’re comfortable with. So in for example, in Google Tag Manager, you can say I want my I want my container, I want to get my Tag Manager container, you hit that export button. And what you get is a big old json file, right, which is great. This is this is the data, this is the good stuff here. However, if you can’t read and work with JSON data, this is not so helpful. So that that second PII of people is so important, because if you don’t have the, I guess you have to assess who do you have on staff? And what are their levels of skill, when working with data to know what your data portability options are?

Katie Robbert 7:09
John, let me ask you this question, because you’ve worked with a lot of vendors throughout your career. I mean, I know we all have, but and I, I’m interested in your perspective, what do you think is the reasoning behind making it so difficult for a user to get their data out of a piece of software that they have, you know, subscribe to?

John Wall 7:28
Oh, yeah, it’s locking. I mean, that’s, you know, by design, right? The goal is to Yeah, exactly handcuffs, as Chris is showing me. You know, there’s no upside really, for a vendor to make your data available in a format that’s easily transferable to a competitor, right, there’s literally no value. And so unless they have some kind of, you know, altruistic streak, and that’s part of their mission statement, it literally a C level team cannot get approval to do it, you know, because you’re basically giving away your market share and competitive advantage. So, yeah, the, the only way you kind of get into it is usually everybody wants to integrate to things. And so whatever the integration infrastructure is, also exposes at least some of your data. And so then you can, you can get around that. But yeah, and it’s a tough thing to because depending on the platform, you’re on, right, you may have features that nobody else supports anyways. So getting that extra data is not going to help you unless you’re building your own system, because you can’t, that data can’t be used by any other systems out there. So that’s another thing that makes it complicated. And yeah, you can go down rabbit holes of like, well, just because you copy this data from one CRM system, doesn’t mean you can just directly load it into another one, you know, you might have to remap things. And depending on how those data objects are set up, it can create a whole slew of headaches that may even be, you know, not impossible, but it’s like it would be cheaper to hire somebody to just key in the data, then figure out how to, you know, work around that and where to go.

Christopher Penn 9:01
Yep, it’s all about locking. This, by the way, is something really important. Because this is something that we do pride ourselves on as consulting company, if you’re working with any kind of agency, and they require that He requires you to use their credentials, their logins and stuff like that. They’re essentially locking you in, they’re saying, like, you won’t be able to get access to your data without us and things. So one things we’re very clear about in all of our consulting engagements is we’re gonna use your stuff so that if you do at some point choose to not do business with anymore. You’re not being held hostage by us because it’s, it’s a pain to de integrate. And also, it’s just, I mean, it’s not illegal, but it sure is kind of slimy.

Katie Robbert 9:48
We’re not slimy.

Christopher Penn 9:50
Not on most days. So, the third one is process, which is how do you get your data out? What was the process and and so it let’s use the a, a less common data portability use case, let’s say, huge surprise. We want to use our data with generative AI, right? So we want to be able to move our data from the systems it’s in into a intergender AI somehow so that we can make use of the data. So the question is, what kind of process is involved in exporting the data? Then what sort of processes needed to get the data ready for generative AI? And then what’s the process for getting it into generative AI and getting a generative AI to do something with it?

Katie Robbert 10:39
I hit the Chris Penn button. I say Hey, Chris, we got to move our data. Good luck.

Christopher Penn 10:48
And this is where process and platform get intertwined. So you will have you have Okay, well, what platform do we have? We’re working with? What’s the generative AI system that’s the target platform, what data is formats does it support? So for example, ChatGPT supports your typically your like your Microsoft Office style files, or documents, Excel documents, PowerPoints, PDFs, etc. Google Gemini supports all that plus google drive stuff plus a different programming formats, audio and video and throw up X cloud supports, again, most Office documents and things like that, are they all through their API’s or application programming interfaces can support just direct streams of data? Typically, in one of two formats, either JSON, or CSV? Most of the time with API’s? It’s in JSON format, itself. So part of the part of the process is figuring out what do we got from the export side? And then what does the import system accept? And if what’s coming out of the system is incompatible with the it’s coming into the other system? You need to have something in the middle that translates between the two?

Katie Robbert 11:57
Yeah, I mean, this is why it’s so important to do your requirements upfront, to figure out why do I need to do this thing in the first place. And that’s where you have your set of user stories. So as a persona, I want to so that so in this scenario, I could see it being you know, as the CEO, I want to look at my CRM and marketing automation data together, so that I can do a more robust attribution analysis. Or I want to look at my CRM and marketing automation, and website analytics and revenue data, so that I can do some sort of marketing mixed model. And that’s where you’re like, Okay, great. That sounds lovely. On paper, that sounds like a really good idea. How the heck are we going to get all of that data we’ve run into this. We’ve had enterprise sized clients who’ve wanted to do things like attribution models and marketing mix models, which I feel like it’s a really common use case, for understanding the requirements for data portability, where they said, I have all of these different systems, I want to put together a marketing mix model or even I want to do a CDP a single view of the customer, a customer data platform. And so what you end up having is all of these different systems that don’t necessarily talk to each other. So now you have to dig into the data portability of how do I get the data out? Once I get the data out? What do I have? And then how does it match up to the six other systems that I plan on using to do an analysis with and then once I get the data out? Where the heck do I put it to? I put it in this magical mystical cloud thing that’s above my head, which is really just John’s computer. He’s taking all of your data, or, you know, do I have some other like, SQL database or whatever the thing is, and so you know, Chris, it feels like just such. It feels very convoluted. And I feel like the more we’re trying to simplify it and explain it, the more tangled it gets.

Christopher Penn 14:04
It does. But that’s why having the user story and the five P’s is important, because it it least guides you in the in the right direction. So let’s look at a very practical example. We’re talking really about Google Tag Manager, right. So I let’s say I have my Google Tag Manager instance. And I know you know, there’s this, there’s some stuff in here. That’s that I’m pretty sure is not great. Right? So how would I use how I use data portability to make the job of cleaning up my Tag Manager instance easier? Well, you’ll notice there’s two features import and export container I can I can go into that and say, Well, what version of my container do I want to export? Or do I want to import a container? By the way, importing a container is kind of a here’s a bit of a trade secret. If you are a company that does a lot of tag match stuff. You can set up a test container or Sam one container of your best practices, export it. And then when you work with a new client, just import your new container right into their instance, and boom, you’ve got all your best practices in one shot, it’s a huge time saver. So in this case, I hit the Export button, I’ve got myself a JSON file. Now, here’s the thing about today’s generative AI models, they are extremely good at understanding formats like JSON, they have been programmed and trained on ridiculous amounts of this style stuff. So I can go into a tool, I’m gonna go into Google’s Gemini 1.5 model. I’m going to start off by saying, let’s talk about auditing Google Tag Manager, what are the best practices, right? If you have not, if you want to check it out, there is a whole download for for this general process of of getting us a tool prompted read quickly, go to trust questions, you can get the one page PDF totally free, don’t even need to fill out a form. And you’ll get some of these questions that we asked the language models to improve their performance rapidly. Alright, so I’ve gotten my pre loaded a lot of the the knowledge about just what constitutes good practices for a tag management. So I’m gonna say, Great, I’m going to provide you with the JSON export of my Google Tag Manager container. Using these best practices can be used audit it, I’m gonna hit File, you would upload here and go Dragon, my JSON file. Now this is a this is a beefy up says, Oh, it has ran to an error extracting from file it did not like that. All right, let’s try this, let’s make a copy of that file and call it a txt file instead. This is part of data portability, knowing what’s what your systems can and can’t handle. So go back to our File, Import, and bring in now, this lovely text file. There we go 64,000 tokens of information later. And now we’re going to have the AI studio just Brian read this thing. And based on its knowledge, hopefully come up with some recommendations says based on the provided export, here’s an audit version control is here, you have some version control problems. Naming conventions are inconsistent, right. So we’ve got tags that don’t have good naming convention, figure logic is good, you have multiple thank you page triggers, Oops, oh, that’s bad. There’s some some third party tags. So what it’s done is in a very, very short order, we’ve taken using data portability, to get data out of one system. I do a bit reformatting, we format it as a text file, and then put it into a different system to make use of it. And the recommendations here, like, hey, you’ve got a bunch of tags that are paused, if you don’t need them, get rid of them. So this is an example of where where that data portability really comes in handy.

Katie Robbert 18:12
So one of the questions we get a lot from our customers and from our audience from our prospects is what questions should I be asking a vendor if I’m considering their software? And so obviously, data portability is a big question. But what are the specific questions around data portability, that people should be asking? Or what are what are some things that they should be looking for? Because I feel like, you know, if you’re doing a demo of a piece of software, you’re not necessarily getting all of the information you need, you know, so you may see, like, export an Export Data button. But if you’re doing a demo that may not be enabled for you, so you can’t really see like what you’re getting. So what are some of the questions that somebody who’s evaluating a vendor should be asking about data portability, telling me the process for exporting all of my data in one shot?

Christopher Penn 19:06
That’s the question. Because most vendors will say, Oh, we don’t support that, like, Okay, why not? And of course, you know, that goes back to John’s answers, they don’t want to make it easy for you to leave. For example, if you’re in the Hubspot CRM system, it is a royal pain to export your data from that system. You can do it one page at a time. You have to export your contact and export your company. If you export your deals, you have to export your notes and all that stuff. And even then, it will take some time. They they say in the documentation, oh, well, you can just use the API, and just your export stuff like that, which you can do, but their API is also not a whole lot of fun to work with. And it requires a very high amount of technical skill to do that.

Katie Robbert 19:56
I feel like well, and with that, I feel like EHRs they don’t give you all the information when they say, Oh, you can just export it through our API. So to your point, there’s a lot of technical technical skills involved. You also need to know how to develop against an API in order to get the data out and know that what’s coming out is everything. You know, I think that that’s one of the challenges right now with Google Analytics 4 is, yes, they have an API, but it doesn’t give you everything or at times out, or it’s different from what you get in the actual Google Analytics 4 interface.

Christopher Penn 20:37
To a degree, so if your Google Analytics instance, was set up, well, there should be the automatic backup to a BigQuery database. That is your one that is your one shot export, because you can take a BigQuery database, and just dump the thing, as is a SQL file. It is gigantic, depending on the size of your site, and it has very high bar of technical skill to do that. But at least the option does exist if you say, okay, just give me my database. I’m leaving now. Other systems? Now, you can’t even do that. There’s just the options. It’s not there.

Katie Robbert 21:13
What do you think, John? If, let’s say you were evaluating vendors, and you said, How do I get all my data? And they said, you can’t like what? What would obviously you would have a like, Oh, what the hell reaction? But what where do you go from there? What if you’ve inherited the system? So you know, their legacy systems? Like, what do you do?

John Wall 21:32
Yeah, well, there’s a lot of poker that goes into that, right? Because like, the first thing you have to think about, as an executive, like, Do I even want to raise this question? Because if your managers and the boss and the board and everybody are like, Yeah, we have to have this. And you know that there’s no tools out there at all to do this. You’re just opening up, you know, a Pandora’s box of hell for yourself. But some other easy ways to get away, as Chris was saying, ask about, you know, a full export, Have him send you a file to tell him you want a sample file so you can see it and see how it’s structured, and what the layout is. And then the other one, yeah, and even worse Pandora’s box is the API, right? Because everybody’s like, Oh, we have an API we integrate. That’s cool. But then you get the instructions for the API and you dig in, you’re like, oh, wait a minute, like, you know, there’s 37 fields in each of these entries. But you can access five of them. And you can’t write to any of them. And you know, there’s all these and God forbid, worse, you get burned. I had a company where we get destroyed where nobody had mentioned that, oh, by the way, each API call is like two cents, you know. And since you’re exporting the whole database, that’s going to be Yeah, and it doesn’t allow you to batch the queries. It’s single, you know, every single dip. So we ended up getting a $20,000 bill for, you know, dumping some data, when you know, the product is 150 bucks a month or whatever, like that happens. So yeah, it’s it’s full on Buccaneer pirate adventure trying to get this stuff straight.

Katie Robbert 23:01
So the big question, what do we do when your system doesn’t have good data portability, but it’s a system that you’ve been required to use? It’s a legacy system. So it’s a system you chose and didn’t realize that data portability was going to be important, like, what do we do?

Christopher Penn 23:19
It depends on the system. Most modern software, there is some kind of underlying database. And to John’s point, you may be to say, hey, look, either send us our files, or we’re canceling our contract, you can generally find an engineer somewhere in that company, like I find, here’s the SQL database export, just don’t tell anyone we gave it to you. The other thing is, and we covered this, a few shows back is have systems in parallel, so that you have a fallback. So for example, a few episodes ago, we looked at exporting your Universal Analytics data, using a couple of different tools, we looked at at the matomo web analytics system that runs in parallel with Google Analytics. So if you want an easier version of web analytics to work with, or you want to control the underlying database itself, you can do that, those would be some good options. If you have to work with the system you’ve got look for what export features it does have, like generally, most business systems will have some kind of spreadsheet export somewhere, or a file type that is that is generally compatible. So Excel file, CSV files, JSON files, etc. And to John’s point, if you have the technical skill to do so, you may want to use that vendors API, because it may actually save you time if you have the skills to do so. So for example, Google Analytics 4 is kind of a pain in the butt. Google Analytics does have a really good API. So one of the things that we do is we have a script that goes into the API and just yanks all the source medium codes out of it. Because we want to be able to see like okay, well what what what of our Our UTM tracking codes, what’s good? What’s bad, right? Are we doing things well or poorly? And we can produce charts from that that runs against the GA four API. Because yes, you can do this in Explorer hub. It’s faster if it’s a piece of code that can just run on its own. Yeah, just scheduled to run once a month. And you can take a look at it rather than have to mess around in the web interface. And more importantly, as we were saying, earlier, you may be, you may want to take this data out of here and bring it into something else. So one of the things that I have in this particular piece of code is it then takes the data and dumps it out as an API query, right? So it takes the actual data restructures the data and turns it into a query that can take and go and copy and paste right into a system like Google’s Gemini can say, Okay, let’s take this. Take our data. This is an example of another use case of data portability. And it’s going to spit out the things that we’re doing well with tagging the things that maybe we’re not doing so well with our UTM tagging. And then some recommendations like, Hey, here’s some things you should do. So this is using the API now to to grab the data, using some intermediary code to transform it, and then putting it into an AI system. So those those I say the options.

Katie Robbert 26:20
So are, are there ever any scenarios where data portability doesn’t matter?

Christopher Penn 26:30
Data portability doesn’t matter if it is entirely your system.

Katie Robbert 26:35
Like if you built it from the ground up, right?

Christopher Penn 26:37
So a long, long time ago, I built a CRM by hand, in PHP, and MySQL for a company, I was working at financial services, portability didn’t matter. Because it was our hardware or software are everything that I had to run the frickin server in the server room and make sure the air conditioner was working at it. So that was a case where we owned the whole thing we were and it was so hyper customized, there was pretty much no chance of ever moving it to any other architecture.

Katie Robbert 27:06
Did anyone ever need all of the data? Besides you?

Christopher Penn 27:14
Not by the time I left it too hot is possible that that that did happen after I left the company. But it because I did get acquired a couple of years after I left, but I don’t know if the parent company acquiree said, Hey, give us your data. Like we don’t remember how to get into this.

John Wall 27:32
Yeah, exactly. I just think about risk with that. You’re like the one man army. Once you’re gone, it’s like good luck.

Katie Robbert 27:41
Is there ever a scenario where data portability doesn’t matter when it’s not you, when you’ve not built it yourself?

Christopher Penn 27:48
If the data is unimportant, if the if the data just doesn’t matter, right? If it’s if you can discard it and just start fresh, then then the affordability doesn’t doesn’t matter as much. And there are cases where sometimes that’s just what you want to do. So you may, for example, be working with a vendor say doing, I don’t know your, your Google Analytics account, right? And you’re like, you know what your company is pivoting, we’re starting over, we’re just gonna start with a fresh new account, we’re not even going to try to bring over our past. We talked about that with Universal Analytics, right? With Universal Analytics, data portability, doesn’t matter. If you’ve had J four running since it came out in October 2020, you have four years of back data, the value of that much older data other than for curiosity sake, probably not super high. So the importance of data portability out of Universal Analytics may not be that important. Now, if you’re in financial services, and you’re required to keep 10 years of data, then yeah, in that case, it does matter. But going back to the older episode, that’s a case where the data portability really doesn’t matter for us.

Katie Robbert 28:50
And if you want to catch any of those older episodes, you can find them on our YouTube channel under the so what playlist? So go to trust John, do you ever have you ever been in a scenario where you’ve signed up for a piece of software, just to use the features of it and then said, but I don’t care what what data it collects?

John Wall 29:13
Oh, yeah, the vertical farming right now is text messaging, right? Using a vendor, and the features I get in there, I can’t get from any other vendor without paying like 10 or 20x. So it doesn’t matter that I can’t export it, because I don’t have any other options. You know, it’s that or nothing. So I have to eat it. And then yeah, I don’t know one thing too, I don’t know, if we’re gonna get into this is, even if it’s completely locked down, right? If that data shows up on a screen somewhere, there are tools you can use to get that data out of there. You know, in fact, I worked for a company that did a lot of stuff in the medical space. And, you know, because of the nature of all these products, there were no API’s, right? Nobody wants you to be able to mess around with your heart rate while you’re in the critical care unit. And So they had a whole team that created a suite of software that could read screens and do all kinds of scripting stuff, to basically scrape data and gather data in an automated fashion. So that’s a whole nother chapter to this story is, you know, if there’s data in there, and you want to get it out, there are people who can get data out of things, and especially if it’s a web enabled product.

Christopher Penn 30:25
Fun fact, most of today’s modern multimodal models can read screen captures very easily. So you can take a screen capture of something like say, your phone’s home screen, and programmatically feed it to a tool like Gemini or ChatGPT, and say, transcribe what you see on screen, and it will do it for you.

Katie Robbert 30:44
Sounds like a lot of work, though. So again, so I guess it kind of it really does go back to defining your five P’s and really understanding what is the purpose of data portability for this particular piece of software. So you may say, you know, what the reporting that I get out of it, not being able to manipulate the data not being able to do your own analysis, but the reporting interface that I get from this piece of software is good enough, I’m never going to need to export the entire database and save it somewhere else, that may be fine. If you find yourself in the scenario, where saying I need to get all of this data, you may go through those different scenarios where like, what happens if I change vendors, or what you know, whatever the scenario is, then data portability may become more of an issue more of a risk, that if the software that you’re choosing doesn’t have it, that could be your problem down the line. And then you may, you know, find yourself using something like an SMS messaging system where you’re like, I don’t care, as long as I can use the thing, the data doesn’t matter. So it really comes down to you running through the five P’s, making sure that you’re doing your user stories to figure out your why how, you know, how are you going to measure success, so that when you are choosing vendors, or working with vendors, or you know, you have existing software, you’re really trying to understand, am I going through this exercise of making sure I have data portability, because it’s a box, I need to check? Or am I actually going to do something with the information, you know, so as Chris mentioned, a few episodes ago, on the live stream, we talked about migrating your Universal Analytics, your GA three data, somewhere into your G four data. And for us, we determined that that level of data portability didn’t matter. We didn’t need that data. But we first had to go through the five Ps and go through that exercise to figure out, you know what, other than, you know, someone like Chris who has that natural curiosity for the data, it’s not worth the effort and the resources that it would take. So I would really encourage you before you say I have to have all that data, really make sure that it’s something that you will be using down the line.

Christopher Penn 32:59
And one final piece on this, if you are building software, you also want to be thinking about data portability from the very beginning. So as soon as you start gathering requirements for the software, you have to think about a what kind of data portability should there be and be? How do we want to account for that work and how we wanted to deal with it. So for example, last year, one of the things I was trying to do was replicate a piece of software that used to be available as a service on the web. And the company that made it went away. And this is a piece of software that essentially looks at a web page tries to figure out all the tracking codes that aren’t like what kind of software does it use? Does he use Google Analytics? Does he use Adobe analytics? Does he use Marketo, or Pardot, etc. And we wrote a piece of software to in Python with the help of generative AI to essentially replicate that service. Well, one of the big questions was, how do we want to store the data? Like what kind of data do we want to have available things and we ended up settling on was, we settled on having it write to a sequel lite database, because this is sort of one big repository, but also have the option to say, well, I want it also as a CSV file, and things like that. So part of system design, has any requirements has to be like, what format should the data be in? Is it something that standardized other systems can work with? And then how do we want to make that available? So I chose sequel lite because it’s, it’s literally a single file, but it is compatible with pretty much every data science and AI system in the world. So there isn’t a single tool out there that can’t take that data and work with it. And that was a key consideration the software, but you have to be part of your requirements gathering process to say, what do we want to do with data portability?

Katie Robbert 34:47
Makes sense. Final thoughts, John?

John Wall 34:51
Yeah, it’s think long and hard before you talk to anybody else in the organization about what you want to do with this have your own strategy locked in because it is It can completely spin out of control and get ugly. But by the same token to do, at least try and come up with it. Yeah, I guess that’s the biggest The real question is, you want to have a rescue plan, unless it’s way too dangerous to have rescue plans. So answer that question.

Christopher Penn 35:19
For vendors, you know, when you’re when you’re doing vendor selection, that question of you know, is there one click export my data? I think that is a super valuable question to ask the vendor. And in your consideration process and your scoring rubric for how you evaluate RFPs. And vendors. That should be a consideration is how, how easy is it to get to my data? And what level of technical skill do I need to take to process it once I get it?

Katie Robbert 35:45
Make sense? Do your requirements, figure out that stuff up front before you get locked in?

Christopher Penn 35:51
Exactly. Because once you’re locked in, it’s very hard to get out. All right, that’s all in. We will see everyone next time. Thanks for watching today. Be sure to subscribe to our show wherever you’re watching it. For more resources. And to learn more, check out the Trust Insights podcast at trust I podcast and a weekly email newsletter at trust Got questions about what you saw in today’s episode. Join our free analytics for marketers slack group at trust for marketers See you next time.

Transcribed by

One thought on “So What? How to get started with Data Portability

Leave a Reply

Your email address will not be published. Required fields are marked *

Pin It on Pinterest

Share This