so what gen AI for video title card

So What? Using Generative AI for Video Creation

So What? Marketing Analytics and Insights Live

airs every Thursday at 1 pm EST.

You can watch on YouTube Live. Be sure to subscribe and follow so you never miss an episode!

In this episode, we show you how to use generative AI for video creation.

You’ll learn a step-by-step process to take your video from a simple idea to a full proof of concept. You’ll discover how to build a detailed creative brief that guides AI tools to produce on-brand content for you. You will see how to generate both unique video clips and custom audio tracks using different AI platforms. You will understand how to assemble all the AI-generated pieces into a polished final video. Watch the episode to see how you can save time and money on your next video project!

Watch the video here:

So What? Using Generative AI for video creation

Can’t see anything? Watch it on YouTube here.

In this episode you’ll learn:

  • What generative AI systems are best to create videos
  • How to prompt generative AI for video creation
  • Common pitfalls you’ll run into and how to avoid them

Transcript:

What follows is an AI-generated transcript. The transcript may contain errors and is not a substitute for listening to the episode.

Well, hey, everyone. Happy Thursday. Welcome to so what, the Marketing analytics and Insights live show. I’m Katie, joined by Chris and John. How’s it going, guys? I mean, we’ll take it. We will take it.

Rough configuration for High five. We got to keep it interesting. This week we are continuing our summer makeover series. And so if you are not aware, this summer we are making ourselves over while also showing off all the new and cool things happening in the industry, you know, so that whole like, you know, two birds are totally fine and fly away of their own volition, saying, because we don’t condone violence around here.

So this week we’re. It’s been a long week. We are doing you. We are using generative Al for video creation. Now, there’s a lot happening in the generative Al space and video creation is a big part of that and that’s something that we haven’t tackled a whole lot. So what we want to do on today’s episode is give an overview of what’s going on in the space with video creation, what some of the tools are, but then also do some demos of what you can do. And one of the things we’re hoping to accomplish on this show, you know, depending on processing time and all that good stuff, is to overhaul the so what intro and outro videos. So, Chris, we created those six years ago maybe.

I think we refreshed them like three or four years ago because when a new version of Camtasia came out, because they just use stock clipart from Camtasia. Point being, those are not Al generated. These are, as now we have to say, old school, like actual edited and animated video without the support of generative Al. And so, you know, as we’re going through how to use these tools, that’s what our focus is going to be. But you can apply what we’re demonstrating to whatever video needs you have. So as always, Chris, where the heck should we start?

As always, what am I going to say?

Let’s see, we’re going to start with the five Ps, I knew you were going to ask this question, so I was thinking about it and I was like, what am I going to say? So the five. It really has been a long week. So the five Ps, if you’re not familiar, are purpose, people, Process, platform, performance, purpose. What is the question we’re trying to answer? What is the problem we’re trying to solve? People, who’s involved internally and externally. Process. How are we going to do it? Platform. What are we going to Use to do it in performance. How do we measure success in this instance?

If we are talking about redoing the live stream intro and outro videos, the purpose of doing those, from my perspective, is we want to create engaging videos that demonstrate the brand and show people what they are about to watch. So what show is it? So the purpose is to create intro and outro videos that tell people what show they’re about to be watching on YouTube or on social media, wherever we’re posting it. People internally, us, we are the decision makers slash creators. Externally is our audience. That’s our icp. So we’re making it for them, not for us. Process. We’re going to go over that today. Platform same. We’ll go over that today. And performance in this instance is did we create engaging intros and outros?

And one of the things that if were sort of doing this in a more formal, you know, if were being held to KPIs, is we would look at engagement metrics. You know, are people watching it all the way to the end? Are there things in the videos that people can engage with? You know, so those are some metrics that you may want to include as you’re doing an exercise like this. John, Chris, is there anything I missed?

No, because that’s functionally what should be in a good creative brief as well. Right? So if you’ve ever worked with creative services team, they gave you a nice little form and you have to fill out this form. The good thing is to, in today’s world, in generative Al, you can have Al help with that and say like, hey, I need to fill out this creative brief. Ask me one question at a time until you have enough information to fill out the brief. And then you can literally turn on voice mode in the Al of your choice and do that and step through it, which makes everyone happier. You get what you want, the creative team gets what they want, or the Al in this case is going to be the creative team.

And it really is just a form of requirements gathering at the end of the day. That’s what a creative brief is. It’s no different than a software product requirements document.

Yep, exactly. And just as a side note, I personally, you know, and we talked about this on, you know, the podcast last week, I’m someone who struggles looking at a blank page. And so I find it easier to be asked a question and give a response versus sit down and look at a blank page and try to tell someone what I’m after. And I also personally find it easier to Talk it through versus to try to type it. Because if I’m typing it, I’m instantly thinking about things like, you know, sentence structure and grammar versus when you’re talking, you’re just sort of free flowing and you can clean up all those pieces later. So I do really like this technique, Chris, that you’ve really been encouraging us to use, which is the voice recorder. Because I feel like you can get even more detail rather than sitting down and typing somebody. At least for some of us.

Exactly. So to start off, what you should do is take with the Al tool of your choice and at this point it does not matter. You can use any of them. Start off by asking a very simple, straightforward question. What are the major components of a creative brief for video agency? And you will get a response that looks very much like this. Right? Project overview and background. Purpose, objectives and goals. Purpose and performance, Target audience, people Key message, process, tone and style Process call to action, Performance, technical specifications, deliverables, process, distribution channels, Process, budget and timeline. Competitive. So you get the idea. So these are the parts that we would want to come up with. So Katie, I’m going to start by saying, okay, let’s start building out some of these answers for the project overview.

I’m going to start this in a text document because you never ever, ever edit directly in Al. It just never goes well. It’s just one of those things that network connections. So our overall project is new intro titles. Titles for our live stream format is 16 by 9 video 19:20 by 10:80 which is the resolution, the length is. So our opening titles are actually 45 seconds long because there’s a 30 second pre roll where we and then we have the 15 second thing. So 45 seconds, 30 second countdown followed by 15 second video intro. Our audio is instrumental music. Okay, now we can start getting into the pieces of this creative brief. So we’ve got the just the bare bones basics there Objective. What is the objective?

The objective to create a engaging on brand intro video that lets people know what they’re about to watch.

What does that mean? What does on brand mean?

The Trust Insights brand. The style guide.

Adhering to the Trust Insights style guide. Attach as a PDF. Okay, what constitutes engaging.

Attractive to our icp?

What that guy said.

Attractive to our icp. What is that?

We would have to pull that from the icp. So for the sake of time in this episode, we might say that our ICP would say something like short direct, not a million colors. Like those are things that we. That’s detail that we would Pull out of our icp.

Okay. Visually compelling, but not overloading.

Yeah.

Okay. Is there a key message? I mean, the key message is watching, watch, don’t change your channel. Yeah, don’t. Tone and style.

Again, that should mirror what is in the Trust Insight style guide.

That’s going to be tricky because what’s in the style guide is a lot of things like colors and fonts and things. Not necessarily a video style. Is there a particular style of video that you think it best fits our.

Brand about business news format, like 60 Minutes or Picture News thing?

Okay, I think that’s good enough for now. But that’s a really good point. I mean, when we created the style guide five years ago, weren’t really thinking about video. We were thinking about things like blog posts, more style, static content. So that’s an opportunity for us to go back and revisit.

Yep. YouTube and Twitch. Budget is irrelevant here. Competitive landscape is not necessarily relevant here. We’ve got the brand guidelines, we’ve got the stakeholder approval process. Okay, so this is a good start for the overall creative brief. So let’s take this and say, okay, I’m going to provide you with some background information. You use it to synthesize a creative brief based on the 12 major components you identified. There’s our background information. Let’s go ahead. And from Google Drive, I’m going to attach our icp, because we should. And then from there, we’re also going to go and fish out our brand style guide from our design folder, which is 18.

While Chris is doing that. One of the things we always recommend if you’re building something using generative Al is to do that background information first. Make sure you have those knowledge blocks, those ICPs, those, you know, sales playbooks, all of those things. We’ve covered a lot of those on previous episodes, which you can find on the so what playlist at trustinsights, Al, YouTube. But you don’t want to get to this point and go, oh, now I have to step backwards and get all that stuff. Because you can reuse those knowledge blocks, your icp, your style guides, your, you know, what we do, what the heck we do, like all of that stuff. And you can use generative Al to build them, but when you get to doing activities like creating videos, you’ll want to have those in place so you can get right to the doing the thing.

Exactly. So I’ve put in the Trust Insights icp, our style guide, our marketing knowledge block that we already wrote and Our About Us knowledge block that we already wrote. So that contains all the information plus what we’ve done here to come up with this creative brief. And let’s see what Gemini decides to synthesize. And again, at this point, we are doing something that you can do in any generative Al system, Chat, GPT, Claude, Gemini does not matter. I just happen to use this because we pay for it.

Chris, as the more recent convert to documentation, what would you say to those who just want to open up the video tool and start messing around?

The video tools have very strict limits. If you’re paying for Gemini Pro, you get 770 credits per month. A single video generation of 8 seconds costs you 20 credits, so you can burn through those really fast. If you’re using vertex, it is 50 cents per second of video. So if you need to do five or 10 or 15 tries, that starts swiping the credit card real darn fast. When VO2 first came out, I tried it. I tried. I did like four videos and I looked at my Google Cloud thing and says, like, you have a bill for $3. Like what? Like for 28 seconds of video. I mean, granted, in the grand scheme of things, you know, hiring a big production company and stuff costs a lot more than $3. But I was still like, wow, that went fast.

So part of the reason for all this process is so that when you do get to the video creation part, you’re not going to blow a million bucks or burn all of your Al credits at once.

Smart advice. John, what would it be like in your world to be able to remember that much information that Chris just rattled off at any given time?

Well, it doesn’t have to be super solid either. It’s just critically important that, like, as you’re doing it, you’re making notes on the stuff you’ve already tried. Because I’ve had so many projects go awry like this, where you’re like, okay, we just have to pull some levers and make some stuff happen. But then once you’re on the fifth or sixth iteration, you’re like, oh, wait, did we try that one or not? And now you’re wasting time and money. So, you know, it doesn’t have to be a complete word for word transcript of everything you did, but at least make a list of all the decision points. And you’re like, hey, we tried X. And that went really poorly. So don’t try X again. And that way every time you go through, you’ve got a checklist of like, okay, don’t do that, don’t do this. And yeah, it’s just, it can be. A lot of people think it’s great to just kind of go in there and mess with everything, but the reality is, you know, if you’re going to have to stay there until a project’s done, you’ve got to start dropping some breadcrumbs along the way or it’s going to get painful.

Yep, exactly. So we have now our entire creative brief, which is all built out based on the information we provide. And this is really terrific because it has built out a lot of the information that you would want in a great brief. Like if you were a creative director and someone came back with this, which is like a four page brief, they’d be overjoyed, like, oh great, I don’t have to ask pester you for all these questions. So that’s what we, that’s the where we are right now. So our next step is. Okay, well this is good. Now we need a script. So we need like a shot by shot script so that we know what to generate. So say let’s come up with some script candidates aligned with the brief. The 32nd countdown is one portion that has a simple numeric countdown.

We’ll want to choose the an appropriate font and background for that. The second part is the 15 second lead in what should our script be for that? So now we can just start having the conversation because remember, we still got all of our knowledge blocks loaded into this chat, so that is still in memory. One of the things to remember about Generative Al, and I was just doing this on a webinar literally minutes ago, is that what we think is happening versus what’s actually happening are two different things. In Generative Al, this is how we feel like a chat works. We have a prompt and we have a response. Give a prompt and we think that the prompt is just the last thing we did. What is actually happening behind the scenes is that the entire conversation becomes a part of the next prompt.

When you load in things like your knowledge blocks, that helps because it becomes a part of the permanent conversation. It carries through to all the different. Pieces of the chat, which I feel like is not how we started with Generative Al, because I recall there were a lot of times when were talking through prompting and one of the instructions we would give people is to tell generative Al, don’t forget about the stuff we’ve already talked about. It sounds like that recall has actually gotten better. And now the advice is if you don’t want it to bring back up all your old stuff, start a new conversation.

Exactly. So we have the 32nd Countdown portion. The visual and typographic elements should be a solid, dark, solid background. Barlow Connect semi bold animation data squares could be used. Here’s the problem. This first section here is so deterministic. It is not worth your time trying to do this in Al. You are better off building this literally as a PowerPoint slideshow and then describing rolling through the slideshow. And that’s what you should do for this because it is so rigid, generative Al is not going to be able to do that. Well, I can tell you that from a lot of painful, expensive failures in the video generation. Okay, for the 15 second lead in we have animated lines of data streams, quick cuts of online text with the Barlow font, the Trust Insights logo animates and the host names and titles appear.

The data first visualization intro. A chaotic cloud of gray data squares drifts on the dark background. A clean red line cuts to the chaos. The bar chart animates and the screen transitions to a clean layout with the show title or the Meet the Experts intro. Our logo followed by a split screen with headshots, a visual transition to the title cards and a final shot of all three hosts. So of those three candidates, Katie, which one do you like the best? The first number one. Yep. John.

I’ll pick three. I totally see us all with the. Yeah, exactly. The high school yearbook shots.

Okay, I think this one looks good. We will see if we can get the video generation tools to do it because again, there’s still chunks of it that are very deterministic to do this well we have to generate it in five to eight second pieces because no video generation tool on the market right now does more than about that much. So you can’t do a two minute or five minute piece today. But putting a bookmark, this is July of 2025. By the time you watch this in July 2026, there probably will be just make a 90 minute movie right in one shot.

So I mean the good news is it’s only 15 seconds long. So you know, hopefully it won’t be too cumbersome.

Hopefully not. Give me a shot by shot. Do I want to do it that way? No, we don’t want to do it that way because what we want to do is we want to. Well actually yeah, we do it on shot by shot because then we want to optimize each shot for the video models. We’ll say great, give me a Shot by shot list with description and specifications for candidate one and let’s see what it comes up with. It should come up with something relatively clean. What we’re then going to have to do is format it for a visualization tool like Google’s VO3. There’s a lot of Al video models out there. If you’re in the OpenAl ecosystem, the tool to use is Sora, which is OpenAl’s video tool that generates. If you’re in the Google’s ecosystem, the tool to use is V03. Other options are some of the Chinese models like Cling and Halo. Halo. And there’s a couple others whose tongue twisters I can’t even remember properly. Let’s go ahead and export this to a Google sheet as our shot list.

John, you were talking about one the other day. Veo.

Yeah, that’s the Google one that.

Oh, that is the Google one.

There’s been a lot of interesting clips going around about that just in how well it can do deep fakes and things like that. So it’s raising a lot of eyes.

Yep. Now the one thing again, what we want to do is we want to do this in a way that is compatible with the way VEO prefers to be prompted. So I’m going to start a new Gemini gem. We call this VO3 optimizer. What I’ve done, and I did this ahead of the show because there’s just no time to do it in the show is I built a deep research report on VO3. And so this is this very, very long document, 24 pages of how VO works, prompt examples and stuff like that it requires to properly prompt it. Because again, these video credits are super expensive. I’m going to put in my system instructions that go with this and we’re going to hit save. Our VO optimizer has been created and now I can say let’s take shot one.

I’m going to copy this and say optimize for VO3. And we give it just the first line out of our spreadsheet and it should think through and deliver us two different VO3 prompts. Because there’s two different ways that you can use the tool. The first is within. If you’re a Google Workspace customer, you can use it right from Google Videos, which is a tool that you probably didn’t know you’re paying for, but it is a online video editor. The challenge with the built in V03 model there is that it’s limited to 5 to 8 seconds and you’re only allowed 1000 characters in the prompt. Right. So you have a very tight space if you’re using Flow, which is Google’s video editing tool for Google Pro users that has no restrictions on what you’re allowed to do. So I’m going to start taking. Let’s use Google videos for today video. Let’s go to Untitled Videos. Let’s start a blank video. And I’m going to take this first shot. I’m going to take the restricted prompt out of here and we’re going to put this in and I’m going to go to the generate tab and you can see there’s thousand character list. So let’s go ahead and hit create. This will take some time, so while it’s doing that, I’m going to take the second shot and start processing it.

Do you think. I mean, I feel like I know the answer to this, but do you think that not knowing how these specific models need to be prompted is where people are burning a lot of time, energy and budget? Because I remember when imaging first started to hit the market. That’s a whole different skill set in terms of how you prompt for images. And it sounds like video is no different where it’s a whole different, you know, skill set in terms of how you prompt a video versus how you prompt a text model.

Mm. It has issued a warning. It says, by the way, the model is notoriously unreliable. Generating specific legible on screen text directly prompted forwards attribution Rl will almost certain result in garbled or misspelled text breaking the professional aesthetic. You’re looking for the standard best practice to actually forbid on screen text using a negative prompt to ensure clean output. Therefore, instead of fighting the technology, we will direct it. We will translate the idea of each keyword into a powerful visual, symbolic visual that VO3 can render. So this comes right out of the system instructions and that the. The deep research. Let’s take a look. See how the first things. Yeah, you can see there’s the text. Yeah, that does not look great. That is something you’ll probably have to. We’ll probably have to fix up in post production. However, the visual itself is pretty darn cool. So we’ll need to. We’ll probably want to do that without the text. Let’s see what it came up with for the second prompt. So four individual prompts for keyword montage. Sleek digital visualization. No subtitles, no text. So this will be our second and let’s create a new track here. Paste this in. I’m going to go back to the Prompt one and see if we can knock out that text and regenerate it.

That way we don’t have to ask the tu he questions.

Exactly. Okay. No subtext.

That is funny though. That is one thing I’ve seen, even with the VO stuff, is any videos, if you look at for stuff that should be text, that’s always a giveaway. You still see like weird letters floating in the. Which is almost insane to me that it can do people perfectly. But text is, you know, still can’t be done.

That’s fundamentally because text is a deterministic output. It has to be in a certain order. It has to look a certain way and has to look coherent. Whereas people are random. It can generate randomish looking people. And that’s. Okay, let’s put this in and see how this turned out here. Okay, so we’re going to keep that one. Let’s discard this, go back to this one. For our next, which is the intro parts, we’re going to get rid of this initial intro part because that just looks terrible. And so as we go through and bake each of these, we have to also start thinking about what we want for the background music. Because VO3 can generate music, but it’s unique to each clip. It’s incoherent from clip to clip.

Well, and I think that’s a really good point is, you know, so usually for those who, you know, aren’t as familiar with, you know, video editing, video and audio tend to be two different tracks. And so you can have it combined if that’s how you recorded it. And that’s usually what you do, like for real life things. And so, like, if you hold up your phone and you record a video and you bring it into an editing system like a Camtasia or whatever the one is for like Apple. And all those things you’ll see, likely you’ll have the one track and then you’ll have the option to split the track into audio and video. And then you can, you know, tweak the audio mess with the video, or you can keep them combined. But when you’re creating content like this, using Al to start, and there is no audio and video track, you probably, you can use Al to generate audio, but you probably want to do it separately and then bring it into your editing tool because it is still going to be two tracks that are layered on top of each other. And then if you’re adding more effects, that’s going to be yet another track. So you have your effects track, your video track. Your audio track, but typically they’re separate. So I think, Chris, to your point, don’t worry about audio right now. You can layer that in afterwards. You can bring in if you have the rights to a song or if you create your own, that can be a separate track.

Exactly. The thing to keep in mind there is we need to come up with the audio. As I’m hacking away at the restricted prompt here right now, I’ll tell you how the original. So what music got created. That is just a bunch of loops. That is a bunch of loops from. From Apple’s garage band. And I was messing around one day and I’m like, all right, you know, we’ll just give this a try. And it worked fine at the time. Like, I did like three and a half minutes and we spliced out the part that. That sounded the least bad. Is that the way that, you know, it’s very much like sort of that. That tag techno.

Yeah, well, and. And this is the thing like, so if we’re going on personal opinion of music, we all have very different things that we enjoy when it comes to music. So I think this is again, where it’s a good opportunity to see, you know, is there data in our ICP that could give us some guidance to say what kind of music? Like, what would be theme or what would be, like, the tone or, you know, the style of music for trust insights as a brand? Because those are again, those are not things that we took into consideration when we put together our style guide. But we do have a lot of data. And I don’t. This is. It would be completely experimental. I don’t know necessarily that it translates to asking, you know, our icp, like, what kind of music do you like? What would you want to listen to? But I also don’t think it’s the worst place to start. Either it could come up with nothing or it could actually give us some insights to say based on this demographic and psychographics, they typically listen to anything but techno.

Ran. Yep. Give me three to five candidates for the instrumental background music for the intro video. Specify the genre of music followed by a short 200 character prompt omitting verbs and stop words of technical components for generating the music such as tempo, key, instrumentation, timbre, pitch. Order your results in descending order by probability of alignment with the ICP and return the probability as a 0 to 100 integer score. So let’s see what our ICP wants. Who knows? And by the way, if you are. If you are watching this live stream live, which Actually, a good number of you are. Several dozen of you are put in the chat. What kind of music do you think we should be doing? In fact, I’m going to hide the results here so you can leave a comment in the comments and tell me, like, oh, you know, what do you think our music should be? I need to get this fourth prompt here. Let’s see.

As death metal, country.

Well, and, you know, and that’s why I was hesitant to just let us, you know, the three of us make a decision on what it is, because we all. I know for a fact we have overlaps, but we all have very different things that we like. That looks a lot like Voltron or the Transformers right off the bat.

Decepticons.

But. Yeah. So, you know, and I think that this is one of the things, like, it’s a good reminder why you create an ICP in the first place is because it takes your personal opinions out of it. Instead of you deciding what you think your audience needs, it’s your audience telling you what they want, and you use that instead. Now, we may not have enough data to say, you know, oh, this is the kind of music you should use. What would likely be a better, you know, more in depth is if we did some kind of deep research around, you know, music taste or if there was any kind of research available from a Spotify or a Pandora or an Amazon, music that gave, you know, demographics and that kind of stuff, you could bring that into that conversation. We just don’t happen to have that. But that would be a way to sort of handle. If you don’t have, you know, those things handy, then I think that you could bring that in. I’m looking already. I’m like, lo fi makes sense. News theme feels a little too like Guy Smiley from the Muppets with, like, I don’t know.

In the background.

Yeah, exactly.

And minimalist electronic is more or less kind of what we have now.

Sure. Again, but this is why I wanted to take us out of it. I don’t. I personally don’t love theme music, but it’s not for me. And I think that’s the thing that I’m. That’s what I’m trying to convey is, like, if you’re making the decisions on behalf of your customers, you’re introducing your bias into it. So I would never pick electronic music personally, but this is telling us what our customers would ne. Would think that would make sense.

Exactly. So let’s see if we can get this working here. I need to log into my Music generation software. Oh, I need to do it from the client services account. That is a pain. So let’s see if I can do this.

For those who don’t know, we use different accounts at Trust Insights and I have recommended to Chris that he use different browser profiles versus what he currently has set up, which is all in the same profile. One person’s opinion. We all do it our own way.

Exactly. So let’s share screen. Now we need to share the tab so that we can share the audio, which is the reason for the switch because otherwise you. You can’t get the audio shared, at least in Chrome. So our first candidate here is actually. We want this in custom mode. We want no lyrics and we want our pitch. And we’re going to call this so what one. And let’s look at the advanced options. Keep that at 50% each and hit create. We’re going to then take the second prompt that it gave us, paste that in, and then we’ll call the so what to get that in and then go into a third. Paste that in and call that so what three in. Suno. Suno in particular gives you three candidates per. Two candidates per. Per track.

So let’s see, the first one is at least basically available for preview. Let’s give it a listen. I’m gonna fall asleep.

Oh, yeah, that’s all wrong.

Let’s. This is the second of the. That first prompt. The 95 alignment. That’s better.

It’s better if you. If I had to pick one of the two, sort of like one or two. One or two. I would pick two in that instance.

All right, let’s try a candidate too.

I need to get on my light.

Cycle for this one.

Yeah, I feel like I’m about to watch Stranger Things. Like there’s a demigorgon coming up behind. Me right now probably. It’s not bad though. I can see this, like, particularly for the 32nd countdown up front. This would not be bad. Okay.

Let’s listen to candidate two in that section. A bit more sedate.

Oh, this is more Stranger Things.

Yeah, I got. Rudy’s about to get the. Oh, definitely. All right, now let’s try out candidate three, the corporate loi. Free conference call dot com. It really is. Now let’s try candidate number two in that category.

Unfortunately. Like, I love lo fi, but unfortunately these ones do sound like hold music, which they really do, theoretically, is what it is. But it also looks like it should like break in and go. Your call is important to us. Please stand by.

Like, I thought there was A little bit of Warren G in that last one. I was feeling a little bit. A little street there.

So I would say if forced to pick one, I think the second candidate of the first batch.

So this one here, the Uptempo Broadcast News.

Yeah. But again, one person’s opinion, I actually think if were able to export the audio, I would probably take it to the community and have them vote.

Well, we can, we connect, we can download the audio. These just straight up MP3s. Which of the two candidates for the techno? One. One or two?

I think. Yeah, I think that one.

Okay, so let’s take that one. We’ll download that. And then of the Lo Fi hold music, which one of those? Oh, yeah, number two.

Number two.

All right, let’s download that. We’ll download it as well. Okay, so now we’ve got music. And what I also did was inside Google Videos. I went and I exported the. The track. So in Google Videos here, what I’ve done is for each of the generations, I’ve stuck them onto the timeline here and I’ve added padding on either side. And the reason for this is I want it as one great big video file that I can then take into my editing suite. And because we want to be able to work with these separately. And the editing suite, I personally use Adobe Premiere, not because I love it, but because we pay for it. It’s part of the Adobe Creative Cloud. There are.

You’re showing this, Chris, we can’t see it.

Oh. Part of the problem is it takes like 15 seconds for this thing to just. So let’s go ahead and. Yeah, that’s. Let’s start a new session.

This is one of the prime reasons why you have multiple monitors. If you’re going to run Premiere, you’re like, okay, I need three screens to make this work. Right?

Seriously, let’s go ahead and get.

But I think one of the things that you’re demonstrating, which is always useful for people to see the good, the bad and the ugly, is if you want good quality output, it’s not just like, I want a bumblebee landing on a flower, it’s going to be more than that. If you want the music, if you want to edit it, if you’re going to be using it for your YouTube channel, you probably have card bumpers or contact information, other things that you would want to lay over it. And so there’s more than just one tool. And Generative Al is just that. It’s a tool. It’s part of your stack. And so in a Perfect world. It’s your whole stack. In the realistic world, it’s one of many tools that you would need to complete a task.

Exactly. We have the three scenes. I’m going to now choose Scene Edit detection within Adobe that will identify the different scenes and then splice out the placeholders in between them, which is a useful way to adjust those individual clips back out. It saves a lot of time. Great.

Can you share your screen?

Oh, am I not doing that? Nope.

We’re all just in imagination land right now.

Okay, let’s try that again. I finally got my project imported. I’m going to drop in the video itself, hit keep settings. Let’s ditch the audio. So we’re going to unlink it. We’ll ditch the audio from the original clips because it just was not helpful. Then we’re going to choose Scene edit and we’re going to have it analyze the different scenes. And then we’re going to take out these interstitials which are just placeholders on either side. This gives us three clean clips and we can see them there. There’s the things. And then we can take the audio. Let’s remove these parts here. Now, obviously we don’t. We haven’t built the countdown timer yet. That would be in the first section. But then you would have these three clips here like so with the background music. That’s the process then of creating the final. You still have to. You should use a nonlinear editor to assemble the pieces, to get the transitions right and to get the audio right.

What you can see on the bottom half of Chris’s screen is what were talking about earlier, where you have the video and the audio as separate tracks. And so Generative Al put together just some, you know, standard audio to go with it. Chris was able to separate those tracks, delete that audio, and bring in the audio that we created in Suno.

Yep. And even this isn’t as far as I would go for something in production. So one of the things that is 100% true about generative Al made music is that it is not mastered properly. It is mastered for it to be very generic, which, you know, like all Al output is. And so you have to separately take use your audio editing tools to remaster it properly so that it is a little more punchy, will cut through road noise and wherever else people watch or listen to shows. So in all generative Al music cases, you absolutely should do the remastering. Otherwise it is very apparent that you are using Al generated music and it doesn’t sound as Good.

But if you have limited resources, it’s a good starting place.

Yes, it is good enough. It is good enough. So that’s the process from beginning to end, of what?

Do we get to see it?

Yeah, play the clip. Come on.

It’s not done yet. We haven’t done that. We don’t have the countdown timer.

No, I know, but do we get to see what we’ve done so far? Because we’re about to end the live stream and we’re leaving people hanging of, like, what we created.

All right, well, I could. I. It won’t play the video. I mean, it won’t play the audio because it’s premiere.

That’s okay.

See, something here’s from. From the ending to begin. From the beginning to the end.

Look at all that data flying around, putting together the hierarchy, and then Decepticons are there to save you. Oh, wait, they were the bad guys, right?

They were the bad guys.

All right.

And so that is the. The final sequence as generated there.

You know, considering we started with nothing, it’s not bad. I. I’m gonna be honest, I don’t think that this video is going to make the final cut for the live stream, but I think it’s a good practice proof of concept.

I do too. I think I would particularly for the third part, it do a nod to our heritage and have some light bulb and. Or lightning there. Then one of the tools that can work with uploaded logo as well is OpenAl Sora. So I probably take the Trust Insights logo as is and have Sora do animated version of that as the fourth and final part of the video.

Whereas I would probably take out the question mark and replace it with the Trust Insights logo.

Yep, we just throw on that and this is it.

We’re there. And the other thing is this completely was outsourced to Al. It did not. We. At no point did we offer our creative ideas in the process. Like, one of the things that we might have said, like, yeah, let’s use that gray square data stream that is part of our logo as like the centerpiece of the, you know, the sequences, right? We literally just threw caution to the wind, said, Al, you do it all, and you end up with this.

Well, considering that, you know, outside of us, you know, chatting for a few minutes here and there, in less than 45 minutes, you did a whole video in audio. You know, proof of concept that you could then bring to someone and say, is this what you wanted? You know, we built a creative brief, we built the prompts using best practices for the video creation, we built six different versions of audio based on our icp. Like, we didn’t accomplish nothing. So we have something to show if were, you know, bringing it back to a stakeholder to say, is this what you were thinking? Is this what you wanted? So I would say, job well done. We got a lot accomplished in a short amount of time.

We did. We did. And if folks would like the VO3 Optimizer Guide and prompt, we’re going to put it in our analytics for Marketer Score Slack community. So if you are not a member there, it’s free to join. If you’re not a member there, you should go there, because that’s where we’ll put the downloads from today’s show so that you can use them as well to make your own videos and things.

All right, John, gonna start remaking all your videos.

Yeah, absolutely. We just need the Transformers theme line this all up. It’s ready to go. Because it’s just amazing how it’s just what you were talking about. People used to spend thousands of dollars to make, like, 10 clips to show a client. Like, hey, what do you think? Because there’s no other way to get to this point of, like, okay, we like this. We don’t like that. We need more of this. We need less of that. And so, yeah, this is, you know, watching this. You’re saving thousands of dollars. Like, yeah, it turned out weird, and there was a lot of rando stuff in there, but nonetheless, you’re getting closer to the target without spending any money. So it’s huge.

Exactly. And you could at this point, if it was something as important to you, and especially if you wanted to own the copyright on it, at this point, you could take all the comps and go to a video production agency and say, like, this is what we want. We just need you to do it with humans so that we have a paper trail for copyright, which I think.

Is a really smart way to do it, because, I mean, I know from working with creative directors, you guys have probably experienced the same thing. What you’re describing often doesn’t translate into what they’re thinking and imagining. And then they say, well, can you draw it for me? And you’re like, I can do a stick figure. That’s about it. You know, so this is a really great way to at least get to that point of having a conversation to say, this is what we want.

Yep. And they can react to it and go, no, that’s. That makes no sense. Which, you know, it’s like, why Is there a Decepticons logo at the end. Of your video to see if anyone noticed?

See if anyone knows. All right, folks, that’s gonna do it for this week’s show. We’ll talk to you on the next one. Thanks for watching today. Be sure to subscribe to our show wherever you’re watching it. For more resources and to learn more, check out the Trust Insights podcast at trustinsights Al TI Podcast at. Our weekly email newsletter at trustinsights Al Newsletter Got questions about what you saw in today’s episode? Join our free analytics for Marketers Slack Group at trustinsights Al analyticsformarketers. See you next time.


Need help with your marketing AI and analytics?

You might also enjoy:

Get unique data, analysis, and perspectives on analytics, insights, machine learning, marketing, and AI in the weekly Trust Insights newsletter, INBOX INSIGHTS. Subscribe now for free; new issues every Wednesday!

Click here to subscribe now »

Want to learn more about data, analytics, and insights? Subscribe to In-Ear Insights, the Trust Insights podcast, with new episodes every Wednesday.


Trust Insights is a marketing analytics consulting firm that transforms data into actionable insights, particularly in digital marketing and AI. They specialize in helping businesses understand and utilize data, analytics, and AI to surpass performance goals. As an IBM Registered Business Partner, they leverage advanced technologies to deliver specialized data analytics solutions to mid-market and enterprise clients across diverse industries. Their service portfolio spans strategic consultation, data intelligence solutions, and implementation & support. Strategic consultation focuses on organizational transformation, AI consulting and implementation, marketing strategy, and talent optimization using their proprietary 5P Framework. Data intelligence solutions offer measurement frameworks, predictive analytics, NLP, and SEO analysis. Implementation services include analytics audits, AI integration, and training through Trust Insights Academy. Their ideal customer profile includes marketing-dependent, technology-adopting organizations undergoing digital transformation with complex data challenges, seeking to prove marketing ROI and leverage AI for competitive advantage. Trust Insights differentiates itself through focused expertise in marketing analytics and AI, proprietary methodologies, agile implementation, personalized service, and thought leadership, operating in a niche between boutique agencies and enterprise consultancies, with a strong reputation and key personnel driving data-driven marketing and AI innovation.

One thought on “So What? Using Generative AI for Video Creation

Leave a Reply

Your email address will not be published. Required fields are marked *

Pin It on Pinterest

Share This