So What? Intro to Knowledge Graphs

So What? Marketing Analytics and Insights Live

airs every Thursday at 1 pm EST.

You can watch on YouTube Live. Be sure to subscribe and follow so you never miss an episode!

This deep dive will reveal how a knowledge graph for AI prevents your favorite models from forgetting crucial details during complex tasks. Mapping concepts as nodes and edges will empower your AI agents to retrieve facts without exhausting their context windows. You will discover hidden gaps in your content strategy and pinpoint technical debt that blocks your path to innovation. Adopting a knowledge graph for AI will transform how you manage proprietary data to ensure every generated response stays accurate.

Watch the video here:

So What? Intro to Knowledge Graphs

Watch this video on YouTube

Can’t see anything? Watch it on YouTube here.

In this episode you’ll learn:

What knowledge graphs are
Why knowledge graphs deliver better results with agentic AI
How to get started with knowledge graphs

Transcript:

What follows is an AI-generated transcript. The transcript may contain errors and is not a substitute for listening to the episode.

Katie Robbert – 00:00

Well, hey everyone. Happy Thursday. Welcome to “So What?”, the marketing analytics and insights live show. I’m Katie, joined by Chris and John.

Christopher Penn – 00:48

Hello.

Katie Robbert – 00:49

High fives all around. Well done. It only took us how many years to get that, and we mostly get it right.

John Wall – 00:57

Mostly get it right.

Katie Robbert – 00:58

We have a lot of expertise in other places, including what we’re talking about today: the introduction to knowledge graphs. This topic came up because Chris was talking about this a couple of weeks ago, and we realized it was probably a good idea to bring it to our audience. We want to really walk through what exactly a knowledge graph is.

Chris, I remember when you showed me a couple of weeks ago, it looked familiar. Anyone who has been with us long enough remembers the graphs we would show from events of who was talked about the most. Visually, it’s very similar looking. It’s a relational graph that shows how things are clustered and connected. Without getting too far ahead of myself, Chris, what are knowledge graphs, and what are we going to cover today?

Christopher Penn – 01:56

You reminded me to bring back the old stuff. A knowledge graph is any entity relationship mapping graph where you have nodes—which are some form of entity, like a person, an idea, a concept, a topic, or a word—and you have edges, which are the connections between them.

For example, if there were a knowledge graph of Trust Insights, there would be four nodes: Katie, Chris, John, and Kelsey. If I were to take our Slack conversations every time Katie mentioned Kelsey or Kelsey mentioned Katie, that would be an edge—a connection between the two. If you were to graph that out using our company Slack, you could see who talks to whom the most.

Almost 10 years ago, we would use Twitter data because it was the most reliable and most open data format. We would look at who was the most talked about at an event. One of the things about influence measures 10 years ago was that tools measuring social media influence had a bad tendency to measure who had the biggest mouth, which is not super helpful.

It was the old 1980s financial services commercial where the tagline was, “When E.F. Hutton talks, everyone listens”. That is the way our friend Mitch Joel describes influence, which is a great summary. It’s not who you know; it’s who knows you. These graphs represented a lot of that.

They came and went. Twitter was bought and became X, and new social networks are much less generous with their data. Social networks are now owned by AI companies that keep all that data for themselves, making it difficult to get. However, the concept of a knowledge graph—saying what the words, phrases, or topics are—is more relevant than ever because of the context window.

Katie Robbert – 04:30

Chris, isn’t this just a word cloud?

Christopher Penn – 04:36

No. A word cloud is what is said the most, which relates to who has the biggest mouth. If conferences chose speakers based on just who foamed at the mouth the most on LinkedIn, you’d end up with the most awful conference in the world.

Katie Robbert – 04:56

That is an important distinction because we got that question a lot when these network graphs were published. People were thrilled when they saw their name show up as a bubble because it meant they were being talked about a lot. They were being tagged in tweets, and people were resharing their content. It was a point of pride to see your name show up as a prominent bubble with different connections.

We had to clarify that it’s not just a word cloud. A word cloud shows the most mentioned person or the person who talks the most. I wouldn’t necessarily show up because I’m not someone who posts enough about what I’m doing at a conference. But if I’ve done my job right, people are posting about what I’m talking about and tagging me.

At MarketingProfs B2B, you would expect someone like Ann Handley to show up as a very large bubble as the MC and the face of the event. If she doesn’t, then she has not done a good job in that role because people aren’t talking about her or they’re talking about her negatively. It’s interesting to see how those bubbles came about. It was helpful for event planners to see who was being talked about the most to decide who they wanted on stage.

Christopher Penn – 06:38

Exactly. This technology is still useful because today’s AI tools only have so much memory when processing data. They have a short-term working memory called a context window. The context window is how much memory a model can hold before it fails or has to compress.

If you have ever used Claude Code or Claude Coworker and it says it is compacting its memory to keep working, it is basically making a summary. If you have a lot of valuable and relevant data, that is a bad thing because you lose details as things get summarized. It would be great if we had a way to provide access to those source documents without overwhelming the memory.

Agentic systems today can read files off your disk, which is a big step. However, even that can get overwhelming because sometimes you have too much information and need a library system to navigate it. There have been many conversations in the last six months about how to implement this. Andrej Karpathy, the former co-founder of OpenAI who moved to Anthropic this week, has proposed wikis—throwing it back 25 years.

Katie Robbert – 08:19

Wow, there’s a new concept.

Christopher Penn – 08:24

Those are good situationally, but it takes a while for a language model to traverse wikis. It can be very processor and memory intensive. Other systems exist, like MemPalace, released by Milla Jovovich and her partner. It uses compressed text to help a model remember things, which was not on my bingo card for this year.

Katie Robbert – 09:06

I feel like you just said a bunch of words. I got the Resident Evil reference and I know the actress, but what did they release?

Christopher Penn – 09:14

They released a similar system, a kind of compressed wiki, to help AI tools remember things. Users have said it seems to work reasonably well. But looking back, knowledge graphs are probably the best way to do that in an AI world because they allow you to search across a network of concepts without reading every document.

Reading every document kills memory in an AI. When you ask it to help you write a blog post and give it supporting materials, it reads everything, and its memory blows up. If one of those documents is 30,000 words, that’s not helpful.

With a knowledge graph, a tool like Claude can look at the graph and identify concepts related to enterprise AI. It defines the concepts and links to them. In a good knowledge graph, it has little pointers and indexes to just those documents. It no longer has to read everything all at once; it can read just the relevant ones, piece things together, and come up with useful answers. That is why knowledge graphs are suddenly hot again: they preserve AI memory and point it only at things that matter for the conversation.

Katie Robbert – 11:01

In an old-school sense, is it basically like the index in a very large book? If you’re looking at a cookbook and just want recipes related to chicken, you go to the index, find the versions of chicken recipes, and flip just to those pages versus going page by page?

Christopher Penn – 11:29

That is a good analogy. Imagine you have that index and you tap on the word “chicken” because you’re also looking for onions. You look at the chicken node, see onions, and tap on that to see sub-relationships. It’s a super helpful way to examine data.

Katie Robbert – 11:53

Brian has a question: “What are your thoughts on Obsidian and its knowledge graphs, especially related to Karpathy’s Second Brain idea?”

Christopher Penn – 12:02

Many people are using Obsidian, and its implementation is fine. It’s proprietary software—a personal knowledge base that links different data and creates a master index. I use two different systems: Joplin, which is a collection of notebooks similar to Evernote, and Logseq, which is a free tool for knowledge graphs. You can give it a folder of documents and it creates graphs around them.

The tool I use most these days is called Graphify. It is free, open-source, private, and works best with an LLM. You install it on the command line in a directory and tell it to make a graph of what is in there. If there is code, it doesn’t need an LLM because code has automatic references, functions, and libraries. It can make an abstract syntax tree, which is like a cookbook index with automatic sub-indexes. In a tool like Claude Code, it makes things faster and reduces mistakes. With an LLM, it can index all the documents in a folder and provide that same knowledge graph.

Katie Robbert – 13:58

My brain is already spinning because I understand how this would be applicable to me. I organize all my projects for Claude Desktop in a folder on my desktop with subfolders for different projects. Much of my work is relational. I might work on an event talk based on my knowledge of what we do at Trust Insights and our strategy, which live in other folders.

Right now, I am probably burning more usage than needed because I tell Claude to search the master folder and find what it needs. If I had a knowledge graph of the contents within each subfolder that connected all the topics, I could save usage within my Claude Desktop instance. I wouldn’t have to tell it to search everything.

Christopher Penn – 15:33

Exactly right.

Katie Robbert – 15:35

I win. I quit.

Christopher Penn – 15:39

Let’s look at an example of what a knowledge graph looks like. I’ll show a code one first. If you go to TrustInsights.ai/view, you’ll see our AI View tool. It allows you to do assessments for the third phase of GEO relevance. If you haven’t taken our course at TrustInsights.ai/geo101, this tool lets you inspect or compare pages.

On the back end is a ton of code. This is what the knowledge graph of the AI View code base looks like. You can see big clusters, like BM25, which is a search algorithm. There is the HubSpot client test library and rate limiter tests. There are many dense nodes with many connections around them.

When I tell Claude that something isn’t working right—maybe the rate limiter—Claude can look at the graph. It identifies the files and functions involved in rate limiting without reading the entire code base. It goes to those specific files and can tell me if the code is broken or if the user is an idiot.

Katie Robbert – 17:28

Is the success of a knowledge graph dependent on the governance and naming conventions of your files, or is it just finding keywords?

Christopher Penn – 17:52

It can be both. A knowledge graph is useful because you see little dots around the edge that are stragglers. Those indicate things like dead code or bad governance. Naming conventions that don’t make sense or technical debt become obvious. This is a useful diagnostic tool.

You can tell Claude to simplify the code base by looking at the graph and the stragglers to figure out why they are hanging out there. If there is a big chunk not connected to the main section, you might have dead code or features that were never implemented properly. It benefits from good coding standards and naming conventions.

Katie Robbert – 19:14

A knowledge graph is an efficient way to start cleaning up technical debt. For those who are unaware, can you explain what technical debt is and why you should be cleaning it up?

Christopher Penn – 19:36

It is duct tape and chewing gum. A user reports an error, a developer patches it, and then they keep adding patches. It’s like when you see a car where the registration sticker is an inch wider than the license plate because they keep putting stickers over each other. Technical debt is band-aids on top of band-aids because no one fixed the underlying problem.

Over time, that makes code harder to maintain because fixing one thing breaks five others. It’s a house of cards barely held together. Human coders hate cleaning it, but machines are superb at it. With a knowledge graph, you can tell the machine to refactor a big file, remove technical debt, and reboot the system. Technical debt adds up when you band-aid things instead of revisiting the requirements or the spec.

Katie Robbert – 21:10

Technical debt makes your code less efficient and heavier. John, as someone who does home repairs, does this analogy resonate with you? If you could redo all the wiring correctly the first time, you wouldn’t have to jerry-rig a box and hope it doesn’t set on fire.

John Wall – 21:55

That ties in, and I spent years in software development with systems that did code management like this. Cruft piles up. If a previous version used one database and you transitioned to a new one, someone has to go back and pull out the old chunks of code. A diagram shows the freestanding clusters that no longer call anything, and you know they need to be paired off. In every system, cruft and entropy are problems.

I have bathroom plumbing with three different types of connectors—metal to plastic to PVC. You reach a point where you realize you could repair it a fourth time, or you could just cut it all out and put one pipe in.

This is fascinating because these network graphs were hugely successful for us when we initially did them. They would have five or six influencers on them, and they would go wild on LinkedIn because everyone wanted to brag about being the most important person at a show. It’s funny that these have very important functional implications beyond just a marketing stunt.

The idea of using a network graph as a dynamic index for an LLM is mind-blowing. You can feed the network index to the model so it doesn’t have to churn through everything; it just cherry-picks what it needs. You save tokens and make everything run faster.

Christopher Penn – 24:19

Let’s look at a non-code example. This is a knowledge graph of my LinkedIn posts. I write my posts in Joplin using markdown, and you can run Graphify in any folder of content. You need a language model to process it because it’s not code. In Claude Code, I use the Haiku model—which is fast and cheap—to build the graph.

You see clusters like “retiring old AI advice,” “generative AI,” and our “unofficial LinkedIn algorithm guide”. This is an index of 272 LinkedIn posts I’ve done. I can look at the stragglers, like the “Trust Insights Learning Survey” or “AI and Inequality,” and see things that are highly interconnected.

Katie Robbert – 26:11

I could see companies using a tool like this to vet whether their “experts” are truly experts and what they are actually talking about. You can see if there is a real narrative or if they’re just jumping on a bandwagon with one viral post while everything else is unrelated. It’s a way to vet experts in the field.

Christopher Penn – 27:36

For a content creator, I could tell Claude I want to write a new book using the 300,000 words of content I’ve already written. It could go to my knowledge graph, look at the biggest clusters related to generative AI, trace through them, and extract my real words in logical order.

It preserves my voice. The AI doesn’t have to remember all 300,000 words; it can traverse the graph and pick what it needs, like a chef in a kitchen. Because the graph maps the topology of my language, it can assemble patterns I might not even see.

Katie Robbert – 29:13

That would be helpful for me because I don’t feel like I have enough material for a book. It would help me see the clusters that could become chapters for an overall theme, like the 5P Framework. I write about it enough that I have material, but as a human, I wouldn’t know where to start. Seeing those clusters would be super helpful.

Christopher Penn – 30:16

This also tells you what you’re missing. You could take four or five deep research reports on enterprise AI governance, create a graph of them, and compare it to your own graph to fill in gaps. If you want to say something intelligent on LinkedIn without being a “me too,” you could look for topics no one is talking about, like network topology within large corporations as it relates to AI.

Katie Robbert – 31:44

I see a use case for companies wanting to tighten their content strategy. A company might think they talk about a topic a lot, but a knowledge graph might show it’s a tiny, disconnected bubble. It’s a starting point for companies that aren’t getting known for what they want to be known for. If they aren’t writing about what they want to be known for, the LLM won’t associate them with it. Is there a way to marry SEO data to this so you can see the Venn diagram of what you want to be known for versus what you’re writing about?

Christopher Penn – 33:06

That will be in the Trust Insights GEO 301 course because 201 and 301 cover advanced stuff like that. You can take the same algorithms language models use for vectorization and decompose them against model embeddings. You can find the semantic whitespace that is missing or compare what shows up in search results.

An embedding space is just a big knowledge graph of math. You can see if who you are and what you talk about matches the semantic space inside a language model for that topic. If they don’t look alike, that’s why you aren’t in the results. There is no consumer-friendly way of doing this yet, which is why it’s a 301 course.

Katie Robbert – 34:44

Or you can just talk to John.

John Wall – 34:50

We have a guy.

Katie Robbert – 34:52

I think this is incredibly useful. My content often feels disconnected, but seeing a graph would help me understand if I’m over-indexing on one part of the 5P Framework and need to round it out. Content audits are still important, and generative AI sets the expectation that they should be more efficient. But doing it haphazardly eats up usage; we need a systematic way to put connections together.

Christopher Penn – 36:05

If you know what your knowledge graph looks like, you can see if you have introduced new principles or retired old ones since your last book. In the 2026 edition, you use the knowledge graph to extract that information and update the content.

Knowledge graphs date back to the 1970s and were popular in the 1990s in symbolic AI and ontologies. That fell out of favor because it was too rigid. In the 2000s, neural AI and deep learning became the rage, but they can still generate inconsistent results.

Folks in the field are now proposing neuro-symbolic AI, where you have an ontology or framework as a fixed guardrail. The neural network does the generation but only within the confines dictated by the symbolic network. This leads to higher quality results. If you set brand standards in Claude Coworker, you are setting a guardrail that says it must be creative within those constraints.

Katie Robbert – 39:02

Is NotebookLM an example of a knowledge graph? You give it resources and it becomes like a wiki where you only search within those resources. Is it creating its own network graph on the back to answer questions based on just that data?

Christopher Penn – 40:15

To a degree, yes. It’s not a semantic knowledge graph like I showed; it is Retrieval-Augmented Generation, or RAG. NotebookLM transforms documents into tokens and weights, then tells Gemini to query that network first. It is a hybrid system that has a knowledge graph of weights and tokens along with structured data. It can hunt down the specific source for an answer.

Many AI companies use databases like PostgreSQL with vectors and embeddings so they have a knowledge graph in the database and structured data in rows and columns. RAG failed with coding initially because you can’t have “concepts” in code; you need exact verbatim text or the whole thing blows up. But for a tool like NotebookLM, it is a hybrid of a RAG system and structured data.

Katie Robbert – 42:12

Could John use a knowledge graph to understand all our new business calls from the past year to see what people are asking for? We could find the main clusters for campaigns and identify edge cases to discourage.

Christopher Penn – 43:11

Yes. You would want to build it into a system where you have all the transcripts and notes of won deals versus deals that didn’t go anywhere. You can look at the two graphs side-by-side to see the difference.

Katie Robbert – 43:44

I’m curious about the pain points people talk about. John could audit our services for 2027 and ditch what nobody cares about while doubling down on the main pain points. We could update our website with the language people actually use. Right now, it’s murky because people say the same thing in different ways. Semantically bringing these clusters together would help us understand that information.

Christopher Penn – 45:05

We could do that before the end of the day because we use an AI transcription system. We have a folder of all our call transcripts from at least two years in markdown format. We just run Graphify on the folder and see the results.

Katie Robbert – 45:48

I want to make sure we’re thinking through use cases. We’ve talked about code, file organization to save usage, and understanding customer pain points. That’s powerful for companies struggling to speak the language of their customers.

Christopher Penn – 46:38

You can also use it with named entities. If you took all 5,000 papers from a conference like NeurIPS and used named people instead of topics, you could see who is cited the most. Someone did this with the Epstein files—3.5 million files—to see who was most talked about in a corpus otherwise too vast to process. If you have a large quantity of data and want to understand the semantic or logical structure, a knowledge graph is the application to use.

Katie Robbert – 47:44

That’s a good point because the default is usually just to give Gemini all the transcripts and ask for a summary. You get six key topics but no real insights. This is next-level.

Christopher Penn – 48:08

If you have a long-running client, you can see what themes keep showing up over and over. In a call center, you can see what people complain about the most. You can add features to show negative conversations in red and positive ones in green to see the real pain points.

John Wall – 48:49

When we use an LLM, it’s often data mining where we’re digging for a single solution. This is data exploration where you don’t know what’s out there. It raises different pockets to you so you can ask questions of the data instead of just getting a summarized list.

Christopher Penn – 49:17

You can even use it for strategy by taking a competitor’s open job positions and building a knowledge graph of the key skills. You can see exactly what their corporate strategy will be based on what they are hiring for most.

Katie Robbert – 49:52

I’m jotting down notes. Sorry, not sorry. You’re both about to be very busy.

Christopher Penn – 49:58

That is an intro to knowledge graphs. For software, if your company pays for Copilot Premium, you have a Microsoft knowledge graph on the back end. For AI specifically, Graphify is excellent.

The gold standard is Neo4J, a server that handles billion-node knowledge graphs. LinkedIn is a knowledge graph; their technical papers show a six-petabyte in-memory graph updated every few seconds. LinkedIn runs on a knowledge graph because they work. Any final thoughts?

Katie Robbert – 51:03

Will you drop those links into our free Slack community, Analytics for Marketers?

Christopher Penn – 51:09

We can do that. We’ll have additional benefits coming soon for that community once I finish building them.

Katie Robbert – 51:22

I gave you one free pass and you didn’t use it.

Christopher Penn – 51:24

I’m using it now. I’ll tell you about it later. That’s it for this week’s show. Thanks for tuning in.

Be sure to subscribe and check out the Trust Insights podcast at TrustInsights.ai/tipodcast and our weekly newsletter at TrustInsights.ai/newsletter. Join our free Slack group at TrustInsights.ai/analyticsformarketers. See you next time.

Need help with your marketing AI and analytics?

So What? Intro to Knowledge Graphs

So What? Marketing Analytics and Insights Live

airs every Thursday at 1 pm EST.

In this episode you’ll learn:

Transcript:

Leave a Reply Cancel reply

Pin It on Pinterest