YOUR DATA IS LYING TO YOU
The 6C Data Quality Framework by Trust Insights
High-quality data is the foundation of every digital business. From service to sales to analytics, in industries from healthcare to consumer goods, data is the bedrock on which business is built. If you don’t trust your data, you probably don’t have strong data quality. And if you want to introduce AI into your company, you have to start with strong data quality — because AI amplifies whatever you feed it, good or bad.
The 6C Framework gives you six criteria to evaluate your data: Clean, Complete, Comprehensive, Calculable, Chosen, and Credible. If your data doesn’t pass all six checks, you’re making decisions on a foundation you can’t trust. Data quality should be a top priority, but it tends to fall to the side in favor of other tasks. Bad data quality is hard to recover from.

THE SIX CRITERIA
1. CLEAN
Prepared well, free of errors. No duplicates, no typos, no formatting inconsistencies. Data is properly structured and error-free. If your spreadsheet has “N/A” mixed with blank cells mixed with zeros, your data isn’t clean. Clean data is the starting point — nothing else in the framework works if your data is full of errors.
Cleaning data isn’t glamorous work, but it’s the most important work. Every analysis, every dashboard, every AI model you build inherits the quality of the data it’s built on. Garbage in, garbage out isn’t just a cliché — it’s a law.
Ask yourself: If you handed this dataset to a stranger, would they find errors in the first five minutes?
2. COMPLETE
No missing information. All required fields are populated. No gaps in time series. No blank records where values should exist. Missing data leads to missing insights — and bad decisions. When you have holes in your data, your analysis is based on assumptions, not facts.
Check for completeness early and often. A dataset that’s 80% complete might look fine in a dashboard, but the 20% that’s missing could be the 20% that changes everything. Especially in time series data, gaps can make trends disappear or appear where none exist.
Ask yourself: Are there gaps in your data that could change your conclusions if they were filled in?
3. COMPREHENSIVE
Must cover the questions being asked. The data actually answers the business question you’re trying to answer. It has enough scope and depth to be useful. This is where many teams go wrong — they analyze the data they have instead of the data they need.
If you’re trying to understand customer behavior but only have website traffic data, your data isn’t comprehensive enough. If you’re evaluating a marketing campaign but only have impressions and not conversions, you’re working with half the picture. Comprehensive data covers the full scope of the question being asked.
Ask yourself: Does your data actually answer the question you’re asking, or are you making do with what’s available?
4. CALCULABLE
Must be workable and usable by business users. Data is in formats that can be computed, analyzed, and visualized. Numbers are numbers, dates are dates. If your revenue column has dollar signs, commas, and text mixed in, it’s not calculable. If dates are stored as free text in different formats, they’re not calculable.
Calculable also means accessible. If only one person on your team can work with the data because it’s locked in a proprietary format or requires specialized tools, it’s not serving the business. Data should be in formats that business users can open, explore, and analyze without needing a data engineer.
Ask yourself: Can a business user open this data and start working with it immediately, or does it need cleaning first?
5. CHOSEN
No irrelevant or confusing data. Curated deliberately. Only data that serves the analysis is included. Noise has been removed. More data isn’t always better — the right data is better. Irrelevant columns and records create confusion and slow down analysis.
This is the flip side of Comprehensive. Where Comprehensive asks “do you have enough?”, Chosen asks “do you have too much?” A dataset with 200 columns when you only need 10 isn’t thorough — it’s overwhelming. Choose your data with intention.
Ask yourself: Is every field and record in your dataset there for a reason, or is there noise that could confuse the analysis?
6. CREDIBLE
Must be collected in a valid way. Sources are trustworthy. Collection methods are sound. The data can withstand scrutiny and challenge. If someone asks “where did this data come from?” you should have a clear, defensible answer.
Credibility is the criterion that ties everything else together. Data can be clean, complete, comprehensive, calculable, and chosen — but if it was collected using a flawed methodology, surveyed a biased sample, or came from an unreliable source, none of that matters. Credibility is what makes your data trustworthy to stakeholders and decision-makers.
Ask yourself: Could you defend your data sources and collection methods to a skeptical executive?
GO DEEPER
Download the complete 6C data quality guide. Use it as a checklist every time you evaluate a dataset.
The 6C Data Quality Framework
The complete guide to evaluating data quality. All six criteria explained with self-check questions and practical examples.