SAP BusinessObjects delivers great Business Intelligence solutions so that organizations can report off their existing data sources. But what is the point of reporting of data that isn’t accurate anyway? Although it is true that accurate data is pretty useless if you can get access to it, the converse is also true. What is the point of a great end-user enabled system that includes inaccurate data?
My Top 5 Customers – Really?
Take a look at the report below. (If you want to download this Xcelsius Model it is available below.)
Who are my top 5 customers?

Top 10 Customers
Did you say: General Electric, Procter & Gamble, PepsiCo, Home Depot and Walmart?
Well, Sorry. I’m afraid, that would be incorrect.
You see, what often happens in real-world situations is that organizations think they have more customers than they actually do. That’s because within their CRM system, employees are able to add the same customer multiple times with multiple spellings. This has happened in our case as well. Let’s apply BusinessObjects Data Quality to this real-world situation. With SAP BusinessObjects, you can take company names, customer names, addresses, etc. and standardize them, e.g. UPS = United Parcel Service = UPS Inc., WalMart = Wal*Mart = Wal-Mart, First Commerce Bank = 1st Commerce Bank.
My Top 5 Customers – Really!
Let’s have a look at this same report with Data Quality applied:

Top 10 Customers with Data Quality
Do you see the changes?
Walmart has jumped up into second place and United Parcel Service is now in fifth place. We can also see the our profitability at Walmart is higher than we thought (26.8% instead of 18.7%) and United Parcel Service is actually lower that we thought (28.6% instead of 26.3%). When you are making business decisions off your corporate data, it’s imperative that it is accurate and complete.
Here is the source data behind this chart and you can see how the lack of standardization has led to the incorrect results. I have highlighted the offending records for you:

Raw Customer Data Behind the Top 10 Customers Report
Once we apply data quality and standardize the names, the order changes and I have a new top 5! Often times our biggest customers, vendors, partners and products don’t get the credit they deserve for contributing to our success. Once you’ve got data quality, you can know that you know that you know, the true numbers.
I’ve introduced this topic under the name of Data Quality, but Data Quality really falls under the broader topic of Data Stewardship or Data Governance.
You Don’t Know What You Don’t Know
The bottom line around data quality is that you don’t know what you don’t know. If you manage a data warehouse which accepts feeds from dozens of systems, then it’s highly likely that you have a data quality problem and don’t even know it. It’s a critical aspect of data warehousing. Operational systems are notorious for bad data. Last year, I read an excellent, practical guide to data quality called, Data Quality Assessment. The book itself does not endorse a specific software vendor but all the principles found in the book would apply to any organization looking to improve their corporate data quality.
Downloads – See It Live
If you’d like to see an Xcelsius model of this chart live, I’ve made it available for download. The source code for the .xlf is also available:
http://trustedbi.com/files/Importance of Data Quality.zip
Truth Is Stranger Than Fiction
Sometimes in life you run across situations that are hard to believe. Here is an example where truth is stranger than fiction. When you want to get someone’s attention when it comes to data quality, just tell them this example. This data quality situation really happened and the results were disastrous. This video is from Timo Elliott. When you click on it, it will take you to his website:

Timo's Data Quality Presentation (2min)
Do you have any good stories to share? I’d love to hear them.
«Good BI»