When Bad Data isn't Bad Data

When Bad Data Quality isn’t Bad Data

24th April 2013 by Henrik Liliendahl Sørensen

There has been a quiz running on this blog with the question: What is the name of the current Pope of the Catholic Church?. Find the current standing of answers in the figure to the right.


It’s good to see a lot of different answers and indeed, a problem with the quiz is that all answers may be correct. While Francis is the name as pope in English chosen by Jorge Mario Bergoglio, the pope has other names in other languages as Frans in Danish and Norwegian, François in French, Franziskus in German and Francesco in Italian.

The quiz is actually bad as it has not included other good answers as Franciscus, the latin name, Francisco, the Spanish name, and Franciszek, the Polish name. The question in the quiz is too simple. What is meant by “the name” should be clarified: Is it the birth name, the chosen name as Pope in a given language or what?

Such problems are in fact very common related to what we often see as bad data quality, as it reflects two frequent issues which aren’t about the raw data:

What other issues have you encountered seen as bad data quality, but which isn’t bad raw data?

