Monday, September 22, 2014

Are Your Spreadsheets the Problem?

Mr. Patrick Burns at Burns Statistics (no, not that Mr. Burns) provides an excellent overview for the hidden dangers that lurk in your spreadsheets. Guess what. The problems aren't just programming errors and the potential for their harm, but are errors that are inherent to the spreadsheet software itself. That's right. Before your analysts even make an error, the errors are already built in. Do you know what's lurking in your spreadsheets? Well, do you?

Before you answer that question, ask yourself these:
  1. What quality assurance procedures does our organization employ to ensure that our spreadsheets are free of errors of math, units conversion, and logic? 
  2. What effort does our organization undertake to make sure that the decision makers and consumers of the spreadsheet analysis comprehend the assumptions, intermediate logic, and results in our spreadsheets? 
  3. How do we ensure that spreadsheet templates (or repurposed spreadsheets or previously loved spreadsheets) are actually contextually coherent with the problem framing and subsequent decisions that the spreadsheets are intended to support? 
Each question actually addresses an hierarchically more important level of awareness and intention in our organizations. The first question addresses the simple rules of math and if they are satisfied. The second question addresses the level of agreement that the math/logic coordinates in a meaningful way and is capable of supporting valid and reasonable insights, inferences, or accurate predictions about the system or problem it describes and that everyone understands why. The last question, the most important question, IMHO, addresses whether our analyses point in the right direction of inquiry at all.

My suspicion is that errors of the first level run amok much more than people are willing to admit, but their prevalence is relatively easy to estimate given our knowledge about the rates at which programming errors occur, why they occur, and how they propagate geometrically through spreadsheets. Mr. Burns recommends that the programming language R is a better solution than spreadsheets and easier to adopt than might be currently imagined by your analysts. I agree. I happen to like R a lot, but I love Analytica as a modeling environment more. But the solution to our spreadsheet modeling problems isn't going to be completely resolved by our choice of software and programming mastery of it.

My greater suspicion is that errors of the second and third level are rarely addressed and pose the greatest level of risk to our organizations because we let spreadsheets (which are immediately accessible) drive our thinking instead of letting good thinking determine the structure and use of our spreadsheets. To rid ourselves of the addiction to spreadsheets and their inherent risks, we have to do the hard work first by starting with question 3 and then working our way down to 1. Otherwise, we're being careless at worst and precisely wrong at best.


(Originally published at LinkedIn.)