The History Behind Bad Data
In Book VII of Plato’s dialogue, The Republic, the Greek philosopher introduces his powerful imagery of learning and means of perception in Allegory of the Cave. In his allegory, there is a group of prisoners who have lived in a cave since childhood, legs and necks chained so that they cannot move. Above and behind them is a fire blazing at a distance so that they can only see shadows of what is between the fire and the wall of the cave in front of them. A low wall has been built to hide men who carry statues and figures of animals so that the shadows of these figures appear on the wall. Everything these prisoners know about these creatures is limited to the images on the wall. But were one of the prisoners to be freed, his eyes would at first be overtaken by the brightness of the outside world. After a time his eyesight would adjust, and he would be able to see images similar to those whose shadows he saw represented on the cave walls, but now directly observing the real thing.
Since its reintroduction into the West during the early Italian Renaissance, Plato’s dialogues have sparked both metaphysical and epistemological debate over the centuries. Many will recognize one of these from high school during talk of whether a tree falling in the forest requires someone to hear it in order for it to make a sound.
This is known philosophically as subjective idealism, which was first articulated in the 18th century by Bishop George Berkeley, who posited that an objective universe of matter outside of the mind was incoherent. In other words, the world exists only in the mind of the one perceiving it and the Deity.
After we came out of the church, we stood talking for some time together of Bishop Berkeley’s ingenious sophistry to prove the nonexistence of matter, and that every thing in the universe is merely ideal. I observed, that though we are satisfied his doctrine is not true, it is impossible to refute it. I never shall forget the alacrity with which Johnson answered, striking his foot with mighty force against a large stone, till he rebounded from it — “I refute it thus.”
What Johnson was demonstrating is the fact that when you kick the universe, the universe kicks back; that is, the stone possesses agency by resisting Johnson’s foot. Of course, we now know that the stone—or the forces that created it—existed long before anyone was around to perceive it. This knowledge is transmitted through radiometric dating, and the results can be reproduced independently in application of the scientific method. We also know that our senses are not some metaphysical phenomenon, but a part of our biological adaptation to perceive and survive in the world. Thus, modern biology and neuroscience fill in the details of why Johnson’s refutation is so effective to this day. As with Plato, one can read all sorts of sophistry if one bothers to perform a Google search regarding whether Johnson actually refuted Berkeley on his own terms. Meanwhile, back in the real world, anyone kicking the rock will find that the universe kicks back and it does so neither aware nor caring about whether we wish to perceive it.
Do Metrics Need to Be Actionable?
Such an introduction to a discussion of project indicators is unusual, I know. But it is necessary in response to the misconception found too often in business systems that any and all measures of performance are of the same nature. There is also a subtext of what I can only describe as “magical thinking” that influences what is chosen as measures of performance. Many of us have probably heard some variation of the “perception is reality” meme. Anyone taking this bit of nonsense literally may find out the hard way (and many have) that external reality will adjudicate the perception of both their project and their fitness in it. Dave Gordon in these pages of AITS has addressed many of these issues and provided some specific examples, as well as a call for examples of utilitarian metrics.
I will leave only a brief few comments regarding the assertion that a metric must be actionable to be of utility. Sometimes you just need to know the condition of your project or any other system, whether or not you can do anything about it. Perhaps the metric is actionable; perhaps it isn’t. The ability to tell us something significant about the material condition of the system or object should be the measure of whether to consider it useful, especially in the prosaic world of project management. But in furthering Dave’s call, I would like to present what I have found from recent observations to be the two main categories of metrics that I have encountered, and propose a framework for a methodology in identifying those metrics that possess greater utility.
Indirect Measures and the Problem with Error Capturing
The first category is classified as indirect measures of the phenomenon that we wish to monitor. As with the prisoners in the Allegory of the Cave, we often find ourselves perceiving only the representations of the objects that we wish to measure. Poor data quality is often the root cause of these weak measures. For example, in measuring data to make qualitative assessments of reliability, validity, and fidelity of underlying systems, summary-level data that is provided through different data streams will be cross-checked to identify inconsistencies. Any conclusions from identified inconsistencies must be inferential in nature, requiring additional inquiry and work to determine materiality.
We can add further complexity to this category by noting that the detail of the data provided through those multiple data streams is not consistent. Some will provide greater detail and some will provide less detail, though they are derived from the same set of data. In this case, comparative summarization of data is used to determine inconsistencies. A practical example of this would be in submission of financial data that provides a sum through one data stream, but then more detailed sub-accounts in another. The sub-accounts in the first stream should equal the sum in the other stream. Once again, an inconsistency must be inferential in nature and will require additional investigation.
Such error capturing is common and has a long history in quality assurance programs that were adapted from industrial economics. The assumption in this approach is that errors will happen, and that catching such errors at some point in a feedback loop will eventually allow for correction of systemic faults in our systems. But is this really necessary under the current information management framework? In the past, such support for suboptimization of data was based on rationales of cost. But basic information economics, which I have outlined in previous articles, demonstrates that such arguments are no longer valid. In fact, not only are they no longer valid, one cannot afford to not leverage the advantages in maximizing data quality. The marginal cost associated with such data is far outstripped by cost avoidance in eliminating conventional after-the-fact quality assurance and investigation, as well as cost and suboptimization caused by duplication of streams and inconsistencies across streams.
Direct Measures for Greater Utility
This leads us to the identification of what is categorized as direct measures, which become possible with such qualitative data improvement and provide greater utility. Having eliminated measures of summarization and inference, our metrics can now focus on the elements within the data itself that represent (at least in project management) work planning and execution. For example, rather than cross-checking an attribute against summarized data from different data streams, there is now one data stream and the attribute for that element can be determined without need of inference or additional data calls.
Using our earlier example of financial data, roll-up verification across data streams is no longer required since one data stream provides both the sub-accounts and the summary. Internal checks within the accounts can verify if an error or unauthorized intervention was made, but more detailed and precise assessments will determine if there are anomalies that are material to the system being measured, since the level at which work and budget are associated can be assessed.
Furthermore, we oftentimes use sub-elements of data from a common data set for different purposes. In project management, this can be for assessments of systems quality, performance measurement, financial integrity, financial planning, risk management, and a number of other functions. In making qualitative improvements by expanding the scope and depth of data provided, the needs of multiple functions can be addressed—by a single data stream.
New Ways to Measure Project Performance
Much has been made in technology literature about Big Data. There is no doubt that such qualitative improvement to get to direct measures takes us in that direction. But from my experience, normalization and rationalization of that dataset related to project management can both reduce the size of the data (making it “Small Big Data”) and eliminate sub-optimization of customized data-mining effort that would otherwise be equivalent to approaching a repository of Babel.
After all, we only need to ask ourselves: what data is common across major projects? We have the project estimate, scope, schedule, baseline, actual financial expenditures, perhaps earned value, technical performance, and risk. This has been done. The U.S. defense industry uses the UN/CEFACT XML D09B standard to achieve consistency in data submission. This meets the standard of making data flat and therefore open, breaking down proprietary barriers to information. It also allows for more effective and appropriate integration of data across line-and-staff disciplines that more closely follow system processes and cause-and-effect. Thus, integration provides insights into project performance not previously measured.
For more brilliant insights, check out Nick’s blog: Life, Project Management, and Everything