A Canticle for Leibowitz, by Walter M. Miller, is considered to be among the best science fiction novels of all time, and its message is apt for our current times. The story begins 600 years after a nuclear war has all but wiped out modern civilization in what is called the “Flame Deluge.”
From the ashes rise a number of human responses to explain and cope with the world that remains. Among these is the creation of the monastic Order of St. Leibowitz, based on the life of Isaac Edward Leibowitz, a twentieth century Jewish engineer who preserved as many books as he could from the violent anti-intellectual reaction that followed in the aftermath of the annihilation. His purpose was to preserve the knowledge of the world that existed prior to the holocaust in the hope that human civilization would recover and rebuild. He was martyred for his efforts.
Unfortunately, Leibowitz’s rationale did not survive the intervening generations, his books and documents revered for their connection to the beatified saint, becoming known as “The Memorabilia.” In one particular passage, a monk finds new documents directly connected to Leibowitz:
On one wall of the stair well a half-buried sign remained legible. Mustering his modest command of pre-Deluge English, he whispered the words haltingly:
FALLOUT SURVIVAL SHELTER
Maximum Occupancy: 15
Provision limitations, single occupant: 180 days; divide by actual number of occupants. Upon entering shelter, see that First Hatch is securely locked and sealed, that the intruder shields are electrified to repel contaminated persons attempting entry, that the warning lights are ON outside the enclosure…
The rest was buried, but the first word was enough for Francis. He had never seen a “Fallout,” and he hoped he’d never see one. A consistent description of the monster had not survived, but Francis had heard the legends. He crossed himself and backed away from the hole. Tradition told that the Beatus Leibowitz himself had encountered a Fallout, and had been possessed by it for many months before the exorcism which accompanied his Baptism drove the fiend away.
The novice stared at the sign in dismay. Its meaning was plain enough. He had unwittingly broken into the abode (deserted, he prayed) of not just one, but fifteen of the dreadful beings! He groped for his phial of holy water…
Brother Francis goes on to find new relics, such as the Holy Shopping List: “Pound pastrami, can kraut, six bagels—bring home for Emma.” Not understanding the context or significance of the content, he worships the object itself. The Memorabilia, meant to be information, is treated as an object of worship, a fetish.
What makes A Canticle for Leibowitz a great book that transcends its genre is its ability to tell us something important about human nature and the human condition, especially regarding perception and how it affects our ability to interpret the world and our history. For those of us in the information business, words and concepts are invoked in the same way without the user of them really understanding the practical meaning behind them. Words that invoke concepts such as “Web-based,” “Cloud,” and “Big Data” are used often related to organizational improvement.
Web Doesn’t Mean What You Think
Let’s take the first of these terms in the context of business. Web-based applications have been around for quite a long time, but what does the non-technical person asserting they need a “web-based” solution really mean?
In most cases where I have inquired further on their meaning, they tend to have in mind some vague concept of an application that is accessed through their browser, and will not require installing an application on every workstation that will use the solution. That is to say, they wish to avoid the traditional thick-client installation. What this is really describing is thin-client.
There are, of course, three main kinds of thin-client solutions described in the PC Magazine link cited above. These are the use of shared services, desktop virtualization, or a browser-based solution. All three of these options can be viewed through a browser, and all three reduce the number of local resources required in processing data.
Most non-techies have in mind the last approach mentioned—a browser-based one—without fully understanding the advantages and disadvantages to such an approach compared to the others, or even compared to thick-client installations. This misunderstanding comes as much from the industry as from any other source, especially through the assertions and marketing of software providers who have banked their success on one or other of these solutions.
Disadvantages of simply relying on browser-based solutions are security and functionality. Regarding security, often browser-based solutions lead into Cloud, which I will discuss in more depth next. But on their own, browser-based solutions, which in most cases are written in some version of HTML, tend to require open ports and other security risks that have turned out not to be all that secure, despite assertions to the contrary. This is true even when internally hosted.
Furthermore, browser-based solutions tend to be very limited in terms of functionality. Solutions written in the new languages that leverage the .NET framework, however, tend to be more powerful and can leverage business and other rich user objects. Microsoft’s own guidance demonstrates how this can be achieved and the different methods that exist to deploy thin-client, including calling web services and utilizing the other methods already mentioned.
Now, of course, if functionality and flexibility aren’t all that important—that your intended use of web applications is to deploy glorified PowerPoint charts and graphs—then these considerations will not be important to you. But in most business environments functionality, flexibility, and sustainability are important considerations. Thus, when picking a “Web” application, it is important to understand that most deployments will be some mix of thick and thin-client deployment, depending on the role and responsibilities of the various classes of user.
Icarus and the Cloud
Oftentimes discussions of web-based applications lead us to the Cloud. As with other marketing-based buzzwords, the precise definition of this term often varies according to the one marketing a solution purported to be a Cloud solution.
In short, Cloud computing is one of a group of solutions where software and often data storage related to that software are hosted and stored using computers outside of an organization’s network. This occurs usually across the Web, or perhaps secured via one of the methods I described in defining a Web application. Cloud computing is also described as Software-as-a-Service (SaaS). There is a subtle distinction often pushed by software companies and others between “pure” Cloud and SaaS, where the former is purported to come with greater customization. But such a distinction conflates Cloud/SaaS with the underlying technologies being rolled out under Fourth Generation software, which often have the look and feel—and comes with many of the advantages—of open source applications.
For organizations that have a concern for operational security (OpSec), which at this point should be everyone, using Cloud and SaaS poses a significant security concern. Despite calming assertions to the contrary from many very large companies entering this market, no one is safe. I will repeat this for emphasis: No one is safe.
I need not go into detail over the plethora of major data hacks in 2015 alone, many by companies that either provide Cloud solutions or heavily invested in them for their operations. But just for the purposes of example, this list includes eBay, Court Ventures, Living Social, Heartland, the U. S. Office of Personnel Management (OPM), Sony, Restaurant Depot, J.P. Morgan Chase, Anthem, Home Depot, Adobe, and the list goes on.
Yet, like Icarus who ignores his father’s advice and with wax wings climbs ever closer to the sun, software developers and executives continue to push Cloud solutions to an ever increasingly skeptical set of industries, who have a good reason for their skepticism. The counterargument that I often get to these facts from my colleagues in industry is to point to all of the systems that are not hacked. True enough as far as it goes, but they must also keep in mind that, in the words of a speaker at an OpSec briefing I attended, “There are two kinds of organizations: those that have been hacked and those that do not know they have been hacked.” This is especially true of small businesses that do not have the internal resources for sophisticated detection.
Furthermore, once you are large or famous enough to garner some attention, chances are that you are a target to hacking. The best OpSec approach in that case is, in your vulnerability assessments, to compartmentalize highly sensitive and personal information, blocking from web access any data that will seriously damage the integrity of the organization. In other words, you need to host such information internally using an intranet with restrictions and monitoring of open ports. Doing so brings us back to solutions involving either thick- or thin-client deployments that use one of the distributed processing methods already mentioned. Furthermore, for those who depend on social networking, it is important to separate your social networking systems from business systems. Using a VPN and browsing, when necessary for business purposes, should be done using Tor.
The days when VPN and Tor were confined to the dark web and black hats are long over. Tor, an open source improvement on Firefox, is partially funded by the U.S. government and used by some agencies for secure browsing. It is time to follow their example.
So what are we to make of the ideal of Cloud? In reality, Cloud computing is the manifestation of the need to make software solutions multi-functional across devices, overcoming the geographical separation and mobility of users. Furthermore, the desire is to scale. Thus, sufficient technological advancements already exist using a number of secure solutions employing desktop virtualization and shared services that will support intranet and, where necessary, mobile deployments that also include transactional (two-way) processing.
With such security concerns, is there still room for Cloud? In my assessment, not if you care about the data. Even apart from considerations of security, investment in Cloud compromises the flexibility of the customer in moving to alternative and competing technologies. It opens up a whole host of problems regarding data that is stored in a proprietary format in a stable structure that may not be easily translated to an alternative.
No doubt the issue of proprietary data storage and table-structure is a common one at the moment, even at the application level, in a strategy in which software providers try to enforce the “stickiness” of their applications. In essence, this issue is the same as the railroads had in the 19th century, where gauges were made incompatible to keep loads from passing through their lines. It is an old story of non-competitive strategies pursued in an industry that exists through public largesse, and one needs to be addressed by companies in using their criteria for determining what solution to select. In the end economics will win out.
Needless to say, having your data in a proprietary format on someone else’s servers that may reside in a country that is not necessarily friendly to your concerns may not be the best way to meet your OpSec goals.
Still, Cloud and Web have that ring to them that suggest one other aspect that is appealing to some people, and that is the ability to scale to support larger organizations using more data. This then leads us to the buzz phrase “Big Data.”
How Big Is “Big” and Its Relevance
Like the previous terms, Big Data has gone through a number of definitions. As indicated in the link, the first use of the term arose within NASA in 1997 simply related to hardware limitations in dealing with dataset. In this case, it was in relation to data that undergirded computer graphics, in which the dataset was larger than could be fit into main memory or in a location on a local disk for easy access. Since that time the term has been transformed into a relative term based on data size and complexity, and the manner that available software and hardware can deal with it.
Most significantly (and unfortunately) though, it has been converted into a buzz phrase used to justify any number of prescriptions in how to overcome the limitations of the moment, as if they are anything more than simply another approach that possesses no unique or compelling technological or intellectual authority.
For example, there are two common types of such prescriptions that I’ve come across in my travels. One is to use the term Big Data as a means of resurrecting and justifying labor-intensive data mining approaches that are both unnecessary and expensive. The other is to justify the use of correlation across data as a substitute for sound statistical, analytical, and causational methods when dealing with socio-economic and behavioral data.
In both of these cases there seems to be a common thread: The collection of artifacts of information takes on more—or as much—importance as the content or significance of the information contained in the artifact. Our approach to the artifact is colored by our presuppositions and biases about its probable importance, and by our cultural, experiential, and educational filters.
What I mean by “artifact” is any sequence or string of data that conveys a message. Thus, the artifact can be a document, a data file, a snippet of text, or any other item that meets this definition. The relative importance of the data and whether it is worth measurement or retention is often forgotten by the near-fetishistic behavior of collecting the artifacts themselves.
One can understand the reason for such behavior. Given that we do not always know what is contained in the data due to its relative opaqueness, the compulsion is to collect as much as possible and to draw conclusions from that data, even if it is premature. Such behavior has significantly impacted our social, economic, political, legal, and organizational systems, for in many cases it has skewed our perceptions of what is happening in the world around us.
In the end, though, Big Data simply describes a relative condition that can be approached through a number of technological solutions, ones that will allow stakeholders to derive importance from that data without imbuing it some mystical power to solve all problems and vanquish all evil. There is no magic in technology, though as Arthur C. Clarke once wrote, “Any sufficiently advanced technology is indistinguishable from magic.” No doubt that is the case, and the reason why we must demystify these terms—so that we make them work for us and not against us.
For more brilliant insights, check out Nick’s blog: Life, Project Management, and Everything