Today’s article is inspired by Minister for Finance Biman Prasad’s speech at the Evidence Based Policy conference where he spoke of never allowing “independent institutions for evidentiary national data gathering and analysis to be gutted and manipulated to suit the political agendas”.
He also highlighted the need for sentience, meaning having the capacity to feel, to be aware, to have cognitive ability.
“Sentience” in data circles is often used to define an organisations data centric intent and vision.
There are five stages in the evolution of data that promise to deliver a sentient organisation in the evolution of data to enable effective, awake, aware, sensitive, real-time decision making.
We will explore this later but let’s first look at it in a historical context and how Total Quality Management (TQM) guru Edwards Deming brought it to life.
“In God we trust… All others must bring data.” – W. Edwards Deming, referring to the importance of data measurement and analysis when doing business.
Deming was a statistician and business consultant whose methods served as a catalyst for Japan’s recovery after the Second World War.
His philosophy and methods allowed individuals and organisations to plan and continually improve themselves, their relationships, processes, products, and services.
His philosophy is one of cooperation and continual improvement; his reference to data alluded to the avoidance of finger pointing and ambiguity and repositions mistakes as opportunities for improvement.
It is said to have spurred the “agile” approach to projects, the “fail-fast” approach.
Fact based decision making
He joined the Census Bureau of the United States in the late 1930’s and introduced statistical process control to their techniques, which contributed to a six-fold improvement in productivity.
Deming’s expertise as a statistician based on observed data was instrumental in his posting to Japan after the Second World War as an adviser to the Japanese Census.
He informed his knowledge of evidence, or fact-based statistical methods and organisation-wide quality to the Japanese.
The combination of the two techniques evolved to later become known as Total Quality Management (TQM).
Deming insisted TQM had to be embedded in an organisations processes rather than accommodated as a topic of some focus.
Deming’s work and writing constitute not so much a technique as a philosophy of management and data is fast becoming that.
My former CTO and Obama technology adviser impressed upon me that all organisations would have to become data organisations, or they’d cease to exist as a going concern.
Nearly 20 years on, that has become increasingly evident.
Test it, check it out.
Switch off all sources of data to management decision making for a few weeks and see how you go.
Volume, velocity, veracity, value of data
The truth is, data has always been at the core of decision making, for decades, since the 1930s and even prior.
It is becoming more prominent in recent times, as the volume of data increases exponentially, as the velocity of data we collect increases exponentially, together with the need for the veracity or accuracy of data increases exponentially.
But here’s the thing, here’s the question: Are you deriving value from the data at all, if not exponentially?
To be relevant and competitive locally and globally data has to be embedded in an organisations processes.
As Deming insisted, we can work on trust, but our decisions must be supported with data facts.
What is the Fiji Bureau of Statistics data potential?
It is as exciting as it is encouraging to know of the allocation of more than FJD7 million to the Fiji Bureau of Statistics.
The Strategic Planning Office, one imagines, would benefit greatly from an upgrade in the Bureau of Statistics data capability as would all parts of government, and indeed the private sector.
This column has often emphasised the importance of data to operational and strategic.
The remainder of this article will cover what it means to be a data organisation, challenge you on your current understanding of your data capability and ability to derive meaningful value from data.
What is the data potential of the Bureau of Stats and what value can be derived from the data they hold?
What is the potential of integrating additional data into their data repository.
The reference here is to detail granular data, and the engineering of that data into an integrated repository, including data sourced from across whole of government systems.
Data strategy and policies will be critical to the success of the Bureau of Stats in its ability to support the governments economic growth plans.
Start with this question
What is the value of our data?
So, what is the value of your data?
That is a difficult question to answer, impossible if you have not yet thought strategically about data, haven’t considered a data architecture, strategy, roadmap, and data engineering effort.
And if you haven’t yet taken a holistic view of data and you haven’t brought data analysis to your decision making.
So how do you get to having a trusted source of data that can help make game changing decisions?
Ask how you’ve monetized your data.
The five stages of data evolution
First, we need to review and recognise where we currently are on the evolution path of data capabilities.
Some will believe they’re good, “we have all the data capabilities we need”.
We have this data base, that business intelligence tool, the other executive dashboard software, some other web application…”
Data needs evolve along five major stages.
As you move over each stage, the complexity of your data workload grows and the sophistication of the data you hold grows.
This assumes the data is integrated in a repository in the first place and with each stage more data is added without breaking the
integrity of the data repository.
Which stage of data evolution are you at?
Stage 1 Reporting what happened: The data in the repository is updated in what is typically called batch mode.
That is monthly, weekly, or nightly, or hourly or any combination of those frequencies depending on how often the reports are needed.
In the case of census data or elections data that would be every four or five years.
For the most part, the questions in a report environment are known in advance.
And so data repository structures can be optimized to deliver good performance even when queries require access to huge amounts of data.
This stage reports what happened in the past, so your decisions are made on data that is anywhere from an hour old to five years old.
The workload on the repository is relatively simple, standard, preformatted with some ad-hoc queries, follow up questions of the same data.
This is not to be confused with the standard predefined reports and queries demanded of an operational system such as an inventory system or a births or business registry.
Stage 2 Analyzing why something happened: The workload here typically requires more data for trending analysis, an increase in ad-hoc queries, and some level of analytics such as propensity to default, or to not complete a prescribed educational course.
Answer questions like why did a student not complete a course, why did a customer close their account?
Stage 3 Predicting what will happen: In this stage there’s more of Stage 1 and 2 activity with the addition and growth of analytical modeling.
The organisation becomes increasingly entrenched in quantitative decision making and experiences the value proposition for understanding the “what’s” and “whys” is complemented with the ability to implement AI techniques with sophisticated algorithms to support the data science.
Experiment with the expected impact of a price or tax increase specific products on certain segments of the population, will the student graduate if they changed certain classes or took on a different major, how can we help them be successful?
Stage 4: Operationalising: Continuous (real-time) updates, time-sensitive queries and time sensitive decisions and actions.
A customer is going to close their account based on current trends as of today, what action do we need to take to save that customer, did they just make a big withdrawal that indicates they will close their account, what’s the best offer we can make to save them from leaving?
Stage 5: Making it happen: Events are detected, conditions for triggering specific actions are activated, and either automatically or via alerts certain actions are executed.
If a customer meets a certain threshold, for example five percent below or above deposits, can we activate an automated promotion offer?
In latter stages, decision makers focus less on what happened and more on why it happened with analysis activities concerned with “drilling down” beneath the numbers on a predefined report to satisfy the inquisitive to “slice and dice” data at a detailed level.
Ad hoc analysis plays a big role as questions against the repository are not known in advance.
In the more advanced stages the underlying data structure becomes more important because the information repository is used much more
interactively.
• Naleen Nageshwar is a data and digital strategy consultant. A Fijian citizen based in Sydney, he runs his own consulting practice Data4Digital and is managing partner Australia, NZ, and Pacific for AlphaZetta Data Science and analytics consulting. For questions and feedback to naleen@ data4digital.com.