BASICS: Data Centric vs Data-Driven: the whys’ and whats’
Peeling this BASICs digital onion, we’ve left some details out (particularly in our last article). As stated, data-centric and data-driven applications deserve their series, a journey we’re kicking off today. Before talking about BigData, strategies and mechanics, let’s filter the messy data-centric vs data-driven dilemma.
Data centric vs Data-Driven
There’s a reason you’ve landed here. Chances are you’ve been caught in the “build a data-driven company” trend. You believe all your operations should focus on data-centric architectures and software. You’ve read countless blogs on “How to become a data-driven company in 3 easy steps”, “Why Data-driven enterprises require sheer commitment” or maybe you’ve flirted with subjects like “How to establish a data-driven enterprise via BigData and Apache Hadoop?” on Reddit, StackOverflow or Quora.
We understand. Going data-centric is seductive and in many cases paramount for leading your operations to the next level. After all, 90% of the fortune 500 enterprises run or are investing in BigData initiatives – you wouldn’t want to miss out. Yet, all this data-centric/data-driven conundrum requires an ambiguity clense before delving into data whys, hows, and whats.
As this conundrum stands, many believe that by transforming their organisation to a data-driven environment, they get an exclusive key to Valhalla, a place of data-centric processing that bears ripe profits and accurate decision making. In reality, there’s a long path to Odin’s absolution, especially when talking implementation.
I’ll have to play the “Captain Obvious” card here: if you want to grow, your decision-making, operations and delivery must be data-driven. This statement is by far an editorial opinion.
Marc Altshuller, former General Manager of IBM’s Business Analytics Division, outlined in a 2017 Forbes Insight information brief, how data becomes a differentiator for business success. Just browsing that resource segregates data-driven and data-centric approaches. Despite their similarities, those two have different approaches when talking strategy and results.
Clarifying what “Data-Driven” stands for
To understand what data-driven means, we must focus on culture, people, and their abilities. Consider it as a practice that establishes an efficient interpretation of the data, rather than a robust infrastructure or a high-end technical solution.
Let’s say you want to go out for a picnic. Normally, a clear sky, sunny weather, and the opportunity of free time, would seduce you. Proactive and agile leaders will jump in, without considering “the analytic” side of this action. To be fair, everything might go according to the plan, except for those instances where it might not.
You could be left one sandwich short of a picnic by traffic jams, emergency meetings or a cats and dogs shower. If conditions such as route, schedule and weather reports, would have been considered before venturing into that picnic, one could at least get an umbrella. Yes, you could think this analogy is a Moo Point – that until we’re bringing-up the big guns: statistics from a Forbes Insights/EY survey ordered by IBM. Those show that just 45% of senior executives include data analytics in their decision-making, especially when designing and executing business strategies.
Unsurprisingly, subjects of the same study reporting the highest growth rates in revenues and profits were part of those 45%. Relying on data for most decisions will not only keep you dry but could also boost your business operations. Now, since we’ve established reason for data-drivenness, it’s high-noon for a proper definition: I found it in a “must-read” for any pioneer venturing into building a data-driven culture: “Creating a Data-Driven Organization: Practical Advice from the Trenches”, By Carl Anderson:
“Data-drivenness is about building tools, abilities, and, most crucially, a culture that acts on data.”
– this is perhaps as close as we can get to a “by-the-book” concept.
Does Data-Driven mean Data-Centric?
Presumably, your organisation ticks any check-box there is for meeting data-driven requirements. Shouldn’t that make your operations, service or app data-centric? Alas, the answer here is simply No. This is where the ambiguity lies, as companies who achieve data-driven environments, often believe they’ve reached the Valhalla of data-centricity and are worthy of wielding the Mjolnir of BigData to their benefit.
It is true, the result of enabling a data-driven culture across your organisation will result in ingesting large volumes of data, yet this does not make your practice data-centric. Worth to mention here that BigData processing is not by default data-centric. On the contrary: If there is no consistency across the datasets you’re harvesting and you just slap those down into data lakes, without a proper vision on how to harmonize those models, you’re getting less data-centric, whilst becoming more data-driven.
Let’s not cross wires here: there’s nothing wrong with maintaining data lakes. We even understand why these are popular nowadays – it’s just that sometimes these approaches miss on particular scopes. Take the ETL Model (Extract, Transform, and Load) – while it is essential for traditional data warehouse environments, operations and applications, it simply takes too long to execute. By the time you’ve had all that data scrubbed, normalized and cleaned, it might simply be useless for harvesting its fruits. In simple terms: Increased latency in getting those analytics can act as a show-stopper. This is the last thing you want on your plate in dynamic environments.
Diving into Data Centricity
While data-drivenness relies heavily on culture and applications, data-centricity at its core, is a predefined architecture that revolves around data (as stated in our previous post). As compared to data-driven practices, there are no direct dependencies on software and tools.
Data-centricity features development around your data, rather than building various data-models that fit a particular scope or application development initiative.
Differences between data-driven and data-centric models
Again, data-drivenness relies heavily on culture and applications, data-centricity focuses on a predefined architecture or standardized model templates that catalogue and congregate any data you ingest. It focuses on development around your data model, rather than deploying various data-models for a particular application or development initiative.
Some Use-Case scenarios
Let’s suppose you’re adding data-centricity to an insurance enterprise. By now, you have ingested petabytes of data about insurance officers, their portfolio and main their healthcare-provider network. If you’ve established a centralised data model, revolving on the persona rather than particular figures, useful patterns emerge for a variety of your apps within the organization: you can integrate a “persona” model and then, map reusable attributes related to this model.
Instead of using a separate data silo for deploying a CRM, you can use the “persona” model to generate targets, convert them to leads and eventually, turn those into closed opportunities, accounts and contacts (as per your sales management flow). Everything you need to link-up those elements within your CRM is there, standardised and ready to be exploited to the benefits of your sales team. The same “persona” model can is replicable for any node of your organisation to manage accounting, statistical analysis, investments, loss control ratios, agency relationships or legal issues.
The applications might vary, but the data model is the same and if you need another one for a specific use case, you can always evolve your data strategy, by developing a variety of scope-based models, under the same data paradigm. Now, before getting tangled deep into the weeds, I must say though: adding data centricity to your enterprise does not necessarily mean using a single database or model. On the contrary, you build data centricity by combining models and various data under a convergent, integrated vision of the data, rather than centralising everything into a single model. The wider your scopes, the richer your vision. You might model your data to fit a use-case, but always in a convergent way, making use of data governance, common sense and procedures that ensure integrity, accuracy and timeliness.
Still confused? – That’s OK
“But wait – aren’t these principles of a data-driven enterprise?”
– this is yet another “Gordian Knot”. The principles might be similar, but the approach is closer to a technical perspective, especially when we’re talking data-centric architectures.
A data-driven enterprise can implement an “application-centric” approach: the exact opposite of data-centricity. This implies issues such as Data Replication for various scopes, delayed real-time analytics, various department silos (or data partitions) and high maintenance costs.
A data-centric architecture focuses on a robust-permanent core: Data. It aims al solving all those app-centric issues and results in various benefits, such as:
- a “common language” for all of your organisational data;
- an optimized application and systems map;
- no data silos or partitions owned exclusively by an organisational unit;
- data owned exclusively by your organisation;
- convergent governance of organisational processes and data.
To make you differentiate data-drivenness and data-centricity, remember:
the latter makes any application ephemeral, with a limited life cycle, closely coupled to the application’s scope.
On the other hand, all of that Data remains there – always ready to serve yet another app scope.
It is to mention though, running a data-driven organisation is a good step forward towards data centricity.
Evolving from data-driven to a data-centric organisation
As mentioned, running a data-driven enterprise is a huge step towards data centricity. When starting from scratch, you will have to design, implement, maintain and then supply a convergent data model to your organization. Starting-up your data-centric journey from a data-driven approach will accelerate the process. You already have the right data, the right metrics and habits in place. All that’s left to do is catalogue those. In the base of these structures defining a unified data model, is easier as compared to starting from scratch.
You would have to turn that model insight-out, starting from a technical perspective. Now, you’re working your way up to the business model. This will involve defining governance approaches based on metadata discovery. Next, focus on tagging policies for specific scopes that converge those into a Data Dictionary and establish data-oriented KPIs. Finally, compile all those elements into a data-centric environment. When you’re done, enrich and maintain your convergent data, as you evolve.
Don’t forget though:
whatever data project, initiative or auxiliary model you start, it must follow the unified core of your data-centric architecture.
From this point on, this core is no longer yet another system you’ve deployed. Now, it’s a business asset, hence anything that relates to it should have proper KPIs.
Of course, there’s more – but hey, this is a vs piece. We’ll elaborate on specific evolutive mechanisms sometimes in the future. Until then, here’s your main handout:
Establishing an environment or building data-centric architectures starts with nurturing a data-driven culture.
If you got here, you’re a tough one. Congrats! I hope the former data-driven / data-centric nonsense phased-out and now, we can steadily move towards more data essentials. We understand this topic feels crammed under 2000 words, but we promise there’s more on this. Think of it as an “all you can eat” data-related buffet we’re powering on this blog. We have lots of goodies scheduled to land on the table.