Ending Data Co-Dependence; Acquisition

Chapter 1

The IQVIA Healthcare Center of Excellence is part consulting firm, part advisory service and part relationship counselor. We help organizations realize that they don’t have to be in an endless cycle of fights with their critical data. We can help you learn to harmoniously acquiRe, Enrich, goVern, intEgrate, pRovision and visualizE (REVERE – get it?) your data in a way that will not only change how you think about data management, but make you love your data again.

Over the coming weeks we will cover the evolution of data management through a series of blogs that share our insights from over 4,000+ projects and 1,000+ customers.

Don’t break up with your data. Let IQVIA’s Healthcare Center of Excellence help you make something beautiful out of your data.



Acquisition – let’s get physical.

Today, we’ll talk about the data provisioning pipeline and data acquisition.

When we hear ‘data provisioning’ in the healthcare industry, it’s rarely said about data in a way that means something, such as a description of a process that has some teeth. Provisioning isn’t just a smash and grab exercise.  Provisioning is an art form. Not in a pompous and inaccessible way, but like a delicious pasta sauce. A recipe originally conceived through trial and error but then passed down, often over generations and often by an oral history, to each subsequent generation, with improvements (both process and technological) that often get better documented over time.

One can see how this mirrors our experience with data management in healthcare. What often starts as an undocumented manual process on someone’s desktop to corral some wild data from a feral spreadsheet, somehow accidentally ends up as a production process on someone’s desktop. Eventually we create a provisioning pipeline for that data that includes better acquisition strategies in phases of increasing sophistication. Unfortunately, we usually don’t get to this evolved state until our data causes real problems. The path goes something like this:


  • moving the data off a desktop to a file share
  • or better yet, the original source of the data
  • and then perhaps ultimately to a streaming or real-time source


  • Initially finding a way to standardize acquisition from the source
  • then to automating the data transfer
  • improving timeliness
  • then perhaps to standardizing the process to include stepwise processing and error handling and alerting
  • ultimately driving this process to a 99% uptime monitored and redundant managed process

Quality Improvement

  • Initially, we just want to make sure the file arrived
  • then we move to some basic quality exercises (e.g., file naming conventions and completeness)
  • possibly implementing proactive monitoring, intelligent data quality validation, file and field level variance tracking
  • ultimately a partner engagement process for driving quality improvement back uphill to the system owner


  • It starts with a basic inventory followed by an exasperated question: `What the heck is this data and where did it come from?’
  • on to instructions – how do I do what you did?
  • then basic metadata about the source including a data dictionary, business context, ownership, process, and sizing
  • finally, more advanced things like automated data domain analysis, relative quality, data element sensitivity and security/exposure risk, social context (e.g., who’s using the data and for what?)

Taming wild data

We have all experienced pieces of this cycle – from data in the wild to a tamed data source that doesn’t bite your hand when you try to feed it. However, the process of acquisition is:

  • often not regimented and relegated to legacy tools
  • driven by noise and only tactical needs instead of strategic improvement
  • isn’t seen as the first and larger opportunity to improve the provisioning cycle that it is
  • not designed with a head for what we will ultimately use data for – how it will be visualized and integrated

That last point – designing data acquisition for both storage and usage is lost in the ‘land everything mentality.’ We are starting to see a shift away from the mindless ragged bulk ingestion that has been a signalling trait of the Hadoop Years. Critical changes in movement to the cloud, edge computing, legal requirements (i.e., GDPR), semantic layer variability (e.g., self-service tools) and other transitions are driving towards a more thoughtful acquisition strategy. This means:

  • survey everything
  • land what I need and can use
  • acquire dynamically and variably based on usage, relative value and risk
  • move only what is necessary
  • design for self-service and bring-your-own-tools (BYOT) analytics

Data acquisition

In summary, we need to think about data acquisition as an active, engaged process where analytics design begins. Some deeper thought about how the data is used, by whom, in combination with what other data and what that guided analytical journey might look like are the critical ideas to drive effective data capture. Data Governance and Data Quality are your friends and they can help you mend your relationship with your data.

Thanks for reading.


NextEnrichment – your data is more than it appears to be

In the next blog, we’ll talk about negative and positive data quality, real data enrichment/evolution, and creating more with less. Until next time.