Provisioning data for non-production in the healthcare sector: data masking or data creation?

Creating Usable and Compliant Test Data using Data Creation

Your organization’s test data management initiatives are likely to focus on provisioning compliant test or development data, and the management of this data. Many healthcare organizations are looking to share and manipulate production-like data across their test and development teams in highly complex environments.

Due to recent regulatory initiatives globally, organizations are unable to use live data in the creation of non-production environments. Likewise, in order to reduce complexity, reduce storage costs and increase productivity, many organizations are looking for alternatives to using and manipulating copies of large production databases in non-production environments.

Grid-Tools offer the flexibility of creating data which models the referential and relational integrity of production environments, creating referentially correct, secure subset databases or masking and anonymizing production data; providing your organization with rich, compliant data sets. The result is less data, but more variety.

In the past, data masking was seen as a way to adequately de-identify production data and provision it for testing. This, however, is an unsatisfactory solution for the healthcare sector. The complexity of PHI (Personal Healthcare Information, see below) means that it often contains too many potential identifiers to mask patients adequately (i.e. type/history of disease often correlates strongly with age or gender). In order to sufficiently mask this complexity of data in a non-production environment, relational integrity with the ‘live’ data is often compromised, making it useless for good testing. This poses the question, so how do we provision useful, compliant test data for the healthcare sector if we don’t use data masking?

The answer is data generation! Grid-Tools offers a flexible range of solutions, including the award-winning tool Datamaker™ which generates data that models the referential and relational integrity of production environments. Using data generation tools can ensure regulatory initiative compliance by:

  • Generating data from ‘scratch’ – no need to hack or edit production data at all
  • Integrate all data into one database from a range of sources from mainframes to spreadsheets

Improving Efficiency in Data Management

Using production data is not only non-compliant but also a costly, time-consuming process. Generating data from a tool, such as Datamaker™, can greatly improve the efficiency of your data management.

  • Generating data from ‘scratch’ means no more developers hacking and re-coding to mask data
  • Referential and relational integrity is maintained, shortening test and development cycles
  • Less, but richer data is generated, increasing possible test scenarios and reducing disk space
  • Removes manual input, leading to greater compliance and less errors in the development
  • Re-usable data which can be stored for future testing, migrated across versions and upgraded
Back to the top