Test Data Management Process. Aka The “Test Data Lifecycle”
Test Data Management means different things to different people, however, there is a common thread independent of your tools, the supporting data security solutions, or the end goal of Software Testing.
Below we provide a list of the key lifecycle steps to ensure successful Test Data management:
Define your Test Data Requirments
To ensure that your software testing & test data management process is effective, start by defining clear criteria for what data should be included in testing. This includes factors such as the business entities required to cover testing scenarios, the volume of data needed, its sources, and how often it needs to be refreshed. Using an automated data catalog can help you inventory and classify test data assets, as well as map out information supply chains. By taking these steps, you can make sure that your test data is meeting all the requirements of your software testing community.
Understand your Data
It is important to understand the structure and potential risks* of your test data to effectively manage it.
*Where sensitive data resides.
Good test data management practices include understanding your data, developing methods to profile your data, and using this information to support activities such as test and data engineering and data security practices that will help avoid data breaches.
Extract Data
Extracting source data, usually, production data, for use in testing can be a difficult and time-consuming process, especially if the required data is spread across many different systems and data sources. A test data management (TDM) system can help simplify and automate this process by integrating with production data systems and extracting test data according to predefined rules. A good TDM system should be adaptable, easy to sync with production data sources, and capable of rolling back the test data on demand.
Note: This extraction step should also consider re-syncing test data.
One of the key aspects of an effective test data management strategy is the ability to quickly and easily roll back the test data that was used for a specific test or purpose. This form of “data maintenance” is essential to avoid impacting other tests that are currently being conducted. An adaptable, easy-to-sync test data management tool or method is essential for this purpose.
Transform
Test Data Masking
When it comes to managing test data, ensuring adequate privacy and data security is essential. That is, you need to leverage test data masking tools to protect the test data, and ensure you are securing sensitive* data along the way.
For example Personally Identifiable Information (PII).
A well-defined data masking process will help to meet data compliance and data security requirements while maintaining the data’s integrity through Deterministic data masking.
Deterministic data masking involves substituting a column value with an identical value, regardless of whether it’s in the same row, table, database/schema, or across different instances, servers, or database types.
Tip: A primary benefit of extracted & masked production data is it will deliver “realistic test data”. And potentially simplify test data preparation.
Fabricate Fake Test Data
One way to generate or prepare test data is to fabricate or create test data. Often a quick form of test data preparation. This can be done by using a data fabrication solution to create test data based on real production guidelines. As such, your process should also include a means to easily generate synthetic data. This will allow the test team to have a sufficient volume of test data needed for testing.
Tip: Fake data, which avoids using production data, also avoids the data security challenge like the use of sensitive customer data and the inherent risks of a data breach.
Load Test Data
Provision of the Test Data
The process of creating test data available in the target development and test environments is known as provisioning. This usually entails moving test data (the masked and/or fabricated data) from several source systems into the target environment. Perhaps by using legacy backup & restore methods or more contemporary snapshotting & cloning techniques. All of which may be considered for inclusion in your CICD frameworks.
Validate Test Data
Use methods to continually ensure that the test data deployed to Non-Production is secure and compliant with Information Privacy Regulations. Thus ensuring your organization is deploying valid test data and the data security operations are protecting your customers.
Test Data Mining
The Test Data Management process ensures that developers and testers can access the data they need quickly. This is achieved by implementing agile test data mining (or viewing) capabilities. This capability allows for greater efficiency and productivity among teams.
Test Data Bookings
To avoid risks associated with test data contention and test data-related defects, establish methods for test data booking (or test data reservation). These reservation methods allow developers and testers to safely share environments and avoid having to redo work due to accidental collisions.
In Summary
Overall, an effective test data management process should include steps to define your test data needs, extract data from sources, mask or obfuscate test data, generate test data, provision test data, validate test data, and easily access the test data. By taking these steps, you can ensure that your test data is adequate for testing purposes and meets all compliance and security requirements.