What is Test Data Management?
Test data management (TDM), a sister function of Test Environment Management, is the construction of test data sets that reliably represents an organization’s actual data so that IT teams (developers and testers) can effectively exercise software testing. Tests include Unit Testing, System Testing, Integration Testing, User Acceptance Testing, Performance Testing, and Security Testing.
Image based on the Holistic Test Data Management Framework
Test Data Management Functions
Test Data Management is a broad subject and covers many facets often including:
- Test Data Requirements Gathering. Gathering the needs of software testing i.e. identifying the data quality needed to build your test scenarios.
- Data Profiling / Data Discovery. Understanding your Production Data. This is critical if you wish to understand what valid test data looks like, identify where personally identifiable information hides, and understand data security risk and data breach avoidance.
- Data Extraction (often from Production data). Note: One benefit of using production data, as a base, is it will help ensure realistic test data.
- Data Transformation
+ Data Sub-setting. That is condense or prepare test data so that it is both smaller & easier to handle.
+ Data Masking / Data Privatization. Protect data & protect customer PII through obfuscating sensitive data e.g. sensitive customer data. Note: Masking will usually use a copy of the original sensitive data i.e. production data. And by data masking, for example, data encryption, you are implementing data privacy to reduce the opportunity for a data breach.
- Test Data Provisioning i.e. the deployment of test data.
- Data Cloning aka Data Virtualization (the concept of snapshotting & deploying "tiny" replica databases). Note: Data cloning is rapidly displacing the need for test data subsetting. Cloning can also be used beyond software testing and against real data.
- Data Fabrication / Data Synthetics. A method of test data preparation. Create test data using a data generation tool to create synthetic data (fake test data), from scratch.
- DataOps (any data operation that will help orchestrate data activities e.g. export, import, snapshot).
- Test Data Mining. Methods to support ease of data access & helping software testing teams find sample data for testing. Note: Data mining can also be used to reduce the opportunity for data breaches by limiting what can be mind.
- Test Data Bookings. Another method to simplify data access is Test Data Booking Management. Test Data Reservation methods help testing teams reserve test data to avoid test data contention.
- Test Data Reporting e.g. Compliance, Size, Usage, Performance, etc.
Test Data Management Benefits
By implementing TDM best practices, and avoiding common Test Data Management Anti-Patterns, you are sure to benefit.
Here are but a few reasons.
- Software Development & DevOps Productivity through timely and fit-for-purpose data.
- Quality Engineering i.e. effective testing using "real data". Using a production-like enterprise data set means fewer production defects.
- Reduced Risk & Improved Compliance with Data Privacy & Data Security Regulations.