Test Environment Management Tools Compared

Five years ago, if you were asked to recommend a “Test Environment Management” platforms you might have struggled.  In fact, you might have struggle to identify one, particularly if you would have considered your own DevTest teams’ behaviour. Lot of disruption, delays, misconfiguration and the inevitable use of Spreadsheets for tracking project bookings, MS Visio document for system information capture, Email for Reporting and perhaps if you were lucky, some test automation for platform health checks. Not exactly elegant nor scalable but undoubtedly better than complete chaos.

However, things have somewhat changed and with a raft of solutions now claiming to solve this problem, The Last Frontier of the SDLC, the question now is not “what” but “which” platform will meet our needs and address one of the SDLC’s biggest “Waste Areas”?

At TEM Dot we decided to compare six of the biggest players in this space across 10 key areas:

Key TEM Vendors

Key TEM Performance Areas

  1. Modelling
  2. Booking Management
  3. Coordination
  4. Ticketing
  5. Health Monitoring
  6. Automation & DevOps
  7. Data Management
  8. Reporting
  9. Extensibility
  10. Affordability

Test Environment Management Tool Scoring

Area-1 Environment Modelling

The ability to know what your Environments and Systems look like.

Historically think Visio or your CMDB (if you have one).

Gold Medal Position:   

Enov8 & ServiceNow both offer powerful Visual CMDBs & Component / discovery mapping.

Silver:

Plutora & Xebia offer modelling capability.

Bronze:

Apwide & Omnium modelling is achieved via tabular forms.

Area-2 Booking & Contention Management

The ability to capture environment requirements & manage contention on Environments & Systems.

Historically think Email & an attached Word document.

Gold Medal Position:   

Enov8 & Plutora offer advanced booking & contention analysis methods.

Silver:

Apwide, ServiceNow offer booking requests (ref ticketing) capability.

Bronze:

Xebia has no obvious environment booking or contention mechanism.

Area-3 Environment Coordination

Tracking Events & Release activity across space (Environments) & time (Month, Year etc).

Historically think a MS Project Plans.

Gold Medal Position:   

Apwide, Enov8, Plutora, ServiceNow offer Environment & Release based calendaring.

Note: Enov8 & Plutora offer Runsheets /Implementation Plans (respectively).

Service Now offers checklists.

Silver:

Xebia – Calendaring is release centric (opposed to environment centric).

Bronze:

Omnium (limited capability identified).

Area-4 Ticketing

Ticketing / IT Service Management to capture Environment Change Requests Incidents etc.

Historically think Remedy.

Gold Medal Position:   

ServiceNow has advanced ITSM methods.

Silver:

Apwide (using Jira), Enov8, Plutora have solid Ticketing / Requests functionality.

Bronze:

Omnium & Xebia dependent on other tools.

Area-5 Health Monitoring

The ability check Systems or Components or Interfaces are up.

Historically think Test Automation scripting or your server monitoring solutions like Zabbix.

Gold Medal Position:   

Enov8 & ServiceNow offer integration methods & native agents to monitor health.

Silver

Apwide & Plutora have APIs that logically allow system health updates.

Bronze:

Omnium & Xebia don’t play in this space.

Area-6 Automation & DevOps

The ability to automate key Environment Operations using code.

Think Jenkins or Puppet Jobs.

Gold Medal Position:   

Xebia is a powerful release orchestrator (its primary purpose).

Silver:

ServiceNow Orchestration automates IT & Business Processes.

Enov8 offers “agnostic” Scripting Hub (Orchestration Manager), Pipelines, Playbook, Webhooks & URL Triggers.

Bronze:

Apwide integration is very simple but can be achieved with Get/Post methods.

Plutora needs other tools to automate/integrate properly (like Dell Boomi). The SaaS only option can also be limiting.

Omnium integrates with other tools to automate.

Area-7 Data Management

The ability to manage one’s data e.g. Extract Data, Masking data, Provisioning Data etc.

Think Compuware File-Aid.

Gold Medal Position:   

Enov8 seems to be the only solution for Test Data. Enov8 offers support for Data (PII/Risk) Profiling & Masking and Data Bookings through “Data Compliance Suite”. Enov8’s Visual Orchestrate can also be used to schedule other Data Tools.

Silver:

Xebia & ServiceNow capabilities are limited but they can leverage their orchestrators and call other tools.

Bronze:

Apwide, Omnium & Plutora don’t appear to play in this space.

Area-8 Reporting

The ability to get & share insights about your Environments.

Historically think drawing pretty pictures & graphs with PowerPoint.

Gold Medal Position:   

A lot of the tools have solid reporting; however, focus is Environments: No Gold Medal yet.

Silver:

Enov8 seem to have best out-of-box Environment dashboards. Needs simpler customization.

ServiceNow Env Dashboard are limited but ultimately extensible.

Xebia have some solid report, but more deployment focused.

Plutora is reliant on a new “Tableau” extension. Getting there but seems disjoint.

Bronze:

Apwide leverages Jira’s native capabilities.

Omnium approach is somewhat “download/export” focused.

Area-9 Extensibility

The ability to have the product do whatever you want.

Think of Salesforce or SAP.

Gold Medal Position:   

ServiceNow – An Extensible Engine. You can use it to build anything.

Enov8 – An “Object Oriented” Extensible Engine. You can use it to build anything.

Silver:

Plutora has broad customization features so you can “partially” alter its behaviour.

Bronze:

Xebia allows customization of your processes but not the platform itself.

With Apwide & Omnium you basically get what you get.

Area-10 Cost

The money ball question. And potentially the most important for some.

Gold Medal Position:   

Low Cost of Entry – Apwide, Enov8 (Free Team Edition) & Omnium

Silver:

Medium – Plutora & Xebia

Bronze:

Expensive – ServiceNow (Just add another “0” for licensing & tailoring services)

The “Test Environment Management Tool” Score Card 

Test Environment Management Tools Comparison

Overall Test Environment Management Platform Rating

Final TEM Tool Positions

Position Player Findings
#1 Enov8 Very much a Test Environment centric solution.
#2 ServiceNow An extensible ITSM solution, expensive but powerful.
#3 Plutora More focused on Release Planning.
#4 Apwide Simple booking tool that has its place at the table.
#4 Xebia More focused on Continuous Delivery
#5 Omnium Inexpensive and will be the right fit for some.

Note: Scoring was limited to the ten key areas recognised by TEMDOT as the most important for successful Test Environment management. The scores do not reflect broader functionality i.e. functionality that may be deemed more important for your organization. If you feel there are inaccurate statements in this comparison or a tool missing, please reach out using our contact form.

DataOps Explained

Preamble

Companies—especially large internet companies—treat collections of data as an asset. And more and more companies are developing an appetite to leverage their data to compete. There are also increasing customer expectations for the fast release of high-quality products or services.

So how do you balance speed and quality? DataOps is your answer. Let’s take a look at what DataOps is and why it matters.

What Is DataOps?

The term DataOps is an abbreviation of the words data operations.

The speed of development and product release has decreased in the last 10 years due to technologies such as DevOps (development operations). As a result, we have a new problem: data and more data. To help draw insight from loads of raw data, companies use data analytics. Of course, there are various types, such as data mining, that help identify trends, patterns, and relationships in large data sets. Unfortunately, in our need-it-now economy, users of data analytics can’t—or won’t—wait for weeks or months to receive new analytics.

With the increased complexity of the emerging data ecosystem and the need to deliver insights more quickly, a new strategy is essential if we’re to gain value from massive amounts of data.

This is where DataOps comes in. It helps improve the delivery speed and robustness of analytics. In other words, DataOps is an automated, process-oriented methodology that helps analytics and data teams improve the quality of data analytics, as well as reduce its cycle time. To achieve this, DataOps combines agile development, DevOps, and statistical process control.

Similar to how DevOps brought together development and operations teams to handle software delivery problems, DataOps seeks to bring together data practitioners to deliver quality data for applications and business processes.

But do we really need another methodology?

Why DataOps Matters

In our current on-demand economy, a company has to rely on data from various sources to better understand their products, customers, and markets. This all sounds good until you factor in the dynamic nature of data. How do you effectively monitor the flow of a company’s data that includes prediction changes, business anomalies, trend changes, and more?

Someone could argue that we already have analytics to handle all of the data issues. But here’s the problem: Data analytics pipelines are in a deplorable state because of

  • Inadequate automation and orchestration
  • Minimal code and data reuse
  • Or a lack of coordination between the involved parties, such as IT, operations, and even business stakeholders.

In the end, we have poor-quality data that’s delivered too late to meet a business’s needs.

As more and more data is collected, the data pipelines become more complex. At the same time, large, more traditional enterprises realize the need to use all the data their company generates. Such information is becoming important even in everyday decisions.

Needless to say, all of these factors make it necessary for an organization to implement a new approach to govern the flow of data through its life cycle.

And here’s one more reason to consider using DataOps. Companies that have already implemented DevOps practices will find that implementing DataOps gives them a higher competitive edge. This is because the DevOps engineering framework may be regarded as preparation for DataOps. Organizations that rely on data need a similar high-quality and consistent framework that’s useful for fast data analysis.

Implementing DataOps in 7 Steps

DataOps is still a rising approach for data-driven organizations. DataKitchen, a company that developed a DataOps platform for data-driven enterprises, suggests seven steps for implementation. And the good news is you don’t have to discard your existing analytics tools.

Here are the seven steps to implementing DataOps.

Add Data and Logic Tests

This step requires that every time you make changes to an analytics pipeline, you have to add a test for the change. Testing applies to data, models, and logic. The idea is to make sure nothing will be broken in the analytics pipeline. These incremental, automated tests ensure that quality and integrity are built into the final output.

Use a Version Control System

In order for raw data to produce useful information, it goes through many processing steps. And all of these steps involve coding. In a similar manner to other software projects, the source files that data analysts use in the data pipeline require maintenance in a version control system such as Git. The aim of version control is to help keep track of changes and revisions. Keeping the code in a repository is also important, as it helps when there is a need for disaster recovery.

Branch and Merge

To maintain coding changes, data analytics should borrow the approach that software developers use to maintain their projects, which is to continuously update code source files. For instance, when a developer wishes to make changes, they pull out the relevant code from the repository. Changes are then made on the local copy (also called a branch) pulled from the repository. Once new changes are made and tested, the local copy (branch) is merged back into the repository.

Use Multiple Environments

Data analytics team members should have their own environment to work from. These environments will allow team members to work on subsets of data while isolating the rest of the organization from any effects of the ongoing maintenance or additions to the existing data.

Reuse and Containerize

Breaking down a data analytics pipeline into smaller components facilitates code reuse and containerization. By doing this, the data analytics team can move quickly as they leverage existing libraries or other code whenever they want to extend or develop new code.

Parameterize Your Processing

Borrowing the idea of parameters from software development will help in designing a robust data pipeline. And a flexible data-analytics pipeline will accommodate varying run-time circumstances.

Use Simple Storage

Simple storage helps make the whole data analytics pipeline readily available, and it eases the updating process.

What About Data Security?

There’s a lot of concern about how to gain insights from raw data in a robust yet fast manner. But we shouldn’t forget the consequences of data breaches across the globe. The costs you may incur for mishandling personally identifiable data is becoming too expensive. As you work toward building more and delivering faster, it’s important to consider the security of the data you handle.

When implementing DataOps, you must protect the data at every stage of its journey. Always keep in mind the bad guys who are ready to grab your data. And don’t forget the issue of accidentally sharing sensitive data that may cause you to fail to meet regulatory compliance.

Thankfully, there are solutions that help take these worries away, such as Data HotSpot—a product specifically designed for those in test data management and those who consume test data. With Data HotSpot, you are assured complete security, customer protection, brand protection, and penalty avoidance. That means you can implement DataOps and stay way ahead of your competitors with real-time or near real-time analytics.

Unlock the Value of Data

Today, there’s a need to avail data in real-time or near real-time because businesses rely on it to retain a competitive edge. As a result, it became necessary to create analytics methods that can quickly provide data for consumption by users or applications.

DataOps is a multidisciplinary approach that helps data analytics teams overcome the challenges of inflexible and poor-quality data. If an organization can implement DataOps properly, they will experience great improvements in producing robust and adaptive analytics.

As we’ve seen, DataOps matters today because it helps organizations create reliable and readily available data flows. And availability plays an important role in unlocking the value of an organization’s data.

Author: Alice Njenga

This post was written by Alice Njenga. Alice’s areas of expertise include technology, artificial intelligence, IoT, cloud computing, security, and telecommunication. She especially enjoys converting dense technical material to articles that are easy for the layman to understand.

DevOps-Metrics

Top 5 DevOps Metrics

When people start talking about DevOps, the idea of metrics usually comes along for the ride. To be able to monitor software after release, we need to know what data is important to us. There are so many options, it may seem overwhelming to know where to look. However, we can limit our options based on two key factors: what decisions we’ll make and how customer-focused they are. With that in mind, I’ll share what I believe to be the five most important DevOps metrics.

Metrics Are for Decisions

The thing about metrics is that they’re useless on their own. People often say, “We need to track this data!” But you need ask them only one question: what decisions will you make with that data? You may be surprised how often—usually after some mumbling—the answer is “I don’t know.” Any metric that doesn’t support a decision or set of decisions we may want to make ahead of time is simply noise. We want to eliminate noise from our minds and focus on what guides our decisions for our team.

Customers First, Then Everything Follows

Knowing what decisions our metrics will support is a good start, but it’s not enough. There are millions of decisions we could make about what we’re seeing. We need a North Star, a guiding light, that will be the anchor from which we can derive a strong set of metrics. This anchor is our customers. For any metric we use, we should be able to point back to how it helps our customers. After all, we ultimately owe them our existence.

Top Five Metrics

Without further ado, I give you the top five DevOps metrics you probably should measure for your team:

  • Customer usage
  • Highest and average latency
  • Number of errors per time unit
  • Highest lead time
  • Mean time to recovery

Customer Usage

The first metric on our list is customer usage. This is any measurement that tells us how much our customers, internal or external, are using our features. When delivering new or enhanced features, it’s important to get to production as soon as possible. But we can’t assume customers want or will use a feature just because we put it in production. This is true even if they specifically ask for the feature. We can weigh how popular a feature actually is against how popular someone claimed it would be or what we estimated it would be.

It’s helpful for us to know how often customers use a feature—even one they requested—after we release it to production and inform them of its existence. Customers often think they need something “right away.” This can cause us to scramble, putting this feature on the top of our backlog. The feature might then sit, inert, for weeks or months because the customers reprioritized their desires.

Internal customers commonly are on a longer cadence, unable to use the feature until they get to it in their own backlog. Tracking customer usage allows us to say, “I know you said this is really urgent, but the last time you said that, it took you six weeks to start using it. Please be sure this is as urgent as you say it is.” We can also use this data to enhance the feature, watching usage go up or down, using hypothesis-driven development.

A good application performance monitoring (APM) tool can track this metric for you. It usually comes in the form of request counts or percentage of traffic.

Highest and Average Latency

Knowing how often customers use your features is a great start. But how do we know if customers are delighted or frustrated with our applications? This is a hard question to answer, but our next metric can hint to us that customers may be frustrated. One of the leading causes of frustration is an application’s slowness. When the response time—that is, the latency—is too high, customers are likely to go elsewhere for their needs.

We want to give our applications the best chance to make customers happy. They’ll appreciate it and likely stick around. If you have internal customers, it may be tempting to say, “They have to use my application, so I don’t need to worry about latency.” Putting aside the potential ethics issue of not caring whether your users have a pleasant experience, that mindset is folly. Even if your direct customers are internal, it’s likely that they or a downstream app are responding to external customers. So, slowness for them is still ultimately hurting your organization’s success. Even if this isn’t the case, enough complaints to the right people may get your applications scrapped.

Two major signals to look for when measuring latency are average latency and the slowest five percent or so of requests. Looking at the average gives you a nice bird’s-eye view of the application as a whole. But even one feature or subset of requests can be enough to create disgruntled customers. This is why it’s also important to keep an eye on your slowest requests.

We can decide where to tune performance with this information. An APM tool can handily monitor all of this for you, in addition to usage.

Number of Errors Per Time Unit

In the same vein of finding out whether our customers are happy, we have the metric of number of errors per time unit. The benefits of this should be pretty clear. Errors with high business impact not only cost your organization money, but they can erode customer trust. Looking at our error rates help us nip these in the bud and find abnormalities that even our tests can’t prevent.

Note that I said “errors with high business impact.” Not all errors are created equal. Your error metrics should differentiate between types of errors. Small glitches and errors are unlikely to erode customer trust or cost a lot of money. For example, if the screen is green instead of blue, that usually won’t be a problem for most people. Also, some errors are caused by users and should be expected. User errors are still good to track because they can provide information about how hard a feature is to use. Just be sure to keep them separate in your monitoring tool.

With this metric in hand, we can decide where to enhance our resiliency. If we can’t control the source of an error, we can decide to escalate that error to the appropriate team. For user errors, we can decide where to focus our efforts on increasing usability.

APM tools are also a great fit for this metric.

Highest Lead Time

Ideally, the work you deliver in your team is set up as a value stream, creating a flow of work from inception to customer usage. This lets us easily identify the individual steps it takes for a piece of software, usually a user story, to reach the customer’s hands. Think of it like an assembly line, but for software features. It’s helpful for us to look at the lead time that a user story takes to go through each step. This helps our customers by increasing the speed by which we get features into their hands.

If we adopt a Theory of Constraints approach, there’s always one highest lead time in our value stream. If we keep finding and reducing that highest lead time, we’ll be ever faster in our ability to deliver software. Say, for example, our value stream has a “coding” step and a “QA testing” step. We can record each step as part of a Kanban board and record which user stories are in “coding” versus “QA testing.” At the end of our iteration, we may see that cards sit in “QA testing” for three days on average, whereas cards sit in “coding” for only two days. “QA testing” is our highest lead time. We can then inspect why it takes so long to do QA testing and make improvements from there.

Lead time comprises two factors: process time and wait time. Process time is the time someone is actively doing something with the user story. Wait time is how long the user story sits idle, finished from the previous step and waiting to be picked up by the next step. Knowing both of these values separately will help the team know what actions they can take to improve the lead time. The decisions you may take on this are varied, but it’s good to have a system in place to frequently inspect and adapt to this metric. A sprint retrospective is a great example of such a system. And, as stated earlier, a Kanban board is a great way to track this metric.

Mean Time to Recovery

The final metric, mean time to recovery, is somewhat of an extension of our error count metric. While it’s good to know how many errors we’re getting, it’s also important to know how fast we can resolve these errors. This goes back to business impact. Business impact is a function both of how often we receive an error and how long it takes to recover from that error. One error that lingers for minutes could have more impact than 20 errors that last only a few milliseconds.

Having both of these metrics will give us a good line of sight on our business impact on errors. This metric is also a good indicator of how equipped your team is to handle operational issues. It’s an often underinvested portion of a team’s tooling.

We can use this metric to decide where we want to improve our insight into our application, such as by adding more logging context. We can also use this metric to help us decide how to simplify our architecture or make our code more readable.

Many tools specialize in error tracking to make it easy to see how quickly the team resolves issues. Some APM tools also have error tracking features.

Strength in Measurement

The key to good measurement is to understand what decisions we’ll be making. These decisions will be most effective when we center our customers. Drawing from this, we can derive a set of strong metrics that ensure our team operates at its best. With these metrics, no challenges will stand in our way for long.

Author: Mark Henke

Mark has spent over 10 years architecting systems that talk to other systems, doing DevOps before it was cool, and matching software to its business function. Every developer is a leader of something on their team, and he wants to help them see that.