How to Achieve Continuous Delivery

How to Achieve Continuous Delivery

So you've decided to get on the continuous delivery train. Congratulations! You're about to turn your deployments from anxiety-inducing to yawn-inducing, which is a luxury non-CD shops will never know. Will it be easy in the short term? Well, let's put it this way: there will need to be some changes to how you approach things. And you'll have to convince your colleagues to change the way they approach things, too. But there are steps you can take to make this transition easier for all. That's what we'll talk about today—your path to successfully doing continuous delivery.

How to Achieve Continuous Delivery

What CD Even Is, Though

First, let me give you a quick summary of the subject at hand. Continuous delivery means packaging every significant code change and pushing it through an automated pipeline of steps until it reaches production. Commonly, these steps act as gatekeepers to the Great Beyond of your prod environment. They're your portcullises and moats. The job of these gatekeepers is to ensure your code is truly ready to enter the wild. Common steps include running automated unit tests, acceptance tests, and smoke tests. A continuous delivery pipeline also includes promoting your package to higher environments and smoke testing them. This is all done so we can eventually make deployments so uneventful that they're boring, reducing risk and saving a boatload of cash and frustration.

How Do We Get There?

If you've read the book Continuous Delivery by Jez Humble and David Farley or if you've perused a few blogs, you may feel overwhelmed at first. Continuous delivery can be a lot to take in. But fret not. We will eat this elephant one step at a time, making it a bearable and possibly fun process.

The principles we'll follow, straight out of Humble and Farley's Continuous Delivery, are to document our steps, continue with those steps even if—especially if—they become painful, and then automate them away. This requires a large dose of tenacity. You have to be willing to stick with it, possibly for an extended period of time. In my experience, this tenacity almost always pays off, often faster than you may think.

Document Your Existing Steps

Starting your journey is as simple as documenting all the steps it takes your system to go from code that's committed and pushed to when it's in production and available to consumers. And yes—you need to document every step. You'll be surprised at how many there are. Pull everyone involved as you need them. Any gaps in the documentation must be fleshed out. Talk to your developers, your QA specialists, your release manager, etc. Talk to your system administrator if you need to. Get it all in one visualization. By the end you should have something like this:

Make sure you understand who owns or commonly performs each step. Sometimes this is a system, but more often, it's a person. If you have trouble showing or understanding the steps, think of your pipeline as a conveyor belt with one thing moving through it: the software package. You may decorate this package with other things, like config files. It may also morph into a different kind of package—for instance, from an executable into a Docker container. But it's still one thing moving through each gated step.

Make Friends

Ultimately, continuous delivery and DevOps is not so much about the tooling but about the people. It's about collaboration and focusing on what matters, and it's about delegating boring deployment work to computers. However, not everyone may take that view. Many people have built up little kingdoms around their role in getting the code to production. They may see your initiative to automate as a threat.

You'll want to understand and have compassion for all the people involved in deploying your system. This may be as easy as giving a heads up. It may be as involved as being vulnerable with them and letting them share their concerns. And at the end of it, you may still have to add a silly button to your deployment server and let them push it. Just remember: it's as much about the people as it is the tooling.

Version Your Package

Ensure your software packages are versioned. You want to know you can grab any build you need and push it through your pipeline. This will make both troubleshooting and tracking easier. Also, ensure that your package is IT & Test Environment agnostic; that is, don't tie your built package to any specific environment, such as dev or prod. We'll wire in the environment-specific stuff later.

Publish Your Package

After you build, version, and unit test your package, publish it to a well-known place. This will make your package available to deploy to multiple environments without rebuilding and unit testing every time. It will also ensure you have a consistent build. You can use something as simple as a shared network drive, but many tools exist to make it easier. Maven and Gradle have the ability to publish built in. Many continuous integration servers also have some sort of publishing mechanism wired in, depending on your language.

Find the Biggest Pain Point

Now the fun part. Document what the biggest pain points are on your diagram, like I did here:

Green is already automated or low pain, and red is the highest pain. You want to ensure the team is doing this painful step as much as possible. The instinct for them will be to run away from it and avoid it. Instead, we want to equip the team with what they need to get rid of it.

Automate the Pain Away

The next step in our path to continuous delivery is to take the pain point from the previous step and figure out how to automate it. There are many, many tools available, depending on the step that's red for you. For testing steps, you can automate your tests. Use a unit testing framework, or Postman, or a more comprehensive testing tool. This is another place where people, namely QA specialists, may think of moving to CD as a threat. It also can be a whole initiative on its own.

For deployment steps, there are many tools to automate the publishing and pushing of your system onto a server. I highly recommend investing in a deployment server. It will save you loads of time automating your pipeline. However, if you're not confident in one or have budget troubles, you can automate with as little as your command shell and some SSH. Something more in the middle can be a task runner like Gradle or even some PowerShell modules. Parameterize these scripts by version number and environment.

You may not be able to automate your most painful step, or you may find it a steep learning curve. That's alright. If it's too difficult, find something similar to start with. You can also mitigate the manual parts down to a few button clicks. That helps. As long as you can remove enough of the pain your red step causes to make it no longer the most painful one, you're moving in the right direction.

Rinse, Repeat

Aggressively and continuously repeat the previous steps. Once you automate most of your painful steps away, your entire team will have an amazing change of attitude when it comes to deployments. It will feel like a large weight has been lifted off your shoulders. Ideally, with one button push, you can get your system from your package repository all the way to production. Most likely, you'll need to press one button per environment. Even then, it'll still be a breath of fresh air compared to the way it was before CD. Nonetheless, you'll want to keep automating until you achieve the one-button-push deployment.

I'm Done Now, Right?

Even though your deployment life will be much easier, I wouldn't stop there. Once you have an effective deployment pipeline, you can do many beautiful things with it. You can make it a zero-downtime deploy. Your team can add feature toggles so you can separate your deployments from your releases. And of course, you should continue to refine and evolve the pipeline as you find new or smaller pain points along the way.

You're on Your Way

As you can see, achieving continuous delivery is well within your reach. It's simple, yet you must persist through all the blockages. Focus on people and showing them how continuous delivery eases their role. Continually and aggressively knock one obstacle down at a time, and you'll be there sooner than you think!

Author Mark Henke. 

Mark has spent over 10 years architecting systems that talk to other systems, doing DevOps before it was cool, and matching software to its business function. Every developer is a leader of something on their team, and he wants to help them see that.

 

I booked the Test Environment

Test Environment Booking Forms and Demand Management

A booking form is a way to tell another department what you need and when you need it, plain and simple.

How do you predict when and how your test environment services will be at a premium? The ebb and flow of business cycles will inevitably cause uneven demands. For example, the end of the year is a busy time for HR as they prepare for a new year of benefits and ever-changing regulations. In contrast, the server management department might be in high demand after Patch Tuesday*. A buggy patch can wreak havoc! These business cycles affect the flow of demands on the test environment.

ITIL recommends that we not only measure but predict demand. Demand management is essential for business. It is a primary part of the Test Environment Management Use Case and falling short of demand means not being able to deliver adequate service. Conversely, overshooting causes waste. That waste translates into dollars. If you can predict demand, then you can manage it.

This post provides an introduction to how you can use booking forms to help manage demand. Seasonal trends are the most straightforward to account for, so let's address those first.

Example of a Test Environment Booking Form:

 

Handling Seasonal Demand

Many industries use booking forms as part of their process to manage demand. Take the case of when you bring your car to the dealership for repairs and they provide a rental car. The dealership puts in a booking form with the rental company. The rental company uses that form to make sure your vehicular needs are met.

Some industries are highly subject to seasonal demands. Hotels and airlines are notoriously swamped during the holiday season. They commonly use booking forms as part of their reservation system in order to manage and control demand. If you've ever experienced the "joys" of holiday travel, just imagine how much worse it would be without a reservation system. It would be total chaos!

Seasonality is ubiquitous in business. Some experience lulls in February. Others are slow during the summer months. Furthermore, the cadence varies by department. HR may be busiest in October, while the accounting department may be buzzing as the new fiscal year approaches. All of those peaks increase the demand for testing as new processes are put in place. A reservation system combined with past trends helps to manage the test environment under this kind of load.

Looking at past data is a great way to predict the seasonal cycles of a business. However, there are times when demand doesn't follow the typical cycles. We need a way to plan for those increases in demand.

Booking Forms Help Predict Demand

A straightforward way to notify another department goes a long way toward preparing for increased demand. It helps the supply side as well as the demand side. The supply side can adjust to meet future demand. Meanwhile, the demand side has a better chance of having its needs met on time.

Many industries use booking forms to communicate demand for services. For example, imagine a team that's working on a software project due to wrap up in September. They will need to book resources for deployment to production. Let's say they need new servers for deployment. Sometime in July, they'll fill out a booking form describing their needs. The form will go to the server team, so they can plan ahead. The server team can fulfill the request in August or even closer to the deliver-by date if that works better with their schedule.

Similarly, consider a new project that's just getting underway. The first round of testing is planned for two months out. That means now would be a good time for the project manager or test manager to put in a request for the testing resources. Let's say the test environment has unique requirements, such as specialized hardware. That request must be placed in advance so the acquisitions can be approved in time. Failure to submit a booking form on schedule would result in project delays. Or, it would cost a significant amount of social capital to recover from the slip-up!

Use Booking Forms for Infrastructure

A booking form is an integral part of a reservation system. When you have finite IT resources, such as physical servers, planning ahead is especially important. Virtualization helps when it comes to shifting demands for servers. But as demand increases up to and beyond physical capacity, something must be done.

In many industries, such as airlines, the cost of service increases with demand. Price adjustment is a standard method of controlling demand. Organizations with fixed assets, as in hotel rooms or airplane seats, need to keep them in use. By reducing prices during times of low demand, these companies increase usage. IT doesn't always have this luxury.

Instead, IT must chase demand. But rather than simply chasing it, we can predict it. In some cases, the PMO will have to adjust priorities to level the demand for limited resources.

Demand Management of Cloud Resources

More and more enterprises are shifting to cloud providers. Complete virtualization of infrastructure and newer technologies like containers are good news when it comes to handling demand. Cloud services make it possible to handle larger fluctuations in demand. And, they do so without reaching the limits of physical capacity. However, it's still important to control resource consumption.

Cost control is one of the primary issues with cloud management. Cloud services offer nearly limitless resource consumption, at a nearly limitless expense. They must be managed effectively to control costs.

Another issue is security. As much as unused servers are wasteful, they are also security liabilities. Cloud providers do a great job of keeping current. Still, consumers must maintain certain types of resources like Virtual Machines.

More VMs means more time spent keeping them running and secure. The waste starts to multiply as the infrastructure and security groups have to take these additional servers into account. The additional servers not only cost money, but they add to the overall workload. That added workload chips away at the capacity of the operations teams—taking little bits away from the things that really do matter.

Speaking of demand for services, let's talk about that demand for personnel.

Booking Forms for Services

Booking forms can be used to book anything from servers to DBAs and testers. Sometimes, the demand for testers can increase beyond capacity.

Let's say your enterprise is moving to the latest OS. Upgrading an OS is a huge undertaking and doing it all in one go could end in disaster. It can be ruinous to hastily make a drastic change to a single system, let alone the operating system used across the entire organization!

Thankfully, OS version upgrades are a rare occurrence. But when your company makes the switch, it's a fairly massive project. You need to test all applications in the IT inventory (and even several that are not). Doing so is the only way to ensure the upgrade will have a limited impact on the business.

However, this kind of massive change presents a problem. You can't throw all your resources into one project. Nor can you hold the project back by limiting supply. There are other projects, and you have to be able to manage resources effectively.

An OS upgrade is a case where you have a sharp increase in demand. Additionally, this fluctuation does not fall on a typical business cycle. You may need additional testers to ensure success. Booking forms would come in from several departments to have their applications tested. You could easily attribute the increase in demand to the OS upgrade project and adjust resources accordingly.

Similarly, you would need to prepare the test environment for the upgrade. Booking forms would flow to the departments that set up the various components within the test environment. A tracking system relates all the requests back to the project. And now we've come full circle—we can see what services are needed to support the project's testing needs.

Form Your Own Conclusions

In this post, we've seen how demand and capacity are related. Managing demand is important for cost control and service delivery. Booking forms are used across industries to communicate in advance. They're standardized and can be used to feed into capacity planning.

In a test environment, infrastructure and staffing demands can fluctuate seasonally. At other times, they can change sporadically. When you use booking forms, you're better able to predict and manage demand. Use them and you'll reduce seasonal stresses and keep surprises to a minimum, which keeps your department on top of things!

* Patch Tuesday is the second Tuesday of the month, when Microsoft releases patches for servers. A bad one can cause a lot of problems that look like buggy application code.

Contributor:

This post was written by Phil Vuollet.

Deployment Styles: Blue/Green, Canary, and A/B

Deployment Styles: Blue/Green, Canary, and A/B

These days, we seem to have an overwhelming number of deployment options. DevOps, continuous delivery, and similar practices have encouraged introspection on how to release valuable software to customers. You've probably heard of three popular options: blue/green deployments, canary releases, and A/B testing. And maybe you've wondered, aren't these all the same? Or, when should I use blue/green instead of A/B? They both have slashes in them, so they're probably equivalent, right?

This post will help you differentiate between these three deployment options and understand why they're valuable. And, as you'll see, each one is indeed valuable depending on the situation.

Deployment Styles: Blue/Green, Canary, and A/B

 

Deployment Vs. Release

Before we delve into the different styles, I want to make a couple of terms clear. Often the ideas of "release" and "deployment" are used interchangeably. But don't be deceived. These are related but different, and some strategies hinge on understanding the difference between them.

Deployment

Deployment is likely the term you understand best. It refers to updating executable code to a specific environment and running it. Strong practices like continuous delivery and continuous deployment focus on how to package this code and get it to the appropriate environment. They often encourage automation and eliminating disruption risks for your customers. Often, as soon as we deploy code customers can see it. However, deploying code need not be the factor that determines whether or not your customers see it. The goal of deployment is to ensure your code can run properly in the appropriate environment.

Release

Releasing is all about making the output of new code visible to your customers. The simplest way to do this is to deploy that code and let it immediately become visible. This is why this concept is often confused with deployment. However, there are many ways to hide code from customers even while it's deployed. The goal of releasing is ensuring a feature meets customer needs and is turned off when defective.

Blue/Green

Blue/green deploys are focused purely on deployment as a way of eliminating downtime and disruption for customers.

How Does It Work?

Let's say you're writing a blog post (go figure) and your audience is expecting to see something in an hour. But you don't like the post and want to rewrite it from scratch. Well, you're not sure you can finish in time, so you publish the old one. When you do have time, you can rewrite the post completely and publish it after the due date, replacing the older version. Depending on when customers visit the blog, they will either see the old post (and you'll get credit for meeting your deadline), or they'll see the new one and enjoy some quality content. Either way, no one will ever see an empty space where they were expecting a post.

Instead of blog posts, picture running software. When you need to deploy a new software version, instead of replacing the existing version you run the new one side by side with the old one. Once everything looks good, you can switch traffic to the new service. That way, customers are always able to get to your system.

What's It Good For?

Blue/green deployment drastically reduces risk in many situations. If you have a site where it costs significantly to be down, even for a few minutes, this option can save your bacon.

Canary

Canary deployments, also known as canary releasing, are usually release-focused. But, sometimes, they can also be deployment-focused. They are deployment-focused when you use your deployment scripts to only update the code on specific containers or servers. They're release-focused when you can change which canary features are visible to some users without redeploying.

How Does It Work?

Canary releasing can work many ways, but essentially you only release a new version or functionality to a small set of customers to start. You then monitor your system and the customers' responses to see if anything...weird...happens. The odd name for this deployment comes from coal miners lowering canaries into the shafts to detect noxious gases so they can see if it's safe before they descend themselves.

Think of canary releasing like a bottle of pop. You accidentally drop it on the ground. You're not sure if it will start spraying everywhere when you open the bottle, so instead of just turning the cap quickly and risking that the pop will explode, you turn the cap slowly to eke out a little air with every rotation. Eventually you can safely open the bottle and drink a refreshing beverage. Canary releasing is eking that software out to the world, user by user.

What's It Good For?

Canary is fantastic for lowering the risk of changes to production. The "no defects in production" mentality is a little overrated, and canary lets you mitigate the cost of defects without spending an enormous amount on preventive testing. (You should absolutely spend some effort on preventive testing, though.)

A/B Testing

A/B testing is a release strategy. It's focused on experimenting with features.

How Does It Work?

With A/B testing, implementation is similar to the process of canary releasing. You have a baseline feature, the "A" feature, and then you deploy or release a "B" feature. You then use monitoring or instrumentation to observe the results of the feature release. Hopefully, this will reveal whether or not you achieved what you wanted.

You're not only limited to two versions in a test. Netflix, for example, has multiple covers it displays as the graphic for the same show based on the version it predicts users will respond to and want to see. But be careful: It's healthy to only run one experiment at a time.

What's It Good For?

A/B testing from the 1,000-foot view may look a lot like canary releasing: You have a set of users seeing the new stuff, and a set with the old stuff. However, A/B has a much different intent. While the focus of canary releasing is risk mitigation, A/B testing's focus is on how useful a feature is for customers. It's the old argument of "build the thing right" vs. "build the right thing."

 

Working Together

Though these three options all tackle different goals, by no means are they mutually exclusive. You can have a system that's backed by blue/green deploys that set up features you can canary-release by region or customer last name. And you can set up certain key features as A/B experiments with your highest-paying customers. These all can work in harmony.

Get Past the Buzz, Choose a Style

We love throwing out buzzwords in this industry, and it can get confusing fast. This is especially true when these jargon-filled concepts operate similarly on the surface. I hope this article helped clarify the differences between these three deployment strategies. Pick one or more that work for you.

 

Author

This post was written by Mark Henke. Mark has spent over 10 years architecting systems that talk to other systems, doing DevOps before it was cool, and matching software to its business function.

Shakedown Cruise

Shakeout Testing With Synthetics: A TEM Best Practice

Shakeout Testing With Synthetics: A Test Environment Management Best Practice

Testing software is pretty easy, right?  You build it to do a thing.  Then you run a test to see if it does that thing.  From there, it's just an uneventful push-button deploy and the only unanswered question is where you're going to spend your performance bonus for finishing ahead of schedule.

Wait.  Is that not how it normally goes in the enterprise?  You mean testing software is actually sort of complicated?

Shakedown Cruise

 

Enterprise-Grade Software Necessarily Fragments the Testing Strategy

I've just described software testing reduced to its most basic and broad mandate.  This is how anyone first encounters software testing, perhaps in a university CS program or a coding bootcamp.  You write "Hello World" and then execute the world's most basic QA script without even realizing that's what you're doing.

  1. Run HelloWorld.
  2. If it prints "Hello World," then pass.
  3. Else, fail.

That's the simplest it ever gets, however.  Even a week later in your education, this will be harder.   By the time you're a seasoned veteran of the enterprise, your actual testing strategy looks nothing like this.

How could it?

Your application is dozens of people working on millions of lines of code for tens of thousands of users.  So if someone asked you "does it do what you programmed it to do," you'd start asking questions.  For which users?  In which timezone and on what hardware?  In what language and under which configuration?  You get the idea.

To address this complexity, the testing strategy becomes specialized.

  1. Developers write automated unit tests and integration tests.
  2. The QA department executes regression tests according to a script, and performs exploratory testing.
  3. The group has automated load tests, stress tests, and other performance evaluations.
  4. You've collaborated with the business on a series of sign-off or acceptance tests.
  5. The security folks do threat modeling and pen testing.

In the end, you have an awful lot of people doing an awful lot of stuff ahead of a release to see if the software not only does what you want it to, but also to see how it responds to adversity.

Sometimes the Most Obvious Part Gets Lost in the Shuffle

But somehow, in spite of all of this sophistication, application development organizations and programs can develop a blind spot.  Let me come back to that in a moment, though.  First, I want to talk about ships.

A massive ship is a notoriously hard thing to test.  Unlike your family car, it seems unlikely that someone is going to put it up on a lift and run it through some simulated paces.  And, while you could test all of its various components outside of the ship or while the ship is docked, that seems insufficient as well.

So you do something profound and, in retrospect, obvious.

You take it out on what's known as a shakedown cruise.  This involves taking an actual ship out into the actual sea with an actual crew, and seeing how it actually performs in an abbreviated simulation of its normal duties.  You test whether the ship is seaworthy by trying it out and seeing.  Does it do what it was built to do?

In the world of software, we have a similar style of test.  All of the other testing that I mentioned above is specialized, specific, and predictive of performance.  But a shakeout test is observational and designed to answer the question, "how will this software behave when asked to do what it's supposed to do."

And it's amazing how often organizations overlook this technique.

Shakeout Testing Is Important

Shakeout testing serves some critical functions for any environment to which you deploy it.  First and foremost, it offers a sanity test.  You've just pushed a new version of your site.  Is the normal stuff working, or have you deployed something that breaks critical path functionality?  It answers the question, "do you need to do an emergency roll-back?"

But beyond that, it also helps you prioritize behavioral components of your system.  If your shakeout testing all passes, but users report intermittent problems or lower priority cosmetic defects, you can make more informed decisions about prioritizing remediation (or not).  The shakeout test, done right, tells you what's important and what isn't.

And, finally, it provides a baseline against which you can continuously evaluate performance.  Do things seem to be slowing down?  Is runtime behavior getting wonky?  Re-run your shakeout testing and see if the results look a lot different.

Shakeout testing is your window into an environment's current, actual behavior.

Shakeout Testing Is Labor-Intensive, Especially With a Sophisticated Deployment Pipeline

Now, all of this sounds great, but understand that it comes with a cost.  Of course it does -- as the saying goes, "there is no free lunch."

Shakeout testing is generally labor intensive, especially if you're going to be comprehensive about it.  Imagine it for even a relatively simple and straightforward scenario like managing a bank account.  Sure, you need to know if you can log in, check your balance, and such.  But you're probably going to need to check this across all different sorts of bank accounts, each of which might have different features or workflows.

It quickly goes from needing to log in and poke around with an account to needing dozens of people to log in and poke around with their accounts.  Oh, and in an environment like prod, you probably want to do as much of this in parallel as possible, so maybe that's hundreds or even thousands of man-hours.

This becomes time-consuming and expensive, with a lot of potential ROI for making it more efficient.

Low-Code, No-Code, and Synthetics: Helping Yourself

As detailed in the article above, a natural next step is to automate the shakeout testing.  In fact, that's pretty much table stakes for implementing the practice these days.  The standard way to do this would involve writing a bunch of scripts or application code to put your system through the paces.

This is certainly an improvement.  You go from the impractical situation of needing an army of data entry people for each shakeout test run to needing a platoon of scripters who can work prior to deployment.  This makes the effort more effective and more affordable.

But there's still a lot of cost associated with this approach.  As you may have noticed, it isn't cheap to pay people to write and maintain code.

This is where the idea of low-code/no-code synthetics solutions come into play.  It is actually possible to automate health checks for your system's underlying components that eliminate the need for a lot of your shakeout testing's end to end scripting.

You can have your sanity checks and your fit for purpose tests in any environment without brute force labor or brittle automation.

Shakeout Testing Has a Maturity Spectrum

If you don't yet do any shakeout testing, then you should start.  If you haven't yet automated it, then you should automate it.  And if you haven't yet moved away from code-intensive approaches to automation, you should do that.

But wherever you are on this spectrum, you should actively seek to move along it.

It is critically important to have an entire arsenal of tests that you run against your software as you develop and deploy it.  It's irresponsible not to have these be both specialized and extensive.  But as you do this, it's all too easy to lose track of the most basic kind of testing the I mentioned in the lead in to the post.  Does the software do what we built it to do?  And the more frequently and efficiently you can answer that, the better.

Contributor: Erik Dietrich

Erik Dietrich is a veteran of the software world and has occupied just about every position in it: developer, architect, manager, CIO, and, eventually, independent management and strategy consultant.  This breadth of experience has allowed him to speak to all industry personas and to write several books and countless blog posts on dozens of sites.

Test Data Management

5 Steps to Better Test Data Management

Forward

I always say that it's important to test in production because nothing compares to a production environment. But it wouldn't be very professional of you to test only—and directly—to production. Testing in production usually gives the impression that you didn't care enough to test before you reached the production stage.

But I'd say that in order for you to even dare to test something in production, you need to have run a set of tests previously in a similar environment—including all the data you need for testing.

That's where test data management (TDM) comes in.

Test Data Management

TDM is the process of creating test data that's similar to the real data being used in a production environment. Developers and testers use this non-production data to be more confident that the new software changes aren't going to break anything during the release.

 

A good testing strategy is usually accompanied by testing data taken from production. Developers sometimes use funny, dummy, and non-real data as well. But what are the steps that you need to follow in order to create good TDM data?

Top 5 Considerations

#1 Only Use the Data You Really Need

If you don't know where to go for your next vacation, just book the next flight and start packing. You might have the best experience of your life...or the worst. If you don't know what data you really need for testing, you might just use it all. That approach has pros and cons, so when you test software without having an idea of which scenarios you need to test, you'll want to have an exact copy of the production database because it's the easiest way to start testing with real data. Otherwise, you'll end up spending too much time and money waiting to get your copy of the data for testing.

When you start creating your testing process by building the list of test cases you'll need, it becomes pretty obvious how much and what type of data you're going to need. More importantly, think about your testing process as an iterative one. If you start testing the login page, you don't need to have all the information from the user for that test case, such as their birthdate or home address.

As you keep iterating, you're going to need more testing data. And as you find more bugs, you're going to need more real data. Unless you need to run stress tests, subsetting data is going to be enough for the majority of the test cases. And even if you still need to validate that the system can handle high waves of traffic, you can also generate varied static data for that purpose. More on this later.

Taking small sets of your production database should be enough for most of the tests you'll run to validate the software. You'll also reduce costs and complexity when building only the test data you really need at the moment.

#2 Avoid Having Sensitive Data for Testing

We've seen a lot of recent GDP-related lawsuits involving big companies in Europe. Europe is taking data protection more seriously than other countries. Pretty soon, regulations like GDPR will be implemented on other continents too. If GDPR is already affecting other companies, we better avoid having unprotected sensitive data in our testing environments.

SOX compliance regulation fosters a separation of duties within an organization. I've worked with these type of regulations. In my experience, auditors only want to see that only certain people have access to the production environments. These people with privileged permissions are legally responsible for what happens with customer data.

Even with regulations in place, data is still leaked. We have to be prepared for that, so you should operate as if you expect the information you're storing will be stolen someday. Mask any data that could identify a person, or what's also called personally identifiable information (PII).

Use irreversible methods to mask data so that it's difficult to unmask it. And make sure you're constantly checking that PII is protected. Managing test data will be simpler and easier if you create subsets of the data to fulfill your different testing needs. And you won't have to worry about giving sensitive data to developers or whoever needs production data for testing.

Ideally, try to avoid having sensitive data. But since sometimes you can't avoid it, try to keep PII data at a minimum, and securely mask the data you need to have.

#3 Build Synthetic Data for Better Efficiency

Even though you decided to mask sensitive data—especially if the data is going to be used for testing purposes—you want to make the security gap as small as possible by not including sensitive data in your tests (even if it's masked). One way to improve security is to replace real data (like credit card numbers) with autogenerated dummy data. That's synthetic data, and it will help you get more efficient results in testing.

You can take advantage of the synthetic data approach by using more realistic data than just dummy data. For example, you might have a user called Joe in your records, but for testing purposes, you decided Joe will be called Jeremy. This gives you a chance to run machine learning experiments where you can learn more about "Jeremy's" preferences without knowing that Jeremy is actually Joe. You're protecting Joe, even if the data is leaked or misused.

Synthetic data makes real data more shareable because you only have the data you need. Why would you need to know a person's name if you're just trying to replicate a bug in production? You're only interested in knowing which paths through the system's workflow the user took. What matters is why the data ended up in a certain state that caused the software to break. You can then decide either to ignore the person's name or replace it with other "real" data.

If you need to have large amounts of data for performance testing, you can use synthetic data to double the size of the database. Along with the previous benefits we discussed, synthetic data makes your tests more efficient by only using the data you need to cover specific test scenarios.

#4 Create Test Data As a Self-Service Model

DBAs are in charge of generating testing data. They know the best ways to do it and what data is sharable among teams (as I explained in the previous section), and sometimes they're the only ones who have access to production databases. When this happens, the DBAs become a bottleneck, and the time spent in the testing stage increases.

That's why you should create test data as a self-service model. It's not just so you don't constantly interrupt DBAs when a developer or tester needs data. The ability to automatically have testing data will let you parallelize the boring task of manually generating data for testing. Do you need to reduce the testing time? Fine. Create more subsets of testing data in parallel and distribute the test cases.

Another benefit of having a self-service model is that you can easily drop and re-create environments on demand. By doing this, you ensure repetition and predictable results when preparing testing data. It's also easier to include TDM in your CI/CD pipeline, which brings you closer each day to one-click deployments.

Creating a self-service model is far from an easy task. So it's important that DBAs, developers, and testers work together to create this model. Not all of them have the same needs and objectives. Join experience, knowledge, and skill to create a better model for data testing.

#5 Keep Testing Data up to Date

Last but not least, keep your testing data up to date. Your software will continue evolving, so the test scenarios and the data they need will keep changing over time too. Some test scenarios will become obsolete, and so will their data. Try to always keep the house clean by making sure you're only generating the testing data you really need regarding its relevance in time.

This process takes discipline, and good communication within the team always helps. Developers need to inform everyone which tests are no longer needed and when it's OK to remove them. And either DBAs or testers need to keep confirming that the data they're using for tests is still valid and relevant.

Keeping data fresh might seem like common sense. But I've seen delivery pipelines where tests continue to grow, even though some of the features no longer exist. Sometimes we get too extreme about trying to have a high percentage of test coverage, which isn't efficient.

Having up-to-date testing data will help you have higher quality TDM.

Benefits: Better Test Results With Better Testing Data

I'd say that testing is the most important stage of any software release life cycle. The more quickly you can verify that everything is still working, the better. Always keep the mindset that parallelizing testing will help you to speed up the process. For that, you need to have better test data quality, and it's not always necessary to have an exact replica of what you have in production. In fact, if you don't, it may help you in the cost, security, or speed departments.

It's important that you start by defining what you truly need and iterate from that. Automation helps with repetitive and boring tasks, but you need to continue taking account of the human side of things in the equation to generate data for testing purposes.

TDM helps you provide only the data you need, on time and securely.

Author:  Christian Melendez 

Further reading suggestions: Holistic Test Data Management

 

OMG DevOps at Scale

The Keys to DevOps at Scale

"An organization that’s operating at scale can grow to meet greater demand without too much hassle."

When it comes to DevOps, it's important to know where organizations generally fall short, but every organization is different. We have to identify where there's waste and what inefficiencies prevent you from delivering software rapidly, consistently, and securely. In this post, we'll cover some keys to DevOps at scale so that you can make your DevOps initiative work in a big organization.

OMG DevOps at Scale

 

Set the Foundation to Create a Great Culture

People are at the heart of every DevOps initiative. Making sure they're effectively communicating is the first key to scaled DevOps.

In big organizations, people are used to delivering software in certain ways. These people aren't known for changing process and technologies often. That's because as a company grows, coordination and communication get more complicated.

A key thing then is to change how people work together to deliver software. And I'm not just talking about developers and operations. Marketing, product owners, managers, testing, and especially senior management needs to understand what DevOps means and how their work will be impacted. These people must be engaged in the DevOps journey.

You can start by doing workshops to discover waste and inefficiencies. Then you'd define the initial action items for the first sprint of many iterations that you'll need to do to increase efficiency. AWS developed a Cloud Adoption Framework (CAF) to help organizations get on board with the cloud. I happen to find CAF helpful for the things that also have to adopt DevOps.

Your team should make the effort to agree on how to better work together. They don't need to be best friends. But when they know each other and understand the needs of other teams, then they can find a balance that works for everyone.

Laying the foundation for people to collaborate is the hard part. Luckily, we're about to talk about the easiest problems to solve—technology problems.

Decouple Architecture to Deploy Frequently

When an organization is spending too much time fixing and debugging problems and they don't have time to reduce the technical debt, then applying DevOps might add more complexity.

Architecture is the next key to DevOps at scale. That's because big companies usually have a ton of interconnected systems where if you try to change something, you might break something else. Tightly coupled architectures need the coordination of many people to release software.

To support this section, I'll refer to what I initially heard from Jez Humble in a podcast about architecture in continuous delivery (CD). Testing and deployment are the main focuses of CD architecture. More specifically, it's important to ask these two questions:

  • Is it possible to do testing without requiring an integrated environment?
  • Is it possible to do releases independently of other systems or services?

Decoupled architectures give you the independence to test software without needing to install and configure other parts of the system. Instead, testing is done using mock objects because there's already a contract.

Deployments or releases can be done without having to update other applications. You're not breaking any agreed-upon contract, like response formats. It's also going to be possible to release frequently and in small batches—the two essential features of CD or DevOps.

Solidify Engineering Practices for Releases

The next key to DevOps at scale is to solidify your engineering practices.

The first time I heard about how Google runs production systems was when I read the Site Reliability Engineering (SRE) book. I loved how Ben Treynor defined SRE as "what happens when a software engineer is tasked with what used to be called operations."

A software engineer gets bored easily. Well, maybe that's true of everyone in tech. But the field that changes constantly is the one for programmers. DevOps actually started as a concept of agile infrastructure, where (predictably) some of the agile principles were applied to infrastructure.

As I said before, technology problems are the easiest ones to solve. That's because there are so many tools and practices that were born with the purpose of automating work. In the tools department, we have Jenkins, VSTS, Puppet, Chef, Ansible, Salt, and many more. But more importantly, there are practices like infrastructure as code, production-like environments, canary releases, blue/green deployments, and the most controversial one: trunk-based development.

Everything I mentioned there helps you have continuous integration (CI), which in turn allows CD. Deployments can then be a normal day-to-day operation. At that point, there's no difference when releasing a new feature or emergency bug fixes.

All changes should pass through the same process, forcing you to decrease deployment time by changing things in small batches. This will make developers happy. You'll improve the mean time to recover (MTTR), and the failure rate for deployments will be lowered, making operations folks happy. The outcomes will please both customers and business owners.

The folks heading up DevOps initiatives in big organizations need to have proper tools and practices in place.

Developers Should Be on Call

The next key to DevOps at scale is that developers have access to the code and are responsible for its performance at all times. It's their baby, and they need to keep watch over it.

AWS CTO Werner Vogels summed up the idea behind developers being on call when he said, "you build it, you run it." Operations folks usually end up calling developers when there are unknown problems they can't solve. Who better to fix a bug than those that introduced it in the first place?

That worked for AWS and Netflix, where developers can push their changes by themselves. But this would be ludicrous at some organizations, especially the big ones. Some companies are under regulations specifying that only certain people have access to production. In this case, having a developer on call is useless because they can't do anything to fix it.

So sometimes this becomes more of a mindset than anything else.

Developers could have access to a centralized logging tool or other metrics with read-only access. They can have visibility into what's happening. Then developers will tell operations how to fix it. And because they were interrupted in the middle of the night, they'll create something to fix it automatically. Developers will make changes in code to avoid that happening again.

Too many times I've seen and experienced that operation folks get tired of reporting the same issues over and over again without any response from developers. Having developers on call makes them aware of issues.

This key is related to the first one, about creating culture in the organization. When people are brought together, even if it's forced in cases like this, it makes the team more accountable for their actions. It also fosters continuous improvement, which is essential in every DevOps initiative.

Shift Change Management to the Left

DevOps is about shifting things to the left. Change management is one of those things that can be shifted—and doing so is another key to DevOps at scale.

Change management is very common in large organizations. They usually have a change advisory board (CAB) that evaluates and approves each change going to production. Most of the time, big organizations think DevOps is not for them because of change management and compliance. They can't automate that process because they're regulated.

Jez Humble has a really good talk about the topic. In it, he said he's been involved in some projects for the government, applying continuous delivery principles to highly regulated agencies.

The project cloud.gov was born after this, which is a platform that helps government agencies host projects in the cloud, applying continuous delivery principles. Cloud.gov assures
that all regulatory controls are in place, making the case for you to include automation for compliance and change management purposes.

Auditors actually love this because there's always a trail of every change in the application code that goes live. They can easily see for themselves when changes were approved, who approved them, when they were released, and who released them. This is more effective than sharing screenshots.

But it's not just about automation; it's about including those verifications and sign-offs early in the process, when we can pay attention and delay things cheaply. Pair programming or peer-based review is better than having managers reviewing the changes in the CAB meeting. They only evaluate the risk, not what the developers changed. What's better than having a peer changing the code with you?

What Are the Keys? People, Process, and Technology

You don't need containers, orchestrators, or microservices to do DevOps. Even if you adopt those new, hot technologies but don't apply everything we just talked about, you've simply wasted time and made things more complex. You can apply DevOps with the people and technology you already have. Process and culture will definitely need to change, and that's usually the hardest part. The technology part is usually the easiest.

It's also important that you increase the quality and the amount of feedback. Waste and inefficiencies will always be caught when they're monitored constantly. DevOps is about integrating people, process, and technology. And at scale, things might seem more complicated. You need to find your own way, and your journey might not be the same as others'. But DevOps also works for big organizations—I promise.

Author:  Christian Melendez 

Related Reading: Pitfalls with DevOps at scale.

 

 

Enterprise Release Management Success

5 Barriers to Successfully Implementing Enterprise Release Management

The sister of IT & Test Environments Management is Release Management, and when it comes to delivering the capability at scale, that is at an organizational level, then we are talking “Enterprise Release Management” (ERM). ERM is the overarching governance framework that brings our different stakeholders, projects and teams together so we can effectively scope, approve, implement and ultimately deploy our changes effectively and in unison.

One could say, ERM bridges the gap between corporate strategy and DevOps.

 

Enterprise Release Management Success

However, implementing an ERM framework isn’t necessarily done, as many companies still don’t do it, nor is it necessarily trivial. In fact, there are many barriers to successfully implementing ERM and ultimately ensuring effective and scalable delivery.

With that in mind, let's look at a few of the top barriers to ERM success.

  1. End-to-End Process Verification

One of the first steps to successfully implement ERM is to ensure that each stage in the end-to-end pipeline is complete and valid. It's critical that you don't overlook some very important processes in your pipeline like compliance or security assertions. Verifying these types of processes in the software development lifecycle (SDLC) is a barrier in ERM because it is complicated and requires a large number of revisions.

When a project starts, the workflow is simple. But as it grows, things get more and more complicated. You need more people, more dependencies, more checks, more systems, and more software changes. Then you need a standard process. Unified practices and processes help organizations ensure that every state is completed properly.

Integrating multiple projects and different departments is challenging. Every department has different goals and objectives, so it's not surprising that some departments have conflicts with each other. C-level executives put pressure on managers, who then put pressure on developers, to deliver on time. At the same time, developers pressure IT operators to release software changes faster, but IT ops are responsible for the system's stability and have to consider how every new change might put that stability at risk. This chain creates conflict constantly, sometimes forcing personnel involved in SDLC to break the rules and bypass some processes.

As you grow, it will become more difficult to ensure that every stage of the SDLC is completed. Help your teams understand why every piece of the ERM puzzle is important and be clear why some processes can't be left incomplete. A good set of checkmarks (or milestones and gates) will definitely make the verification process less painful. But for that, you need automation, which brings us to the next barrier.

  1. Manual Procedures

The easiest way to start managing portfolio releases is with manual procedures. But having a spreadsheet where you control the priorities and release dates of projects doesn't scale well when you need to integrate more people and projects. Automate every task, process, and documentation you can. Don't think about it—just do it. Breaking the barrier of manually managing releases will let you focus on how to speed up software delivery.

Not everything can or should be automated—you still need someone to decide which releases are good to go and when. But getting rid of even some manual procedures is going to change the way you do things in your organization. The process will start by centralizing requests (or registration) for changes, which can come from several sources like marketing, management, and customers. Then you can start recording how these requests move through SDLC.

Leaving trails in the end-to-end process helps create status and compliance reports. You can easily leave trails when you integrate automatically—not only teams and projects, but also build, configuration, and deployment tools like Jenkins, Ansible, or Spinnaker. When you integrate tools like these into your ERM, you'll have information on how much time it takes for the company to deliver software change and complete critical tasks and you'll be able to prove to auditors that you're taking care of everything the company needs in order to comply with regulations.

Manually managing enterprise releases will give you headaches. People will end up spending too much time on things that don't necessarily add value to the customer. Break this barrier with automation because life is too short to keep doing boring things manually.

  1. Visibility for the Entire Workflow

It's not easy to get the big picture of the release pipeline. There are too many required processes in the enterprise's journey to deliver software changes. And leaving a trail in each stage is the key to increasing visibility. A lack of visibility is a significant barrier to improving the software release process.

Knowing where you spend the majority of your time in the release pipeline lets you focus your work efficiently. Maybe you'll find out that you're spending too much time manually configuring applications. In that case, you might need to centralize and automate configuration management.

Or maybe you're spending too much time releasing software after it has been approved in a certain environment, like dev. That could be a sign that you're doing deployments differently in each environment. You might need to focus on preparing a good test environment framework.

But having more visibility isn't just about improving the process. If management wants updated information about how the project is going, they don't need to interrupt the team. Is the project going to be ready for the agreed release date? In which stage is it currently? Did the software pass all tests? Which team has the hot potato now? All sorts of information tell the story and status of a project, and it's all critical to the success of ERM. And you need to have that information up to date and in real time. Having enough visibility helps the team to communicate and coordinate better.

  1. Long Delays Between Releases

One of the main purposes of the ERM framework is to spend less time on releases. At least, that should be the expectation. The amount of time you spend on releases could become a barrier to succeeding in ERM.

ERM is a good framework to plan the date of a release, and that's good to know. But knowing that a release is going to happen in two months doesn't add a ton of value to customers or to management. By that time, it might be too late to release software changes, especially if you need to fix anything after the fact. When you have enough visibility, it becomes obvious where you can reduce the time a release takes.

The amount of time between releases can also increase if you fixed a bug that broke something else at the same time. You won't expect to have a high rate of deployment failures when you implement ERM.

Reasons may vary, but here's a list of things that I've seen affecting time between releases:

  • Needing to coordinate between too many teams, usually in a sequential manner, not in parallel.
  • Waiting to have enough changes to release because it's "less risky."
  • Lacking production-like environments, which help maintain consistency.
  • Hermetically building and packaging the artifacts more than once.
  • Dealing with messy and complex configuration management.

ERM should aim to break the pattern of long delays between releases by working to improve software architectures. There are practices like infrastructure as code, configuration management, self-provisioned environment, and others that ERM can orchestrate.

  1. Same Release Cadence for Everyone

Not only does waiting for someone else to finish their part of the job to release a software change cause a lot of problems, but it could also become a barrier to ERM's success. You could end up accumulating too many changes, which, as we just discussed, increases the risk of failure and the time between releases. What should the team that's waiting do? Keep waiting? I don't think so—they usually start working on something else, which means they could be interrupted constantly to help with the pending release.

Let me give you an example of a common scenario. The developers have just finished their code changes to use a new column from a table in the database. Now they need to wait for the DBAs to add the new column, and after that, the code is ready for the release. Seems like a decent system, right? Wrong. That's just a sign that the architecture is highly dependent on the database.

Developers should be able to release software without having to worry that any dependency—like the database—has everything the application needs. What if database changes take a while to be ready for the release? No problem; the application code is using feature flags, and it's just a matter of releasing the code with the feature turned off.

What if the database changes have problems after the release? No problem; the application is resilient enough to expect problems in the database. Now that's a way of working that doesn't need complex solutions. Code apps so that they're ready for a release at all times.

If you miss the train on a release, you need to wait for the next one to come, and that should be OK. It's better to wait to release something until you are sure that it won't have a negative impact on the company's revenue. But as soon as it's ready, ship it.

Conclusion: Agility Helps Break Barriers

As you can see, not all the barriers I described are related to a specific stage of deployment. Many teams play significant roles in the success of ERM, beginning when a project is planned in management offices to when it's being delivered to end users.

Automation; traceability; versioning control; decoupling architectures; releasing small, frequent changes; and other DevOps and Agile practices are going to help you successfully implement ERM. But also having the proper tools in place are key. Because if you only have a hammer, everything looks like a nail.

Barriers to a successful enterprise release management will always exist. But as long as you continue to find opportunities to improve, success is practically guaranteed.

Contribution by Christian. Christian Meléndez is a technologist that started as a software developer and has more recently become a cloud architect focused on implementing continuous delivery pipelines with applications in several flavors, including .NET, Node.js, and Java, often using Docker containers.

Top 20 DevOps Tools 2018. IT and Test Environment Management Winners.

Rise of the TEM Tools (DevOps Winners 2018)

After years in obscurity it appears that IT & Test Environment Management tools are now starting to get noticed.

This year saw several IT Environment Management products being nominated by CIOReview as the most promising DevOps tools. Most probably in recognition of the importance of managing your IT Environment & Operations to achieve "DevOps at Scale" and ultimately better manage you cloud landscape.

Congratulations to the winners Enov8 and the other CIO Review Top 20.

Top 20 DevOps Tools 2018. IT and Test Environment Management Winners.

--- News Source ---

Enov8, the Enterprise IT Intelligence company, has been named winner of this year’s, 2018, CIOReview “20 Most Promising DevOps Solution Providers”.

SYDNEY, NSW, AUSTRALIA, July 17, 2018 /EINPresswire.com/ -- Enov8, the Enterprise IT Intelligence company, from Sydney Australia, has been named winner of this year’s, 2018, CIOReview “20 Most Promising DevOps Solution Providers”.

A somewhat unique winner, Enov8 goes “against the grain” of providing a “simple” DevOps point solution, and provides a broader “business intelligence” platform that helps organizations run their Enterprise DevOps more effectively through the embracement of methods that help the organization better understand and manage their complex IT landscape and delivery chain operations.

In response to the news, Enov8’s Executive Director, Niall Crawford said: “CIOReview selected a great list* of companies this year. To be selected as the overall winner of this prestigious award was fantastic and I believe it was partly in recognition that the industry is evolving and organizations now need to address the ‘DevOps at Scale’ problem and, off-course, an endorsement of Enov8’s hard work and innovative approach in this space.”

CIOReview Top 20 List*: https://devops.cioreview.com/vendors/most-promising-devops-solution-providers-2018.html

Note: This award follows some of Enov8’s recent achievements which include various big wins with global enterprises (banking, telecom and retail) and significant partnership announcements with some of the world’s largest technology service provider.

About Enov8
Enov8 is a leading “Enterprise IT Intelligence” and “Enterprise DevOps” innovator, with solutions that support IT & Test Environment Management, Release Management and Data Management. Enov8’s philosophy is to help organizations be “Agile at Scale” through enhanced transparency and control of their IT landscape and operations.

About CIOReview
CIOReview is a leading technology magazine that has been at the forefront of guiding organizations through the continuously disruptive landscape and providing enterprise solutions to redefine the business goals of enterprises tomorrow. Recognized as a principal and reliable source of information, CIOReview offers a ground-breaking platform allowing decision makers to share their insights, which in turn provides entrepreneurs with analysis on information technology trends and the broader environment.

Reference: EINnews.

 

Service Virtualization for Test Environments

Test Environment Best Practice – Service Virtualization

Service Virtualization

Service virtualization is an evolution of methods like "mocking" and "stubbing." These techniques allow you to replace dependencies in test environments, enabling teams to keep testing their software without the constraints created by services out of their control.

It's fairly common to rely on external resources that are expensive to use and/or not available to every developer. Some teams try to work around these limitations by writing their own mock code. Simulating a few interactions might get them results in the short term, but it’s unproductive to keep this effort isolated.

Service Virtualization for Test Environments

An alternative is service virtualization, which allows teams across your organization to use the same simulated dependencies effectively. In this post, I’ll talk about the benefits of service virtualization and highlight how having a clear process, consistent governance, and operation standards are vital to implementing a platform like this.

What Are the Benefits of Service Virtualization?

When testing in the early phases of your development cycle, developers often don't have access to all the necessary systems that will make software testing fully representative. Reasons may vary, but common examples include systems on legacy platforms, prohibitive licensing costs, and systems that generate financial transactions or cost money each time they’re queried.

This kind of constraint creates bottlenecks, forcing your teams to wait if they want to access resources. Service virtualization will enable them to replace dependencies with components and data that reliably resemble the production site.

When Should We Use Service Virtualization?

Teams that have restricted access to service dependencies tend to produce code with lower test coverage. In many cases, these teams delay most of their testing until later development phases. But fixing bugs in the later stages of the development process is more expensive, not to mention frustrating for developers.

By using service virtualization as early as possible in your development process, you're enabling your teams to increase their capacity to fix bugs that would show up only in a production environment. This is more efficient, and it gets you closer to achieving true DevOps.

How Do We Implement It?

Once you’ve picked a certain framework or tool for service virtualization, select a service with a high rate of interaction problems. Figure out your best candidate for "problematic service dependency." Record enough data to replicate this component, set up a virtualized service, and configure your code to interact with it.

In order to standardize the use of this new resource, there should be a unique source of truth where you store (and can later find) your virtualized service specifications. One common option is a version control repository like Git. Having this source of truth will help you increase adoption among other development teams. As soon as other people start reaping the benefits, they’ll be interested in virtualizing other troublesome service dependencies.

You should also create guidelines to help teams follow standards and conventions. If your organization is implementing service virtualization through an open source framework, you’ll have to enforce standardization yourself; your virtualized services should follow a common set of rules so they’re reusable across teams.

How Do We Keep a Virtualized Service Close to the Real Thing?

"Recording sessions" is a common method of service virtualization. It facilitates the process of capturing data that can be later used to simulate inputs to client services. The whole purpose is to reply to requests made by your tested code with representative samples of data.

There are other non-functional requirements you could validate, like simulating network instability or unusual response times. This is valuable for developers of mobile applications who can't assume their client application is going to enjoy a stable connection at all times.

Last but not least, persuade people to have their virtualized services automated and continuously tested. Service virtualization will be effective only when it closely represents how data is actually served in production. In order to keep up with external changes, it's critical to have automatic and reproducible processes. If you start to slack on testing, your virtualized services will become obstacles.

Whose Work Is All This?

I've seen software shops where every possible form of testing is sent to a quality assurance (QA) team. I'm not a supporter of this method.

Testers have unique skills, but their role is assisting developers when they write tests for their own code. I see service virtualization as an enterprise-wide traversal process. It needs involvement from all these team members:

  • Developers. Because of their familiarity with your business codebase, developers can quickly reason how each service interaction will affect program behavior.
  • Operators. They can quickly map how interactions may turn into failure scenarios under real-world conditions. Even if your organization has transitioned to DevOps, operators usually control firewall restrictions and endpoint configurations that are necessary for implementing service virtualization.
  • Testers (QA). They are deeply involved with reproducing conditions that might occur when your end users operate your software. Testers also serve as an active audience in the integration process, which gives them a holistic understanding of your service interactions.
  • Test Environment Managers. To ensure these "Virtual Systems" are properly advertised, provisioned, configured and shared/coordinated across the consumers (Test Teams & Projects).

Wait, Isn't This More Work?

A group of early adopters may absorb additional work in the initial phases of service virtualization. But they'll learn and document processes that other teams can then include in their development cycles, which will improve efficiency overall. Make sure to raise awareness of your initial scope to avoid frustration caused by unmet expectations. Transparency and clear goals are very important.

How Do We Get Started?

Any decision to implement a new tool should be made within the boundaries of your budget; that's why it's vital to know what dependencies are worth simulating. Locating bottlenecks in your development cycle is the first step. Then, evaluate how much your cycle could improve by using service virtualization and select a service you can quickly implement as a prototype.

Consult your teams. Ask them for input on how to obfuscate data to avoid security issues, and be sure to involve the necessary stakeholders to provide the basic infrastructure. Emphasize the importance of putting all these pieces together through automatic and reproducible methods.

A Word of Advice: Remember to Measure

Find metrics for lead times, time to market, or any other relevant measurements from your software development cycle in order to find spots where you can quickly improve. This will help your team create valuable impact in the short term.

Project managers would rather assign long time-frames that might allow projects to finish ahead of schedule, rather than tight deadlines that might cause projects to push back their due dates. Pay attention to these patterns; you might detect hidden delays in processes like infrastructure deployment and testing reconfiguration. Look for these situations as opportunities to automate, standardize, and promote change.

Authored by: Carlos “Kami” Maldonado

Test Environment Management Best Practices: Using Containers

Containers have been gaining popularity since their inception in 2001, particularly in the last few years. According to the official Red Hat blog post on the history of containers, they were originally created in order to run several servers on a single physical machine. There are significant advantages to using containers. You may have either the systems under test or automated tests running in containers—or both! This post describes best practices for managing a test environment that runs containers.
Define Ownership It's important to define ownership of the container environment. If your test environment management (TEM) team is separate and has its own budget, most of the ownership will fall to them. Most of the guidance given by Docker regarding ownership of clusters in production applies equally to TEM.

Employing Containers

Use Containers for Tasks

The test environment itself may use containers for running tests on demand.

You can use containers to:

  1. Run applications/services.
  2. Perform tasks.

Tasks you can perform include smoke testing, load testing, and other types of automated tests. Since task containers are throwaways, you benefit from being able to free resources immediately after the task is run.

Use Clusters

Containers have scaled beyond the original intent of running multiple independent servers in relative isolation on a single machine. Clusters of host servers are common these days. You deploy containers to the cluster without having to directly manage servers for each application's requirements. Instead, you define the requirements for the container and the cluster management system runs the instance appropriately. Some noteworthy cluster management systems include Docker swarm and Kubernetes.

Cloud Hosting Services

Cloud services offer container hosting as a service, which is yet another level of abstraction from managing servers. These clusters are fully managed by the cloud provider. Results may vary based on your environment, but I've found that using cloud services to run tasks is beneficial, as it reduces the amount of scaling needed in a self-managed cluster. Also, hosting applications and services in self-managed clusters across your cloud VMs can lead to significant cost savings when running containers over longer periods of time.

Define Limits

Container memory, CPU, and swap usage limits should be set appropriately. If they are not set, the container instance will be allowed to use unlimited resources on the host server. This host server will kill processes to free up memory. Even the container host process can be killed; if that happens, all containers running on the host will stop.

Validate Configurations

Test for appropriate container runtime configurations. Use load testing in order to determine the appropriate limits for each application version and task definition. Setting the limit too low may cause application issues; setting it too high will waste resources; not setting it at all may result in catastrophic failure of the host.

Use a Source Control Management System

Container definitions are specified in a relatively straightforward text file. The host uses the text file to create and run the container instance. Often, container definitions will load and run scripts to perform more complex tasks rather than defining everything in the container definition (for Docker, that's the Dockerfile).

Use a source control management system such as Git to manage versions of container definitions. Using source control gives you a history of changes for reference and a built-in audit log. If a bug is discovered in production, the specific environment can be retrieved and rehydrated from source control. Because you can quickly recall any version of the environment, there is no need to keep a version active when it's not under test.

Create a New Container Per Version

It's best to create a new container for each version of each application. Containers are easy to run, stop, and decommission. Deploy a new container instance rather than updating in place. By deploying a new instance, you can ensure that the container has only the dependencies specific to that version of the application. This frees the running instances from bloat and conflicts.

Running instances have names for reference; use the application and version in the instance name, if possible.

If dependencies haven't changed, the container definition itself doesn't need to change. The container definition is for specifying the container itself (the OS and application dependencies). Specify versions of dependencies and base images rather than always using the latest ones. When dependencies change (including versions), create a new version of your container definition or script.

To clarify, here is a snippet from a Dockerfile for node 10.1.0 based on Linux Alpine 3.7:

FROM alpine:3.7

ENV NODE_VERSION 10.1.0

...

    && curl -SLO "https://nodejs.org/dist/v$NODE_VERSION/node-v$NODE_VERSION.tar.xz" \

    && curl -SLO --compressed "https://nodejs.org/dist/v$NODE_VERSION/SHASUMS256.txt.asc" \

    && gpg --batch --decrypt --output SHASUMS256.txt SHASUMS256.txt.asc \

    && grep " node-v$NODE_VERSION.tar.xz\$" SHASUMS256.txt | sha256sum -c - \

...

CMD [ "node" ]

If you were running node scripts, you might create your own Dockerfile starting with FROM node:10.1.0-alpine. This tells docker to use this specific base image (node 10.1.0 running on Linux Alpine) from the public image repository—Docker Hub. You would then use the remainder of your Dockerfile to install your application specific dependencies. This process is further described here.

Avoid Duplication

There should be a single source of truth for each container definition. All deployments in all environments should use that source of truth. Use environment variables to configure the containers per environment.

Design container definitions for reuse. If you find that only certain parts of a definition change, create a base file for the parts that stay stable and move the ever-changing parts into child files.

Monitor

Monitor your running container instances and the cluster environment. Monitoring allows you to flag events that could indicate defects in the system under test. These defects may go unnoticed without a way to measure the impact of the system on the environment.

When working with clusters, monitoring is essential for auto-scaling based on configurable thresholds. Similarly, you should set thresholds in your monitoring system to trigger alerts when a process is consuming more resources than expected. You can use these alerts to help identify defects.

Log Events

Logging events from your test environment can mean the difference between resolving an issue and letting it pass as a "ghost in the machine." Parallel and asynchronous programming are essential for boosting performance and reducing load, but they can cause timing issues that lead to odd defects that are not easily reproducible. Detailed event logs can give significant clues that will help you recognize the source of an issue. This is only one of many cases where logging is important.

Logs realize their value when they are accessed and utilized. Some logs will never make it to the surface, and that's OK. Having the proper tools to analyze the logs will make the data more valuable. Good log analysis tools make short work of correlating events and pay for themselves in time.

Use Alerting/Issue Tracking Strategically

Set up alerts for significant events only. A continual flood of alerts is almost guaranteed to be less effective. Raise the alarms when there is a system failure or a blocker. Batching alerts of lower priority is more efficient, as it causes less disruption to the value stream. Only stop the line when an event disrupts the flow. Checkpoints like gates and retrospectives are in place for a reason. Use them, along with issue tracking systems, to communicate non-critical issues.

Summary

Containers are being used nearly everywhere. They're continuing to gain traction, especially as cloud hosting providers are expanding their container hosting capabilities. Understanding how to manage containers and their hosting environment is important for your test environment management capabilities. You should now have a better idea of what to expect when using containers and how you can most effectively manage your container environment.

Author: Phil Vuollet

Phil uses software to automate process to improve efficiency and repeat-ability. He writes about topics relevant to technology and business, occasionally gives talks on the same topics, and is a family man who enjoys playing soccer and board games with his children.