The ultimate software QA process

I have worked with many teams and organisations with varying approaches to testing and QA. In this post I cover some of the approaches I’ve come across and their pros and cons, and will then present my ideal software quality process.

ALL the QAs

On the one extreme side of QA heaviness spectrum, I’ve worked with a few teams with many layers of manual testers. In one team, there were no automated tests whatsoever. When developers were done with a task, the BA would pick up the task and test it on the dev environment. If the BA was happy with it, the task would move forward in the testing pipeline where it was later tested by a QA team. If the QAs were happy with it, it would then move to the next and last stage where yet another group of QAs, called acceptance testers, would test the change again but in a broader context to make sure it hasn’t caused any regression elsewhere. The testing process in this team took somewhere between one to six months depending on the size and criticality of the change as well as the testing pipeline which was always full. If I recall correctly, this team had 47 staff: there were eight developers, six BAs and the rest were testers!!

In another team, there were some automated tests as well as three manual testers per developer and even then the manual testing cycle took as long as development. I was invited to this organisation to help them improve their processes. When I asked why they had so many testers, they said their application is life critical and a bug could cost someone’s life. When I looked at their testing pipeline, there were several critical bugs sitting there waiting to be tested and released. When I asked them why they don’t release the bug fixes, they said they couldn’t release them before they are thoroughly tested and that took weeks. Oh the irony!!

QA as the quality gatekeeper

When you rely very heavily on QAs for the quality of your software, you end up adding more and more QAs as the size and complexity of your app grow, and even then your release cycle gets slower over time.

The other side effect of this process is that developers become more and more insensitive towards software quality because “it’s someone else’s problem”. This then leads into bad blood between QAs and developers: QAs become increasingly unhappy as developers don’t care about quality as much as they should and QAs find a lot of silly bugs on every card. Developers also become increasingly unhappy at work as they keep getting pulled back to fix bugs on tasks they considered done.

You might think “but good devs don’t behave this way”. I have seen many good devs become lazy and insensitive towards quality when someone else is responsible for it.

This buck passing mindset is what brings the application delivery cycle to a halt. When you hold your QAs responsible for the quality of application, they make damn sure that no bug can get through. So they test the application as thoroughly as humanly possible, and when that takes a long time and business wants features out more quickly, you hire more QAs. This is a slippery slope.

No QAs!

On the other extreme end of this spectrum, I’ve worked with teams without any official QA role. In these teams, in absence of anyone else to rely on for software quality, the developers write high quality code with good test coverage. They also manually test their own code with the provided acceptance criteria, and they do some manual testing for edge cases, security, accessibility, UX etc. Typically the change is made, reviewed and tested by different developers to ensure it’s looked at by as many people as possible.

When you don’t rely on QAs for software quality, the application can be released to production more frequently, because there is no back and forth between developers and QAs and there is no separate testing pipeline. In the teams I mentioned above, there were multiple releases to production everyday: basically if the developers were happy with a change and the CI pipeline was happy with the build and tests, the code would be released to production. More often than not, the release is configured to happen automatically at the end of a successful test run. So the speed of value delivery is significantly higher. Another perhaps intangible effect of this process is that developers are really happy at work because they feel ownership over the quality of application, get to deliver value to customers at a rapid pace and aren’t bogged down by “unnecessary red tape”.

Some people get concerned that this results into some bugs getting released to production, and they are partially right, as I discuss below. That said, since you can release more frequently, the bugs can be dealt with very rapidly. A bug is found or reported? You write a failing test to prove the existence of bug, you fix the code, and go through the same process as above, and the bug fix is pushed to production an hour or two after the bug is found.

You still need QAs

Having devs as quality gatekeepers is not all rosy though. When it comes to software quality, QAs are a lot more detailed, thorough and picky than developers. As a QA, you have a different mindset. You look at the app through a different lens. Your job is basically to crash the application or make it misbehave. Developers are good in creating applications, QAs are good in breaking them.

Also why get developers who are good at creating and enjoy doing it do something they’re not very good at and don’t necessarily enjoy doing? Having QAs in your team means better quality outcome. Also since your developers spend less time on manual testing, they will have more time for coding and you can get more features out.

The ultimate software QA process

You should have QAs in your team. Your milage might vary but I think this works very well with one QA per 5 to 8 developers depending on several factors (e.g. how good the devs are, how good the QAs are, how much test coverage you have, the maintainability of your codebase etc). Your QAs are not in a different team. They are PART of the team. Your QAs work closely with BAs to define the story and acceptance criteria before it’s picked by developers. They attend the story kick off with BAs and developers to highlight and discuss corner cases, gotchas and pitfalls. As developers work through the stories, they pair with them to refine the acceptance criteria and help define acceptance tests, help write automated tests, test the feature as it’s being coded, answer questions along with BAs and basically work on the story alongside the rest of the team. This way, by the time the story is done, it should be properly tested as well and there should be no surprises for anyone. So the story is not dev done. It is DONE and it could go to production if it passes the peer review and the build. But the work of QA is not done yet.

Exploratory testing

In absence of test automation, most QAs come up with large body of test cases and run through them manually. QAs should not follow test scripts. Computers are really good and fast at following scripts. Humans are too smart and valuable for that. If your QAs follow a test script, automate it. That will run significantly faster, can even run overnight and since computers don’t get tired or distracted, it is not prone to human error!! This will also free up your QAs to do what humans are good at: creative work.

I have found QAs to be very effective in exploratory testing. That’s when they do their best to break the system; but we just said the code is pushed to production if it passes tests! In this model, there is no place for a test environment, because it unnecessarily slows down deployment to production. That leaves QAs with one place to test the application: in production! After the code is released to production, QAs will test the feature and application in production. You can never ensure the quality of an application better than testing it in production when it’s run on the production hardware, configuration, setup etc. And BTW this applies to performance and load testing too!

Change gradually

This sounds like a scary concept and admittedly takes a lot of discipline and practice to master; but I’ve done it in a few teams and nothing feels as good or effective in comparison. I appreciate it is difficult to go from a traditional QA heavy mindset to what seems like a cowboy style software delivery. There are different things you can do to bridge this gap and build the trust. For example you can use something like GitHub Flow where you deploy branches to production. That way if you find a bug in production, you can just push the master back up again. Obviously there are considerations around database migration if you have relational databases; e.g. your database migrations should be backwards compatible so the old version of application can still work with the performed database changes; otherwise pushing master to production could break the application or cause data loss (if you roll your database back as the result).

Also don’t forgo your test or staging environment instantly. You can use them while you’re building up and getting used to this process. Keep the test environment and run your manual regression testing on them as a stage before pushing the code to production. Every bug you find will obviously reduce your trust for full continuous delivery model promised above. That means you either don’t have good test coverage or good tests, or your devs are not disciplined enough, or your QAs are not properly involved during the development of stories. You need to chip away at these issues gradually. As you get better at it and the number of bugs you find during this stage gets closer to zero, you build up trust in the process and at some point you will be able to comfortably drop the manual testing stage altogether.

You might need a manual test stage for some changes

Some changes are too big, complex or otherwise difficult to make in a backwards compatible way. It’s not that you can’t do it; but it might cost significantly more if you did it that way. You would want to test these changes manually to avoid any critical bugs slipping through or bringing production down on release. This can happen by pushing the code in a feature branch and push that branch to a test environment for manual testing (similar to how GitHub flow works). You can then either merge that back to trunk and push to prod, or just push to production from the branch and only merge to master after a successful release.

Feature Toggles

I see feature toggles as an essential ingredient for continuous delivery. If your changes are wrapped in a feature toggle, then they can be deployed to production with less worry. QAs could then login to production, toggle the feature on for themselves, test the system with the feature toggled on, and only toggle the feature on globally (AKA release it) when they are happy with its quality.

You can also automate the feature toggle roll out. You basically start with the feature completely toggled off. When the QA is happy with the feature, they trigger an event in the system that will automatically roll out the feature to a small test group. After the rollout, the test group’s error count, conversion rate, memory usage patterns, CPU usage patterns etc is automatically compared with a control group. If the outcome is desirable, the system rolls the feature out to more users and continues with the same process until the feature is rolled out to all users. If the test group shows any spikes in the number of errors or any other important indicator, then the system automatically toggles the feature off. In one project, we went as far as identifying the PR that led to run time regression and raised a ticket with data collected from the experiment and assigned it to the team as a high priority. This is a very interesting approach; but I’m going to park this here as it’s not the focus of this article.

The ultimate software delivery process

Let’s zoom out and look at the entire software delivery process. The ultimate software delivery process is the one that allows for rapid and continuous development and delivery of high quality features to customers.

This idea of QAs working along the team is not a new concept. It is just an extension of the DevOps movement. DevOps was created out of the frustration caused by the functional silos between dev and ops teams. Dev would throw their application over the wall to Ops team for them to deploy, monitor and operate it in production. That led to miscommunication, issues in production, slow releases, politics and heavy change management processes to name a few. Devs throwing the application over the wall to QA team is no different. You still end up with functional silos which result in more or less the same amount of pain, slowness and politics.

Now let’s expand this idea to every role in your team. If BAs analyse the requirements, fully specify what needs to happen and throw it over the wall to devs, then devs become code monkeys and just implement what’s written on the card. Instead you want discussion between BAs and devs. Devs should discuss the stories in detail with BAs, challenge things that may not work well and work with BAs to define the work in a way that makes sense, not just from the business’ point of view but also from the technical one. Likewise, you don’t want your UX designers to design interfaces, workflows and interactions without consultation with the rest of the team or your Product Manager(s) to come up with a roadmap in isolation.

Everyone is in this together and this is why cross functional teams work so well. The smoothest software development process is one achieved through heavy collaboration on every aspect of work, from analysis to design to development to testing and deployment. The days of waterfall software development where these tasks were performed sequentially and by different groups of people are long gone. A software development team should now be a self contained team composed of every role required to go from an idea to a running feature in production.

I would love to hear your thoughts if you have had success with other processes and approaches.