Functional testing before code checkin

Question

I am working on a project with legacy code where it does not have much code coverage. One of the idea to improve that is to enforce a rule that each code check-in must have test, not only test but functional test as well, so that we can verify that the existing code did not break.

I would like to get some ideas, both pros and cons on this approach. Sometimes it sounds like it's a huge undertaking as it will need a really good infrastructure so that developers can have their code fully tested before it is committed. How would one go about running automated unit tests as well as functional tests when code is checked in? Is this even a feasible idea?

I reopened this question because the underlying concept intrigues me. I don't think it is a good idea necessarily, but it is certainly worthy of a question. — maple_shaft, CommentedJan 12, 2013 at 3:33
“enforce a rule that each code check-in must have test, not only test but functional test as well”: Imagine a check-in is made to simply correct a typo in user interface. What would be the reaction of the developer who will be forced to write a unit test and a functional test for that? — Arseni Mourzenko, CommentedMar 29, 2014 at 9:07

maple_shaft · Accepted Answer · 2013-01-12 03:31:10Z

You should start by evaluating the problem that you are really wanting to solve. You have a nasty confusing code base that is brittle and filled with technical debt. Often those working on it are required to know the software completely to truly know what other features they could be affecting, however this is unrealistic because the legacy code has gotten so bad over the years that your group has had a revolving door.

There is hope to fix this, but I feel that perhaps commit enforcement of functional tests is like learning to run before you have even started crawling.

1. Refactoring slowly

Learn to get in the good habit of refactoring code as you work on various areas. Perhaps you are fixing a bug in one module, however the code is confusing, scattered, duplicated and doesn't conform to any patterns or designs. This is a good opportunity to take a few extra days to refactor that portion.

Some things to keep in mind when you refactor are to keep an eye on the ideal yet realistic design that you hope to achieve someday. Document your efforts well as you go.

2. Write unit tests for your REFACTORED code

How can you verify that the module or code you have refactored hasn't changed functional behavior? Write unit tests that can verify the newly refactored code. I put emphasis on only writing unit tests for refactored code as unit tests on poor legacy code almost always end up being poor unit tests.

On this note, enforcing each check in to have a unit test will be counter productive as you are wasting time writing unit tests on code that should be refactored anyway, and also it encourages a rushed developer that is close to a deadline to check in incomplete or deficient unit tests.

3. Code review

Get your team in the habit of reviewing each others code and designs. Enforce this procedure for any major code checkin or feature to make sure that code is following proper design, that modules are slowly being refactored, and that unit tests are being added to cover new and refactored code.

4. Automated builds and test suites

Establish and configure automated builds after checkin to verify that code compiles, and also configure that all existing unit tests will run and report any test failures to the team after a checkin. This is a superior means of enforcing code commit quality.

After addressing the following issues, you should begin to start seeing the benefits of Functional Testing in general, however a few of the negatives aspects that I mentioned about commits enforcing the existence of a unit test I feel paint a pretty strong picture that it is not a worthwhile endeavor when you have so much room for improvement of code and feature quality elsewhere in the application.

snakehiss · Accepted Answer · 2014-03-29 08:00:44Z

TL;DR: Do the analysis for your case. Weigh the cost against the benefit and do it if the benefits win.

A lot depends on the cost of breaking the system versus the cost of writing a full functional test for every change. This can be further complicated by the developers familiarity with test driven development. It can be difficult, expensive or both to create a test that captures all of the functional behavior and asserts on that state.

We do ATDD (Acceptance Test Driven Development) at my work. The developers (yes the developers) are responsible for writing a test (we build up just like you do with a unit test) before coding. The test doesn't stop at a unit, but continues on until it covers the acceptance criteria. We use a standard xUnit framework and wire the test call directly to the top of the stack instead of starting up a server. Essentially we treat our system as the unit rather than a function or object. Keeping the stack processing within the same process space as the test call eliminates timing issues with regard to server startup and gives us more control over what we can change or even call in the stack. This allows us to create tests with mocks if we need a more more controlled test, or even write straight unit tests if we feel we need coverage specifically for the a function.

Cons

False positive broken tests do to unmanaged resources (ie. database, network, network service, etc). This can be mitigated in many ways. In practice I have found failures from unmanaged resources to be both obvious and, with a little work, rare.
Functional test (and most ATDD tests) only tell you the system isn't working properly, it doesn't necessarily tell you with function, module or object is defective. In practice, most errors I get come with a stack trace so I know at least where the exception is thrown. While the origin of the exception is not always the same as the source of the error, it's usually not that hard to backtrack and find what the source is. There is also nothing preventing you from writing more granular tests (more classic unit tests) for more/better coverage.
Additional time upfront. Although this is true for all test driven development in general. It costs more time upfront to develop the tests and the code. It can also cost more time specifically for ATDD because you have to understand how to reset the system after a test so that your always executing from a known state.

Pros

Coverage of integration points (other function, objects, databases, SOAP services, etc). I started doing only unit testing (any automated tests are better than none) and while that did prevent many kinds of defects I found it didn't prevent the worst ones which came from integration incompatibilities. A function or object returning information in a slightly different way than was expected. Sure you can mitigate with strong requirements, but the point is that you have no coverage of the actual integration points.
Coverage of the acceptance criteria aka. a working system. The near confidence that your code will work you develop from having a suite of automated acceptance tests is great. The fact you can run the suite to verify everything works after a massive refactor/rewrite, upgrading dependent systems, deploying to alternative platforms, etc is a huge benefit.
Forces the developer to understand what the system behavior needs to be. Obviously this only works if the developer is the one writing the tests. I find this to be very valuable, so much so I don't think I will ever consider changing this practice. I have found very valuable benefits such as driving understanding/comprehension and eliminating ambiguities earlier in the process.

Your Case, and The Differences

In your case you have to weigh the cost to doing this style of development with the benefit. I would do a trial and see how it works for 5 changes. Then review and take a look at the cost and the benefits.

UPDATE:

I've simulated HTTP calls two different ways.

Limited Monkey patch all network calls to be direct calls to the network call's endpoint which keeps everything in the same process space. We limit the monkey patch because we may still need to make network calls out to communicate with DB, SOAP, etc.
Create our own HTTP GET and POST which takes data exactly how a user would send it. Transform and pass the data the same way it would occur in a server to our code's entry point (top of our stack).

Obviously this works better with RESTful services because it eliminates having to test the UI component. However, you can automate, to some degree, the testing of the UI. What I have done in the past is use HTML parsing to assert the data is showing in the page under the correct element as identified by class, element or context. This allows you to still make calls and assert the data the user is requesting is returned. This still leaves the rendering of the page as a gap.

The classification of what kind of test it is depends on your definition of functional/integration/acceptance tests are. They are integrative tests in that they do cover integration points. They are functional tests in that they test a particular function of the system. They are acceptance tests in that the tests encode the success criteria by which we can say the feature has been implemented. They are always, always, ALWAYS automated tests.

Do you enforce developers to write the "functional" test before check-in on top of unit/integration test? In my case, I have to bring up the web container and deploy multiple wars/jars to perform the "functional" test. Over time the test suite will grow. Given that developers machine may not be able to handle it, i am thinking that these should run off on a remote box and have selenium hub/web driver to perform such tests. One point I want to make clear is that; the goal here is to validate that developer is not breaking existing feature rather than code coverage with these functional tests. — bond, CommentedJan 14, 2013 at 16:34
@bond: Our policy is to develop the our acceptance test in concert with the code (similar to unit tests). We do not "deploy" or startup a server to run our test suite. Our tests are alot like unit tests in that the test and the code are in the same process space. The http is simulated and the data is passed to the function at the top of the stack. This allows us to retain granular control over the test should we need to do any monkey patching, mocking, setup, teardown, load fixtures, etc. — snakehiss, CommentedJan 15, 2013 at 3:41
if u dont mind sharing how are you simulating the http (mocks)? I am assuming that these are RESTFul end points? Sounds more like a unit/integration test to me. My assumption for functional test is to actually test through the UI that a user would see. More of if the user experince is not broken in the process. — bond, CommentedJan 15, 2013 at 17:22

Community · Accepted Answer · 2017-04-13 12:41:45Z

_{It really bugs me how often developers falsely assume they should (and could) do the work of professional testers. This could probably only be compared to another notoriously bad intuition programmers have about performance. Every time I see developers breaking their heads on SQA matters, it feels as if restaurant guys worried about quality of products they use, decide to establish their own farm instead of looking for the right supplier.}

Checkin to VCS is by design supposed to be reversible. Thus, it is only natural to checkin first, then do func-testing, then revert it if needed. That way VCS could have a documented trail of what was done and why, that can be greatly leveraged, helping future maintainers avoid repeating the same mistake again.

^{If there is an issue that multiple programmers can make it complicated by checking in concurrently, solution for that is, again,
not to avoid checkins but perform and test these in independent branches, then merge.}

Given above, the question would be better phrased as, whether to do functional testing after each and every checkin. Taking into account that this is as a rule more time and effort-consuming than build checks and running unit tests, one would better think twice whether it is worth it.

^{I can easily imagine this making sense in mission critical software, but in 99.99% regular applications this is most likely an overkill.}

Main consideration when deciding on what approach to take in functional testing is I believe to clearly understand that these matters are supposed to be of professional testing, not development.

Because of that, I think the most realistic solution would be to get to your testers (or hire them if you don't have QA at all ^{1, 2, 3}) and check with them.

^{Quite sad that while it is widely understood that the right answer to legal questions is typically ask a lawyer, when it comes to matters clearly belonging to professional SQA, it is so rare to see an answer ask a tester. And it is particularly painful to think of consequences when developers waste their efforts doing wrong job. It comes at price, and the price is they have less time and focus to perform their primary duties, which in turn leads to worse design, code and unit tests and eventually to more bugs.}

I disagree with the premise that it's natural to checkin first and then do testing (unit or functional). There are certainly some tests that are harder or impractical to write without having the code first (whitebox, performance, etc), but blackbox functional tests can (and I think should) be written before code. This establishes a contract that needs to be met in order for the feature to be called complete. — snakehiss, CommentedJan 12, 2013 at 16:16
@dietbuddha I am not talking about test development, black box could be written prior to code, here you are right. When I was a tester we designed and even sometimes wrote test code before developers began implementation, that's not a problem. What I am talking about assumes functional tests are already there and ready to be executed; it's about developer code sitting on their own machine (build checked, unit tested) and not yet checked in VCS. At that point it indeed makes no sense to block checkin - precisely because checkins are reversible — gnat, CommentedJan 12, 2013 at 16:24
...oh and regarding when and how to create / execute functional tests, I firmly believe this should be defined in discussion between developer and tester, not by developer alone. That's the point — gnat, CommentedJan 12, 2013 at 16:32
@gnat Hiring QA and testers isn't always in the budget, and don't always make sense (at least financially), but I really like your points on how a check in is by its very nature supposed to be reversible. It is true that enforcement of tests on this is counter intuitive. — maple_shaft, CommentedJan 14, 2013 at 1:41
@maple_shaft well the need for the testers would be easy to estimate if they do trial run (as suggested in another answer) and record how much efforts their developers waste on func-testing. Though given they want func test at every checkin I can predict the outcome with about 99.999% certainty. :) As for financial viability, there is a good note on that in "Top Five (Wrong) Reasons..." article by Joel Spolsky (in my answer, reference to this article is labelled with "3") — gnat, CommentedJan 14, 2013 at 6:34

Stack Exchange Network

Functional testing before code checkin

3 Answers 3

1. Refactoring slowly

2. Write unit tests for your REFACTORED code

3. Code review

4. Automated builds and test suites

Linked

Hot Network Questions

Functional testing before code checkin

3 Answers 3

1. Refactoring slowly

2. Write unit tests for your REFACTORED code

3. Code review

4. Automated builds and test suites

Linked

Related

Hot Network Questions