How to Incrementally Develop an Algorithm using Test Driven Development — The Prime Factors Kata

Photo by Bahador on Unsplash

This article was first published on February 20, 2020 on my personal blog here: https://www.andreadiotallevi.com/blog/how-to-incrementally-develop-an-algorithm-using-test-driven-development.

In the world of software development, test-driven development (TDD) is a discipline where failing tests are written first, and only then is the actual software code created, aiming to pass the newly-generated tests.

In this article, I will explore the fundamental components of test-driven development, with an example of how to incrementally develop an algorithm using the TTD cycle.

The example I will use is the prime factors kata, which challenges you to find all the prime factors of a given integer. For instance, the prime factors of is the array , because .

This article and the algorithm development closely follow this fascinating talk by Robert C. Martin, aka Uncle Bob, which inspired me to thoroughly master the underlying concepts behind test-driven development.

What is Test-Driven Development?

The core of the test-driven development cycle, also called red-green-refactor, revolves around three simple steps, which are repeated throughout the software development life cycle:

1. Red: write a new test and confirm it fails.

2. Green: write the simplest code to make the new test pass.

3. Refactor: optimise the code, making sure all the tests still pass.

1. Red: write a new test and confirm it fails

In test-driven development, each new cycle begins with writing a test. This test should be as simple as possible, testing a very specific component of a larger feature.

This is the first differentiating feature of TTD versus writing unit tests after the code is written, as it makes the developer focus on the requirements before writing the code.

We are now ready to write our first test, which expects the yet undefined function to return an empty array for the given integer . We will use the JavaScript language and the Jest framework for testing, while the full codebase can be found here.

Once the test is created, the next step is to confirm that the test fails. By confirming that the new test fails, and does so for the reasons we expect, we can be confident that the test is useful, and will be beneficial once we write the code necessary to pass the test. Let’s run the tests:

2. Green: write the simplest code to make the new test pass

After confirming that the test fails, we now must write the code that will allow the test to pass. The idea here is that we should not write any additional code that goes beyond the scope of the test, but only focusing on writing the least amount of code to make the test pass.

The code written here will likely be rough and not finalised, but that’s ok as the entire test-driven development process encourages constant refactoring, meaning our current code is likely to change numerous times in the future.

We can now start writing our production code and let the error message guide us. Let’s define the function:

Let’s run the tests again:

In order to make the test pass with the least amount of keystrokes, we can just return an empty array:

3. Refactor: optimise the code, making sure all the tests still pass

The process of writing the code necessary to allow the test to pass may have introduced some duplications or inconsistencies. That’s perfectly normal and expected and it depends on the fact that it is hard to write good code when we try to make it work.

The importance of the refactoring step is to take the time to locate those problem areas and to focus on simplifying the codebase, confirming that we haven’t accidentally introduced any additional bugs, or changed something that now causes a previously passing test to fail.

Incremental Algorithm Development

Now that we have analysed all three steps of test-driven development, we can continue with the algorithm design and learn how to incrementally develop the code test by test.

From a constant to a variable

It’s time for the next test, which expects the prime factors of to be an array with in it:

We expect the test to fail:

We can make this pass by modifying our algorithm. We still want to start with that empty array constant, but we want to be able to modify it before returning it. Therefore, we need to transform that constant into a variable named and introduce an statement, where we check whether the given number is greater than :

We expect the tests to pass:

Now, some of you might think we have just hacked an statement in and that there was no design in there. However, watch what happens to that statement as we continue, which is truly fascinating.

As the tests get more specific, the code gets more generic

The next test will be to expect the prime factors of to be an array with in it:

This test should fail. Notice how we are getting into a hypothesis experiment loop, as we write a test and expect it to fail:

Our hypothesis was correct. Now, we need to make this test pass with the fewest keystrokes possible. Inside the statement we can change the to and the tests will pass:

Note that we made again a change from a constant to a variable. A change from something specific to something more general. In fact, if you think carefully now, all the changes we have made to the production code have been towards generality. We didn’t have to do that, as we could have put more statements in. However, that would have violated this new rule: as the tests get more specific, the code gets more generic.

Recognising emerging patterns

We expect the prime factors of to be an array with two in it ():

We expect the test to fail:

If we look at the code, we can get it to pass by putting another if statement that checks whether the number is divisible by 2 and then reduces the original number by the factor 2:

Let’s run the tests again:

We can see the new test passed, but the test that failed is the test number 2. After the nested statement we need to avoid adding to the array, so we must add another statement that checks whether the number is greater than :

Now, we expect the tests to pass:

However, that new statement is an interesting one, as it is the same as the one above it. Since we have two statements in the row, we can take the second one and move it completely out of the loop and the tests still pass:

We have now two statements in the row with the same predicate and you can see how the last statement looks like the end condition of a loop. However, let’s move on for now.

Comprehensive test suite

The next three tests are the following:

And they all pass:

A while is a general form of an if statement

We expect the prime factors of to be an array with three in it ():

We expect this test to fail, as there is nothing in our code that puts three elements in the array:

How can we get this to pass with the fewest possible keystrokes? We can change the statement to a loop, that keeps checking whether the number is divisible by :

Let’s run the tests:

What happened there? A is a general form of an statement. An statement is a degenerate form of a loop. Again, this is a move towards generality. We have made the code more general simply by letting the statement continue until it is false, which is interesting. Let’s move on to .

For loops are concise while loops

We expect the prime factors of to be an array with two in it ():

We expect this test to fail as there is nothing in our code that can put two s in it:

If we look at the code, we realise that there is a little engine that factors out all the s. Therefore, we can repeat that loop and factor out all the s:

And we expect the tests to pass:

However, this violates the rule of making the code more general. In fact, by duplicating the code, we have made the code more specific. As we know what we want to do now, we just need to do it in a more general way.

First thing we need to do is change that into a variable called . Then, we need to take that code and put it into another loop, ending when the number is reduced to . So we will execute this loop while the number is greater than , by changing the first statement to a loop. Finally, we need to take the initialiser outsides the loop and increment it at the end of the loop block:

However, there is still some refactoring we can do. The first loop cannot exit until is equal to , therefore the following statement is not necessary anymore and can be removed (it really was the terminating condition of a loop):

There is still more refactoring we can do. While loops are wordy and can be transformed into loops:

And we can do the same for the outer loop:

Does it really work?

To check if our algorithm really works, let’s write a final test for a big number with a lot of prime factors:

It passes!

Although the algorithm works, there is still a small improvement we can make to speed the algorithm up in the case of large prime numbers. We don’t need to loop while is greater than , but we can loop while is greater than the square root of the original :

Conclusion

It is interesting to notice that this elegant algorithm did not come about because we thought it through beforehand, but emerged while we were developing it, with a little test case at the time.

This implies that it is possible to derive algorithms incrementally to solve algorithmic problems, if we assume we follow the rule that everything we do to the production code is done to make it more general.

I hope this article was helpful to you and will make you want to further explore the magic behind test-driven development! :)

Software engineer, architect, pianist, passionate about algorithms, web technologies and learning new skills.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store