Ways to Improve Your Development Process - Property-Based Testing

Defeating the lazy programmer. You know, yourself.
Jun 29 2021 by James Craig

At this point I've talked about a base line along with the concept of fuzzing. The basic example based unit tests get us to a point where we're somewhat confident that our software does what we expect when we give it known values. Fuzzing, on the other hand, gives us piece of mind that our software isn't going to blow up. That's great but I'm lazy so is there a way to get the advantages of both in one package? I don't know why I asked that as a question as the answer is obviously yes or I wouldn't have brought it up. On top of that you read the title so you should already have an idea that I'm talking about property-based testing.

Property-Based Testing #

Property-based testing generally came about in the late 1990s with libraries like QuickCheck. QuickCheck was a library in Haskell that tested an application automatically based on the specifications that the developer gave about how the code should act. The library then finds interesting inputs to see if those specifications hold true or not. In instances where a specification fails, it then does an interesting thing where it runs the code over and over again using a slightly modified input value set. It does this in order to find the smallest input possible that fails the test. For instance if an array of three items fails, it may remove one and see if it continues to fail. Each time removing one until it passes. This, for me, is one of the nicest bits about it as instead of looking at a huge string trying to figure out where the issue is, it's usually the minimal value. In the years since there have been a number of improvements in this space with people expanding it out to testing REST/GraphQL endpoints, testing machine learning algorithms, among other things.

As far as how property testing relates to fuzzing, that's one of those debates that gets surprisingly heated. But generally the way that I view it is that property testing takes the concepts of fuzzing and unit testing and merges them together in an attempt to take the best features of both. For instance property testing is similar to fuzzing in that it takes randomly generated values and checking if the application has issues. With fuzzing we're left with a black box approach where it has no real idea of what the app is attempting to do. We are just looking for that exception to be thrown. With property testing, the developer knows basic facts about the application or bit of code in question. As such we tell it what can be considered a correct response and what is a failing test making it similar to unit tests. Another distinction is that fuzzing generally uses "dumb" data. It's usually completely random. With property tests, we tell it what the general shape of the randomized data should look like. This reduces the search space from the crazy levels that fuzzing deals with to one that's usually a lot more manageable. This reduces the amount of time it takes from hours, days, or weeks down to the seconds and minutes time frame.

Pros

  1. The run time for property tests is greatly reduced leaving us with more time to code because we're limiting our search space to relevant data.

  2. Because the data is mostly random, we end up with high branch code coverage similar to fuzzing.

Cons

  1. You have to have a high level of knowledge of the code in order to define the properties.

  2. Since we constrain the input values, we may never be able to generate truly bizarre and security critical inputs. Thus fuzzing is generally a better fit for security testing.

I will say that I generally like to find peer reviewed papers to figure out a tangible benefit. For instance the testing time boosts or code coverage stats would be nice but amazingly this seems to be an area of research that is lacking quite a bit. Property-Based Testing is generally accepted as very helpful but finding results to show just how much is difficult at this point in time. That said practical experience by many individuals seems to pan out that the above pros and cons are true.

Looking at the pros alone, you may be thinking "Awesome, I'll just start using this instead of unit tests and fuzzing". The issue is that you have to have domain knowledge of the code that you are testing. And usually that knowledge has to be rather deep. You'll see why in a second. This will be a very simple example to show how deceptive it can be. Let's assume that we have code that adds 1 to an input value to get an index:

    public int NextIndex(int value) => value++;

With property testing, we wouldn't be testing specifically the resulting value because we'd be feeding the function random-ish values. Instead we'd be testing things that we know to be true. For instance we may expect the result from the method above had a value greater than the value that we sent in. That seems correct but is that always the case? Since that's C# above, what happens if we enter in int.MaxValue? Let's scratch C# and let's say we're using a language that doesn't do overflow checking. In that instance we may wrap around to the min value. Do we want that to happen? I mean maybe. Let's say instead of an int, we're using a byte to hold the index of a circular buffer. When we hit 255, loop back to 0. The fact is that without a lot more information about what the above code is being used for, you have no way of determining what the properties for that code should be. Similarly you may make assumptions of the code to only find that those properties don't always hold true. Or what about an instance where you only have access to the library's API and not the internal code? In that instance you would have no idea how it picked the next index. That limits our ability to think up properties even more. So in instances where you are lacking information, a simple example based unit test and fuzzing may be as good as you can do.

Property-Based Testing Tools #

I'm going to be honest and say that I'm a bit biased on this one. You may be thinking that I'm saying that because of my own library which uses the concepts but no. I'm talking about FsCheck, a property based testing library based on QuickCheck. Don't get me wrong, I'd love people to try out my library and give feedback, build upon it, etc. But FsCheck is a joy to work with. FsCheck is a library written in F# that works with C#, VB.Net, F#, etc. to bring property testing to the .Net world. If you're from the F# world then you probably know about it already. If you're from the C# world then this may be your first time hearing about it. The best part is it even has a plugin for xUnit to make the experience pretty simple. And if you're using another language, you're in luck as there is probably a library based on QuickCheck for you also. In all seriousness, they're everywhere.

That's the basic concept in a nutshell. So with all of that we have our first easy upgrade to our testing environment. Once you add it to your inventory you should be able to find a number of new bugs that you had never found before. Next time I'll talk about fault tolerance.

Items in the Series #

  1. Unit Testing and Automation
  2. Fuzzing
  3. Property-Based Testing
  4. Mutation Testing
  5. Fault Injection