I recently had the chance to toy around with property-based testing. Property-based tests verify statements about the output of your code based on some input, where the same statement is verified for many different possible (admissible and inadmissible) inputs. Such tests rely heavily upon randomly generated objects and values. Even if you do not fully commit to property-based testing, having abstractions for generating random objects and values from your domain can simplify testing a lot:

  • In a typical, Unit-test-based setup, large portions of your tests consist of methods that construct test fixtures. Even if your test fixtures are parameterizable, it is cumbersome to combine fixtures. A generator-based approach can simplify your code base in that regard.
  • Oftentimes you find yourself in need of a test fixture for some class, but the actual test logic only cares about a single parameter that goes into the constructor of that class. You still have to come up with values for the rest of the parameters, even if they provide no value for the test. Not only does this increase the code inside your test case, but it clouds the distinction between parameters that are relevant for that particular test and those that are not. A generator-based approach can help here as well.

Since I develop with Java most of the time as part of my day job, I wondered if something similar, albeit very rudimentary, can be done using functional style Java 8. Although there are good and feature-rich libraries out in the wild (cf. ScalaCheck for Scala, junit-quickcheck for Java), we will implement a simple combinator API for generators in this article for the sake of argument.

A First Draft of a Functional Interface for Generators

Lets start right away. A generator is something that knows how to generate values of a particular type. How it does that (by leveraging randomness, ...) is open to the actual implementation of the generator. We can express that in Java 8 with the following functional interface.

@FunctionalInterface
public interface Gen<T> {
    T sample();
}

A simple generator that we can implement using this interface is a generator that randomly return integers that are non-negative.

public static Gen<Integer> nonNegativeInteger() {
    final Random r = new Random();
    return () -> {
        int i = r.nextInt();
        return i < 0 ? -(i + 1) : i;
    };
}

Combining Generators using map and flatMap

One aspect of functional programming is to provide a set of combinator functions to build more powerful abstractions from simpler ones. Lets assume that we want to implement a generator choose that is parameterizable with an integer-based range and provides values randomly selected from that range. Surely we could implement this in a very similar fashion like the generator nonNegativeInteger.

public static Gen<Integer> choose(final int start, final int stopExclusive) {
    final Random r = new Random();
    return () -> {
        int i = r.nextInt();
        int j = i < 0 ? -(i + 1) : i;
        return start + n % (stopExclusive - start);
    };
}

This undoubtedly works, but it is somewhat cumbersome and we do not want to duplicate existing code. Since the generator choose differs from nonNegativeInteger only in the last part, would it not be better to just map the output of nonNegativeInteger to some other generator that performs the necessary transformation to satisfy the contract of choose?

map is such a combinator function. The underneath listing shows an implementation of map for our functional interface Gen<T>.

default <U> Gen<U> map(final Function<? super T, ? extends U> mapper) {
    return () -> mapper.apply(sample());
}

Our implementation of map seems pretty simple, but there is actually a lot going on here. map closes over the current Gen<T> instance and returns a new Gen<U> (please note that T and U might not necessarily represent different types). map accepts a mapper of type Function<? super T, ? extends U> which transforms the output of the inner Gen<T> to a value that respects the type restrictions of the outer generator Gen<U>. Thus, upon evaluation of the outer Gen<U> (by calling its sample method), a value of type T is generated from the inner Gen<T> and then applied to the mapper function to obtain a value of type U.

Having implemented map, we can reuse the generator nonNegativeInteger for our implementation of choose.

public static Gen<Integer> choose(final int start, final int stopExclusive) {
    return nonNegativeInteger().map(n -> start + (n % (stopExclusive - start)));
}

Much better and easier on the eye. There is another combinator function called flatMap which operates in a similar way. Instead of combining two Generators using a mapper function that maps from T to U, we use a mapper function that maps from T to Gen<U>. This allows us to express a dependency between the output of the inner Gen<T> and the parameters for the outer Gen<U>.

The underneath listing shows the implementation of flatMap for our functional interface Gen<T>.

default <U> Gen<U> flatMap(final Function<? super T, Gen<U>> mapper) {
    return mapper.apply(sample());
}

How can we put flatMap to use? We could for instance implement a generator weighted that accepts some threshold between 0.0 and 1.0 and two Generators. If some random value is below the given threshold it selects the first generator, otherwise the second generator. Given that we have already implemented some generator double that generates random values between 0.0 and 1.0, we could implement weighted as follows:

public static <T> Gen<T> weighted(final double probability, final Gen<T> genT1, final Gen<T> genT2) {
    return double().flatMap(d -> d < probability ? genT1 : genT2);
}

Example: Generating Test Data

If the usage of map and flatMap is still too abstract for you, consider the following example. Suppose we have a simple representation of a User that consists of a username, an email address and a hashed password. In order to be able to generate instances of User, we have to provide generators for its attributes first. We will combine those generators later to obtain a Gen<User> that produces instances of User. So lets start off with a generator that produces valid email addresses.

Building blocks of an email adress are identifiers for recipients/senders, some constant symbols like @ and domain names. We can start off with some basic generators for the latter that provide us with randomly chosen domain names. We will use the oneOf generator, which randomly selects a value from the provided ones.

private static Gen<String> topLevelDomainNameGen() {
    return oneOf("com", "de", "at", "ch", "ca", "uk", "gov", "edu");
}

private static Gen<String> domainNameGen() {
    return oneOf("habitat47", "google", "spiegel");
}

Using the core generators constant and oneOf as well as the generators we just wrote, we are now able to express a generator that is able to produce valid email addresses.

private static Gen<String> validEmailGen(final Gen<String> firstNameGen, final Gen<String> lastNameGen) {
    return firstNameGen
            .flatMap(firstName -> oneOf("-", ".", "_")
            .flatMap(nameDelimiter -> lastNameGen
            .flatMap(lastName -> constant("@")
            .flatMap(at -> domainNameGen()
            .flatMap(domainName -> constant(".")
            .flatMap(domainDelimiter -> topLevelDomainNameGen()
            .map(topLevelDomain -> firstName + nameDelimiter + lastName + at + domainName + domainDelimiter + topLevelDomain)))))));
}

Notice that we use flatMaps to combine the individual generators. Subsequent flatMaps are called within a flatMap, so that previously bound variables inside the lambda are not lost. Also note that the last combinator must be a map (we do not want to return another generator but a sample!) and that we have full access to all generated values up to this point and can combine them to yield a String that represents the email address.

A Gen<User> can be expressed using the same technique.

public static Gen<User> userGen() {
    return alphaNumStringGen(8)
            .flatMap(firstName -> alphaNumStringGen(8)
            .flatMap(lastName -> validEmailGen(constant(firstName), constant(lastName))
            .flatMap(email -> alphaNumStringGen(14)
            .map(hashedPassword -> new User(firstName + " " + lastName, email, hashedPassword)))));
}

We can use the following piece of code to test-drive userGen.

final User user = userGen().sample();
System.out.println(user);

Executing this code yields

User{username='AXsIfyAs U1P1ZaO0', email='AXsIfyAs-U1P1ZaO0@spiegel.gov', hashedPassword='tojVq1Lz8aa2u4'}

in my case.

Combining Generators with Predicates

It is not hard to imagine other combinator functions that play well together with existing functional interfaces, such as Predicate<T>. Suppose we want to implement the suchThat combinator, which is a function on Gen<T> that accepts a Predicate<? super T> and samples random values until it found a value that satisfies the predicate.

default Gen<T> suchThat(final Predicate<? super T> predicate) {
    return () -> Stream
            .iterate(this, t -> t)
            .map(Gen::sample)
            .filter(predicate)
            .findFirst()
            .get();
}

Please note that this implementation of suchThat does not prevent infinite loops when searching for an admissible sample. If there is no instance in the domain of generated values, it will fail to terminate.

Lets put the implementation directly to use. Suppose we want to express a Gen<T> that randomly selects even values from a given range. We could simply reuse our implementation of choose and combine it with a suitable predicate and pass this to suchThat, like so:

choose(1, 100).suchThat(n -> n % 2 == 0)

Conclusion

Using functional generators for test data generation is an easy-to-implement technique that can enhance the readability and reusability of your test code and lead the way to more powerful testing strategies, like property-based testing. If you are interested in learning more on property-based checking (and I encourage you to do so) then you should check out either ScalaCheck for Scala or junit-quickcheck for Java.

The code presented in this article is available on GitHub. It includes more core generators to build upon and demonstrates their usefulness by example.

Hi there! I'm Markus!

I'm an independent freelance IT consultant, a well-known expert for Apache Kafka and Apache Solr, software architect (iSAQB certified) and trainer.

How can I support you?

GET IN TOUCH