Monday, December 21, 2015

Make Your Factories Beautiful

Every java programmer worth the name knows about the Factory Pattern. It is a convenient and standardized way to reduce coupling by teaching a component how to fish rather than giving it to them. When working with large systems the pattern does however add a lot of boilerplate code to the system. For every entity you need a number of different factories for producing different implementations of that entity, which is both tiresome and unnecessary to write. This is only one of many new patterns that we have come to use at Speedment.
Here is a typical example where you want a car trader to be able to create instances of the Car interface without knowing the exact implementation.
Car.java
public abstract class Car {
    private final Color color;

    public interface Factory {
        Car make(Color color);
    }

    protected Car(Color color) {
        this.color = color;
    }

    public abstract String getModel();
    public abstract int getPrice();
}
Volvo.java
public final class Volvo extends Car {
    public Volvo(Color color) {
        super(color);
    }

    public String getModel() { return "Volvo"; }
    public int getPrice() { return 10_000; } // USD
}
Tesla.java
public final class Tesla extends Car {
    public Tesla(Color color) {
        super(color);
    }

    public String getModel() { return "Tesla"; }
    public int getPrice() { return 86_000; } // USD
}
VolvoFactory.java
public final class VolvoFactory implements Car.Factory {
    public Car make(Color color) { return new Volvo(color); }
}
TeslaFactory.java
public final class TeslaFactory implements Car.Factory {
    public Car make(Color color) { return new Tesla(color); }
}
CarTrader.java
public final class CarTrader {

    private Car.Factory factory;
    private int cash;

    public void setSupplier(Car.Factory factory) {
        this.factory = factory;
    }

    public Car buyCar(Color color) {
        final Car car = factory.make(color);
        cash += car.getPrice();
        return car;
    }
}
Main.java
    ...
        final CarTrader trader = new CarTrader();
        trader.setSupplier(new VolvoFactory());
        final Car a = trader.buyCar(Color.BLACK);
        final Car b = trader.buyCar(Color.RED);
        trader.setSupplier(new TeslaFactory());
        final Car c = trader.buyCar(Color.WHITE);
    ...
One thing you might not have noticed yet is that most of these components are redundant from Java 8 and up. Since the factory interface might be considered a @FunctionalInterface we don’t need the factories, we can simply specify the constructor of the implementing classes as a method reference!
Car.java
public abstract class Car {
    private final Color color;

    @FunctionalInterface
    public interface Factory {
        Car make(Color color);
    }
}
Main.java
    ...
        trader.setSupplier(Volvo::new);
        trader.setSupplier(Tesla::new);
    ...
Notice that no changes are needed to the implementing classes Volvo and Tesla. Both the factories can now be removed and you are left with a much more concrete system!
(For simple examples such as this, the Factory-interface is not needed at all. You could just as well make CarTrader take a Function<Color, Car>. The advantage of specifying an interface for the factory is that it is both easier to understand and it allows you to change the parameters of the constructor without changing the code that uses the factory.)

Tuesday, December 8, 2015

Streaming over Maps with Java 8

In this article I will show you how Speedment Open Source stream efficiently over standard Java maps, expanding the Stream interface into something called a MapStream! This addition will make it easier to keep your streams concrete and readable even in complex scenarios. Hopefully this will allow you to keep streaming without prematurely collecting the result.

One of the largest features in Java 8 was the ability to stream over collections of objects. By adding the .stream()-method into the Collection interface, every collection in the java language was suddenly expanded with this new ability. Other data structures like the Map-interface, does not implement the method as they are not strictly speaking collections.

The MapStream will take two type parameters, a key and a value. It will also extends the standard Stream interface by specifying Map.Entry<K, V> as type parameter. This will allow us to construct a MapStream directly from any Java map.

public interface MapStream<K, V> extends Stream<Map.Entry<K, V>> {
    ...
}

The concept of polymorphism tells us that a child component may change the return type of an overidden method as long as the new return type is a more concrete implementation of the old return type. We will use this when defining the MapStream interface so that for each chaining operation, a MapStream is returned instead of a Stream.

public interface MapStream<K, V> extends Stream<Map.Entry<K, V>> {

    @Override 
    MapStream<K, V> filter(Predicate<? super Map.Entry<K, V>> predicate);

    @Override 
    MapStream<K, V> distinct();

    @Override
    MapStream<K, V> sorted(Comparator<? super Map.Entry<K, V>> comparator);

    ...
}

Some operations will still need to return an ordinary Stream. If the operation change the type of the streamed element, we can’t ensure that the new type will be a Map.Entry. We can, however, add additional methods for mapping between types with Key-Value pairs.


    @Override
    <R> Stream<R> map(Function<? super Map.Entry<K, V>, ? extends R> mapper);
    
    <R> Stream<R> map(BiFunction<? super K, ? super V, ? extends R> mapper);

In addition to the Function that let the user map from an Entry to something else, he or she can also map from a Key-Value-pair to something else. This is convenient, sure, but we can also add more specific mapping operations now that we are working with pairs of values.


    <R> MapStream<R, V> mapKey(BiFunction<? super K, ? super V, ? extends R> mapper);

    <R> MapStream<K, R> mapValue(BiFunction<? super K, ? super V, ? extends R> mapper);

The difference doesn’t look like much, but the difference is apparent when using the API:


// With MapsStream
final Map<String, List<Long>> map = ...;
MapStream.of(map)
    .mapKey((k, v) -> k + " (" + v.size() + ")")
    .flatMapValue((k, v) -> v.stream())
    .map((k, v) -> k + " >> " + v)
    .forEach(System.out::println);

// Without MapStream
final Map<String, List<Long>> map = ...;
map.entrySet().stream()
    .map(e -> new AbstractMap.SimpleEntry<>(
         e.getKey() + " (" + e.getValue().size() + ")"),
         e.getValue()
    )
    .flatMap(e -> e.getValue().stream()
        .map(v -> new AbstractMap.SimpleEntry<>(e.getKey(), v))
    )
    .map(e -> e.getKey() + " >> " + e.getValue())
    .forEach(System.out::println);

The full implementation of MapStream can be found here. If you are interested in more cool stuff, have a look at the Speedment Github page. Have fun streaming!

Friday, December 4, 2015

Parsing Java 8 Streams Into SQL

When Java 8 was released and people began streaming over all kinds of stuff, it didn’t take long before they started imagining how great it would be if you could work with your databases in the same way. Essentially relational databases are made up of huge chunks of data organized in table-like structures. These structures are ideal for filtering and mapping operations, as can be seen in the SELECT, WHERE and AS statements of the SQL language. What people did at first (me included) was to ask the database for a large set of data and then process that data using the new cool Java 8-streams.

The problem that quickly arose was that the latency alone of moving all the rows from the database to the memory took too much time. The result was that there was not much gain left from working with the data in-memory. Even if you could do really freaking advanced stuff with the new Java 8-tools, the greatness didn’t really apply to database applications because of the performance overhead.

When I began committing to the Speedment Open Source project, we soon realised the potential in using databases the Java 8-way, but we really needed a smart way of handling this performance issue. In this article I will show you how we solved this using a custom delegator for the Stream API to manipulate a stream in the background, optimizing the resulting SQL queries.

Imagine you have a table User in a database on a remote host and you want to print out the name of all users older than 70 years. The Java 8 way of doing this with Speedment would be:

final UserManager users = speedment.managerOf(User.class);
users.stream()
    .filter(User.AGE.greaterThan(70))
    .map(User.NAME.get())
    .forEach(System.out::println);

Seeing this code might give you shivers at first. Will my program download the entire table from the database and filter it in the client? What if I have 100 000 000 users? The network latency would be enough to kill the application! Well, actually no because as I said previously, Speedment analyzes the stream before termination.

Let’s look at what happens behind the scenes. The .stream() method in UserManager returns a custom implementation of the Stream interface that contain all metadata about the stream until the stream is closed. That metadata can be used by the terminating action to optimize the stream. When .forEach is called, this is what the pipeline will look like:

The Java 8 Pipeline

The terminating action (in this case ForEach will then begin to traverse the pipeline backwards to see if it can be optimized. First it comes across a map from a User to a String. Speedment recognise this as a Getter function since the User.NAME field was used to generate it. A Getter can be parsed into SQL, so the terminating action is switched into a Read operation for the NAME column and the map action is removed.

The first intermediate operation is removed

Next off is the .filter action. The filter is also recognised as a custom operation, in this case a predicate. Since it is a custom implementation, it can contain all the necessary metadata required to use it in a SQL query, so it can safely be removed from the stream and appended to the Read operation.

The source of the Java 8 pipeline is reached

When the terminating action now looks up the pipeline, it will find the source of the stream. When the source is reached, the Read operation will be parsed into SQL and submitted to the SQL manager. The resulting Stream<String> will then be terminated using the original .forEach consumer. The generated SQL for the exact code displayed above is:

SELECT `name` FROM `User` WHERE `User`.`age` > 70;

No changes or special operations need to be used in the java code!

This was an simple example of how streams can be simplified before execution by using a custom implementation as done in Speedment. You are welcome to look at the source code and find even better ways to utilize this technology. It really helped us improve the performance of our system and could probably work for any distributed Java-8 scenario.


Until next time!