Purifying the API Surface Area
This is my first "Vavr One Log" post in 2019. I would like to take the chance to outline what we've achieved so far and show the direction of our journey.
If you're wondering why I chose the cover image ...
Simplification
The 1st Log "The Essence of Vavr One" was written roughly nine months ago, after brainstorming a while and writing Github issues. I started to distil the essence of Vavr by filtering the technical features that went well IMO. My goal was to simplify Vavr.
Vavr 0.9 is one big module. I analyzed the dependencies between packages and classes. It turned out that some features live more or less on their own, like Functions and Tuples. Tuple0 and Tuple1 are academic, Tuple2 acts as Map entry and Tuple3 is returned by unzip3(). Beside these, internal DSLs rely on functions and tuples.
In the 2nd Log "Simplifying Vavr" I questioned Vavr language enhancements like pattern matching and for-comprehensions. They can't be as efficient as native language features. Internal DSLs have the overhead of new instance creation. Such language enhancements might be dead ends when Java evolves.
It wasn't deliberately my primary goal to that time but while thinking and writing about my strategy on how a future Vavr could look like I have collected a lot of valuable feedback from different channels. Thank you!
Modularization
I was mad about modularizing Vavr. I wanted to slice features and ship them as separate modules. So starting with a first module containing only Either, Option and Try sounded like a plan. I wrote about it in 3rd Log "A Safe Try" and shipped that module as first alpha version of Vavr 1.0.
That first release showed me that it was neither a good idea to bet on Java 11 only, nor it payed off to bet on the Java Platform Module System aka Jigsaw.
Option, Try and Either were nearly finished. But I had to remove cross-module dependencies because of modularization, like sequencing objects using collections. We increased maintainability at the cost of losing features.
I needed some time to realize that I don't want to remove core Vavr features my users rely on and care about. As a first step, the second alpha went one step back, removed Jigsaw modules and restored Java 8 compatibility.
Foundation
All the effort described above layed out the foundation for the actual development. On that way, I solved several non-functional requirements, like
- Moving from Maven to Gradle
- Releasing to Bintray (we will move back to Sonatype)
- Porting tests from JUnit 4 to JUnit 5
- Improving CI builds by running tests on JDK 8, 9, 10, 11 & 12-ea
The next big task will be the enhancement of our code generator (I wrote it in Scala). At the moment we generate whole files but we need a more fine-grained control.
I want to elaborate on code generation a bit, because it is an interesting topic. It is straight forward to expand a simple template, say Tuple, by generating N classes. However, when looking at the Vavr 0.9 code base, we see artifacts on the class level that are solely needed to remove code duplication.
Such utility classes are still present at runtime. Because they are discoverable via reflection, they are effectively part of the Vavr API. That is really bad. So my goal is to create a code generator that injects generated code into generated regions. Maybe we will also need protected regions that remain untouched during code generation.
The code generator is an essential tool because it will allow us to move code from utility classes and interface default methods to generator templates. The generated code will be more efficient because we remove unnecessary method calls and parameterized behavior (read: lambda parameters) of utility functions.
Safety
I promised to make Vavr 1.0 absolutely safe. Examples of unsafe API are:
import static io.vavr.API.println;
var t1 = Try.of(some::call);
println(t1.get()); // 💥 NonFatalException
var t2 = Try.of(some::call);
println(t2.getCause().getMessage()); // 💥 UnsupportedOperationException
var o = Option.of(someValue);
println(o.get()); // 💥 NoSuchElementException
One could argue that these cases could be circumvented:
var t1 = Try.of(some::call);
if (t1.isSuccess()) {
println(t1.get());
}
var t2 = Try.of(some::call);
if (t1.isFailure()) {
println(t2.getCause().getMessage());
}
var o = Option.of(someValue);
if (o.isDefined()) {
println(o.get());
}
But the main pain point is that the compiler can't statically check that the API is used in a safe way. Therefore I deprecated unsafe methods. I do not intend to remove the methods, they are only marked.
So you have two options.
- Use the unsafe API and maybe suppress warnings (discouraged)
- Write safe applications by using the functional replacements for their imperative counterparts (encouraged)
var t1 = Try.of(some::call);
t1.forEach(System.out::println);
var t2 = Try.of(some::call);
t2.onFailure(err -> println(err.getMessage()));
var o = Option.of(someValue);
o.forEach(System.out::println);
These are only a few alternatives. There are other alternatives, like fold and getOrElse, that help you to think in values.
// imperative style 🤭
Try<Integer> t = Try.of(some::call);
String result;
if (t.isSuccess()) {
result = String.valueOf(t.get());
} else {
result = t.getCause().getMessage();
}
// functional style 🥰
final String result =
Try.of(some::call).fold(Throwable::getMessage, String::valueOf);
As I said, starting with Vavr 1.0, such imperative code will produce deprecation warnings. However, side-effecting API, like Try.run, forEach and onFailure are perfectly okay because they are safe.
Performance
Are you worried about a performance overhead by using functional alternatives instead of imperative getters? Then you trade safety for premature optimization.
The bottleneck of an application often is IO, not necessarily instance creation. Two years ago I led a migration project for a bank. We implemented new pricers for end-of-day reports and scenario analysis. The calculation comprised a big part of the trade portfolio of the bank. Even on a calculation grid of 100 nodes the timeframe was small for us.
We profiled our pricers and optimized them step-by-step. The bottlenecks were located within the algorithms and the underlying data structures, not in the glue code. We used primitive arrays in order to replace collections. We achieved the required speed by allocating enough memory for floating-point matrices and calculating values in-place (at well-defined points).
One of our results was a standalone calculation core, written in Java. Because we wrote it in a functional way, it was easy to understand and well testable. Regarding the speed, we have exceeded the expectations.
Purification
The title "Purifying the API Surface Area" means that I fight the demons of the past. A healthy, approachable and user-friendly API has a small surface area. Less methods and types are easier to remember and better to maintain.
Currently I consolidate Vavr's ever-growing, redundant API (my personal Hydra demon) and compare types and methods with corresponding types of Java and Scala. I question every method signature. This is a lot of work because the API is the integral part of our library.
One goal is to stay backward compatible if possible but there will be breaking changes. I set aside cosmetic actions like moving types and renaming packages. This is also the reason for not renaming the collections, which was controversly discussed on channels like Github and Gitter.
The controls in alpha-2 can be considered clean. My next target is the collection package plus some side-stories within the core package.
The image below shows all public Vavr 0.9 types we find in the API docs (click to enlarge). Some interesting questionable types are tagged with 🤔.
There are several API design flaws in Vavr 0.9. For example I exposed static inner classes that have no purpose. Some interfaces should have been sealed abstract classes (read: abstract classes with private constructor), like Either, Option, Try, List and Stream.
The work on modularization was not in vain. I got deep insights into the dependencies between features of the Vavr library. I was amazed when I discovered that I'm able to completely decouple Tuples and (Checked)Functions from most other Vavr features. None of the APIs that rely on them payed off, namely pattern matching, for-comprehensions, validation and try with resources.
API.Match (aka pattern matching) for example is somehow nice because it solves existing problems, like object deconstruction, handling of matched types without casting and checking complex conditionals. However, it is quite memory-hungry and slow compared to the usual code.
API.For (aka for-comprehensions) is nothing more than syntactic sugar for a chain of flatMap's with a terminating map call. I discourage the use of API.For because it creates unnecessary instances, hence it it also quite memory consuming.
Try.WithResources went completely in the wrong direction. It misapplies the Try monad by treating it as equal substitute for Java's try / catch, which is wrong. The same applies to methods like andFinally.
The Validation applicative (read: collector of validation errors) is really nice. But it does not scale well. Input forms typically have many fields. We are not able to compose many Validation instances because Vavr restricts itself to functions and tuples of arity 8. We could increase the arity, however, it is a multiplier. Because of our code generator, the number of static nested classes would explode.
I also tagged some esotheric collections in the overview above. Foldable leaked into the public API, it was only needed to structure a specific class of methods. Also, it is questionable if the Ordered interface really needs to be exposed to the outside. With the removal of the Value interface, Ordered should be obsolete.
If in doubt, leave it away. What remains is a tidy Vavr 1.0, as shown in the image below (click to enlarge).
Beside the changes I already mentioned, some functional interfaces went to places they belong. On the one hand this has to do with decoupling from (Checked)Function0..8. On the other hand functional interfaces like CharFunction or CharUnaryOperator will be obsolete when Java gets primitive generics (see JEP 218). Both could be expressed by Function<char, any T>.
There are still questions. For example I might re-introduce the Promise for concurrent write operations in order to ease the implementation of special use-cases.
Type Hierarchy
Currently I'm examening the collection API. A first change already has been done, I fixed the hierarchy of iterable types.
Value was meant to be an abstraction for monadic types but it turned out that it wasn't possible to find a common interface for filter. Also flatMap can't be a common method because of the lack of higher-kinded types in Java. Methods, that were meant for single-valued types only, leaked into the collections.
In the end, Value was a type that hosted mainly conversion methods. Currently I'm experimenting with generic conversion methods that scale better.
public interface Traversable<T> extends io.vavr.Iterable<T> {
// generic conversion method
default <C> C to(Function<Iterable<T>, C> fromIterable) {
return fromIterable.apply(this);
}
}
These two are equal but I think that the fluent version is more readable.
// fluent type conversion
var set1 = List("a", "bb", "c")
.map(String::length)
.to(HashSet::ofAll);
// nested type conversion
var set2 = HashSet.ofAll(
List("a", "bb", "c")
.map(String::length)
);
This kind of conversion feels good so far. It is more user-friendly than the general list.collect(HashSet.collector()).
Collections
Our collections grew over time. Currently I'm unifying the API of the most basic operations, map and flatMap. Here are some interesting cases (I removed the generic bounds for better readability).
interface Traversable<T> extends io.vavr.Iterable<T> {
<U> Traversable<U> flatMap(Function<T, Iterable<U>> mapper);
<U> Traversable<U> map(Function<T, U> mapper);
}
interface Seq<T> extends Traversable<T> {
@Override <U> Seq<U> flatMap(Function<T, Iterable<U>> mapper);
@Override <U> Seq<U> map(Function<T, U> mapper);
}
interface Set<T> extends Traversable<T> {
@Override <U> Set<U> flatMap(Function<T, Iterable<U>> mapper);
@Override <U> Set<U> map(Function<T, U> mapper);
}
interface SortedSet<T> extends Set<T> {
@Override <U> Set<U> flatMap(Function<T, Iterable<U>> mapper);
@Override <U> Set<U> map(Function<T, U> mapper);
<U> SortedSet<U> flatMap(Comparator<U> comparator, Function<T, Iterable<U>> mapper);
<U> SortedSet<U> map(Comparator<U> comparator, Function<T, U> mapper);
}
interface Map<K, V> extends Traversable<Tuple2<K, V>> {
@Override <U> Traversable<U> flatMap(Function<Tuple2<K, V>, Iterable<U>> mapper);
@Override <U> Traversable<U> map(Function<Tuple2<K, V>, U> mapper);
<K2, V2> Map<K2, V2> flatMap(BiFunction<K, V, Iterable<Tuple2<K2, V2>>> mapper);
<K2, V2> Map<K2, V2> map(BiFunction<K, V, Tuple2<K2, V2>> mapper);
}
interface SortedMap<K, V> extends Map<K, V> {
@Override <U> Traversable<U> flatMap(Function<Tuple2<K, V>, Iterable<U>> mapper);
@Override <U> Traversable<U> map(Function<Tuple2<K, V>, U> mapper);
@Override <K2, V2> Map<K2, V2> flatMap(BiFunction<K, V, Iterable<Tuple2<K2, V2>>> mapper);
@Override <K2, V2> Map<K2, V2> map(BiFunction<K, V, Tuple2<K2, V2>> mapper);
<K2, V2> SortedMap<K2, V2> flatMap(Comparator<K2> keyComparator, BiFunction<K, V, Iterable<Tuple2<K2, V2>>> mapper);
<K2, V2> SortedMap<K2, V2> map(Comparator<K2> keyComparator, BiFunction<K, V, Tuple2<K2, V2>> mapper);
}
It might look a bit scary at first sight. But that's the crucial part that has to be done when creating the API for the collections. The above should already reflect the final state (modulo generic bounds).
It really amazed me that Scala seems to change their collections in a similar way for the upcoming version 2.13. So it isn't Java-specific that some map and flatMap overrides do not return the self-type.
Some unsafe collection methods will pass away, like avg, sum and product. The next step will be to distill the essence of Vavr's collection methods.
Vavr (formerly Javaslang) is and will stay a free library. If you want to support me, please donate my next ☕️🍻🥓.
Thank you & have fun!
- Daniel