Java Collection Views

Today we finished the first of three Java collection views: List view. And we put quite some effort into testing the view well against all of our sequential persistent collections.

We are currently preparing our first Vavr release (successor of Javaslang) that will contain the new List view. A separate blog post will follow...

Update: We renamed the collection decorator methods and updated the post accordingly.

Why views, aren't there already conversion methods!?

There is already a blog post about the upcoming Java interoperability improvements. In Vavr, Value is at the top of the type-hierarchy, similar to Object in Java. Value provides several conversion methods, between Vavr types and to Java types.

// = Some(1)
List(1, 2, 3).toOption();

// = Optional[1]
Some(1).toJavaOptional();

// = List()
Try(() -> { throw new Error(); }).toList();

// = []
Try(() -> { throw new Error(); }).toJavaList()

// and many more

This is already nice, but... especially when converting a Vavr collection to a Java List, we can do better.

Conversion to Java collections using the toJava*() methods requires to iterate the persistent collection and build a new Java collection by successively adding all elements. For sequential collections, this takes place in linear time and space.

The new List view takes a different approach, it fully implements java.util.List and delegates method calls to the underlying persistent Vavr collection. Such a view is created in constant time.

The List view has the same performance characteristics as the underlying persistent Vavr collection. However, a List view needs to be eagerly evaluated - we can't take advantage of Vavr's lazily evaluated collections here.

Different types of List views

A List view can be created for all of our sequential collections, namely those which implement Seq<T>.

  • List - a linear linked list (fast prepend, good memory footprint)
  • Stream - a lazy linked list (infinite lists)
  • Queue - a fifo queue
  • Array - a random-access array (best for fast reads)
  • CharSeq - an indexed character sequence (String on steroids)
  • Vector - an indexed sequence (fast for reads and writes)

Seq<T> provides two basic methods that return a java.util.List view:

  • asJava() - an immutable view, i.e. add, remove, etc. will throw
  • asJavaMutable() - a mutable view, i.e. add, remove, etc. work as expected
CharSeq hi = CharSeq("Hi");
java.util.List<Character> list;

// ['H', 'i']
list = hi.asJavaMutable();

// ['H', 'i', '!']
list.add('!');

// ['H', 'i']
list = hi.asJava();

// 💩 throws UnsupportedOperationException
list.add('!');

The poor immutable java.util.List view works as expected :)

Additionally we decided to create something ✨unique✨, a scoped list view, where mutations result in a new, modified persistent collection instance.

  • asJava(Consumer<j.u.List>) - returns the unmodified Vavr collection
  • asJavaMutable(Consumer<j.u.List>) - returns a possibly modified Vavr collection
// = CharSeq(Hi!)
final CharSeq greet = CharSeq("Hi").asJavaMutable(list -> {
    list.add('!');
}).asJava(msg -> {
    Try.run(() -> Twitter.post(msg))
       .onFailure(LOG::error);
});

The main idea is borrowed from the fact that using mutable variables is okay in a local scope, like a method body.

A scoped List view makes it possible to keep the code clean by moving the dirty, side-effecting parts into a minimal scope.

Secure Coding Guidelines: Mutability

We all know Oracle's Secure Coding Guidelines for Java SE: Mutability.

The Vavr List view helps us to comply with these rules while increasing the performance of our code. These are the main reasons:

Guideline 6-1: Prefer immutability for value types

Persistent collections are immutable by definition. An immutable List view is conform to the Java Collection API. However, Java's immutable List still has (and will ever have) the drawback of throwing when mutator methods are called.

Guideline 6-2: Create copies of mutable output values

Here we have a clear advantage over Java's java.util.List implementations. We do not need to make a defensive copy of the original collection. We can completely forget the clone() method.

Instead a mutable List view is created in O(1) without consuming additional memory for the List elements. The mutator methods, like add() and set(), operate on the persistent collection and create new instances under the hood. Efficient memory-sharing algorithms ensure that only a bare minimum of the data structure is rewritten.

Guideline 6-12: Do not expose modifiable collections

This one is interesting. Of course we can expose a modifiable List view, because modifications will create new persistent instances under the hood (without copying!). By definition, the state of the original persistent instance cannot be altered.

There are more benefits but these catched my eyes when reading the guidelines.

Some implementation details

We managed to create one simple List view implementation that works for all of our persistent collections. The List view unit tests run for both, our persistent collections and for java.util.ArrayList, in order to ensure that the results are the same.

Complying with the java.util.List API spec wasn't trivial. For example when removing an element from a lazily evaluated Stream, we get a new Stream in O(1) that eventually removes that element.

// = Stream(a, ?)
Stream<Character> alphabet = Stream.iterate('a', c -> (char) (c + 1)).take(7);

// = Stream(a, ?)
alphabet.remove('c');

A java.util.List view requires us to return true or false, depending on whether the element was removed or not.

// = [a, b, c, d, e, f, g]
java.util.List<Integer> list = alphabet.asJavaMutable();

// = true
list.remove((Character) 'c');

// = false
list.remove((Character) 'c');

As you see, the List view is mutable, the underlying Vavr collection is persistent/immutable.

// = [a, b, d, e, f, g]
println(list);

// = Stream(a, b, c, d, e, f, g)
println(alphabet);

Conclusion

Views are great to interoperate with standard Java APIs.

The new List view is faster than creating defensive copies of Java collections. It helps us to implement clean Java code along the Secure Coding Guidelines. The new scoped views separate side-effecting code while reflecting back the changes to the persistent collection.

Our new List view is production ready. It acts exactly like Java's ArrayList. The performance characteristics of a List view are generally equal to those of the underlying persistent collection.

Have a great May Day!

- Daniel