Implementing Json Decoders using Vavr
I'm really excited to feature our first guest post - thank you, Frederico!
Hello! I'm Frederico Honório, a software engineer from Portugal and I'd like to show you a way to work with json values, and how a combinator library might do so, in Java. We'll also use Vavr, obviously. :)
What are decoders?
A decoder is a function that takes a json value and computes a value from it. This is useful because json has very little structure which makes it cumbersome to deal with directly. In Java, this is often solved by libraries like Jackson or Gson which can map json values to java objects. But these approaches work through reflection, by inspecting the shape of the classes we want at runtime.
Decoders are implemented in many functional programming languages, although I first discovered them while looking into the Elm language. Their main advantage is that they are declarative and composable. Hopefully I'll make this clear by the end of the article.
A small detour: modeling Json data
Being functions, decoders have to accept values, right? We'll start by
modeling the types of json values with a closed class hierarchy:
public abstract class JValue {
private JValue() {} // private so it can't be extended outside of this module
public static class JString extends JValue {
public final String value;
public JString(String value) { this.value = value; }
// omitted toString, hashCode, equals, but they are necessary
}
public static class JObject extends JValue {
public final Map<String, JValue> entries;
public JString(Map<String, JValue> entries) { this.entries = entries; }
// omitted toString, hashCode, equals, but they are necessary
public Option<JValue> get(String key) {
return entries.get(key); // this is a vavr map, so it returns Option<T>
}
}
// you could also use an abstract method with multiple implementations
public Option<JString> asJString() {
return Option.when(this instanceof JString, () -> (JString) this);
}
public Option<JObject> asJObject() {
return Option.when(this instanceof JObject, () -> (JObject) this);
}
}
In order to keep it short we'll just implement the object
and string
json types, but the remaining variants should be straightforward. We're just defining wrappers for all possible json types.
From now on we'll refer to a magical parse
function that can parse a string into a JValue
. It's often possible to write a parser for a json library where the resulting value is an structure like ours, you can find an example for Jackson here.
Our first decoder
Now that we have data structures, we can start. A decoder is a function that implements the following interface:
interface Decoder<T> {
Either<String, T> decode(JsonValue json);
}
The left side of the Either
will have an error message in case the decoder fails, otherwise the right side will have a value.
Our first decoder will just read string values, so:
Decoder<String> JStringD = new Decoder<String> {
@Override
public Either<String, String> decode(JsonValue json) {
return json.asJString()
.map(wrapper -> wrapper.value)
.toRight(() -> "expected a string, got: " + json.toString());
}
}
Fortunately, since Decoder
has a single method, we can use a lambda to define a Decoder
instances:
Decoder<String> JStringD = json -> json
.asJString()
.map(wrapper -> wrapper.value)
.toRight(() -> "expected a string, got: " + json.toString());
And for JObject
:
Decoder<JObject> JObjectD = json -> json
.asJObject()
.toRight(() -> "expected an object, got: " + json.toString());
Using our parse
we can now parse this json value "hello"
to get the java string hello
:
JValue json = parse("\"hello\"").get(); // you would obviously handle the error
JStringD.decode(json); // Right(hello)
Yay, progress! Now to the fun part.
Combinators
Combinators are functions that build other functions, in our case, we can construct decoders based on other decoders. We'll now see some examples.
We often use primitive json types to encode more structured types, for instance we might have something like:
{"created": 1493856000}
where created
is, in our code, a java.time.Instant
instead of an int
. So we need a way to transform a value that has been decoded. We'll call it map
, since it is very similar in spirit to mapping over a List
or an Option
:
default <U> Decoder<U> map(Function<T, U> f) {
return json -> this.decode(json).map(f);
}
Essentially decA.map(f)
creates a new decoder that uses the decode
method of decA
and then applies f to the result, if it succeeds. Assuming we have a JIntegerD
that decodes integers, we can use it like this:
Decoder<Instant> instantD = JIntegerD.map(seconds -> Instant.ofEpochSecond(seconds));
We'll also want to decode objects with multiple fields, so let's start by decoding a single field and worry about combining fields later. We can be sure that the user of our field
decoder will need to provide a key, but how do we know that the value is valid? Well, we'll just accept another Decoder! Our function will accept a Decoder and return another Decoder.
And so, our field
is defined as follows:
public static <T> Decoder<T> field(String key, Decoder<T> valueDec) {
return json -> JObjectD.decode(json)
.flatMap(obj -> obj.get(key)
.toRight("missing field: " + key))
.flatMap(val -> valueDec.decode(val)
.mapLeft(decError -> "field " + key + ": " + decError)
);
}
Which reads:
- we ensure that the top value is a json object
- given an json object, we ensure the value for the key exists
- if it doesn't we provide an error message
- given the value is defined in the object, we apply the provided decoder
- if decoding fails we provide an error message referencing the field
We can now look at a value like {"name": "John", "age": 19}
and read the name
! But we kinda need both name
and age
, right? We'll get there, but let's do another
combinator first.
It's often useful to deal with values with differing shapes that we unify to
a single type, for example we might want to accept both these values:
{"id": 123}
{"id": "123"}
by converting the first one to a string. Another example could be:
{"square": {"side": 5}}
{"circle": {"radius": 3}}
where both Square
and Circle
are subtypes of Shape
. For this we'll make a new combinator, oneOf
:
public static <T> Decoder<T> oneOf(Decoder<T> first, Decoder<T> second) {
return json -> {
Either<String, T> fstRes = first.apply(json);
if (fstRes.isRight())
return fstRes;
Either<String, T> sndRes = second.apply(json);
if (sndRes.isRight())
return sndRes;
// both are left
return Either.left("both decoders failed: (" + fstRes.getLeft() + ") & (" + sndRes.getLeft() + ")");
};
}
There's no reason to limit oneOf
to only two decoders, this is only for
demonstration purposes.
Some assembly required
We'll now see a way to chain decoders. One example is similar to the previous one, but in this case we have a versioned
protocol:
{"version": "v1", "name":...}
{"version": "v2", "Name":...}
We can do this by first decoding the version
field and selecting a decoder depending on that value, we'll write an andThen
(could also be called flatMap
) combinator for this:
default <U> Decoder<U> andThen(Function<T, Decoder<U>> f) {
return json ->
this.decode(json)
.map(decoded -> f.apply(decoded).decode(json))
.getOrElseGet(Either::left);
}
This one might look a little strange, let's break it down:
- First we apply the
this
decoder, in our example it decodes theversion
field - If the decoder succeeded, apply
f
which chooses the next decoder, in our examplef
takes the version string and
selects the decoder- we apply the selected decoder to the same value, in this case, we read the same object that contained the
version
- we apply the selected decoder to the same value, in this case, we read the same object that contained the
- If it fails we just return the error
And this is how we could use it for our example:
field("version", JStringD)
.andThen(version ->
version.equals("v1") ? v1Decoder :
version.equals("v2") ? v2Decoder :
// fail decoding since we don't know the version
);
With andThen
we finally have a way to combine multiple fields into a single value.
Given the constructor User(String firstName, String lastName)
we can build a decoder for User
:
Decoder<User> userDecoder = field("firstName", JStringD)
.andThen(firstName -> field("secondName", JStringD)
.map(secondName -> new User(firstName, secondName))
);
Notice that the last lambda has both firstName
and secondName
in scope, because we have nested map
inside andThen
. This looks confusing, and it's easy get the nesting wrong, so we'll write some combinators that express the pattern of combining multiple values into one, for example, combining 3 values:
public static <A, B, C, T> Decoder<T> map3(Decoder<A> aD, Decoder<B> bD, Decoder<C> cD, Function3<A, B, C, T> f) {
return
aD.andThen(_a ->
bD.andThen(_b ->
cD.map(_c ->
f.apply(_a, _b, _c))));
}
Of course this isn't pretty either, but as a library function we only need to write it once, and now, given the constuctors:
StreeName(String streetName)
User(String firstName, String lastName, StreeName streetName)
we can build the decoder for User
:
Decoder<StreetName> streetNameD = JStringD.map(StreeName::new);
Decoder<User> userDecoder = map3(
field("first_name", JStringD),
field("last_name", JStringD),
field("address", field("street", streetNameD)),
User::new
);
Which is much cleaner, and is visually consistent with the structure we are expecting. We did it!
Conclusion
Is it better than the reflection-based approach? Here are the main benefits of decoders over it:
- Much more flexible, you don't need to load Jackson modules or play nice with other frameworks to work with the datatypes you want, just write a function.
- Checked at compile time, you can't say you have a decoder for
User
unless you provide all the values that the constructor requires. - There are no implicit conversions (unless you want to). Many Java libraries will, for example, silently convert a json number into a string if you read the value as string. This allows obvious schema errors to hide or to be visible much later.
- Data is decoupled from its json representation.
The obvious downside is that you have to write more code. In the end it depends, if you've never had a problem with reflection-based json libraries there might be no reason to change.
We have created an bunch of functions, condensing a series of verbose operations (e.g. is this an object? does it have a name
field?, is that field a string?) into an declarative API for defining an expected structure.
Additionally, the API is completely open for composition, you can combine decoders almost infinitely (be mindful of the stack, though) for complex structures and build your own combinators. This combinator approach is also not constrained to json, and it offers interesting benefits for library design in other domains.
Some other examples where combinators are used are scodec, for working with binary data, and Parsec, for building parsers.
Here are some Java decoding libraries for json built this way:
- json-decoder, which I built and motivated this blog post
- immutable-json, which is has some differences and more features, and was an obvious inspiration
Finally, the code in this post is available here.
Hope you find this as interesting as I did!