Field access interfaces for more readable Java stream code.

An exploration to find a way to improve readability of Java stream code

Introduction

Code using Java stream API is not very readable. To be fair, it was parially because stream API is based on function-programming paradigm which most Java programmers are not familiar with. Another problem, arguably, is the rigidity of the API. Well actually, it is the rigidity of the language. By that I meant the strict syntax and the strong typing force us to write code using terminalogies that does not make much sense (names like map, flatMap), verbose and unconcise expressions for lambda (lack of default variable name like 'it' in groovy) as well as the generic typing of function types (Function).

In the previous post, I discuss Lombok's extension methods (https://dzone.com/articles/lomboks-extension-methods) and touch a bit that they can be used to make stream more readable. This post explores another way to make the code more readable. Disclaimer: Due to the exploring nature, it might not necessary results in a useful techniques comparing the exisitng methods. The aims here are to document the attempts and just in case they are useful in some situation. Additionally, we will learn about classes in java.util.function pacakge along the way and might be able to use some of the techniques found here in ourown code.

In this post, we will talk about field-access interfaces. A field-access interface is a representations of a field getter with added functionalities. These functionality will allow us to write a more readable stream code by allowing more expressive use of the field.

Field access

Accessing a field is done normally by associated getter method. Java already have a way to access to getter method easily using method reference.

@Data
public class Person {
  private Stirng name;
}

val persons = Arrays.asList( ... some persons ...);
val personNames = persons.stream()
    .map(Person::getName)
    .collect(toList());
System.out.println(personNames);

Noticed that I make use of lombok.Data, lombok.val (type inference) and static import of java.util.function.Collectors.toList() for the purpose of brevity. The term Person::getName in the code is the method reference to the getter getName of the field name. This reference is then casted to Function<Person, String> meaning that given a Person a string can be returned (the name). This is a better and shorter form of the lambda person->person.getName(). So what more to it, you may ask. Let's say we want to get persons with starts with 'smi', a typical usecases for autocompletion (like looking for 'smith'). We can write....

  public List&let;Person> autoCompleteByName(String term, int limit) {
    val allPersons = getAllPersons();
    val foundPersons = persons.stream()
        .filter(person->person.getName().startsWith(term))
        .limit(limit)
        .collect(toList());
    return foundPersons;
  }

Because we are not just using getName but also perform some processing to it (filter in this case), we can't just use the method reference. The code works but the person->person .... part of the code might irritate some people (e.g., me). I wish Java has something like it in Groovy (it might be comming in the future Java in the form of single underscore _). So let see if we can do better. We can extract that portion of the code out as a method. Like this ...

  private <HOST, FIELD> Predicate<HOST> startsWith(Function<HOST, FIELD> access, FIELD value) {
    return host->access.apply(host).startsWith(value);
  }
  public List<Person> autoCompleteByName(String term, int limit) {
    val allPersons = getAllPersons();
    val foundPersons = persons.stream()
        .filter(startsWith(Person::getName, term))
        .limit(limit)
        .collect(toList());
    return foundPersons;
  }

The above code is reasonably readable. In most cases, we can be satify with just here. In fact, I would recommend to use this type of refactoring for most practical situations. But something inside me is still itching for a more natually readable way. Perhaps it is the fact that the code calls verb(object, subject). So let's go further with the following code.

  @FunctionalInterface
  public interface StringField<HOST> extends Function<HOST, String> {
    public default Predicate<HOST> startsWith(String term) {
      return host -> this.apply(host).startsWith(term);
    }
  }

  private static final StringField<Person> personName = Person::getName;

  public List<Person> autoCompleteByName(String term, int limit) {
    val allPersons = getAllPersons();
    val foundPersons = persons.stream()
        .filter(personName.startsWith(term))
        .limit(limit)
        .collect(toList());
    return foundPersons;
  }

Now the filter is done using personName.startsWith(term) which read more natural (object.verb(subject)). Please read until the end (the discussion section) if you don't think this is a good idea. For now let me explain how all this works. Basically, we create a StringField interface that extends Function<HOST, String>. The extension inherits the method String apply(HOST host) which stay abstract. Then we add a default method Predicate<HOST> startsWith(String term) so that the field can check if the host's field starts with the term. Since there is only one abstract method in the interface, we can annotate it with @FunctionalInterface. Then, we create a personName constant of the type StringField given the value of the field getter reference Person::getName. When I use this, I normally put this constant in the host class, i.e., Person. This getter reference provides the implementation to the abstract method String apply(HOST host) where HOST is Person. With all that done, we can use personName.startsWith(term) to create a Predicate<Person> that the filter method wants.

Armed with this technique, we can write many of similar intefaces and methods. For examples ...

@FunctionalInterface
public interface FieldAccess<HOST, TYPE> extends Function<HOST, TYPE> {
    public default Predicate<HOST> is(TYPE value) {
        return host -> Objects.equals(this.apply(host), value);
    }
    public default Predicate<HOST> isNot(TYPE value) {
        return host -> !Objects.equals(this.apply(host), value);
    }
    public default Predicate<HOST> isNull() {
        return host -> Objects.isNull(this.apply(host));
    }
    public default Predicate<HOST> isNotNull() {
        return host -> Objects.nonNull(this.apply(host));
    }
}

@FunctionalInterface
public interface StringField<HOST> extends FieldAccess {
  public default Predicate<HOST> startsWith(String term) {
      return host -> this.apply(host).startsWith(term);
  }
  public default Predicate<HOST> matches(String regex) {
      return host -> this.apply(host).matches(regex);
  }
  public default Predicate<HOST> isEmpty() {
      return host -> this.apply(host).isEmpty();
  }
  public default IntField<HOST> length() {
      return host -> this.apply(host).length();
  }
}

@FunctionalInterface
public interface CollectionField<HOST, TYPE, COLLECTION extends Collection<TYPE>>
                    extends FieldAccess<HOST, COLLECTION> {
    public default Predicate<HOST> contains(TYPE value) {
        return host -> this.apply(host).contains(value);
    }
    public default Predicate<HOST> contains(Predicate check) {
        return host -> this.apply(host).stream().anyMatch(check);
    }
    public default StreamField<HOST, TYPE> stream() {
        return host -> this.apply(host).stream();
    }
}

@FunctionalInterface
public interface StreamField<HOST, TYPE> 
                    extends FieldAccess<HOST, Stream<TYPE>> {
    
}

@FunctionalInterface
public interface IntField<HOST> extends ToIntFunction<HOST> {
    public default Predicate<HOST> equalsTo(int value) {
        return host->this.applyAsInt(host) == value;
    }
}

I hope you can easily guess what each of them in the code above are about. And with those field-access interfaces, you can write the following.

val total = orders.stream()
        .filter(orderStatus.isNot("CANCEL"))
        .filter(orderItems.contains(itemId.equalTo(1)))
        .flatMap(orderItems.stream())
        .mapToInt(itemPrice)
        .sum();

Except for the stream API method names, the code now read mostly like English.

Discussion

So, how do you like it? To me the last code is much pleaser to my eyes. It makes the code looks more like a DSL. It makes the code more friendly to those non developers like the domain experts to the point they might be able to help validate the intention of the code. However, the amount of effort to make it happen is quite a lot.

The effort might be worth it if the code deal a lot with domain-specific terminalogies. Or, it often involves more than a line of code to be put in lambda while still readable.

The technique might also make sense if there are a lot of code dealing with sub objects while continue to stream the parent objects. Operations such as

.filter(orderItems.contains(itemId.equalTo(1)))
.filter(orderItems.contains(itemPrice.lessThan(5)))
.filter(orderItems.contains(itemCategory.equalTo("PART")))

When do a lot of these, it might be worth create a field access for them.

Another situation is when working with something like BigDecimal (used a lot in finance or scientific related application). Operations done to BigDecimal is quite verbose and often involve hidden context (MathContext -- precision and rouding mode in this case). Those context properties can be embeded into BigDecimalField, out of sight.

But for most cases, the amount of work and what you got might not be worth it. If there something like lombok annotation that add all these field access for us, it will be much more attractive to use.

Finally, some people like to understand how the code does its job on one glance. To me, what more important is what the code does or attempt to do. It is better if it also shows how it does it. But every often, the "how" parts obscure the overall code and make it harder to find things or understand the big picture. I prefer to see the what the code is trying to do. Then, zoom in to see how it does them. This way I can scan and locate part of the code that I am looking for. Thus, I lean toward the code that have its implementation details extracted out. That said, it is you and your team to agree on what is readable.

Conclusion

This article explored field-access interfaces as a way to make stream code more readable. The field-access interfaces can access to the field and also includes the additional operations which returns appropriate function objects required by the stream API. This makes the code reads more naturally. However, it requires quite a lot of work to prepare these interfaces.

So, how do you like it? Do you see yourself using the technique? Do you think that those are quite a lot of work than it worth? What technics do you use to make your stream code more readable? Do you like the code that explains what it does even if it hides the implementation? Or you like the code that show you everything?

Happy coding! Nawa Man

Comments

Thank you for keeping the comment section positive, constructive and respectful environment. I do appreciate constructive criticism & respectful disagreement! I have ZERO tolerant for disrespect, harassment, threats, cyber-bullying, racism, sexism attacks, foul language, and spam. Comments will be actively moderated and abusive users will be blocked. Keep it civil! :-)