12 March 2018
CATEGORY
FunctionalFun
TAGS
FunctionalJavaStateClosureCacheLazyCounter

FunctionalFun -- States in Functions: Cache, Lazy and Counter

Breaking the functional rules and have some fun!

Introduction

Functional programming advocates for programs with no states. Sometimes, however, we need to use states ... not for holding data that change over time like entities; but for other facilitating purposes. There happen to be a way to hold state in Java functions without turning it into an object. This article discusses such way to add state for your functions with practical examples: cache, lazy-initialization with memorization and proving indexes.

States for the Functions

One of the important principal of Functional Programming (FP) is to avoid having states. States generally make it more difficult to reason about the code, basically, by adding temporal dimension to the logic. They also represent the side effect also known as the enemy number one of FP. But just like grey, there are fifty shades of functional purity. If having states provide practical benefits AND (big 'AND') they do not cause much problems -- such as unpredictability, it is sometime worth to consider. So, how will we go about having some states in our functions without turning it to an object? By staying a function, the state has no way to unexpectedly leak out as there is only one way in and out of the function.

Secret Storage Space

There is a secret storage space in Java that most people often overlook (`under exploited` might be a better word). That storage space is the closure constants. The closure constants are the constants that is attached to the lambda that use it. Consider the following code.

public List<Integer> getSmall(int threshold) {
  return list.stream()
      .filter(i -> i <= threshold)
      .collect(toList());
}

threshold is the closure constant for the lambda i -> i <= threshold. From the look of it, threshold does not seem to be special as it looked just like any other variables (or parameters) that can be accessed by the code in the scope. But in reality, it is attached to the lambda (created together with the lambda) and tagalong with the lambda through the adventure in filter method. If it still not clear to you, take a look at another code.

private Predicate<Integer> smallerThan(int threshold) {
  return i -> i <= threshold;
}

...
List<Integer> smalls
    = list.stream()
      .filter(smallerThan(5))
      .collect(toList());
...

As you can see, the lambda created inside smallerThan is returned out of the method. And threshold follows along as a good companion, eager to help answering if the given number is small. This can be seen as a partial application of function ... or some form of currying. Now that you know the secret place, let make use of it.

Quick cache

Let say you need to read some information from the database or over the network. This information does not get changed often so it is ok to cache it for a short period of time such as the life of a request. Using a closure constant to hold the value we can use it to create a cache. The following code demonstrates how to use it to create a cache for any functions.

public static <INPUT, OUTPUT> Function<INPUT, OUTPUT> cache(Function<INPUT, OUTPUT> inFunction) {
    val cache = new ConcurrentHashMap<INPUT, OUTPUT>();
    return in -> cache.computeIfAbsent(in, inFunction::apply);
}

NOTE: Lombok's val is used for brevity.
And then you can use the cache function like this ...

@RequestScoped
public class MrScopeBean {

  @Inject
  private CompetitionService competitionService;
  @Inject
  private UserService userService;
  @Inject
  private CityService cityService;

  private final Function<String, String> cityName = cache(cityService::getCityById).andThen(City::getName);

  /** Returns the winner name and the city. */
  public List<String> getWinnerNamesAndCities() {
    return competitionService
        .getWinners().stream()
        .map(userService::getUserById)
        .map(user -> user.getName() + " @ "+ cityName.apply(user.getCityId()))
        .collect(toList());
  }

}

The function returned from the method cache has a cached embedded inside. So multiple calls for the same city id will only trigger the actual service call once.

WARNING: Improperly implemented cache can cause memory leak!
Before we go on, let's get this warning out first. Caching improperly can cause nasty memory leak that results in the certain death of your application. So, put the great care when use cache in long running application. The cache implementation shown here are reasonable for short-to-medium live caching need. Since the actual cache map is with the cityName function, once the function is garbage collected, the whole map is freed to be garbage collected as well. Thus, putting the cached-function object (cityName) in request scoped object or bean, will limit its live span there -- given it is not leaked to outside this bean.

Lazy initialization with memorization

Let say you have an operation that is expensive to do and the result of the operation can be reused within an acceptable period of time. The function below might be useful.

public static <T> Supplier<T> lazy(Supplier<T> supplier) {
  val cacheKey  = new Object();
  val reference = new AtomicReference<Object>();
  return ()->{
    if (reference.compareAndSet(null, cacheKey)) {
      try {
        val value = supplier.get();
        reference.set((Supplier<T>)(()->value));
      } catch (RuntimeException e) {
        reference.set(e);
      }
    }
    while (!(reference.get() instanceof Supplier)) {
      if (reference.get() instanceof RuntimeException)
        throw (RuntimeException)reference.get();
    }
    return ((Supplier<T>)reference.get()).get();
  };
}

In this function, a supplier to perform and return something is given as a parameter. An AtomicReference is then used to stored an object as a cache key to mark that the first processing has started. When the computation is done, a supplier of the value is then set to the reference (replacing the cache key). In the case of exception, the exception will then be assigned to the reference (again, replacing the cache key). If other threads get in while the computation is not done yet, that threads will be in the while loop until the supplier is done or an exception is found as the value. It is done this way to ensure that the supplier only run when needed and only run once no matter how many time it is called (even calls from multiple threads). You can see how I test it here: ClosureConstantTest on GitHub

With all that, the function might be used like this.

@RequestScoped
public class MrScopeBean {

  @Inject
  private ConfigService configService;
  @Inject
  private UserService userService;

  private final Function<String, User> userById = userService::getById;
  private final Supplier<Configurations> configurations = lazy(configService::readConfigFromFile);

  public boolean isQualify(String id) {
    val userPoint = userById.andThen(User::getLoyatyPoint).apply(id);
    val qualPoint = configurations.get().getQualifiedPoint();
    return userPoint >= qualPoint;
  }
}

In this example code, reading configuration from a file is expensive so the lazy function is used to make it lazily applied and its result is memorized. Just like the cache example earlier, the configurations is only stored within the life of if the object configurations.

Counter: Index in a Stream

Let's say, you need to have access to the index of elements within the code the use Java8 stream API. For example, given a list, you need to filter it with a certain condition and return the indexes of the ones fitting the condition. Here is how you might approach this:

public static <INPUT, OUTPUT> Function<INPUT, OUTPUT> withIndex(BiFunction<INPUT, Integer, OUTPUT> body) {
  val index = new AtomicInteger();
  return input -> body.apply(input, index.getAndIncrement());
}

The funcion above will give you access to the index of each element so you can makes use of it.

public List<Index> getQualifiedWinersIndexes(List<User> winners) {
  return winners.stream()
    .map(withIndex(Pair::new))  // Pair is a tuple of 2
    .filter(pair->pair._1.getLoyatyPoint() > configurations.get().getQualifiedPoint())
    .map(pair._2)
    .collect(toList());
}

In the code above, the winners (Users) is streamed and mapped to a pair using withIndex. withIndex take a BiFunction of index and the element (the winner), then return a pair holding both index and the winner. Then, we filter the winner that has a qualified loyalty point. Finally, we collect only the index and return it back.

Or if you want to print the element with its index, you might make the one with BiConsumer instead.

public static <INPUT, OUTPUT> Consumer<INPUT> withIndex(BiConsumer<INPUT, Integer> body) {
  val index = new AtomicInteger();
  return input -> body.accept(input, index.getAndIncrement());
}
public void print(List<User> winners) {
  System.out.println(
    winners.stream()
    .filter(winner->winner.getLoyatyPoint() > configurations.get().getQualifiedPoint())
    .map(withIndex((winner, idx)->idx + ": " + winner.getName()))
    .collect(joining("\n"))
  );
}

The index value is actually the incremental of an atomic integer like a counter. So every time, the function is called, the index got incremented. Since the function is called once for each iteration, you get an incremental index for each of those. But that means the index is not exactly the index of the element in the original list, but the index of the element that pass through that operation. If the stream pass through filters before going though withIndex, for example, the result index will be the index of the elements after the filter as you see in the second example.

Conclusion

In a pure functional programming, functions must not have states. However, there are practical reasons for function to have states. Closure constants holding mutable objects can be used as private storage of a function. They can be used to store necessary states. Proper consideration must be given to ensure benefits without introducing complexity and unpredictability.

What do you think about states in functions? Will you use something like these in your code? Have you use closure constants for something rather than just currying?

Happy coding!
Nawa Man

Comments

Thank you for keeping the comment section positive, constructive and respectful environment. I do appreciate constructive criticism & respectful disagreement! I have ZERO tolerant for disrespect, harassment, threats, cyber-bullying, racism, sexism attacks, foul language, and spam. Comments will be actively moderated and abusive users will be blocked. Keep it civil! :-)