NashTech Blog

Caffeine Cache: A High-Performance Caching Library for Java

Picture of hanguyenvan
hanguyenvan
Table of Contents

Introduce

In modern application development, caching is crucial for improving performance and reducing latency. Whether you’re fetching data from a database, making expensive computations, or calling external APIs, caching can significantly boost your application’s responsiveness. Enter Caffeine – a high-performance, near-optimal caching library for Java 8+.

What is Caffeine Cache?

Caffeine is a modern, high-performance caching library for Java that provides an in-memory cache using a Google Guava-inspired API. It’s designed to be a superior alternative to Guava’s cache, offering significant performance improvements. Developed by Ben Manes, Caffeine leverages advanced algorithms and contemporary Java features to deliver exceptional caching performance.

Why Choose Caffeine?

  1. Blazing Fast Performance: Caffeine uses a sophisticated eviction policy called W-TinyLFU, which provides near-optimal hit rates. In simple terms, it’s incredibly smart about keeping the most valuable data in the cache, ensuring you get the most bang for your memory buck.
  2. Flexible Eviction Policies: Caches have limited memory. Caffeine provides multiple ways to decide what to remove (evict) when the cache is full:
    • Size-based: Limit the cache to a certain number of entries.
    • Time-based: Expire entries after a set duration (e.g., 10 minutes after write, or 1 minute after the last access).
    • Reference-based: Evict entries based on Java’s garbage collection rules (using weak or soft references).
  3. Fluent and Easy-to-Use API: Building a cache is a joy with Caffeine’s builder pattern. You can chain methods together to configure your cache in a clean, readable way.
  4. Asynchronous Operations: Need to load cache values from a remote service or database without blocking your main thread? Caffeine’s AsyncCache has you covered, returning a CompletableFuture for non-blocking workflows.
  5. Helpful Statistics: Wondering how effective your cache is? Caffeine can record stats like hit rate, eviction count, and load times, which are invaluable for monitoring and tuning.

Getting Started: Adding Caffeine to Your Project

First things first, let’s add the dependency. If you’re using Maven, add this to your pom.xml:

<dependency>
    <groupId>com.github.ben-manes.caffeine</groupId>
    <artifactId>caffeine</artifactId>
    <version>3.1.8</version> <!-- Check for the latest version on Maven Central -->
</dependency>

The Three Flavors of Caffeine Caches

Caffeine offers three main types of caches, each suited for different use cases.

1. Manual Cache (Cache)

This is the most straightforward type. You manually put values into the cache and use getIfPresent to retrieve them. If a value isn’t there, you get null.

The most powerful method here is get(key, mappingFunction). It performs an atomic “get-or-create” operation. If the key is present, it returns the value. If not, it computes the value using the provided function, stores it in the cache, and then returns it. This prevents multiple threads from trying to compute the same missing value simultaneously.

import com.github.benmanes.caffeine.cache.Cache;
import com.github.benmanes.caffeine.cache.Caffeine;

import java.util.concurrent.TimeUnit;

public class ProductService {

    private final ProductRepository repository = new ProductRepository();

    // Build a cache that holds up to 100 products and expires them
    // 10 minutes after they are last written to.
    private final Cache<String, Product> productCache = Caffeine.newBuilder()
            .maximumSize(100)
            .expireAfterWrite(10, TimeUnit.MINUTES)
            .build();

    public Product getProductById(String id) {
        System.out.println("Attempting to get product with ID: " + id);

        // The 'get' method is an atomic "get-or-compute" operation.
        // If the key is not in the cache, the lambda is executed to load the value.
        Product product = productCache.get(id, key -> {
            System.out.println("Cache miss! Fetching from database for ID: " + key);
            return repository.findById(key);
        });

        System.out.println("Returning product: " + product.getName());
        return product;
    }
}

If you call getProductById("P123") twice, you’ll see “Fetching from database” only the first time. The second call will be instantaneous!

2. Loading Cache (LoadingCache)

LoadingCache is a self-populating cache. When you build it, you provide a CacheLoader that knows how to fetch a value for any given key. Then, you just call get(key), and the cache handles the rest. If the value is present, it’s returned; if not, the loader is automatically invoked.

This simplifies your code by centralizing the loading logic.

import com.github.benmanes.caffeine.cache.Caffeine;
import com.github.benmanes.caffeine.cache.LoadingCache;

import java.util.concurrent.TimeUnit;

public class UserService {

    private final UserRepository repository = new UserRepository();

    // A LoadingCache automatically fetches missing values using the provided loader.
    private final LoadingCache<Integer, User> userCache = Caffeine.newBuilder()
            .expireAfterAccess(1, TimeUnit.HOURS) // Expire after 1 hour of inactivity
            .build(userId -> repository.findById(userId)); // The CacheLoader lambda

    public User getUserById(Integer id) {
        System.out.println("Fetching user with ID: " + id);
        // Just call get(). The cache handles the loading logic automatically.
        return userCache.get(id);
    }
}

The calling code is cleaner. You don’t need to pass a loading function every time; it’s part of the cache’s definition.

3. Asynchronous Cache (AsyncCache and AsyncLoadingCache)

What if your data loading involves a network call that returns a CompletableFuture? An AsyncCache is the perfect fit. It stores CompletableFuture<V> as values, allowing your application to remain non-blocking while waiting for data.

import com.github.benmanes.caffeine.cache.AsyncLoadingCache;
import com.github.benmanes.caffeine.cache.Caffeine;

import java.util.concurrent.CompletableFuture;
import java.util.concurrent.Executor;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;

public class WeatherService {

    private static final Executor executor = Executors.newFixedThreadPool(10);

    // Build an async cache that loads data on a separate thread pool.
    private final AsyncLoadingCache<String, WeatherData> weatherCache = Caffeine.newBuilder()
            .maximumSize(100)
            .expireAfterWrite(10, TimeUnit.MINUTES)
            .buildAsync((city, executorService) -> fetchWeatherFromApi(city));

    public CompletableFuture<WeatherData> getWeatherForCity(String city) {
        // The get() method immediately returns a CompletableFuture.
        return weatherCache.get(city);
    }

    // Simulates a slow, non-blocking network call
    private CompletableFuture<WeatherData> fetchWeatherFromApi(String city) {
        return CompletableFuture.supplyAsync(() -> {
            System.out.println("Fetching weather for " + city + " from remote API...");
            try {
                // Simulate network latency
            } catch (InterruptedException e) {}
            return new WeatherData(city, 25.5);
        }, executor);
    }
}

Advanced Configuration Options

Eviction: by count or by weight

  • maximumSize(n): evict to keep at most n entries.
  • maximumWeight(w) + weigher: evict by “cost” (e.g., bytes). Use this if entries vary widely in size.
Cache<String, String> cache = Caffeine.newBuilder()
    .maximumSize(1000)  // Maximum number of entries
    .build();

// Or weight-based
Cache<String, String> weightedCache = Caffeine.newBuilder()
    .maximumWeight(10_000)
    .weigher((key, value) -> value.length())  // Custom weight calculation
    .build();

Expiration: fixed or per-entry

  • expireAfterWrite(duration): TTL since write.
  • expireAfterAccess(duration): idle timeout.
  • refreshAfterWrite(duration): softens staleness without hard TTL.
  • expiry(Expiry<K,V>): variable TTL per entry.
// Expire after write
Cache<String, String> cache1 = Caffeine.newBuilder()
    .expireAfterWrite(10, TimeUnit.MINUTES)
    .build();

// Expire after access
Cache<String, String> cache2 = Caffeine.newBuilder()
    .expireAfterAccess(5, TimeUnit.MINUTES)
    .build();

// Custom expiration
Cache<String, String> cache3 = Caffeine.newBuilder()
    .expireAfter(new Expiry<String, String>() {
        @Override
        public long expireAfterCreate(String key, String value, long currentTime) {
            return TimeUnit.MINUTES.toNanos(10);
        }

        @Override
        public long expireAfterUpdate(String key, String value, 
                                      long currentTime, long currentDuration) {
            return currentDuration;
        }

        @Override
        public long expireAfterRead(String key, String value, 
                                    long currentTime, long currentDuration) {
            return currentDuration;
        }
    })
    .build();

Refresh Cache

Automatically refresh cache entries after a specified duration:

LoadingCache<String, String> cache = Caffeine.newBuilder()
    .refreshAfterWrite(1, TimeUnit.MINUTES)
    .build(key -> loadData(key));

Cache Statistics

Monitor cache performance with built-in statistics:

Cache<String, String> cache = Caffeine.newBuilder()
    .maximumSize(100)
    .recordStats()  // Enable statistics
    .build();

// Use the cache
cache.put("key1", "value1");
cache.getIfPresent("key1");
cache.getIfPresent("key2");  // Cache miss

// Get statistics
CacheStats stats = cache.stats();
System.out.println("Hit rate: " + stats.hitRate());
System.out.println("Miss rate: " + stats.missRate());
System.out.println("Load count: " + stats.loadCount());
System.out.println("Eviction count: " + stats.evictionCount());

Removal Listeners

Get notified when entries are removed:

Cache<String, String> cache = Caffeine.newBuilder()
    .maximumSize(100)
    .removalListener((key, value, cause) -> {
        System.out.printf("Removed key=%s, value=%s, cause=%s%n", 
                         key, value, cause);
    })
    .build();

Best Practices

  1. Choose Appropriate Eviction Policy: Use expireAfterWrite for data that becomes stale after creation, and expireAfterAccess for frequently accessed data.
  2. Set Maximum Size: Always set a maximum size to prevent OutOfMemoryError.
  3. Enable Statistics in Development: Use .recordStats() to monitor cache effectiveness during development.
  4. Use Async Loading for I/O Operations: When loading involves network calls or disk I/O, use AsyncLoadingCache.
  5. Implement Proper Key Design: Ensure cache keys are immutable and implement proper equals() and hashCode().
  6. Handle Null Values Carefully: Decide whether to cache null values or throw exceptions.
  7. Monitor in Production: Use cache statistics to tune cache size and eviction policies.

Conclusion

Caffeine is a powerful, production-ready caching library that should be your go-to choice for in-memory caching in Java applications. Its high performance, rich feature set, and excellent Spring integration make it ideal for modern applications.

Whether you’re building a microservice, a monolithic application, or a high-throughput system, Caffeine provides the caching capabilities you need with minimal overhead.

📌 Note: Caffeine is a local in-memory cache (single JVM). Cache data is not shared across multiple server instances. For distributed caching needs, use Redis, Hazelcast, or similar solutions.

Picture of hanguyenvan

hanguyenvan

Leave a Comment

Suggested Article

Discover more from NashTech Blog

Subscribe now to keep reading and get access to the full archive.

Continue reading