A Brief Guide to Learning Java Concurrency

Java concurrency is one of the most challenging topics for beginners. After reading many books and blogs, and using it in a production environment, I’ve realized that having a clear learning roadmap for Java concurrency can save a lot of time. In this blog, I’ll highlight the key points of Java concurrency without diving too much into the source code or overwhelming details.

Create a thread

For demonstration purposes, we should know two methods of creating a thread in Java.
The first method is using the constructor of the Thread class directly and implementing the Runnable interface:
public static void main(String[] args){ new Thread(() -> { System.out.println(Thread.currentThread().getName()); }, "T1").start(); }
If a thread needs to return a value or throw an exception, we can implement the Callable interface and use it with FutureTask. This allows us to retrieve the result or handle exceptions via the FutureTask. This is an asynchronous processing approach. For more flexible methods of handling asynchronous tasks, please refer to: CompletableFuture.
public static void main(String[] args) throws Exception { FutureTask<String> futureTask = new FutureTask<>(() -> { return Thread.currentThread().getName(); }); new Thread(futureTask, "T2").start(); System.out.println(futureTask.get()); }

Volatile and JMM

The volatile keyword is a crucial stepping stone in understanding Java concurrency. You'll often see it in the java.util.concurrent package. Its primary function is to ensure memory visibility. But what exactly is memory visibility? To grasp this, we first need to understand the Java Memory Model (JMM). JMM isn’t a physical entity but a concept that helps Java developers understand how memory works in concurrency (and it's different from the JVM).
When two threads access a variable, each typically works with its own local memory. As a result, if thread A modifies a variable, that change might not immediately be visible to thread B, since thread A’s change isn’t flushed to main memory. The volatile keyword solves this problem by forcing the threads to access the variable's value directly from main memory, ensuring that updates are visible across threads.
https://jenkov.com/tutorials/java-concurrency/java-memory-model.html

Race conditions

Race conditions are a common issue in concurrent programming. For example, imagine we want to implement a function that increments a int variable by 1. Now, suppose two threads are performing the same task. Thread A reads the value as 1 and increments it, while at the same time, before local memory flush to main memory, thread B also reads the value as 1 and increments it. The result is that the value may be incremented only once. This example shows why volatile alone cannot guarantee atomicity—it ensures visibility but does not prevent race conditions.
To solve this, we have to prevent multiple thread access same resource. In Java, we can implement it by using synchronized key word or the Lock.

Synchronized

synchronized is designed based on Java objects (specifically the object header), so it can be applied in 4 contexts:
situations
code
lock scope
instance method
public class Counter { private int value = 0; public synchronized void addOne() { this.value += 1; } }
instance object
static method
public class Counter { private static int count = 0; public synchronized static void addOne(){ value += 1; } }
.class object
code block(lock instance)
public class Counter { private int value = 0; public void addOne() { synchronized(this) { this.value += 1; } } }
instance object
code block(lock .class object)
public class Counter { private static int value = 0; public void addOne() { synchronized(Counter.class) { this.value += 1; } } }
.class object

Lock

We can also explicitly use the Lock interface to manage the timing of lock and unlock operations when dealing with race conditions. The Lock interface provides more flexibility than the synchronized keyword. One common implementation of this interface is the ReentrantLock class.
public class ReentrantLockLab implements Runnable { public static Lock lock = new ReentrantLock(); public static int value = 0; public void run() { lock.lock(); try { value += 1; } finally { lock.unlock(); } } }

More about synchronized

synchronized is optimized through biased locking. The general idea is that acquiring a lock can be expensive, but if a lock is always held by the same thread, it doesn’t need to incur the full cost of locking each time. Here's how it works: if a lock is consistently held by a single thread, it becomes a biased lock. This allows the thread to continue acquiring the lock without using Compare-And-Swap (CAS) operations. However, if another thread tries to acquire this lock and finds that it hasn’t been released and is biased towards another thread, the lock upgrades to a lightweight lock.
In the case of a lightweight lock, if another thread attempts to acquire it, the thread is not blocked but will keep trying to obtain the lock. If too many threads are competing for the lightweight lock, the lock is then upgraded to a heavyweight lock (or a mutex lock). Threads attempting to acquire a heavyweight lock are blocked until the lock is released.
This optimization was introduced in JDK 6, but it is not enabled by default in JDK 15. For more details, see: JEP 374.

More about ReentrantLock

ReentrantLock can be either a fair or an unfair lock, depending on the constructor used. It is implemented with the help of AbstractQueuedSynchronizer (AQS), which provides a framework for building concurrent utility classes. For example, both Semaphore and CountDownLatch are also designed using AQS.
AQS supports Condition, which allows threads to communicate with each other. The await() method suspends a thread, while signal() wakes up a waiting thread, providing more flexible control over thread coordination. It's important to note that AQS can be implemented with either exclusive (mutex) locks or shared locks, but only mutex locks support Condition. Since ReentrantLock is a mutex lock, it supports Condition. However, Semaphore and CountDownLatch, which use shared locks, do not support Condition.

Atomic class

Going back to the race condition example, what if we don’t want to use synchronized blocks or locks? Java provides atomic classes in the java.util.concurrent.atomic package for atomic operations. These classes are implemented using native methods from the Unsafe class. The key idea is to leverage the Compare-And-Swap (CAS) commands provided by the operating system, allowing for atomic operations without the need for locks.
private AtomicInteger atomicInteger = new AtomicInteger(); public class MyThread implements Runnable { @Override public void run(){ for (int i = 0; i < 10; i++) atomicInteger.incrementAndGet(); } }

ThreadLocal

When writing code, we sometimes encounter a situation where each thread needs its own instance of an object from the same class. un thread safe class like SimpleDateFormat for example, we can use ThreadLocal. When use ThreadLocal, we can consider it as a key value map, where key is the Thread, and value is the instance we put in.
private final static ThreadLocal<DateFormat> THREAD_LOCAL = ThreadLocal.withInitial(()-> new SimpleDateFormat("yyyy-MM-dd HH:mm:ss")); public static void main(String[] args) throws ParseException { Date date = THREAD_LOCAL.get().parse("2024-10-01 23:01:00"); System.out.println(date); }
ThreadLocal itself doesn't store any values; instead, the object is stored within each thread. Before a thread terminates, it's important to remove the object from ThreadLocal to prevent other threads from reading stale values and to avoid potential memory leaks. For more details, see: threadlocal-memory-leak.

Life Cycle of a Thread

https://www.geeksforgeeks.org/lifecycle-and-states-of-a-thread-in-java/

ThreadPool

In most production environments, we typically don't create new threads manually. Instead, we use a ThreadPool to manage and optimize thread usage efficiently. Java provides several static methods in the Executors class for creating thread pools, but all of them internally use the constructor of ThreadPoolExecutor. Therefore, the key to understanding thread pools lies in understanding ThreadPoolExecutor. It's highly recommended to use the raw constructor to create a thread pool for better control and customization.
/** * Creates a new {@code ThreadPoolExecutor} with the given initial * parameters. * * @param corePoolSize the number of threads to keep in the pool, even * if they are idle, unless {@code allowCoreThreadTimeOut} is set * @param maximumPoolSize the maximum number of threads to allow in the * pool * @param keepAliveTime when the number of threads is greater than * the core, this is the maximum time that excess idle threads * will wait for new tasks before terminating. * @param unit the time unit for the {@code keepAliveTime} argument * @param workQueue the queue to use for holding tasks before they are * executed. This queue will hold only the {@code Runnable} * tasks submitted by the {@code execute} method. * @param threadFactory the factory to use when the executor * creates a new thread * @param handler the handler to use when execution is blocked * because the thread bounds and queue capacities are reached * @throws IllegalArgumentException if one of the following holds:<br> * {@code corePoolSize < 0}<br> * {@code keepAliveTime < 0}<br> * {@code maximumPoolSize <= 0}<br> * {@code maximumPoolSize < corePoolSize} * @throws NullPointerException if {@code workQueue} * or {@code threadFactory} or {@code handler} is null */ public ThreadPoolExecutor(int corePoolSize, int maximumPoolSize, long keepAliveTime, TimeUnit unit, BlockingQueue<Runnable> workQueue, ThreadFactory threadFactory, RejectedExecutionHandler handler) {...}
explain for ThreadPoolExecutor
TimeUnit is an enum class that provides different time units.
BlockingQueue typically has two common implementations:
  • LinkedBlockingQueue: a First-In-First-Out (FIFO) queue with a default maximum length of Integer.MAX_VALUE.
  • SynchronousQueue: a queue that transfers tasks directly between threads without holding them, used in Executors.newCachedThreadPool.
ThreadFactory : Use Executors.defaultThreadFactory() to create.
RejectedExecutionHandler : When the queue reaches its limit and the thread pool has hit its maximum number of threads, you can specify a policy to handle new tasks:
  • ThreadPoolExecutor.AbortPolicy: The default policy. It throws a RejectedExecutionException when a task is rejected. This is useful for projects where task rejection needs to be flagged immediately.
  • ThreadPoolExecutor.DiscardPolicy: Silently discards the task without throwing an exception.
  • ThreadPoolExecutor.DiscardOldestPolicy: Drops the task at the head of the queue and retries executing the new task. If the retry fails, this process repeats.
  • ThreadPoolExecutor.CallerRunsPolicy: The thread that submits the task executes it. This provides feedback control, slowing the rate of new task submissions.
How to estimate the size of the thread pool, what’s the best choice of the size of block queue? Java Concurrency in Practice give us a formula:
pool size = number of cores * percentage of target utilization * (1 + wait time / compute time);
In real-world projects, determining the optimal wait time and compute time for a thread pool can be challenging. Some strategies recommend setting the pool size based on CPU core count or TPS (Transactions Per Second), but there's no one-size-fits-all solution. In production, monitoring performance is crucial, and thorough testing before release is essential. Whenever possible, store thread pool configurations in a dynamically adjustable file. As a starting point, set the core pool size slightly above the task capacity. If there is only one thread pool in the service, for compute-intensive tasks, the core size should be close to the number of CPU cores, with the maximum size near the core size. For IO-intensive tasks, the core size can be much larger than the CPU core count, and the max pool size can be 2 to 3 times larger than the core size. If a service has multiple thread pools, you’ll need to balance and allocate CPU resources across the pools efficiently.

More in last

Having a basic understanding of Java concurrency is sufficient for building small projects. While repeatedly reading Java Concurrency in Practice can help us proactively avoid concurrency issues, effective testing and monitoring are often more impactful. Additionally, with many services being distributed today, it’s important to manage threads across machines or pods in the cloud. Tools like Redis or others for implementing distributed locks are worth exploring for further learning. Good luck!

Reference

  1. https://jenkov.com/tutorials/java-concurrency/java-memory-model.html
  1. https://openjdk.org/jeps/374
  1. https://stackoverflow.com/questions/17968803/threadlocal-memory-leak
  1. https://tech.meituan.com/2020/04/02/java-pooling-pratice-in-meituan.html
  1. Java Concurrency in Practice
Â