Concurrent Collections

Thread-safe collections for concurrent applications.

Synchronized vs Concurrent Collections

Understanding the difference is crucial for performance!

Visual: Lock Granularity

Java Concurrent Collections

ConcurrentHashMap

ConcurrentHashMap provides thread-safe hash map operations with high concurrency.

How It Works

Java 7: Segment locking (16 segments by default) Java 8+: CAS operations + synchronized blocks for individual buckets

Visual: ConcurrentHashMap Architecture

1
import java.util.concurrent.ConcurrentHashMap;
2

3
public class ConcurrentHashMapCache<K, V> {
4
    private final ConcurrentHashMap<K, V> cache = new ConcurrentHashMap<>();
5

6
    public V get(K key) {
7
        return cache.get(key);  // Thread-safe read
8
    }
9

10
    public void put(K key, V value) {
11
        cache.put(key, value);  // Thread-safe write
12
    }
13

14
    // Atomic operations
15
    public V putIfAbsent(K key, V value) {
16
        return cache.putIfAbsent(key, value);  // Atomic!
17
    }
18

19
    public boolean remove(K key, V value) {
20
        return cache.remove(key, value);  // Atomic!
21
    }
22

23
    public V computeIfAbsent(K key, java.util.function.Function<K, V> mappingFunction) {
24
        return cache.computeIfAbsent(key, mappingFunction);  // Atomic!
25
    }
26
}

1
package main
2

3
import (
4
  "fmt"
5
  "sync"
6
)
7

8
// sync.Map is Go's equivalent of ConcurrentHashMap
9
type Cache[K comparable, V any] struct {
10
  m sync.Map
11
}
12

13
func (c *Cache[K, V]) Get(key K) (V, bool) {
14
  val, ok := c.m.Load(key)
15
  if !ok {
16
    var zero V
17
    return zero, false
18
  }
19
  return val.(V), true
20
}
21

22
func (c *Cache[K, V]) Put(key K, value V) {
23
  c.m.Store(key, value)
24
}
25

26
// LoadOrStore is sync.Map's atomic putIfAbsent equivalent
27
func (c *Cache[K, V]) PutIfAbsent(key K, value V) (V, bool) {
28
  actual, loaded := c.m.LoadOrStore(key, value)
29
  return actual.(V), loaded
30
}
31

32
func (c *Cache[K, V]) Delete(key K) {
33
  c.m.Delete(key)
34
}
35

36
func main() {
37
  cache := &Cache[string, int]{}
38
  cache.Put("a", 1)
39
  cache.Put("b", 2)
40

41
  v, ok := cache.Get("a")
42
  fmt.Printf("Get a: %d, found: %v\n", v, ok)
43

44
  existing, loaded := cache.PutIfAbsent("a", 99)
45
  fmt.Printf("PutIfAbsent a=99: existing=%d, alreadyLoaded=%v\n", existing, loaded)
46
}

CopyOnWriteArrayList

CopyOnWriteArrayList creates a new copy on each write, making reads lock-free.

Visual: Copy-On-Write

Example: CopyOnWriteArrayList

Java
Go

1
import java.util.List;
2
import java.util.concurrent.CopyOnWriteArrayList;
3

4
public class CopyOnWriteExample {
5
    public static void main(String[] args) {
6
        List<String> list = new CopyOnWriteArrayList<>();
7

8
        // Multiple readers (no locking needed!)
9
        for (int i = 0; i < 10; i++) {
10
            final int readerId = i;
11
            new Thread(() -> {
12
                for (int j = 0; j < 1000; j++) {
13
                    list.size();  // Lock-free read
14
                }
15
                System.out.println("Reader " + readerId + " finished");
16
            }).start();
17
        }
18

19
        // Occasional writer
20
        new Thread(() -> {
21
            for (int i = 0; i < 10; i++) {
22
                list.add("Item " + i);  // Creates copy
23
                try {
24
                    Thread.sleep(100);
25
                } catch (InterruptedException e) {
26
                    Thread.currentThread().interrupt();
27
                }
28
            }
29
        }).start();
30
    }
31
}

1
package main
2

3
import (
4
  "fmt"
5
  "sync"
6
  "time"
7
)
8

9
// Go equivalent of CopyOnWriteArrayList: protect slice with RWMutex.
10
// Reads use RLock (concurrent), writes use Lock (exclusive) and copy.
11
type CopyOnWriteSlice[T any] struct {
12
  mu   sync.RWMutex
13
  data []T
14
}
15

16
func (c *CopyOnWriteSlice[T]) Add(item T) {
17
  c.mu.Lock()
18
  defer c.mu.Unlock()
19
  // Copy-on-write: make a new slice
20
  newData := make([]T, len(c.data)+1)
21
  copy(newData, c.data)
22
  newData[len(c.data)] = item
23
  c.data = newData
24
}
25

26
func (c *CopyOnWriteSlice[T]) Len() int {
27
  c.mu.RLock()
28
  defer c.mu.RUnlock()
29
  return len(c.data) // lock-free read on stable snapshot
30
}
31

32
func main() {
33
  list := &CopyOnWriteSlice[string]{}
34

35
  var wg sync.WaitGroup
36
  // 10 concurrent readers
37
  for i := 0; i < 10; i++ {
38
    wg.Add(1)
39
    readerID := i
40
    go func() {
41
      defer wg.Done()
42
      for j := 0; j < 1000; j++ {
43
        list.Len()
44
      }
45
      fmt.Printf("Reader %d finished\n", readerID)
46
    }()
47
  }
48

49
  // Occasional writer
50
  wg.Add(1)
51
  go func() {
52
    defer wg.Done()
53
    for i := 0; i < 10; i++ {
54
      list.Add(fmt.Sprintf("Item %d", i))
55
      time.Sleep(100 * time.Millisecond)
56
    }
57
  }()
58

59
  wg.Wait()
60
}

BlockingQueue Implementations

We covered these in Producer-Consumer, but here’s a quick reference:

Queue Type	Characteristics	Use Case
`ArrayBlockingQueue`	Array-backed, bounded	Fixed-size queues
`LinkedBlockingQueue`	Node-based, optionally bounded	Better throughput
`PriorityBlockingQueue`	Priority ordering	Priority-based processing
`DelayQueue`	Time-based scheduling	Scheduled tasks
`SynchronousQueue`	Zero capacity	Direct handoff

Python Concurrent Collections

Thread-Safety of Built-in Types

Python’s built-in types have limited thread-safety due to the GIL.

Visual: Python Thread-Safety

Example: Thread-Safe vs Unsafe Operations

Python
Go

1
import threading
2

3
# ❌ NOT thread-safe: Compound operation
4
counter = 0
5

6
def unsafe_increment():
7
    global counter
8
    counter += 1  # NOT atomic: read-modify-write
9

10
threads = [threading.Thread(target=unsafe_increment) for _ in range(10)]
11
for t in threads:
12
    t.start()
13
for t in threads:
14
    t.join()
15

16
print(f"Unsafe result: {counter}")  # May not be 10!
17

18
# ✅ Thread-safe: Single atomic operation
19
d = {}
20
def safe_operation():
21
    d['key'] = 'value'  # Atomic operation
22

23
threads = [threading.Thread(target=safe_operation) for _ in range(10)]
24
for t in threads:
25
    t.start()
26
for t in threads:
27
    t.join()
28

29
print(f"Safe result: {len(d)}")  # Always 1
30

31
# ✅ Thread-safe: Using locks
32
counter_safe = 0
33
lock = threading.Lock()
34

35
def safe_increment():
36
    global counter_safe
37
    with lock:
38
        counter_safe += 1
39

40
threads = [threading.Thread(target=safe_increment) for _ in range(10)]
41
for t in threads:
42
    t.start()
43
for t in threads:
44
    t.join()
45

46
print(f"Safe with lock: {counter_safe}")  # Always 10

1
package main
2

3
import (
4
  "fmt"
5
  "sync"
6
  "sync/atomic"
7
)
8

9
func main() {
10
  // ❌ NOT safe: race condition on plain int
11
  var unsafeCounter int
12
  var wg sync.WaitGroup
13
  for i := 0; i < 10; i++ {
14
    wg.Add(1)
15
    go func() {
16
      defer wg.Done()
17
      unsafeCounter++ // data race!
18
    }()
19
  }
20
  wg.Wait()
21
  fmt.Printf("Unsafe result: %d (may not be 10)\n", unsafeCounter)
22

23
  // ✅ Safe: atomic operation
24
  var atomicCounter int64
25
  for i := 0; i < 10; i++ {
26
    wg.Add(1)
27
    go func() {
28
      defer wg.Done()
29
      atomic.AddInt64(&atomicCounter, 1) // atomic increment
30
    }()
31
  }
32
  wg.Wait()
33
  fmt.Printf("Atomic result: %d\n", atomic.LoadInt64(&atomicCounter)) // Always 10
34

35
  // ✅ Safe: mutex-protected counter
36
  var safeCounter int
37
  var mu sync.Mutex
38
  for i := 0; i < 10; i++ {
39
    wg.Add(1)
40
    go func() {
41
      defer wg.Done()
42
      mu.Lock()
43
      safeCounter++
44
      mu.Unlock()
45
    }()
46
  }
47
  wg.Wait()
48
  fmt.Printf("Safe with mutex: %d\n", safeCounter) // Always 10
49
}

queue Module

Python’s queue module provides thread-safe queue implementations.

Python
Go

1
import queue
2
import threading
3

4
# FIFO Queue
5
fifo_queue = queue.Queue(maxsize=10)
6

7
# LIFO Queue (Stack)
8
lifo_queue = queue.LifoQueue(maxsize=10)
9

10
# Priority Queue
11
priority_queue = queue.PriorityQueue(maxsize=10)
12

13
def producer(q):
14
    for i in range(5):
15
        q.put(i)
16
        print(f"Produced: {i}")
17

18
def consumer(q):
19
    while True:
20
        try:
21
            item = q.get(timeout=1)
22
            print(f"Consumed: {item}")
23
            q.task_done()
24
        except queue.Empty:
25
            break
26

27
# Thread-safe operations
28
threading.Thread(target=producer, args=(fifo_queue,)).start()
29
threading.Thread(target=consumer, args=(fifo_queue,)).start()

1
package main
2

3
import "fmt"
4

5
func main() {
6
  // FIFO Queue: buffered channel
7
  fifoQueue := make(chan int, 10)
8

9
  // Producer
10
  go func() {
11
    for i := 0; i < 5; i++ {
12
      fifoQueue <- i
13
      fmt.Printf("Produced: %d\n", i)
14
    }
15
    close(fifoQueue)
16
  }()
17

18
  // Consumer
19
  for item := range fifoQueue {
20
    fmt.Printf("Consumed: %d\n", item)
21
  }
22

23
  // Priority queue: use a heap with a mutex (stdlib container/heap)
24
  // LIFO (stack): use a slice with a mutex
25
}

multiprocessing.Manager

For shared state across processes (not threads):

Python
Go

1
import multiprocessing
2

3
def worker(shared_dict, shared_list):
4
    shared_dict['count'] = shared_dict.get('count', 0) + 1
5
    shared_list.append(shared_dict['count'])
6

7
if __name__ == '__main__':
8
    manager = multiprocessing.Manager()
9
    shared_dict = manager.dict()
10
    shared_list = manager.list()
11

12
    processes = []
13
    for _ in range(5):
14
        p = multiprocessing.Process(target=worker, args=(shared_dict, shared_list))
15
        processes.append(p)
16
        p.start()
17

18
    for p in processes:
19
        p.join()
20

21
    print(f"Dict: {shared_dict}")
22
    print(f"List: {shared_list}")

1
package main
2

3
import (
4
  "fmt"
5
  "os"
6
  "strconv"
7
  "sync"
8
  "sync/atomic"
9
)
10

11
// For sharing state across goroutines (same process), use sync primitives.
12
// For sharing state across OS processes, use files, pipes, or shared memory.
13

14
func worker(sharedCounter *int64, sharedList *[]int, mu *sync.Mutex) {
15
  count := atomic.AddInt64(sharedCounter, 1)
16
  mu.Lock()
17
  *sharedList = append(*sharedList, int(count))
18
  mu.Unlock()
19
}
20

21
func main() {
22
  if len(os.Args) > 1 && os.Args[1] == "-child" {
23
    n, _ := strconv.Atoi(os.Args[2])
24
    fmt.Printf("Child %d working\n", n)
25
    return
26
  }
27

28
  var sharedCounter int64
29
  var sharedList []int
30
  var mu sync.Mutex
31
  var wg sync.WaitGroup
32

33
  for i := 0; i < 5; i++ {
34
    wg.Add(1)
35
    go func() {
36
      defer wg.Done()
37
      worker(&sharedCounter, &sharedList, &mu)
38
    }()
39
  }
40
  wg.Wait()
41

42
  fmt.Printf("Counter: %d\n", sharedCounter)
43
  fmt.Printf("List: %v\n", sharedList)
44
}

Comparison Table

Collection Type	Java	Python	Thread-Safety
HashMap/Dict	`ConcurrentHashMap`	`dict` + locks	Java: Full, Python: Limited
List	`CopyOnWriteArrayList`	`list` + locks	Java: Full, Python: Limited
Queue	`BlockingQueue` variants	`queue.Queue`	Both: Full
Set	`ConcurrentSkipListSet`	`set` + locks	Java: Full, Python: Limited

Practice Problems

Easy: Thread-Safe Cache

Design a thread-safe cache using concurrent collections.

Solution

1
import java.util.concurrent.ConcurrentHashMap;
2

3
public class ThreadSafeCache<K, V> {
4
    private final ConcurrentHashMap<K, V> cache = new ConcurrentHashMap<>();
5

6
    public V get(K key) {
7
        return cache.get(key);
8
    }
9

10
    public void put(K key, V value) {
11
        cache.put(key, value);
12
    }
13

14
    public V computeIfAbsent(K key, java.util.function.Function<K, V> mappingFunction) {
15
        return cache.computeIfAbsent(key, mappingFunction);
16
    }
17
}

1
import threading
2

3
class ThreadSafeCache:
4
    def __init__(self):
5
        self._cache = {}
6
        self._lock = threading.RLock()
7

8
    def get(self, key):
9
        with self._lock:
10
            return self._cache.get(key)
11

12
    def put(self, key, value):
13
        with self._lock:
14
            self._cache[key] = value
15

16
    def compute_if_absent(self, key, mapping_function):
17
        with self._lock:
18
            if key not in self._cache:
19
                self._cache[key] = mapping_function(key)
20
            return self._cache[key]

1
package main
2

3
import (
4
  "fmt"
5
  "sync"
6
)
7

8
// sync.Map provides a thread-safe map with no need for explicit locking
9
type ThreadSafeCache[K comparable, V any] struct {
10
  m sync.Map
11
}
12

13
func (c *ThreadSafeCache[K, V]) Get(key K) (V, bool) {
14
  val, ok := c.m.Load(key)
15
  if !ok {
16
    var zero V
17
    return zero, false
18
  }
19
  return val.(V), true
20
}
21

22
func (c *ThreadSafeCache[K, V]) Put(key K, value V) {
23
  c.m.Store(key, value)
24
}
25

26
func (c *ThreadSafeCache[K, V]) ComputeIfAbsent(key K, fn func(K) V) V {
27
  val, _ := c.m.LoadOrStore(key, fn(key))
28
  return val.(V)
29
}
30

31
func main() {
32
  cache := &ThreadSafeCache[string, int]{}
33
  cache.Put("x", 10)
34

35
  v, ok := cache.Get("x")
36
  fmt.Printf("Get x: %d, found: %v\n", v, ok)
37

38
  result := cache.ComputeIfAbsent("y", func(k string) int { return 42 })
39
  fmt.Printf("ComputeIfAbsent y: %d\n", result)
40
}

Interview Questions

Q1: “What’s the difference between ConcurrentHashMap and synchronized HashMap?”

Answer:

synchronized HashMap: Locks entire map for any operation (low concurrency)
ConcurrentHashMap: Fine-grained locking or CAS (high concurrency)
Performance: ConcurrentHashMap is much faster for concurrent access
Use ConcurrentHashMap: When you need thread-safe map with high concurrency

Q2: “When would you use CopyOnWriteArrayList?”

Answer:

Use when: Reads vastly outnumber writes (e.g., 100:1 ratio)
Perfect for: Event listeners, configuration, read-heavy scenarios
Don’t use when: Frequent writes (too expensive - creates copy each time)
Trade-off: Expensive writes for lock-free reads

Q3: “Are Python’s built-in dict and list thread-safe?”

Answer:

Single operations: Yes, atomic (e.g., dict[key] = value, list.append(item))
Compound operations: No, NOT thread-safe (e.g., if key in dict: dict[key] = value)
Solution: Use locks for compound operations or thread-safe collections
GIL: Provides some protection but doesn’t guarantee thread-safety for compound ops

Q4: “What’s the difference between ArrayBlockingQueue and LinkedBlockingQueue?”

Answer:

ArrayBlockingQueue: Array-backed, always bounded, fixed memory, slightly lower throughput
LinkedBlockingQueue: Node-based, optionally bounded, dynamic memory, typically higher throughput
Choose: ArrayBlockingQueue for fixed-size needs, LinkedBlockingQueue for better performance

Key Takeaways

Next Steps

Continue learning concurrency:

Asynchronous Patterns - Futures and async/await
Lock-Free Programming - CAS and atomic operations

Mastering concurrent collections is essential for building thread-safe systems! 📦

Request a feature or report an issue