Java,J2ee,Hybris E-commerce suit,Apache Ofbiz ,database, E-commerce domain Experts: Use Concurrent HashMap instead of hashtable & synchronizedMap

Use Concurrent HashMap instead of hashtable & synchronizedMap

Some of the drawbacks of Synchronized collection such as HashTable , Collections.synchronizedMap are as follows .

Synchronized collection classes such as Hashtable and the synchronized wrapper classes created by the Collections.synchronizedMap are thread safe with poor concurrency, less performance and scalabilty .

1. Poor concurrency : When these collections are accessed by two or more threads, they achieve thread safety by making the collection's data private and synchronizing all public methods so that only one thread at a time can access the collection (hashtable / synchronizedMap ) data. This leads to poor concurrency. As Single lock is used for the whole collection , multiple threads struggle for the collection wide lock which reduces the performance

2. ConcurrentModificationException :

When one thread is traversing the hashtable / Collections.synchronizedMap through an Iterator , while another thread changes it by mutative operations (put, remove , etc) , iterator implemented in the java.util collections classes fails by throwing ConcurrentModificationException . The exception occurs when the hasNext() or next() method of Iterator class is called. The same error also occurs (See Code Part 1 : ) , when elements are added in hashtable or synchronizedMap , once the iterator is constructed. While iterating the collection (hashtable) through iterator , collection / table- wide locking is required , otherwise ConcurrentModificationException is occured .

3. Scalabilty Issues :

Scalabilty is the major issue when we use synchronized collections . When the workload of the application increases , increasing the resources like processor , memory should also increase the throughtput of the application. Unfortunately , it does not happen . A scalable program can handle a proportionally larger workload with more resources. As synchronized collections synchronize on a single common lock , it restricts access to a single thread at a time, other threads are restricted to access that collections , even if the resources are available to schedule those threads.

4. Some of the common sequences of operations , such as put-if-absent (to check if an element is in the collection before adding it) or iteration , require external synchronization (i.e. client side locking ) (See Code Part 3 ) to avoid data races .

Code Part 1 :

//Map hm=Collections.synchronizedMap(new HashMap());

Map hm=new Hashtable(new HashMap());

//ConcurrentHashMap hm=new ConcurrentHashMap();

hm.put(1, "Blue");

hm.put(2, "Green");

hm.put(3, "Yellow");

Iterator entries = hm.entrySet().iterator();

hm.put(4, "Red");

hm.put(5, "Orange");

while (entries.hasNext()) {

Map.Entry entry = (Map.Entry) entries.next();

Integer key = (Integer)entry.getKey();

String value = (String)entry.getValue();

System.out.println("Key = " + key + ", Value = " + value);}

To overcome the above issues with the synchronized collections , a new version of HashMap with concurrent access has been designed that is ConcurrentHashMap. This class is packaged with java.util.concurrent in JDK 1.5)

The main purpose to create ConcurrentHashMap is to provide

1. better concurrency

2 high scalability

3. thread safe

and it supports
1. full concurrency of retrievals. Allows all readers to read the table concurrently . No lock is used for retrival operations.

2. concurrency for writes . Allows a limited number of writers to update the table concurrently

3. full thread safe .

ConcurrentHashMap can be used where more read operation is required ( i.e. traversal is the dominant operation )

How a ConcurrentHashMap is implemented ? or How it works? or how concurrency is achieved?

          Volatile fields and lock striping plays major role for to achieve concurrency .
Lock striping :   Synchronizing every method on a single lock, restricts access to a single thread at a time.   Instead of using single lock , ConcurrentHashMap uses different locking mechanism called lock striping to access the shared collection concurrently which increases the scalabilty and performance .     Using different locks to allow different threads to operate on different portions of the same data structure called lock striping. Splitting the lock into more than one improves the scalability . For example two locks allow two threads to execute concurrently instead of one.

    Lock splitting can sometimes be extended to partition locking on a variablesized set of independent objects, in which case it is called lock striping.

              Now let see that how lock striping mechanism is applied to ConcurrentHashMap . The strategy is to subdivide the collection (hashtable) into independent subsets called segments each guarded by a lock sothat each subset (itself a hashtable)   can be accessed concurrently.    It uses an array of 16 locks each of which guards 1/16 of the hash buckets. N/16 locks are used for a hashtable having N hash buckets. Hash bucket N is guarded by lock N mod 16. 16 locks allow maximum of 16 threads to modify the hashtable at same time.   Mutative operations such as put() and remove() use locks where as read operation does not use locks .

Note : The number of locks can be increased

Volatile Fields :   Some of the volatile fileds declared in the ConcurrentHashMap are

transient volatile int count;

static final class HashEntry<K,V> {

final K key;

final int hash;

volatile V value;

volatile HashEntry<K,V> next;

HashEntry(K key, int hash, HashEntry<K,V> next, V value) {

.....

}

transient volatile HashEntry<K,V>[] table;

From the source of ConcurrentHashMap

As we know , volatile field ensures visibilty i.e. one thread reads the most up-to-date value written by another thread . For example count is the volatile field which is used to track the number of elements . When one thread adds an element to the table , the count is increased by one , Similarly when one thread removes an element from the table , the count is decreased by one . Now the other threads doing many read operations get the count variable's most recent updated value.

Similarly HashEntry<K,V>[] table , value , volatile HashEntry<K,V> next fileds are declared as volatile. This ensures that all the threads see the the most recent written value of those fields at all times.

When iterating the collection (hashtable) through iterator , it does not throw ConcurrentModificationException, but the elements added or removed after the iterator was constructed may or may not be reflected . No collection / table- wide locking is required while iterating the collection.
Issue: How to protect / lock the entire collection? . There is no support for locking the entire table in a way that prevents all access. Then One way is to acquire all of the locks recursively which is costlier than using a single lock .
ConcurrentHashMap provides three new update methods:
putIfAbsent(key, value) - check if the key is in the collection before adding the specified key and associate it with the given value
replace( key, value) - Replace the existing key with given key , only if the key is mapped to given value.
remove(key, value) - Remove the key only if the key is mapped to given value.

The following program using ConcurrentHashMap helps to keep the accessed files in a cache .
Code Part 2 :

import java.util.*;

import java.util.concurrent.ConcurrentHashMap;

import java.io.*;

public class CacheUsingMap2 {

ConcurrentHashMap cache;

public CacheUsingMap2() {

cache = new ConcurrentHashMap();

}

public String getFile2(String fname) {

cache.putIfAbsent(fname, readFile(fname));

return ((myFile)cache.get(fname)).getFileData();

}

public myFile readFile(String name)

{

File file = new File(name);

String fileData="";

try {

Scanner scan = new Scanner(file);

scan.useDelimiter("\\Z");

fileData = scan.next();

} catch (FileNotFoundException e){

System.out.println(e);

}

catch ( IOException e) {

System.out.println(e);

}

return (new myFile( fileData));

}

public static void main(String args[]) {

CacheUsingMap2 cache=new CacheUsingMap2();

String filePath="D:/Files/";

System.out.println( cache.getFile2(filePath+"k.txt"));

System.out.println( cache.getFile2(filePath+"k1.txt"));

System.out.println( cache.getFile2(filePath+"k.txt"));

System.out.println( cache.getFile2(filePath+"k1.txt"));

}

class myFile {

String fileData;

public myFile(String data)

{

fileData=data;

}

public String getFileData() {

return fileData;

}

Code Part 3 :

Sample code to createt cache using Hashtable (implements put-if-absent operation) which requires client side locking

....

Hashtable cache =new Hashtable();

....

public String getFile(String fname) {

// if (cache.get(fname)==null)

if (!cache.containsKey(fname))

{

synchronized(cache)

{

cache.put(fname, readFile(fname));

}

return ((myFile)cache.get(fname)).getFileData(); }

Java,J2ee,Hybris E-commerce suit,Apache Ofbiz ,database, E-commerce domain Experts

Use Concurrent HashMap instead of hashtable & synchronizedMap | JDK 1.5 new features Concurrent HashMap

No comments:

Post a Comment