Imagine having a main thread which creates a HashSet and starts a lot of worker threads passing HashSet to them.
Just like in code below:
void main() {
  final Set<String> set = new HashSet<>();
  final ExecutorService threadExecutor = 
  Executors.newFixedThreadPool(10);
  threadExecutor.submit(() -> doJob(set));
} 
void doJob(final Set<String> pSet) {
  // do some stuff
  final String x = ... // doesn't matter how we received the value.
  if (!pSet.contains(x)) {
    synchronized (pSet) {
      // double check to prevent multiple adds within different threads
      if (!pSet.contains(x)) {
        // do some exclusive work with x.
        pSet.add(x);
      }
    }
  }
  // do some stuff
}
I'm wondering is it thread-safe to synchronize only on add method? Is there any possible issues if contains is not synchronized?
My intuition telling me this is fine, after leaving synchronized block changes made to set should be visible to all threads, but JMM could be counter-intuitive sometimes.
P.S. I don't think it's a duplicate of How to lock multiple resources in java multithreading Even though answers to both could be similar, this question addresses more particular case.
HashMap allows null values and null keys. Both HashSet and HashMap are not synchronized.
Thread Safe HashSet Using ConcurrentHashMap Factory Method Basically, this method returns an instance that respects the java. util. Set interface and allows the usage of standard methods like add(), contains(), etc.
synchronized keyword is one of the way to achieve 'thread safe'. But Remember:Actually while multiple threads tries to access synchronized method they follow the order so becomes safe to access.
Thread synchronization is the concurrent execution of two or more threads that share critical resources. Threads should be synchronized to avoid critical resource use conflicts. Otherwise, conflicts may arise when parallel-running threads attempt to modify a common variable at the same time.
I'm wondering is it thread-safe to synchronize only on the
addmethod? Are there any possible issues ifcontainsis not synchronized as well?
Short answers: No and Yes.
There are two ways of explaining this:
Java synchronization (in its various forms) guards against a number of things, including:
In your example, synchronizing on add is sufficient to ensure that two threads cannot update the HashSet simultaneously, and that both calls will be operating on the most recent HashSet state.
However, if contains is not synchronized as well, a contains call could happen simultaneously with an add call.  This could lead to the contains call seeing an intermediate state of the HashSet, leading to an incorrect result, or worse.   This can also happen if the calls are not simultaneous, due to changes not being flushed to main memory immediately and/or the reading thread not reading from main memory.
The JLS specifies the Java Memory Model which sets out the conditions that must be fulfilled by a multi-threaded application to guarantee that one thread sees the memory updates made by another. The model is expressed in mathematical language, and not easy to understand, but the gist is that visibility is guaranteed if and only if there is a chain of happens before relationships from the write to a subsequent read. If the write and read are in different threads, then synchronization between the threads is the primary source of these relationships. For example in
 // thread one
 synchronized (sharedLock) {
    sharedVariable = 42;
 }
 // thread two
 synchronized (sharedLock) {
     other = sharedVariable;
 }
Assuming that the thread one code is run before the thread two code, there is a happens before relationships between thread one releasing the lock and thread two acquiring it.  With this and the "program order" relations, we can build a chain from the write of 42 to the assignment to other.  This is sufficient to guarantee that other will be assigned 42 (or possibly a later value of the variable) and NOT any value in sharedVariable before 42 was written to it.
Without the synchronized block synchronizing on the same lock, the second thread could see a stale value of sharedVariable; i.e. some value written to it before 42 was assigned to it.
That code is thread safe for the the  synchronized (pSet) { } part : 
if (!pSet.contains(x)) {
  synchronized (pSet) { 
  // Here you are sure to have the updated value of pSet    
  if (!pSet.contains(x)) {
    // do some exclusive work with x.
    pSet.add(x);
  }
}
because inside the synchronized statement on the pSet object : 
pSet has also its updated state guaranteed by the happens-before relationship with the synchronized keyword.   So whatever the value returned by  the first if (!pSet.contains(x)) statement for a waiting thread, when this waited thread will wake up and enter in the  synchronized statement, it will set the last updated value of pSet. So even if the same element was added by a previous thread, the second   if (!pSet.contains(x)) would return false.   
But this code is not thread safe for the first statement if (!pSet.contains(x)) that could be executed during a writing on the Set.
As a rule of thumb, a collection not designed to be thread safe should not be used to perform concurrently writing and reading operations because the internal state of the collection could be in a in-progress/inconsistent state for a reading operation that would occur meanwhile a writing operation.
While some no thread safe collection implementations accept such a usage in the facts, that is not guarantee at all that it will always be true.
So you should use a thread safe Set implementation to guarantee the whole thing thread safe.
For example with :
Set<String> pSet = ConcurrentHashMap.newKeySet();
That uses under the hood a ConcurrentHashMap, so no lock for reading and a minimal lock for writing (only on the entry to modify and not the whole structure).    
No,
You don't know in what state the Hashset might be during add by another Thread. There might be fundamental changes ongoing, like splitting of buckets, so that contains may return false during the adding by another thread, even if the element would be there in a singlethreaded HashSet. In that case you would try to add an element a second time.
Even Worse Scenario: contains might get into an endless loop or throw an exception because of an temporary invalid state of the HashSet in the memory used by the two threads at the same time.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With