all
I just got a weird error sent through from our applcation:
when i updated with two processes, it was complaining of a duplicate key error on a collection with a unique index on it, but the operation in question was an upsert.
case code:
import time
from bson import Binary
from pymongo import MongoClient, DESCENDING
bucket = MongoClient('127.0.0.1', 27017)['test']['foo']
bucket.drop()
bucket.update({'timestamp': 0}, {'$addToSet': {'_exists_caps': 'cap15'}}, upsert=True, safe=True, w=1, wtimeout=10)
bucket.create_index([('timestamp', DESCENDING)], unique=True)
while True:
    timestamp =  str(int(1000000 * time.time()))
    bucket.update({'timestamp': timestamp}, {'$addToSet': {'_exists_foos': 'fooxxxxx'}}, upsert=True, safe=True, w=1, wtimeout=10)
When i run script with two processes, Pymongo Exception:
Traceback (most recent call last):
  File "test_mongo_update.py", line 11, in <module>
    bucket.update({'timestamp': timestamp}, {'$addToSet': {'_exists_foos': 'fooxxxxx'}}, upsert=True, safe=True, w=1, wtimeout=10)
  File "build/bdist.linux-x86_64/egg/pymongo/collection.py", line 552, in update
  File "build/bdist.linux-x86_64/egg/pymongo/helpers.py", line 202, in _check_write_command_response
pymongo.errors.DuplicateKeyError: E11000 duplicate key error collection: test.foo index: timestamp_-1 dup key: { : "1439374020348044" }
Env:
mongodb 3.0.5, WiredTiger
single mongodb instance
pymongo 2.8.1
mongo.conf
systemLog:
   destination: file
   logAppend: true
   logRotate: reopen
   path: /opt/lib/log/mongod.log
# Where and how to store data.
storage:
   dbPath: /opt/lib/mongo
   journal:
     enabled: true
   engine: "wiredTiger"
   directoryPerDB: true
# how the process runs
processManagement:
   fork: true  # fork and run in background
   pidFilePath: /opt/lib/mongo/mongod.pid
# network interfaces
net:
   port: 27017
   bindIp: 0.0.0.0  # Listen to local interface only, comment to listen on all interfaces.
setParameter:
   enableLocalhostAuthBypass: false
Any thoughts on what could be going wrong here?
PS:
I retried the same case in MMAPV1 storage engine, it works fine, why?
I found something related here: https://jira.mongodb.org/browse/SERVER-18213
but after this bug fix, it cases this error, so it looks like this bug is not fixed completely.
Cheers
If you ever faced this error all you need to do is to check your model carefully and find out that is there any unique key set true by you and if it is not necessary then simply remove the unique key from the model or otherwise set a unique value if it is necessary to be unique.
Because of the unique constraint, MongoDB will only permit one document that lacks the indexed field. If there is more than one document without a value for the indexed field or is missing the indexed field, the index build will fail with a duplicate key error.
Here in MongoDB, the upsert option is a Boolean value. Suppose the value is true and the documents match the specified query filter. In that case, the applied update operation will update the documents. If the value is true and no documents match the condition, this option inserts a new document into the collection.
Or in other words, upsert is a combination of update and insert (update + insert = upsert). If the value of this option is set to true and the document or documents found that match the specified query, then the update operation will update the matched document or documents.
I found the bug at: https://jira.mongodb.org/browse/SERVER-14322
Please feel free to vote for it and watch it for further updates.
An upsert does both a check for an existing document to update, or inserts a new document.
My best guess is you are running into a timing issue where:
Check what native query your python library is sending underneath first. Confirm it's what you expect on the native mongo side. Then if you can reproduce this semi regularly on wiredtiger but never on mmap, raise a bug with mongo to confirm what their expected behavior is. It's sometimes hard to pick what they guarantee to be atomic.
This is a good example of why Mongo ObjectIDs combine a timestamp, a machine id, a pid and a counter for uniqueness.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With