The version conflict error is often seen when doing document indexing operations in Elasticsearch.
Each document in Elasticsearch has two fields _seq_no
and _primary_term
records the state for this document.
When we do concurrent changes to a document, each change has its own _seq_no
and _primary_term
.
Elasticsearch will compare the _seq_no
and _primary_term
to avoid older changes overwrite the changes of newer changes.
When you update a document, you can specify sequence number and primary term, to make sure that there are no other changes made between when you get the sequence number/primary term and when you make your change. If the sequence number and primary term you specify does not match the one in the document, you will get 409 conflict error.
First, index document to my_index
:
PUT my_index/_doc/123
{
"title": "monday"
}
This is the result you get.
{
"_index": "my_index",
"_id": "123",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 0,
"_primary_term": 1
}
If you check what is in document 123:
GET my_index/_doc/123
Result is:
{
"_index": "my_index",
"_id": "123",
"_version": 1,
"_seq_no": 0,
"_primary_term": 1,
"found": true,
"_source": {
"title": "monday"
}
}
If you do another update to the document 123:
PUT my_index/_doc/123
{
"title": "monday",
"content": "this is monday"
}
You see something like this:
{
"_index": "my_index",
"_id": "123",
"_version": 2,
"result": "updated",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 1,
"_primary_term": 1
}
The sequence number/primary term for document 123 will be changed (GET my_index/_doc/123
):
{
"_index": "my_index",
"_id": "123",
"_version": 2,
"_seq_no": 1,
"_primary_term": 1,
"found": true,
"_source": {
"title": "monday",
"content": "this is monday"
}
}
In the document index API, we can also specify if_seq_no
and if_primary_term
parameter:
PUT my_index/_doc/123?if_seq_no=2&if_primary_term=1
{
"title": "monday"
}
Since the sequence number and primary term does not match the document exactly, we see error:
{
"error": {
"root_cause": [
{
"type": "version_conflict_engine_exception",
"reason": "[123]: version conflict, required seqNo [2], primary term [1]. current document has seqNo [1] and primary term [1]",
"index_uuid": "n_LbDCbbT4O5mwgGWLOOKA",
"shard": "0",
"index": "my_index"
}
],
"type": "version_conflict_engine_exception",
"reason": "[123]: version conflict, required seqNo [2], primary term [1]. current document has seqNo [1] and primary term [1]",
"index_uuid": "n_LbDCbbT4O5mwgGWLOOKA",
"shard": "0",
"index": "my_index"
},
"status": 409
}
For document Delete API, we can also specify if_seq_no
and if_primary_term
:
DELETE my_index/_doc/123?if_seq_no=2&if_primary_term=1
Document 123 is there, but the sequence number and primary term does not match, you see errors:
{
"error": {
"root_cause": [
{
"type": "version_conflict_engine_exception",
"reason": "[123]: version conflict, required seqNo [2], primary term [1]. current document has seqNo [1] and primary term [1]",
"index_uuid": "n_LbDCbbT4O5mwgGWLOOKA",
"shard": "0",
"index": "my_index"
}
],
"type": "version_conflict_engine_exception",
"reason": "[123]: version conflict, required seqNo [2], primary term [1]. current document has seqNo [1] and primary term [1]",
"index_uuid": "n_LbDCbbT4O5mwgGWLOOKA",
"shard": "0",
"index": "my_index"
},
"status": 409
}
If the document does not exist (for example, if you do DELETE my_index/_doc/183244?if_primary_term=1&if_seq_no=1
), you see error like this:
{
"error": {
"root_cause": [
{
"type": "version_conflict_engine_exception",
"reason": "[183244]: version conflict, required seqNo [1], primary term [1]. but no document was found",
"index_uuid": "n_LbDCbbT4O5mwgGWLOOKA",
"shard": "0",
"index": "my_index"
}
],
"type": "version_conflict_engine_exception",
"reason": "[183244]: version conflict, required seqNo [1], primary term [1]. but no document was found",
"index_uuid": "n_LbDCbbT4O5mwgGWLOOKA",
"shard": "0",
"index": "my_index"
},
"status": 409
}
This conflict error can also happen when you use delete_by_query API.
sequence number and primary term#
Sequence number represent the operation number that is performed on a document. Primary term means how many times that the primary shard for the index has been changed. If you create a new index and index a document, you see that the primary term is 1.
The primary term is introduced to solve doc version problems in a distributed system as Elasticsearch. You can find more about the motivation in this official blog1.
Elasticsearch delete versioning#
If you have a delete operation that targets at version 1000 of a document, but later you have an index operation that targets version 999 of the same document, if the delete operation is immediately and ES does not keep the version of the document, then the new index operation will succeed. However, the index operation targets an old version of the document, so it shouldn’t succeed. The document should still be deleted, not indexed again.
The index.gc_deletes
setting is used to solve the above issue.
It specifies how long the version number of deleted document is kept.
By default, it is kept for 60 seconds.
ref:
- Elasticsearch version support (this is old, but still worth reading): https://www.elastic.co/blog/elasticsearch-versioning-support
- explanation of versioning in the delete API: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete.html#delete-versioning
index.gc_deletes
is under dynamic index settings: https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules.html#dynamic-index-settings
version conflict when using delete by query#
When we use delete_by_query to delete a document, we will see potentially the ConflictError exception.
typical error#
The typical error message you see like this in the reason:
version conflict, required seqNo [0], primary term [1]. current document has seqNo [1] and primary term [1]
This means that you are getting an older version of the document.
This can happen, for example, if you have a document in the index, then you index a new version of the docment,
then immediately run the delete_by_query
to delete the document.
Below is a simple snippet to reproduce the error:
from elasticsearch import Elasticsearch
es_client = Elasticsearch(...)
index_name = "demo_index"
_id = "123456"
doc_old = {"title": "old title"}
doc_new = {"title": "new title"}
if es_client.indices.exists(index=index_name):
es_client.indices.delete(index=index_name)
es_client.index(index=index_name, document=doc_old, id=_id)
es_client.indices.refresh(index=index_name)
search_response = es_client.search(index=index_name, seq_no_primary_term=True)
print(f"first search response: {search_response}")
# index updated doc
es_client.index(index=index_name, document=doc_new, id=_id)
# this refresh below is very important
# es_client.indices.refresh(index=index_name)
second_response = es_client.search(index=index_name, seq_no_primary_term=True)
print(f"second search response: {second_response}")
deletion_result = es_client.delete_by_query(
index=index_name, query={"bool": {"should": [{"term": {"_id": _id}}]}}
)
print(f"deletion result: {deletion_result}")
but why? This is because after the second index operation using doc_new
,
the updated doc version is not immediately for search.
You have to wait for Elasticsearch to refresh the index to make the new document searchable,
which happens every 1 second by default and is controlled by setting index.refresh_interval
.
Or you can use the refresh api to refresh the index.
Under the hood, the API delete_by_query()
search for the matching document and then delete them.
However, the document version it gets is an older version.
Actually if we compare the first response and second response, they are the same:
first search response: [{'_index': 'demo_index', '_id': '123456', '_seq_no': 0, '_primary_term': 1, '_score': 1.0, '_source': {'title': 'old title'}}]
second search response: [{'_index': 'demo_index', '_id': '123456', '_seq_no': 0, '_primary_term': 1, '_score': 1.0, '_source': {'title': 'old title'}}]
If you uncomment the es_client.indices.refresh()
call in the above snippet, the delete_by_query
call should run without issues.
another error#
Another conflict error I see is somewhat different, which has the following error reason:
version conflict, required seqNo [0], primary term [1]. but no document was found
This can happen, for example, when you use delete_by_query
to delete a document from index.
Then immediately you try to run delete_by_query
again to delete the same document.
Try the below code snippet:
from elasticsearch import Elasticsearch
es_client = Elasticsearch(...)
index_name = "demo_index"
_id = "123456"
doc_old = {"title": "old title"}
if es_client.indices.exists(index=index_name):
es_client.indices.delete(index=index_name)
es_client.index(index=index_name, document=doc_old, id=_id)
es_client.indices.refresh(index=index_name)
search_response = es_client.search(index=index_name, seq_no_primary_term=True)
print(f"first search response: {search_response['hits']['hits']}")
deletion_result = es_client.delete_by_query(
index=index_name, query={"bool": {"should": [{"term": {"_id": _id}}]}}
)
print(f"deletion result: {deletion_result}")
second_response = es_client.search(index=index_name, seq_no_primary_term=True)
print(f"second search response: {second_response['hits']['hits']}")
# es_client.indices.refresh(index=index_name)
deletion_result = es_client.delete_by_query(
index=index_name, query={"bool": {"should": [{"term": {"_id": _id}}]}}
)
print(f"deletion result: {deletion_result}")
Remember the section on delete versioning.
Actually after the you call the first delete_by_query
, the document deletion is not yet fully done.
The second search in the snippet can still show the document!
So your second call to delete_by_query()
gets an older version of the document and tries to delete it,
but actually the document is not there anymore (searchable, but actually not there).
If we uncomment the es_client.indices.refresh()
line, then the second call to delete_by_query()
will not run into errors.
ref:
- sequence number and version number: https://stackoverflow.com/a/58762025/6064933
- Elasticsearch version conflict when using delete by query: https://stackoverflow.com/q/55382702/6064933
- document refresh: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-refresh.html
- https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#optimistic-concurrency-control-index
- https://www.elastic.co/guide/en/elasticsearch/reference/current/optimistic-concurrency-control.html
You can check the local and global checkpoint mentioned in this post using the cat-shards API ↩︎