I'm trying to debug a very strange discrepency between 2 seperate cosmos db collection that on face value are configured the same.
We recently modified some code that executed the following query.
OLD QUERY
SELECT * FROM c 
WHERE c.ProductId = "CODE" 
AND c.PartitionKey = "Manufacturer-GUID"
NEW QUERY
SELECT * FROM c
WHERE (c.ProductId = "CODE" OR ARRAY_CONTAINS(c.ProductIdentifiers, "CODE")) 
AND c.PartitionKey = "Manufacturer-GUID"
The introduction of that Array_Contains call in the production environment has tanked the performance of this query from ~3 RU/s ==> ~6000 RU/s. But only in the production environment.
And the cause seems to be that in Production, it's not hitting an index. See output below for the two environments.
DEV CONFIGURATION
Collection Scale: 2000 RU/s
Index Policy: (Notice the Exclude path for ETags)
{
    "indexingMode": "consistent",
    "automatic": true,
    "includedPaths": [
        {
            "path": "/*",
            "indexes": [
                {
                    "kind": "Range",
                    "dataType": "Number",
                    "precision": -1
                },
                {
                    "kind": "Range",
                    "dataType": "String",
                    "precision": -1
                },
                {
                    "kind": "Spatial",
                    "dataType": "Point"
                }
            ]
        }
    ],
    "excludedPaths": [
        {
            "path": "/\"_etag\"/?"
        }
    ]
}
PROD CONFIGURATION
Collection Scale: 10,000 RU/s
Index Policy: (Notice the absense of an Exclude path for ETags compared to DEV)
{
    "indexingMode": "consistent",
    "automatic": true,
    "includedPaths": [
        {
            "path": "/*",
            "indexes": [
                {
                    "kind": "Range",
                    "dataType": "Number",
                    "precision": -1
                },
                {
                    "kind": "Range",
                    "dataType": "String",
                    "precision": -1
                },
                {
                    "kind": "Spatial",
                    "dataType": "Point"
                }
            ]
        }
    ],
    "excludedPaths": []
}
When comparing the Output results from both environments, DEV is showing an index hit, while PROD is showing an index miss, despite no noticeable difference between the index policies.
Results in DEV
Request Charge:           3.490 RUs
Showing Results:          1 - 1
Retrieved document count: 1
Retrieved document size:  3118 bytes
Output document count:    1
Output document size:     3167 bytes
Index hit document count: 1
Results in PROD
Request Charge:           6544.870 RUs
Showing Results:          1 - 1
Retrieved document count: 124199
Retrieved document size:  226072871 bytes
Output document count:    1
Output document size:     3167 bytes
Index hit document count: 0
The only thing I've been able to find online is a reference in the documents to some change that occurred within Cosmos Collection stating that a "New Index Layout" is being used for newer collections, but there's no other mention of Index Layouts as a concept that I can find anywhere in the docs.
https://docs.microsoft.com/en-us/azure/cosmos-db/index-types#index-kind
Anyone got any idea where I can go from here in terms of debugging/addressing this issue.
Indexing mode Azure Cosmos DB supports two indexing modes: Consistent: The index is updated synchronously as you create, update or delete items. This means that the consistency of your read queries will be the consistency configured for the account. None: Indexing is disabled on the container.
Azure Cosmos DB is commonly used within web and mobile applications, and is well suited for modeling social interactions, integrating with third-party services, and for building rich personalized experiences. The Cosmos DB SDKs can be used build rich iOS and Android applications using the popular Xamarin framework.
Cosmos DB is a very good modern platform for cloud native and mobile applications. I work with different teams which are using Cosmos DB. I found that the service is excellent in some areas like scalability, DR, availability.
Your Dev container is newer and using our v2 Index which has significant improvements including on Array_Contains(). To learn more on how to upgrade your PROD container please email us at askcosmosdb at microsoft dot com.
Thanks.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With