26/10 Azure Cosmos DB - East US - Mitigated (Tracking ID JKWW-JP8) Impact Statement: Between 00:43 UTC and 06:00 UTC on 26 Oct 2022, customers using Azure Cosmos DB in East US may have experienced issues accessing services. New connections to databases in this region may have resulted in an error or timeout. Existing connections would have remained available to accept new requests, however if those connections were terminated, re-establishing them may have failed.
Preliminary Root Cause: We identified that this issue was caused due to a code regression introduced in a recent deployment, which led to high resource utilization on one of Azure Cosmos DB’s clusters in East US.
Mitigation: We mitigated the issue by re-distributing the load to other healthy Azure Cosmos DB clusters within the region. We will be deploying a permanent hotfix over the next 48 hours.
Next Steps: We will continue to investigate to establish the full root cause and prevent future occurrences. Stay informed about Azure service issues by creating custom service health alerts: https://aka.ms/ash-videos for video tutorials and https://aka.ms/ash-alerts for how-to documentation.
Posted Oct 26, 2022 - 15:31 EDT
The issue seems to have been resolved by Microsoft, we are verifying that all functions have resumed properly their operations.
Posted Oct 26, 2022 - 15:05 EDT
ClearID leverages Microsoft Cosmos DB for some of the services like roles in ClearID. Currently the Cosmos DB instances we are using have issues. Microsoft is actively working on resolving the problem.
Posted Oct 26, 2022 - 14:53 EDT
This incident affected: ClearID (Demo (https://demo.clearid.io), Global (https://portal.clearid.io)).