Lokalise platform performance issues
Incident Report for Lokalise
Postmortem

To better handle the growing amount of data and users in Lokalise, we constantly work on scaling resources for the application. After a routine operational change that was previously executed multiple times and tested on staging environment successfully, Elasticsearch cluster that powers many Lokalise features has become suddenly overloaded.

Once more users have started coming online the service began struggling with the load leading to increased latency and general slowness of Lokalise application. We have turned off filters, search, and statistics to make application performant again while in limited mode, and continued to work on resolving the issue.

The source of the issue has been established quickly, however full performance restoration took more than an hour before we could re-enable all functionality. It took this long because the Elasticsearch index that had to be relocated was very large. The root cause of the incident was an incorrect estimation of the resources required for scaling the backend service. This was unexpected as it turned out that metrics we have in place did not reveal the full extent of the actual service’s load.

We apologize for the inconvenience and frustration caused by the downtime experienced by our customers. Our team takes this incident seriously and is committed to taking all necessary measures to prevent similar incidents from occurring in the future. We appreciate your patience and understanding and will continue to work diligently to improve our system’s performance and reliability for you.

Posted Apr 25, 2023 - 14:58 UTC

Resolved
This incident has been resolved.
Posted Apr 19, 2023 - 10:17 UTC
Monitoring
A fix has been implemented and the search, filtering, and statistics functionality are restored. We are monitoring the results.
Posted Apr 19, 2023 - 08:29 UTC
Identified
The issue has been identified, and a fix is being implemented. The app is accessible now, however we disabled search, filtering, and statistics functionality to work on the root cause.
Posted Apr 19, 2023 - 06:52 UTC
Update
We are continuing to investigate this issue.
Posted Apr 19, 2023 - 06:37 UTC
Update
We are continuing to investigate this issue.
Posted Apr 19, 2023 - 06:26 UTC
Update
We are continuing to investigate this issue.
Posted Apr 19, 2023 - 06:07 UTC
Investigating
The Lokalise platform (https://app.lokalise.com/) is experiencing performance issues related to loading times, navigation speed within projects, and search functionality. API is also affected. Our team is investigating the root cause of these issues and working diligently to resolve them immediately.
Posted Apr 19, 2023 - 04:20 UTC
This incident affected: Lokalise API and Lokalise App.