This incident was related to a database maintenance performed on January 19, during which management of Plazma table resources was migrated from the existing RDBMS to a dedicated Aurora database. The purpose of this change was to better isolate table resource workloads and improve long-term scalability.
The migration itself completed successfully, and both the original database and the newly introduced Aurora database were operating normally immediately after the maintenance.
On the following day, when internal batch processes began running, the newly introduced database experienced unexpected load. Investigation determined that the internal batch system was configured to connect to the master database endpoint instead of a reader endpoint. As a result, internal batch processing competed with user-facing requests, causing increased latency when accessing table resources.
To mitigate the impact and enable rapid recovery, temporary database tuning was applied. This tuning was intended solely as a short-term measure to stabilize the system and has since been fully reverted. The permanent fix consisted of correcting the configuration of the internal batch system so that it accesses the appropriate database endpoint.
Following these actions, system performance recovered in both the US and Tokyo regions, and all services returned to normal operation. No data loss or data corruption occurred.
To prevent similar issues in the future, we are reviewing our configuration and deployment practices, including reducing configuration differences between staging and production environments and strengthening validation of database endpoints used by internal workloads.