8/8/2024 3:38 PM
Summary
On July 31, 2024, at approximately 19:04 CEST, an issue temporarily impacted access to content packages for some GCC customers. Thanks to a swift response, the issue was fully resolved by 23:22 CEST the same day, ensuring minimal disruption. This report highlights the incident, the root causes, the prompt actions taken, and proactive steps that have been implemented to enhance our service reliability moving forward.
Root Cause
The incident was linked to the handling of requests to the path, which is crucial for managing cookies that enable access to content packages. The issue resulted in 404 errors, where the server temporarily struggled to retrieve or set necessary cookies for session management. While such errors can be transient and influenced by various factors, including:
* Temporary Server or Network Challenges: Momentary issues may have impacted the server’s ability to process requests.
* Session State Management: Occasional challenges in session management might have led to brief disruptions.
* Caching Issues: The server’s caching mechanisms might have momentarily affected data retrieval.
Despite the complexity of the factors, these types of errors tend to be short-lived, typically occur on the server side and are beyond the direct control of the application itself. The restart of the web application efficiently resolved the issue by refreshing the server state and clearing any potential temporary conditions.
Resolution
After a thorough investigation of the circumstances, which included performing multiple tests, our team verified the app service configuration and implemented a restart of the affected web application, which successfully:
* Cleared any disrupted session states.
* Reset the server state, addressing any transient challenges and restoring full functionality.
Impact
* Customer Experience: During the period of disruption, GCC customers experienced problems with accessing content packages.
* Service Availability: SCORM services were impacted for approximately 4 hours, 2 minutes, and 18 seconds, during which our team was fully engaged in restoring service.
Preventive Measures
We already have auto-heal rules in place for all our app services, ensuring they can recover from most issues automatically. However, the path to obtain cookies that enable access to content packages was not previously covered by these rules. To further strengthen our system and prevent similar incidents, we have made the following enhancements:
* Optimized Auto-Heal Settings: We have updated our auto-heal rules to include the aforementioned path. This improvement ensures that any future issues related to this path will be automatically addressed, minimizing the need for manual intervention.
* Enhanced Error Handling and Alerts: We have introduced targeted alerts for repeated 404 errors during such requests. This will enable our team to respond even faster and proactively to any emerging issues.
Conclusion
The incident was resolved quickly once the underlying issue was identified. We are committed to delivering a seamless experience for our customers and will continue to enhance our systems to prevent future disruptions.
We can confirm the issue regarding accessing content packages has now been fully resolved.
If you continue to have any issues, please contact the Zensai Support Team for additional assistance.
https://helpcenter.zensai.com/hc/en-us/articles/202908861-Submit-a-request-to-Zensai-Product-Support
We have implemented a fix and will continue to monitor the results to ensure the behavior is appropriately addressed.
7/31/2024 7:21 PMWe are aware of an issue currently impacting GCC customers regarding access to Content Packages.
We are actively investigating and will update this page as soon as possible.
For current system status information about LMS365, check out our system status page. During an incident, you can also receive status updates by subscribing to updates available on our status page. The summary of our post-mortem investigation is usually posted here a few days after the incident has ended. If you have additional questions about this incident, please log a ticket with us.
Comments
Please sign in to leave a comment.