tag:status.pronto.io,2005:/historyPronto Status - Incident History2024-03-28T23:23:34-06:00Prontotag:status.pronto.io,2005:Incident/203185912024-03-21T07:29:14-06:002024-03-21T07:29:14-06:00Database Maintenance<p><small>Mar <var data-var='date'>21</var>, <var data-var='time'>07:29</var> MDT</small><br><strong>Resolved</strong> - We've wrapped up the loose ends on this database maintenance. Everything looks good and the Pronto platform is fully functional. Thanks for your patience.</p><p><small>Mar <var data-var='date'>21</var>, <var data-var='time'>05:09</var> MDT</small><br><strong>Identified</strong> - The database maintenance is taking longer than expected. Service may unreliable until it is complete. We apologize for the disruption, our team is working as quickly as possible to restore full service.</p>tag:status.pronto.io,2005:Incident/203130312024-03-21T04:00:56-06:002024-03-21T04:00:56-06:00Database maintenance<p><small>Mar <var data-var='date'>21</var>, <var data-var='time'>04:00</var> MDT</small><br><strong>Completed</strong> - The scheduled maintenance has been completed.</p><p><small>Mar <var data-var='date'>21</var>, <var data-var='time'>02:00</var> MDT</small><br><strong>In progress</strong> - Scheduled maintenance is currently in progress. We will provide updates as necessary.</p><p><small>Mar <var data-var='date'>20</var>, <var data-var='time'>13:09</var> MDT</small><br><strong>Scheduled</strong> - Due to the outage yesterday we will be undergoing high priority database maintenance early tomorrow morning to increase stability and performance. This will result in downtime across the Pronto platform for a period of up to 2 hours from 02:00-04:00 MDT. If you have any questions or concerns please contact support@pronto.io.</p>tag:status.pronto.io,2005:Incident/203031982024-03-19T13:34:50-06:002024-03-19T13:34:50-06:00Database issues<p><small>Mar <var data-var='date'>19</var>, <var data-var='time'>13:34</var> MDT</small><br><strong>Resolved</strong> - The issue has been fully resolved. Thanks for your patience.</p><p><small>Mar <var data-var='date'>19</var>, <var data-var='time'>11:52</var> MDT</small><br><strong>Monitoring</strong> - A fix has been implemented and we are monitoring for stability.</p><p><small>Mar <var data-var='date'>19</var>, <var data-var='time'>11:41</var> MDT</small><br><strong>Investigating</strong> - There is a connectivity issue with our primary database. We are working urgently with our database vendor to understand the issue and get it fixed as soon as possible. We apologize for the issues and will send out another update as soon as we know more.</p>tag:status.pronto.io,2005:Incident/185598032023-09-20T16:25:30-06:002023-09-20T16:25:30-06:00Pronto System Outage<p><small>Sep <var data-var='date'>20</var>, <var data-var='time'>16:25</var> MDT</small><br><strong>Resolved</strong> - This incident has been resolved.</p><p><small>Sep <var data-var='date'>20</var>, <var data-var='time'>15:20</var> MDT</small><br><strong>Update</strong> - The system has recovered and is currently operational. We will continue to monitor the database and work to identify the cause.</p><p><small>Sep <var data-var='date'>20</var>, <var data-var='time'>15:20</var> MDT</small><br><strong>Monitoring</strong> - We have cleared out the hung queries and performance seems to be back to normal. We will continue monitoring for the next little while to ensure that the problem is resolved.</p><p><small>Sep <var data-var='date'>20</var>, <var data-var='time'>15:12</var> MDT</small><br><strong>Investigating</strong> - Database queries have begun to hang for an unknown reason and have resulted in system downtime. We are investigating this right now with our database vendor and will update here as soon as we know more. We apologize for the downtime.</p>tag:status.pronto.io,2005:Incident/184402462023-09-11T00:01:22-06:002023-09-11T00:01:22-06:00Realtime Messaging Maintenance (Phase 2)<p><small>Sep <var data-var='date'>11</var>, <var data-var='time'>00:01</var> MDT</small><br><strong>Completed</strong> - The scheduled maintenance has been completed.</p><p><small>Sep <var data-var='date'>10</var>, <var data-var='time'>22:00</var> MDT</small><br><strong>In progress</strong> - Scheduled maintenance is currently in progress. We will provide updates as necessary.</p><p><small>Sep <var data-var='date'> 8</var>, <var data-var='time'>15:23</var> MDT</small><br><strong>Scheduled</strong> - Our realtime messaging provider has scheduled a second window of urgent maintenance on their system to solve the problems of the past few days. During this time realtime messaging may be unstable. Messages can still be sent in Pronto, but unread badge counts and realtime messages appearing in chat may be affected. Usually a refresh of the browser or restarting the mobile app will reload any missed messages.</p>tag:status.pronto.io,2005:Incident/184402212023-09-09T00:01:33-06:002023-09-09T00:01:33-06:00Realtime Messaging Maintenance<p><small>Sep <var data-var='date'> 9</var>, <var data-var='time'>00:01</var> MDT</small><br><strong>Completed</strong> - The scheduled maintenance has been completed.</p><p><small>Sep <var data-var='date'> 8</var>, <var data-var='time'>22:00</var> MDT</small><br><strong>In progress</strong> - Scheduled maintenance is currently in progress. We will provide updates as necessary.</p><p><small>Sep <var data-var='date'> 8</var>, <var data-var='time'>15:20</var> MDT</small><br><strong>Scheduled</strong> - Our realtime messaging provider has scheduled urgent maintenance on their system to solve the problems of the past few days. During this time realtime messaging may be unstable. Messages can still be sent in Pronto, but unread badge counts and realtime messages appearing in chat may be affected. Usually a refresh of the browser or restarting the mobile app will reload any missed messages.</p>tag:status.pronto.io,2005:Incident/184226212023-09-07T09:28:27-06:002023-09-07T10:19:34-06:00Real-time web sockets issue<p><small>Sep <var data-var='date'> 7</var>, <var data-var='time'>09:28</var> MDT</small><br><strong>Resolved</strong> - This incident has been resolved.</p><p><small>Sep <var data-var='date'> 7</var>, <var data-var='time'>08:42</var> MDT</small><br><strong>Monitoring</strong> - Our provider has implemented a fix and we are monitoring the results.</p><p><small>Sep <var data-var='date'> 7</var>, <var data-var='time'>08:34</var> MDT</small><br><strong>Investigating</strong> - We are again seeing increased error rates and problems connecting to our real-time messaging service provider. They are investigating the issue.</p>tag:status.pronto.io,2005:Incident/184095762023-09-06T15:30:31-06:002023-09-06T15:30:31-06:00Real-time web sockets issue<p><small>Sep <var data-var='date'> 6</var>, <var data-var='time'>15:30</var> MDT</small><br><strong>Resolved</strong> - A fix has been implemented and we are now seeing normal connection rates. Thanks for your patience. As we gather more details about what happened we will share them in a post-mortem.</p><p><small>Sep <var data-var='date'> 6</var>, <var data-var='time'>14:36</var> MDT</small><br><strong>Update</strong> - Our provider has made strides in fixing the issue but it is not completely resolved yet. We continue to see sporadic connection issues. These issues may be resolved temporarily by refreshing your web browser or by restarting your app (on mobile). We will continue to post updates as we know more.</p><p><small>Sep <var data-var='date'> 6</var>, <var data-var='time'>11:01</var> MDT</small><br><strong>Update</strong> - Overall things are working better, but we are still seeing sporadic connection issues. We continue to work with our provider to identify root cause. We apologize for the disruption today. We will continue to post updates as we get them.</p><p><small>Sep <var data-var='date'> 6</var>, <var data-var='time'>09:01</var> MDT</small><br><strong>Investigating</strong> - We are again seeing increased error rates and problems connecting to our real-time messaging service provider. This issue seems to be identical to the one we experienced yesterday. Our service provider has acknowledged the issue and is working on a resolution. Thank you for your patience.</p>tag:status.pronto.io,2005:Incident/184000972023-09-05T18:44:29-06:002023-09-05T18:44:29-06:00Pronto system outage<p><small>Sep <var data-var='date'> 5</var>, <var data-var='time'>18:44</var> MDT</small><br><strong>Resolved</strong> - Our service provider has marked this issue as resolved. Our systems look good and everything is functioning as normal. Thank you for your patience.</p><p><small>Sep <var data-var='date'> 5</var>, <var data-var='time'>17:03</var> MDT</small><br><strong>Monitoring</strong> - Our service provider seems to have resolved the issue and Pronto is currently functional. We will continue to monitor the situation and await further updates from them until they have confirmed the solution.</p><p><small>Sep <var data-var='date'> 5</var>, <var data-var='time'>15:52</var> MDT</small><br><strong>Identified</strong> - We are again experiencing the same issue as before, causing a service disruption to Pronto. We will continue to monitor the situation and update when we have further information.</p><p><small>Sep <var data-var='date'> 5</var>, <var data-var='time'>15:31</var> MDT</small><br><strong>Monitoring</strong> - Our service provider seems to have resolved the issue and Pronto is currently functional. We will continue to monitor the situation and await further updates from them until they have confirmed the solution.</p><p><small>Sep <var data-var='date'> 5</var>, <var data-var='time'>15:16</var> MDT</small><br><strong>Identified</strong> - Our real time messaging service provider is currently experiencing a major system outage. This event is also affecting Pronto. They are aware of the issue are are currently working to resolve the issue as quickly as possible.</p>tag:status.pronto.io,2005:Incident/172200832023-05-11T13:32:41-06:002023-05-11T13:32:41-06:00Pronto system outage<p><small>May <var data-var='date'>11</var>, <var data-var='time'>13:32</var> MDT</small><br><strong>Resolved</strong> - We are confident that the system is now fully back to normal. We will be working with our database vendor further to understand how to avoid this situation during future migrations. Thank you for you patience.</p><p><small>May <var data-var='date'>11</var>, <var data-var='time'>10:26</var> MDT</small><br><strong>Monitoring</strong> - The issue appears to be that when the migration was performed some unexpected database locks caused certain queries to hang, blocking other queries from happening. We have cleared out the hung queries and performance seems to be back to normal. We will continue monitoring for the next little while to ensure that the problem is resolved.</p><p><small>May <var data-var='date'>11</var>, <var data-var='time'>10:13</var> MDT</small><br><strong>Investigating</strong> - After the team performed a routine database migration, queries began to hang for an unknown reason and have resulted in system downtime. We are investigating this right now with our database vendor and will update here as soon as we know more. We apologize for the downtime.</p>tag:status.pronto.io,2005:Incident/161398142023-02-14T13:44:28-07:002023-02-14T15:14:38-07:00Slow performance and request failures<p><small>Feb <var data-var='date'>14</var>, <var data-var='time'>13:44</var> MST</small><br><strong>Resolved</strong> - We identified a rarely occurring slow database query that caused a cascading effect on the database. We have restored the production database performance and are in the process of testing a permanent fix.</p><p><small>Feb <var data-var='date'>14</var>, <var data-var='time'>13:34</var> MST</small><br><strong>Investigating</strong> - We are currently investigating an issue where requests are failing. We will update this incident as we learn more.</p>tag:status.pronto.io,2005:Incident/127556022022-11-08T11:22:14-07:002022-11-08T11:22:14-07:00Real-time web sockets issue<p><small>Nov <var data-var='date'> 8</var>, <var data-var='time'>11:22</var> MST</small><br><strong>Resolved</strong> - This incident has been resolved.</p><p><small>Nov <var data-var='date'> 8</var>, <var data-var='time'>10:39</var> MST</small><br><strong>Update</strong> - We are continuing to monitor for any further issues.</p><p><small>Nov <var data-var='date'> 8</var>, <var data-var='time'>09:03</var> MST</small><br><strong>Monitoring</strong> - Performance seems to be improving. We are continuing to monitor and will post any further updates until the issue is resolved.</p><p><small>Nov <var data-var='date'> 8</var>, <var data-var='time'>08:48</var> MST</small><br><strong>Identified</strong> - The vendor has acknowledged there's an error on their side and they're working to resolve it.</p><p><small>Nov <var data-var='date'> 8</var>, <var data-var='time'>08:45</var> MST</small><br><strong>Investigating</strong> - Our websocket provider is currently experiencing an outage that is affecting sending, receiving, and loading messages. Customers may see prolonged progress spinners and message send errors. We are in contact with our vendor and will provide updates here as we get them.</p>tag:status.pronto.io,2005:Incident/127377712022-11-06T19:14:17-07:002022-11-06T19:14:17-07:00Message send errors<p><small>Nov <var data-var='date'> 6</var>, <var data-var='time'>19:14</var> MST</small><br><strong>Resolved</strong> - The vendor has updated their SSL cert and all is back to normal.</p><p><small>Nov <var data-var='date'> 6</var>, <var data-var='time'>17:53</var> MST</small><br><strong>Identified</strong> - The SSL certificate of one our vendors expired. We are contacting them so they can update it.</p><p><small>Nov <var data-var='date'> 6</var>, <var data-var='time'>17:35</var> MST</small><br><strong>Investigating</strong> - Attempting to send a message is resulting in a server error. Meetings may also be affected. The team is investigating.</p>tag:status.pronto.io,2005:Incident/107837532022-08-08T10:28:21-06:002022-08-08T10:28:21-06:00Failed web deployment<p><small>Aug <var data-var='date'> 8</var>, <var data-var='time'>10:28</var> MDT</small><br><strong>Resolved</strong> - The deployment has been successfully rolled back. We will be taking a look at what happened before re-deploying.</p><p><small>Aug <var data-var='date'> 8</var>, <var data-var='time'>10:13</var> MDT</small><br><strong>Identified</strong> - We are currently performing a rollback to the previous version which will take a few minutes.</p><p><small>Aug <var data-var='date'> 8</var>, <var data-var='time'>10:01</var> MDT</small><br><strong>Investigating</strong> - During an attempt to deploy new changes to the Pronto web app, there was a failure resulting in blank screens and loading errors. We are currently investigating. The mobile app and APIs are not affected.</p>tag:status.pronto.io,2005:Incident/107606362022-08-04T10:42:15-06:002022-08-04T10:42:15-06:00Real-time web sockets issue<p><small>Aug <var data-var='date'> 4</var>, <var data-var='time'>10:42</var> MDT</small><br><strong>Resolved</strong> - This incident has been resolved.</p><p><small>Aug <var data-var='date'> 4</var>, <var data-var='time'>09:33</var> MDT</small><br><strong>Update</strong> - We are seeing some elevated error rates once again and are working with our vendor.</p><p><small>Aug <var data-var='date'> 4</var>, <var data-var='time'>09:26</var> MDT</small><br><strong>Monitoring</strong> - Our websockets vendor has implemented a fix and we are monitoring the results. Right now the system appears to be back to normal.</p><p><small>Aug <var data-var='date'> 4</var>, <var data-var='time'>09:23</var> MDT</small><br><strong>Update</strong> - We are continuing to investigate this issue.</p><p><small>Aug <var data-var='date'> 4</var>, <var data-var='time'>09:22</var> MDT</small><br><strong>Investigating</strong> - Our websocket provider is currently experiencing an outage that is affecting sending, receiving, and loading messages. Customers may see prolonged progress spinners and message send errors. We are in contact with our vendor and will provide updates here as we get them.</p>tag:status.pronto.io,2005:Incident/106700202022-07-22T18:30:25-06:002022-07-22T18:30:26-06:00Notification delay<p><small>Jul <var data-var='date'>22</var>, <var data-var='time'>18:30</var> MDT</small><br><strong>Resolved</strong> - Since our last update we have been carefully monitoring notification delays and the numbers look great and have stayed that way for about 4 hours now. We believe that this issue is now fully resolved and will close this incident. The root cause was that a database upgrade appears to have reset some of the optimizations that were used to ensure fast database access. After reconfiguring the database with the proper optimizations, performance returned to normal levels.</p><p><small>Jul <var data-var='date'>22</var>, <var data-var='time'>13:57</var> MDT</small><br><strong>Monitoring</strong> - After making some more database optimizations all our numbers are looking good again. We will continue monitoring for the next few hours, but for now notifications are back to normal delivery times.</p><p><small>Jul <var data-var='date'>22</var>, <var data-var='time'>10:07</var> MDT</small><br><strong>Investigating</strong> - We are once again experiencing delayed notifications, due to some slow database queries. We are working again with our vendor to investigate and will update here as we learn more. We sincerely apologize for the ongoing issue. The team is working diligently to solve this issue for good.</p>tag:status.pronto.io,2005:Incident/106640842022-07-21T20:42:54-06:002022-07-21T20:42:54-06:00Delayed notifications<p><small>Jul <var data-var='date'>21</var>, <var data-var='time'>20:42</var> MDT</small><br><strong>Resolved</strong> - Notifications have returned to normal after the fixes made by our database vendor. We will continue to monitor notification delay over the next 24 hours to ensure no further issues. Thanks again for your patience.</p><p><small>Jul <var data-var='date'>21</var>, <var data-var='time'>19:02</var> MDT</small><br><strong>Monitoring</strong> - Notification delivery times have returned to normal after our database vendor applied some optimizations. Initial findings indicate a positive impact on all affected database operations. We will continue to monitor the results.<br /><br />Thank you for your patience.</p><p><small>Jul <var data-var='date'>21</var>, <var data-var='time'>17:32</var> MDT</small><br><strong>Update</strong> - Overall notification delays have improved, but we are still seeing some periodic spikes. We continue to investigate and will update here again once we know more. Thank you for your patience.</p><p><small>Jul <var data-var='date'>21</var>, <var data-var='time'>14:26</var> MDT</small><br><strong>Investigating</strong> - We are currently experiencing issues with notifications being delayed. We believe this is due to a database being slower than normal after an upgrade. We are working with our database vendor to identify and solve the issue. The rest of the Pronto platform is working normally.</p>tag:status.pronto.io,2005:Incident/104446822022-06-30T12:07:15-06:002022-06-30T12:08:33-06:00Platform issues<p><small>Jun <var data-var='date'>30</var>, <var data-var='time'>12:07</var> MDT</small><br><strong>Resolved</strong> - After reviewing logs and metrics, all evidence points to a temporary, transient network issue within AWS that lasted from 16:53 - 17:00 UTC. We have opened a case with AWS support and will update with a post-mortem if there are any changes to our assessment.</p><p><small>Jun <var data-var='date'>30</var>, <var data-var='time'>11:30</var> MDT</small><br><strong>Update</strong> - We are continuing to investigate this issue.</p><p><small>Jun <var data-var='date'>30</var>, <var data-var='time'>11:00</var> MDT</small><br><strong>Update</strong> - The Pronto service is now back up. We are still investigating the cause.</p><p><small>Jun <var data-var='date'>30</var>, <var data-var='time'>10:53</var> MDT</small><br><strong>Investigating</strong> - Most requests are currently failing to the backend API servers. We are actively investigating and will update here once we know more.</p>tag:status.pronto.io,2005:Incident/100678702022-05-29T02:00:28-06:002022-05-29T00:34:28-06:00Database maintenance<p><small>May <var data-var='date'>29</var>, <var data-var='time'>02:00</var> MDT</small><br><strong>Completed</strong> - The scheduled maintenance has been completed.</p><p><small>May <var data-var='date'>29</var>, <var data-var='time'>00:01</var> MDT</small><br><strong>In progress</strong> - Scheduled maintenance is currently in progress. We will provide updates as necessary.</p><p><small>May <var data-var='date'>24</var>, <var data-var='time'>16:05</var> MDT</small><br><strong>Scheduled</strong> - We will be performing database maintenance to enhance overall application stability and performance. Pronto will be unavailable during this short window.</p>tag:status.pronto.io,2005:Incident/97417732022-04-17T03:26:30-06:002022-04-17T03:26:30-06:00Database maintenance<p><small>Apr <var data-var='date'>17</var>, <var data-var='time'>03:26</var> MDT</small><br><strong>Completed</strong> - The scheduled maintenance has been completed.</p><p><small>Apr <var data-var='date'>17</var>, <var data-var='time'>02:50</var> MDT</small><br><strong>Verifying</strong> - Verification is currently underway for the maintenance items.</p><p><small>Apr <var data-var='date'>17</var>, <var data-var='time'>02:00</var> MDT</small><br><strong>In progress</strong> - Scheduled maintenance is currently in progress. We will provide updates as necessary.</p><p><small>Apr <var data-var='date'> 8</var>, <var data-var='time'>14:03</var> MDT</small><br><strong>Scheduled</strong> - We will be performing maintenance on the Pronto platform during a 4 hour window on Sunday, April 17, from 2am - 6am MDT. Pronto services may be unavailable during this time. The purpose of this maintenance window is to make some database changes that will improve performance and long term maintainability.</p>tag:status.pronto.io,2005:Incident/91559822022-01-24T19:14:52-07:002022-01-24T19:14:52-07:00Database issue<p><small>Jan <var data-var='date'>24</var>, <var data-var='time'>19:14</var> MST</small><br><strong>Resolved</strong> - The Pronto platform is back to normal. We will be analyzing data from this incident over the coming days to understand what went wrong and how to better prevent a similar event in the future.</p><p><small>Jan <var data-var='date'>24</var>, <var data-var='time'>18:15</var> MST</small><br><strong>Monitoring</strong> - Some of the affected components have been restored and we are starting to see some improvement. The Pronto service is back up, but may have degraded performance for a little while as the other components are restored.</p><p><small>Jan <var data-var='date'>24</var>, <var data-var='time'>17:50</var> MST</small><br><strong>Identified</strong> - As mentioned in the previous update, the issue is due to some underlying hardware failures. Our database vendor is working to move the affected components to new hardware.</p><p><small>Jan <var data-var='date'>24</var>, <var data-var='time'>17:31</var> MST</small><br><strong>Update</strong> - Our database vendor has identified a hardware issue in the database cluster and is working to mediate.</p><p><small>Jan <var data-var='date'>24</var>, <var data-var='time'>16:35</var> MST</small><br><strong>Update</strong> - Our backend database provider has acknowledged an issue on their platform and are working to diagnose. We will continue to update here as soon as we know more.</p><p><small>Jan <var data-var='date'>24</var>, <var data-var='time'>16:17</var> MST</small><br><strong>Investigating</strong> - We have been alerted to an issue with our backend database that is causing a system outage. We are currently investigating.</p>tag:status.pronto.io,2005:Incident/89539592021-12-28T20:09:05-07:002021-12-28T20:22:31-07:00Degraded performance on user queries<p><small>Dec <var data-var='date'>28</var>, <var data-var='time'>20:09</var> MST</small><br><strong>Resolved</strong> - Our workarounds have been deployed successfully and performance has returned to normal. Both User Search and User Count endpoints have been re-enabled and all Pronto functionality is restored. At this point we believe the root cause was a database-level bug. We are working with our database vendor to confirm the bug and get a permanent fix. In the meantime, now that we know how to workaround it, we expect no further problems from this.</p><p><small>Dec <var data-var='date'>28</var>, <var data-var='time'>18:26</var> MST</small><br><strong>Update</strong> - We are continuing to work on a fix for this issue. We have deployed some workarounds and seen some promising results. We are working to identify and address the remaining slow areas using a similar approach. We will provide another update after testing and deploying those additional workarounds.</p><p><small>Dec <var data-var='date'>28</var>, <var data-var='time'>15:59</var> MST</small><br><strong>Identified</strong> - We've identified the very slow queries and are attempting some workarounds to speed them up. It is still unclear what the root cause of the slow queries is, but we are hopeful that this workaround will resolve the immediate issues to get performance back to normal and allow us to reenable all endpoints.</p><p><small>Dec <var data-var='date'>28</var>, <var data-var='time'>13:53</var> MST</small><br><strong>Update</strong> - We have temporarily disabled two endpoints that are causing the issue in order to protect the rest of the application. These two endpoints are:<br /><br />1. User search - Attempting to search for a user in any context will not work. This includes starting a new DM (existing DMs are unaffected), adding new users to a group, or searching for users in org management.<br />2. User counts - In org management the overall user count will not be displayed<br /><br />We apologize for this loss of functionality, but deemed it necessary in order to prevent further issues across the Pronto app. We are working with our database vendor directly to diagnose and solve this issue as quickly as possible. Thanks for your patience.</p><p><small>Dec <var data-var='date'>28</var>, <var data-var='time'>13:31</var> MST</small><br><strong>Investigating</strong> - We are currently seeing degraded performance on user-related queries in the main Pronto database. User searches in org management and also in the client apps are currently taking a long time or timing out. User online status is also affected and may not be reflecting the correct state right now. We are actively investigating and will update as soon as we know more.</p>tag:status.pronto.io,2005:Incident/87838092021-12-07T18:36:41-07:002021-12-07T18:36:41-07:00AWS outage<p><small>Dec <var data-var='date'> 7</var>, <var data-var='time'>18:36</var> MST</small><br><strong>Resolved</strong> - All Pronto services are now back to normal. Push notifications are now being delivered in real-time and other async jobs such as URL previews are speedy once again. Canvas integration has also been re-enabled. Canvas course syncing will need some time to catch up, but should be up to date for all customers within the next 6 hours. Thank you for your patience today. We will spend some time analyzing this event to see what changes we can make to be more resilient to a similar failure in the future.</p><p><small>Dec <var data-var='date'> 7</var>, <var data-var='time'>17:07</var> MST</small><br><strong>Update</strong> - AWS has implemented their root cause mitigation plan and core Pronto services are once again working well. We are still experiencing some minor latency with push notifications as scaling on that service has not yet been restored by AWS engineers. Canvas integration is also still disabled for the same reason. We are hopeful that these issues will both be resolved quickly.</p><p><small>Dec <var data-var='date'> 7</var>, <var data-var='time'>15:29</var> MST</small><br><strong>Monitoring</strong> - As AWS starts to see significant recovery, we also are seeing some Pronto services scaling up again. Push notifications are still delayed, but response times are improving on the core Pronto services. Canvas integration is still disabled. We will continue to provide updates as services recover.</p><p><small>Dec <var data-var='date'> 7</var>, <var data-var='time'>14:24</var> MST</small><br><strong>Update</strong> - We just saw a major increase in traffic from an integration platform, perhaps as it itself was recovering. This caused our small cluster to get overloaded. To mitigate we have temporarily disabled the Canvas integration platform until we are once again able to scale the Pronto services. This mitigation appears to have worked and Pronto core services are back up, albeit with slower response times than normal.</p><p><small>Dec <var data-var='date'> 7</var>, <var data-var='time'>14:17</var> MST</small><br><strong>Update</strong> - We are continuing to work on a fix for this issue.</p><p><small>Dec <var data-var='date'> 7</var>, <var data-var='time'>14:05</var> MST</small><br><strong>Update</strong> - As expected, traffic increases finally pushed Pronto over the edge and we are now experiencing a system wide outage due to our inability to scale because of the AWS outage. We will continue to do whatever we can within our power to bring Pronto back up. We sincerely apologize for the disruption we know this is causing you.</p><p><small>Dec <var data-var='date'> 7</var>, <var data-var='time'>12:15</var> MST</small><br><strong>Update</strong> - AWS says they are starting to see some signs of recovery, but do not have an ETA for full recovery at this time. We have tried various ways to scale Pronto servers, but because AWS internal APIs are failing this has not been successful. Thus, Pronto is currently running on less than half the capacity we normally would at this time of day. Push notifications continue to be delayed, and general response times are increasing. We expect that if AWS has not recovered their services in the next hour we will start to see much higher latency and an increase in error rates on Pronto core services. We will continue to explore alternatives in the meantime and will keep you up to date. Thanks for your patience.</p><p><small>Dec <var data-var='date'> 7</var>, <var data-var='time'>11:15</var> MST</small><br><strong>Identified</strong> - AWS has identified the root cause and are working towards recovery. Pronto core services are still running smoothly for now (except for delays in push notifications and other async jobs as noted in the last update), but because of the outage we are unable to automatically or manually scale up our servers as we normally would. As traffic increases in the next couple of hours this could result in slower response times across Pronto services. We are investigating alternative ways to scale up our servers in the meantime and will continue to keep you updated.</p><p><small>Dec <var data-var='date'> 7</var>, <var data-var='time'>10:48</var> MST</small><br><strong>Investigating</strong> - There seems to be problems in the us-east-1 AWS region resulting in some services being slow or having increased error rates. Core Pronto services are not currently impacted, but push notifications and other async jobs such as URL previews may be delayed. We are monitoring the situation and will post updates as we learn more. AWS status is available here: https://status.aws.amazon.com/</p>tag:status.pronto.io,2005:Incident/82285232021-10-23T05:12:18-06:002021-10-23T05:12:18-06:00Database upgrades<p><small>Oct <var data-var='date'>23</var>, <var data-var='time'>05:12</var> MDT</small><br><strong>Completed</strong> - Our database upgrades have been completed and verified. We will continue to monitor throughout the day. Have a great weekend!</p><p><small>Oct <var data-var='date'>23</var>, <var data-var='time'>05:06</var> MDT</small><br><strong>Update</strong> - All Pronto services have been verified and are back online. We are now monitoring performance to ensure everything is running smoothly.</p><p><small>Oct <var data-var='date'>23</var>, <var data-var='time'>04:48</var> MDT</small><br><strong>Verifying</strong> - We have finished our database updates. We are now in the process of verification.</p><p><small>Oct <var data-var='date'>23</var>, <var data-var='time'>04:00</var> MDT</small><br><strong>In progress</strong> - Scheduled maintenance is currently in progress. We will provide updates as necessary.</p><p><small>Oct <var data-var='date'>14</var>, <var data-var='time'>07:53</var> MDT</small><br><strong>Scheduled</strong> - We will be performing maintenance on the Pronto platform during a 3 hour window on Saturday, October 23 from 4am - 7am MDT. All Pronto services will be unavailable during this time. The goal of this maintenance window is to increase database resiliency and reliability to be better prepared to withstand a provider outage like the one that happened on Sep 26th. Pronto operations engineers have prepared carefully and have contingency rollback plans in place. There is no risk to customer data.</p>tag:status.pronto.io,2005:Incident/82285702021-10-15T05:08:27-06:002021-10-15T05:08:27-06:00Database configuration<p><small>Oct <var data-var='date'>15</var>, <var data-var='time'>05:08</var> MDT</small><br><strong>Completed</strong> - All systems are back to normal after the change was applied. We will continue to monitor throughout the day.</p><p><small>Oct <var data-var='date'>15</var>, <var data-var='time'>05:03</var> MDT</small><br><strong>Verifying</strong> - The database configuration change has been applied and we're running checks.</p><p><small>Oct <var data-var='date'>15</var>, <var data-var='time'>05:00</var> MDT</small><br><strong>In progress</strong> - Scheduled maintenance is currently in progress. We will provide updates as necessary.</p><p><small>Oct <var data-var='date'>14</var>, <var data-var='time'>08:00</var> MDT</small><br><strong>Scheduled</strong> - We will be performing a small change to our current database configuration in preparation for next week's larger database upgrade. This configuration requires a restart of the main database cluster, during which time the Pronto platform will be unavailable. The downtime should last no more than 15 minutes.</p>