Metakgp:Incident Reports/2019-07-17 Visual Editor stopped working for a few hours
Jump to navigation
Jump to search
Impact
Visual Editor was not working for about 15 hours from approximately 01:00 to 16:00.
Trigger
01:10 Release of PR 49
Detection
06:45 Detected during the post-release testing of MediaWiki upgrade to v1.33
> visual editor still not running
Timeline
Notes:
- Dates and times must always be entered in India Standard Time (UTC +5:30)
- Event (column 3) must be written in the present tense
Date | Time | Event | Notes |
---|---|---|---|
2019-07-17 | 01:10 | PR 49 is released | links was a deprecated Docker feature and we wanted to move away from using it to the recommended replacement: networks. But the networks were not configured correctly: the parsoid container couldn't connect to the nginx container. This connection is essential for parsoid to work. |
2019-07-17 | 01:15 | [INCIDENT BEGINS] Visual editor becomes unusable | |
2019-07-17 | 06:45 | [INCIDENT DETECTED] Visual editor error are noticed for the first time | Error log from parsoid container clearly says that nginx container is not accessible |
2019-07-17 | 16:10 | [INCIDENT MITIGATED] PR 64 is released | This PR puts the parsoid and nginx containers in the same network, hence enabling communication between them |
2019-07-17 | 16:12 | [INCIDENT ENDS] Visual editor is usable again | Verified as both anon user and as a logged in user, with and without captcha |
Incident Analysis
What went well? | What went wrong? | Where did we get lucky? |
---|---|---|
Incident was detected during routine post-release testing for a subsequent PR | Several PRs were released together or in quick succession which made it hard to detect this problem right after PR 49 was released | Parsoid's error message made the reason for the error apparent |
Bug Fix PR 64 was approved quickly and released the same day |
Notes / Discussion
Error message inside the parsoid container
{ "name":"parsoid", "hostname":"a815b3504bee", "pid":36, "level":60, "err":{ "message":"Config Request failure for \"http://nginx/api.php\": Error: getaddrinfo ENOTFOUND nginx nginx:80", "name":"lib/index.js"