Release Note: Resolved issue with absence attendance records not being generated correctly

On August 3, 2015 we were alerted that the system was not generating absence attendance records correctly. During our investigation we ended up realizing that this was an un-noticed after effect of the outage from July 31 .

We learned that when Amazon went down it left a locking mechanism that normally prevents running certain types of background jobs simultaneously in a locked state, preventing those jobs from running. This caused a backlog of what we call “Reminder” jobs to build up and go un-noticed.

Once we identified the root cause we re-ran the missing jobs and re-generated the absence records. There was no data lost other than it not being available for 3 days between the AWS outage and when the issue was fixed on August 3, 2015.

In order to avoid similar issues in the future we implemented a few measures in the system:

We replaced the locking mechanism on the jobs with one that has a proper timeout mechanism, this way even if the job was to get stuck again, the lock would expire a few minutes later and not be an issue.

We added alerting to the system when there is a large number of “Reminder” jobs waiting so that we will noticed such issues in the future.

If you have any questions please let us know by clicking here .

— Piotr Banasik

About The Author