Failed assertion in vaccination list monitor

Project:NAADSM
Version:3.2.17
Component:Code
Category:bug report
Priority:high
Assigned:Neil Harvey
Status:active

The scenario in the attached ZIP file crashes on occasion with the following error message:

ERROR:../models/vaccination-list-monitor.c:321:
vaccination_list_monitor_handle_vaccination_canceled_event:
assertion failed: (p != NULL)

Description
AttachmentSize
vaccListAssertionFailure.zip1.15 MB

Comments

#1

Title:Failed assertion in vaccination_list_monitor» Failed assertion in vaccination list monitor

#2

Assigned to:guest» Neil Harvey

Working on this one... trying to find a RNG seed that will reproduce the error. Tried seeds 1-200 so far with no crash; it must be a quite rare error condition.

#3

Even though I could run this scenario to completion on my laptop (as noted in the comment above), it would always crash shortly after starting on Sharcnet.

The crashes were occurring in GLib memory-handling statements, so I updated the GLib libraries to 2.28.7, but that did not help.

I noticed that the memory use (reported in the post-job email sent from Sharcnet) always seemed to be about the same. So I tried using the --mpp option to sqsub (the Sharcnet job-submission command) to say that the program needs more memory. Sure enough, the higher the value I specified for memory use, the more days the simulation ran before aborting.

It appears that Sharcnet has started enforcing a user-stated memory limit in addition to a user-stated runtime limit. I was not aware they were doing that. And it seems that if a program exceeds the user-stated memory limit, the resulting abort looks just like an ordinary segmentation fault.

I notified Jaleal, Ellie, and Margaret of this. I don't know yet whether using the --mpp option to up the memory limit will solve the particular failed assertion in issue #2513, but at least I'm once again able to do test runs of this scenario in the Sharcnet environment.

#4

Seeds 67 and 105 seem to produce this error condition on Sharcnet when used with the scenario "KSv23ID492".

Update - I was mistaken - still haven't found a seed to produce this error.