Thank you for the response. Here is what we have for those lines:
preemptive_sched: true ALL
I think we have preemption configured correctly based on the goals we are trying to achieve. And as I mentioned, the preemption works correctly when the rerunnable flag is not set. It’s only when I set
-r n for the preemptible job that preemption stops working. I don’t want to say for sure that this is a bug, but I tested this multiple times, and setting
r -n somehow prevents the job from being preempted.
Could this have to do with not having a checkpoint script, since our preempt_order is set to CR? Maybe it is just taking longer for PBS Pro to preempt the job because it is trying to checkpoint the job and is not able to?
I can provide job scripts if you want to try reproduce the error.