PP-832: Which scheduler to talk to while taking over from Primary


Hi All,

PP-832 presents an issue in a failover setup whereby the secondary while taking over from the primary checks only once if it can communicate with the scheduler on the primary host or not and proceeds accordingly.

I have written a summary of this issue and proposed two approaches to fix it over here.

I request the community to provide feedback.



Thanks for writing the EDD and I vote for solution 1. Second step in in this solution is

  • Any time the scheduler on the primary goes down, the secondary server will spawn a local scheduler.

I am assuming primary server is up in this case even though the scheduler went down. If yes just wanted to know why can’t the primary server itself start the scheduler locally (This way we don’t lose fairshare usage for one cycle, other configuration of sched etc)

If the second step in solution-1 applies only when secondary becomes active then we need to rephrase the above statement so as to reflect this.

Because Multisched is already checked in and If we are not considering Multisched we need to mention in EDD that it applies only to default scheduler for now but it can be enhanced to consider multiple schedulers as part of Multisched failover interface.