This is a TPP related bug that shows up in a IP failover situation.
In some customer site failover is configured is such a way that, when the primary machine fails, cluster manager starts the daemons on the secondary, and fails-over the IP-address along with it.
The daemon (e.g. mom) is started up on another host (by cluster manager) but has the same IP address as the primary (the IP address itself is failed over, and is the usual way cluster managers perform failover).
Since the primary machine went down abruptly, the TCP connection of the mom to the pbs_comm was not yet broken, and when a new connection comes from the restarted mom, pbs_comm keeps rejecting it. (saying the IP address is already registered).
The proposed solution fixes the situation as follows:
When a connection arrives, pbs_comm checks whether the IP address is already registered (and still registered) and if so, drops the new connection.
However, now, instead of dropping the new connection, it will close the older connection such that the new connection can be accepted.
Please review the interface change document for the above solution