Alternate security model to CLUSTERADDR2

Hi All,

This proposal is to eliminate the usage of IS_CLUSTERADDR due to its salability issues and to use an alternate scheme.

You can find my initial solution in the page:

https://pbspro.atlassian.net/wiki/spaces/PD/pages/1320878087/Alternate+security+model+to+CLUSTERADDR2

Let me know your comments.

Could you please describe any plans for backward compatibility and how a site would migrate from the old mechanism? Will they need to drain the cluster, or can this be done while jobs are running?

The packets have a fixed content for a given IP address, making spoofing easier. Just wait for a node to go down (e.g., for provisioning) and grab its IP address.

I think it would be better to generate a packet containing several bytes of random data, followed by a timestamp, then the IP address. Take this and encrypt it with the shared key and send that.

The receiver decrypts it using the shared key, ignores the random data, and validates the timestamp and IP address.

Replays are caught by the timestamp. And every packet is unique, making partial-known-plaintext attacks harder.

Hey @mkaro,

This is only the proposal for replacing CLUSTERADDR functionality. At the time to implementation we can circle back as to what is required. Technically, it is possible to support both the protocols, the old and new (since we will version this as well, as part of the new PING protocol).

Hi @dtalcott,

Yes you are right about the possibility of IP spoofing. However, IP spoofing from outside a subnet is a bit difficult - you can send a message to target, but the reply will usually make way back to somebody with the original IP (or to nobody) since the routers will do their jobs (unless you have access to the routers themselves). Nevertheless, it is possible.

Our view was that this is only to replace the CLUSTERADDR2 message. The overall IM exchange sequence is protected by other authentication mechanisms, like munge (in future TLS etc). This is, therefore, not a replacement of the authentication, but just a replacement of CLUSTERADDR2 message.

That said, we can introduce a timestamp; our feel was that this would need the clocks to be pretty synchronized - is that a thing that can be enforced? Of course, i think munge mandates that anyway…

If mandating that clocks be synchronized is not issue, then we can definitely use timestamp.

Of course we can store the last packets timestamp and check that the incoming timestamp is > one stored and that does not need clocks to be synced.

I’ve thought about this more and have decided the original proposal is adequate. The only purpose to the CLUSTERADDR2 message is to add an IP address to a MOM’s list of in-the-same-cluster hosts. It does not guarantee the (current) holder of that IP address is not a rogue of some kind.

Note this new method does not protect a Mom from a Mom that was removed from the server via qmgr. The old Mom still has the secret, so it can send valid CLUSTERADDR2 msgs. Under the old method, the server would push out a new list with that old Mom removed.

In a similar vein, is this a risk in a cloud environment where the same image might be used for multiple instances? Thus, same secret. So a Mom from one instance could add itself to the Moms in another instance if it knew their IP addresses?

Yes, you are right.

Shared key should be removed from the mom to prevent falling in the wrong hands once the node is removed from the cluster.
In a cloud environment, adding/removal of the key should be done on the fly and key being part of the cloud image is insecure.