PP-289: unique job ids up to 1 trillion


I like the idea of using the stdint types. Since we don’t have negative sequence numbers, we could use uint64_t to take the maximum from 9,223,372,036,854,775,807 (INT64_MAX) to 18,446,744,073,709,551,615 (UINT64_MAX).

The upper bound really controls how many job IDs can be active (queued, running, held, exiting, etc.) at any given time. This will always be something less than infinity, so I think having an upper bound is fine for our purposes.

Any decimal number beyond about ten digits starts getting a little intimidating for us puny humans. Even with hexidecimal we’re looking at a string of 16 hex digits to represent UINT64_MAX. That’s a little better than 20, but not much. Even with hexatridecimal (base 36) it’s still 13 characters, with the possibility of some vulgar words embedded.


Hi All,
Thanks for inputs. As per the recent replies we can think of the following possible approaches:

Approach 1: Use long long with the upper bound.
Approach 2: Use int64_t. The issue here might be that, since Microsoft Visual Studio is not fully C99 compliant. We are not sure if int64_t is supported.
Approach 3: Use long long or unsigned long long with no upper bound. Let it go to the max value and wrap itself.
Approach 4: Use long or unsigned long without upper bound. Assuming we ignore 32bit. On both architectures reaching the max values, they wrap around. Though max values are different.
Approach 5: Use long or unsigned long with upper bound. Again assuming we ignore 32bit. In 64bit, wrap around happens at the upper bound while on 32bit, it happens earlier on reaching the max value.

As of now we are working on “Approach 1”.
Please vote on the “Approach” with which we should go ahead. Or please suggest, if I have missed something.


FWIW, I wrote a small program and ran it on a VM to see just how long it would take to increment a uint64_t and measure the amount of time it took to consume each bit in the variable. Obviously, performance will vary, but it gives an idea of how large the domain should be. I gave up after 48 bits…

$ gcc -Ofast -o counter counter.c
$ ./counter
2 (1 bits) (0 seconds)
4 (2 bits) (0 seconds)
8 (3 bits) (0 seconds)
16 (4 bits) (0 seconds)
32 (5 bits) (0 seconds)
64 (6 bits) (0 seconds)
128 (7 bits) (0 seconds)
256 (8 bits) (0 seconds)
512 (9 bits) (0 seconds)
1024 (10 bits) (0 seconds)
2048 (11 bits) (0 seconds)
4096 (12 bits) (0 seconds)
8192 (13 bits) (0 seconds)
16384 (14 bits) (0 seconds)
32768 (15 bits) (0 seconds)
65536 (16 bits) (0 seconds)
131072 (17 bits) (0 seconds)
262144 (18 bits) (0 seconds)
524288 (19 bits) (0 seconds)
1048576 (20 bits) (0 seconds)
2097152 (21 bits) (0 seconds)
4194304 (22 bits) (0 seconds)
8388608 (23 bits) (0 seconds)
16777216 (24 bits) (0 seconds)
33554432 (25 bits) (0 seconds)
67108864 (26 bits) (0 seconds)
134217728 (27 bits) (0 seconds)
268435456 (28 bits) (0 seconds)
536870912 (29 bits) (0 seconds)
1073741824 (30 bits) (0 seconds)
2147483648 (31 bits) (1 seconds)
4294967296 (32 bits) (2 seconds)
8589934592 (33 bits) (5 seconds)
17179869184 (34 bits) (10 seconds)
34359738368 (35 bits) (19 seconds)
68719476736 (36 bits) (39 seconds)
137438953472 (37 bits) (77 seconds)
274877906944 (38 bits) (155 seconds)
549755813888 (39 bits) (310 seconds)
1099511627776 (40 bits) (619 seconds)
2199023255552 (41 bits) (1239 seconds)
4398046511104 (42 bits) (2485 seconds)
8796093022208 (43 bits) (4963 seconds)
17592186044416 (44 bits) (9914 seconds)
35184372088832 (45 bits) (19812 seconds)
70368744177664 (46 bits) (39610 seconds)
140737488355328 (47 bits) (79361 seconds)
281474976710656 (48 bits) (158542 seconds)


@varunsonkar - sorry i looked at your reply only now. yes internal variables like sv_jobidnumber exist today since we increment the jobid inside pbs server. However, if we loaded the next id from a database sequence we could easily read that large number as a string…so internally we never deal with the ID as a number at all. If and when we move towards a multiple server approach, we would need to get the job id sequence etc from the database anyway instead of incrementing inside the server.

Anyways for the time being, loading the sv_jobidnumber from database into a int64_t would be fine. The postgres field corresponding to sv_jobidnumber is defined as integer, which can take only this much:

integer 4 bytes typical choice for integer -2147483648 to +2147483647


So, along with changing the variable in C we will need to change the database column type to bigint. And this would also need an alter statement in case of upgrades.


Hi @subhasisb,
Thanks for the reply.
Yes we will be modifying the database column type to “BIGINT”. Also we will handle the cases as you mentioned like upgrade.


Thanks Varun.

I think we do not need to validate the implementation here. The only real question I saw in this discussion was whether we need to support 1 trillion jobs for a 32 bit build as well, and I believe I heard the answer as “yes”. I myself believe we need to support for the case of Windows where we build PBS in 32 bit mode currently.

As far as implementation of how to handle (hold) a 64 bit value properly, that should be adequately reviewed during code review.

Given the above, I sign off on the design.


Hi All,
I have updated the EDD mentioning the limitation on the length of “job name” which gets affected with this implementation. Please review the updated EDD and provide the comments/signoff.


Hi @mkaro,
As per our discussion we will go with “unsigned int64_t” approach.
Please have a look at the EDD and provide the signoff/comment.


@varunsonkar: Please use uint64_t as opposed to “unsigned in64_t”. Otherwise, that sounds fine.


Thanks @mkaro,
We will use “uint64_t” while implementing.


EDD looks good to me. I sign off.


At the PBS Pro User Group meeting I got face to face feedback of a strong desire to not have to deal with job ids of longer than ~7 digits. Users and admins regularly have to speak the job ID numbers out loud and it is bothersome when they get long. The customer in question has flipped the job ID counter multiple times in the past few years and basically wants the ability to preserve existing behavior. Can we add the max_sequence_id interface from v1 of the EDD back in (updated with the direction of the discussion around uint64_t)?


That looks like a good interface that meets the customer need. That said, I don’t see a discussion here of why it was dropped. Was there a specific reason?