Unix Timestamp Compression

I'm working on some wrapper scripts to create Azure resources using Azure CLI and it uses part of the Unix timestamp in the resource names. The resource names follow a specific convention and must be unique. Along with Azure's inconsistent naming convention rules, the generated timestamp_id must be sortable and less in length than the Unix timestamp. A 10-digit or 32-bit timestamp is sufficient for our needs.

<abrev><3_digit_code><product><env><timestamp_id>

Base58

For example, x123bobprod4Kk7tC uses Base58 encoding works for most Azure resources, but not all. Some resources (Storage Account and KeyVault) are also restricted to 24 characters in length. Base58 excludes some ambiguous characters (0, O, l, I). We use a byte_size of 4 to account for the 32-bit Unix timestamp. The byteorder can either be big or little endianness. We use an encoding of ASCII which means each character is 8-bits in Python. Where UTF-8 is variable between 1-4 bytes (the first 128 characters are encoded with 8-bits.

encoded_base58 = base58.b58encode(timestamp.to_bytes(byte_size
    , byteorder=byte_order)).decode()

Boo

Base16

We can use Base16 or Hexadecimal encoding which would give us a timestamp_id of 82022B29 with length 8:

encoded_base16 = base64.b16encode(shortened_timestamp.to_bytes(byte_size
    , byteorder=byte_order)).decode()

Base32

With Base32 encoding we also get a timestamp_id with length 8, but since the character set it is encoded to is either all uppercase using the 24 letters and 10 digits we use up 56-bits to encode the original 32-bit integer as a string, instead of 64-bits.

encoded_base32 = base64.b32encode(timestamp.to_bytes(byte_size
    , byteorder=byte_order)).decode().lower()

This would look like QIBCWKI=, we get a padding character, but it will be there until we roll over to an 11-digit Unix timestamp which won't happen until 2340ish... yeah these scripts will still be relevant. 😉

If that were the case we would increase to a byte-size of 5 and see what else we can skimp on. Maybe Microsoft Azure will change to less restrictive naming conventions by that time.

Considerations

The other alternative I was looking at was to shrink the original timestamp. That is we can assume to strip the first digit of the timestamp giving us 9 digits and assuming it's a 1 since it will be sometime in 2055 before it increments. This just shaves 2-bits off giving us a 30-bit, but still yields a Base32 8-character string with padding character.

Unix Timestamp Compression

Using Base58, Base32, and Base16 for Azure Resource Names

Base58

Base16

Base32

Considerations

Repl

References