Unix Timestamp Compression
Using Base58, Base32, and Base16 for Azure Resource Names
I'm working on some wrapper scripts to create Azure resources using Azure CLI and it uses part of the Unix timestamp in the resource names. The resource names follow a specific convention and must be unique. Along with Azure's inconsistent naming convention rules, the generated timestamp_id
must be sortable and less in length than the Unix timestamp. A 10-digit
or 32-bit
timestamp is sufficient for our needs.
<abrev><3_digit_code><product><env><timestamp_id>
Base58
For example, x123bobprod4Kk7tC
uses Base58
encoding works for most Azure resources, but not all. Some resources (Storage Account and KeyVault) are also restricted to 24
characters in length. Base58 excludes some ambiguous characters (0, O, l, I). We use a byte_size
of 4
to account for the 32-bit
Unix timestamp. The byteorder
can either be big
or little
endianness. We use an encoding of ASCII
which means each character is 8-bits
in Python. Where UTF-8
is variable between 1-4 bytes (the first 128 characters are encoded with 8-bits
.
encoded_base58 = base58.b58encode(timestamp.to_bytes(byte_size
, byteorder=byte_order)).decode()
Boo
Base16
We can use Base16
or Hexadecimal
encoding which would give us a timestamp_id
of 82022B29
with length 8
:
encoded_base16 = base64.b16encode(shortened_timestamp.to_bytes(byte_size
, byteorder=byte_order)).decode()
Base32
With Base32
encoding we also get a timestamp_id
with length 8
, but since the character set it is encoded to is either all uppercase using the 24
letters and 10
digits we use up 56-bits
to encode the original 32-bit
integer as a string, instead of 64-bits
.
encoded_base32 = base64.b32encode(timestamp.to_bytes(byte_size
, byteorder=byte_order)).decode().lower()
This would look like QIBCWKI=
, we get a padding character, but it will be there until we roll over to an 11-digit Unix timestamp which won't happen until 2340ish... yeah these scripts will still be relevant. ๐
If that were the case we would increase to a byte-size
of 5
and see what else we can skimp on. Maybe Microsoft Azure will change to less restrictive naming conventions by that time.
Considerations
The other alternative I was looking at was to shrink the original timestamp. That is we can assume to strip the first digit of the timestamp giving us 9
digits and assuming it's a 1
since it will be sometime in 2055 before it increments. This just shaves 2-bits
off giving us a 30-bit
, but still yields a Base32
8-character string with padding character.