Site subdomains are mapped to unique unix usernames.

Subdomain limits:

  • 253 characters total (including top level domain)
  • a-zA-Z0-9-

Unix username limits:

  • 32 characters limit (some things support longer, but not everything)
  • recommended to match: [a-z][a-z0-9-]*
  • may be user-visible (ie, for ssh)

format for short subdomains

This format will be used most of the time, and it makes it easy to see what hostname a process is running for, etc.

First 2 chars: Avoid conflict with system usernames, and briefly indicate top-level domain name. "b-"

Remaining 30 chars: subdomain, if it is short enough

format with hash for longer subdomains

Any hash has potential for collisions. The chances of such a collision occuring in relatively short, limited DNS subdomains are quite rare, and in the unlikely case of one happening, the system should just print an error and the user can pick a different subdomain. This means that the system has to explicitly check for a hash collision at site creation time.

I think it follows that we do not have to choose a hash that is crytographically strong.

md5_base64, with + replaced by _, and / by -

  • 22 characters
  • could spell a rude word, if unlucky
  • other 10 characters could be "b5-" + <7 characters of subdomain> to avoid any overlap with other usernames, encode the hash type, and make the subdomain semi-discoverable
  • small potential for collisions

md5_hex

  • 32 characters
  • no rude words
  • small potential for overlap with (32 character only!) system usernames
  • small potential for collisions

sha1_base64, with + replaced by _, and / by -

  • 27 characters
  • could spell a rude word, if unlucky
  • other 5 characters could be "bsh1-" to avoid any overlap with other usernames, and encode the hash type (or "b1-" + <2 characters of subdomain>, but 2 chars is not a worthwhile amount)
  • small(er?) potential for collisions

half of md5_hex

  • 16 characters
  • no rude words
  • other 16 characters can be "b5-" + <12 characters of subdomain> + "-" to avoid any overlap with other usernames, encode the hash type, and make the subdomain semi-discoverable
  • larger potential for collisions, as we have a 64 bit hash
  • what is currently used