Internationalized time point unique time zone abbreviations

http://mm.icann.org/pipermail/tz/2012-June/018020.html

Introduction

If time zone abbreviations for a given point in time are unique they can be used to faster find IANA time zones. The current abbreviations are not unique, e.g. IST could stand for Indian Standard Time, Iran Standard Time, Ireland Standard Time, or Israel Standard Time (and there has been talk about a single Indonesian Standard Time¹ to replace its 3 timezones).

The English abbreviation could also conflict with the native language abbreviation, as in the case of Asia/Jakarta abbreviation, which is WIT (Western Indonesian Time, or UTC+7), but in local Indonesian language, WIT stands for Waktu Indonesia Timur (Eastern Indonesian Time, or UTC+7), while in Indonesian language, Western Indonesian Time abbreviated as WIB (Waktu Indonesia Barat).

Since the IANA time zone database cutoff point is 1970-01-01T00:00:00Z and the first publication of ISO 3166 alpha-2 country codes occurred in the early 1970s, one could easily use the country codes to separate sets of abbreviations between countries.

One rule sometimes used in the IANA time zone database for creating new abbreviations is to use the ISO alpha-2 country codes. The idea presented here applies such a rule rigidly.

Apart from that the idea presented also applies rigid rules for marcation of offset changes like daylight saving time and uses a fixed length of five characters in the long mode and four characters in the short mode.

Features

If the first two letters are an ISO 3166-1 alpha-2 country code, one can:

derive the country of the time zone to which the abbreviation belongs from the abbreviation and from knowledge of the current ISO country code for that country. This is not possible with current IANA time zone database abbreviations like CET, IST, EDT, GALT, EAST, CT (e.g. Cuba Time), CUT (1924 Central Ukraine Time).
group the abbreviations by country, alone by alphabetic sorting of the abbreviations. This is not possible with current IANA time zone database abbreviations like CST (US), CT (Cuba), PST (US).

If the year is known, one can identify the time zone for any given abbreviation. This is not possible with current IANA time zone database abbreviations like IST.

The system reduces the number of IANA time zones for some abbreviations, e.g. IST is used for Asia/Jerusalem, Asia/Kolkata, while INCT would only refer to Asia/Kolkata.

Definitions

General

D0.1) DST - Daylight saving time
D0.2) Format: "<CO><L><S>T" #abbreviations would be four or five letters long. Details in D1 to D4.

Components

D1) CO - country code and similar

Regex /[A-Z]{2}/
ISO 3166-1 alpha-2 country code or,
a special code like "EU" or,
a code from the private use area to define larger regions, e.g.:
- XA - ASEAN
- XE - "East" for UTC offset zones having positive offset, e.g. UTC+02
- XW - "West" for UTC offset zones having negative offset, e.g. UTC-02
- ZZ - the whole world, used for UTC and UTC offset zones, details see examples.

D2) L - location code and similar

Regex /[A-Z0-9]{1}/
D2.1) A character from the set [A-Z], unique for each real time zone within the country. Preferably not from the set [SDF] or any letter agreed in D3 for offset changes or reserved there.
D2.2) A character from the set [0-9] for numbered zones, e.g. in Russia.
D2.3) The letter C (Common Time) for the most common time, maybe also N derived from "National Time".
D2.4) The letter Z could be used if there is only one time zone. Could be dropped for countries that only have one zone, but mandatory per D0.2).
D2.5) For UTC offset zones that start with ZZ, when using date-time group (DTG) letters, the letter G - not the letter L to make changing into or from E harder. Otherwise either E for East or W for West, as defined in D1.

D3) S - seasonal offset code and similar

Regex /[A-RU-Z]{1}/
D3.1) The letter "S" is not used since in some contexts it stands for "standard time" in others for "summer time".
D3.2) T is not allowed to avoid issues with the T as in the fifth position.
D3.3) For DST use the letter "D".
D3.4) For wartime maybe the letter W or F as in forward time. In D2 W stands for West, but this does not interfere here.
D3.5) For double summer time - to be defined, maybe M for midsummer time
D3.6) For absence of any extra rules "standard time", the letter Z is optional. Consider also "C" - "Common Time" which would sort before the special letters.
D3.7) For UTC offset zones: a digit.

D4) T - time

Regex /[T]{1}/

As currently done in English to indicate "Time".

Examples

E1) No DST

CUCT - Cuba (Common) Time#in IANA tzdb northamerica 8.54 is CT
THCT - Thailand (Common) Time
USET - US Eastern Time
USCT - US Central Time
USPT - US Pacific Time
CAET - Canada Eastern Time
CLCT - Chile (Common) Time #Continental Time
CLET - Chile Eastern Time #Easter Islands, which in the IANA tzdb is EAS%sT
ECCT - Ecuador (Common) Time #Continental Time
ECGT - Galapagos Time #which in the IANA tzdb is GALT

RUOT - Omsk Time #in IANA tzdb is OMST, could be read as Oman
Summer/Standard Time
RUMT - Moscow Time #Maybe RUCT - Russia Central/Common Time or RUNT -
Russia National Time
RUKT - Kaliningrad Time
RUIT - Irkutsk Time
RUVT - Volgograd Time #in IANA tzdb was VOLT
Optional: RU1T - Russia First Time Zone
Optional: RU2T - Russia Second Time Zone

EUCT - Central European Time #Some countries that use this time, are
not in the EU.
EUWT - Western European Time #See comment for EUCT
EUET - Eastern European Time #See comment for EUCT

XACT - ASEAN Common Time

XE01T - UTC+01
XE08T - UTC+08
XE13T - UTC+13
XW06T - UTC-06
XW11T - UTC-11

ZZZT - UTC
ZZE1T - UTC+01
ZZE8T - UTC+08
ZZEAT - UTC+10 # A = hexadecimal for 10
ZZEDT - UTC+13 # A = hexadecimal for 13
ZZW6T - UTC-06
ZZWBT - UTC-11 # B = hexadecimal for 11

#Letters from the NATO(?) date-time group
#taken from http://de.wikipedia.org/wiki/Date_Time_Group
ZZGZT - UTC±00
ZZGAT - UTC+01
ZZGHT - UTC+08
ZZGKT - UTC+10
ZZGST - UTC-06
ZZGXT - UTC-11

E2) With DST

#for meaning of the third letter see section D2
USEDT
USCDT
USPDT
CAEDT
RUMDT # Daylight saving time or "decree time"
EUCDT #preferred as defined in D3.1

F) Further considerations

http://mm.icann.org/pipermail/tz/2012-October/018393.html - Suggesting extension for "historical contexts". This may be really valuable since from

"2011-06-12 18:14 VECT"

one could not know under which regime it was defined. This issue can be seen on Royal Jordan and some other airlines, which have issued tickets, and the change of official time comes after issuing the tickets.

Feedback

Hank W [1]:

I suggest
changing USXX to NAXX, since one set of abbreviations is shared 
by every North American country except Cuba.  
Also, I’d like to suggest simply appending a “+” to indicate 
daylight saving time/ summer time (e.g., NAET+).

Reply by Tobias: NA is the ISO 3166-1 alpha-2 country code for Namibia. A non-country code could only be selected from the private set of codes: AA, QM to QZ, XA to XZ, and ZZ. An option would be XN.

IIRC the DST start and end rules in Mexico have not always been the same as those in the USA. So a North American code could not be used to identify a set of start and end rules, a feature that /might/ be desired.

The "+" is language neutral and maybe easier to read if there are already so many letters. This would be two advantages over "D" from the proposal. But it is not from the set of "RFC 3986 section 2.3 Unreserved Characters", thus it might be encoded in URLs, which might be undesired. Also other environments may not allow the + to be used as it would be with the "D". But there could be two variants, one for programming, one for display.

You changed the order from DT to TD/T+. So with your proposal the first letters would stay fixed, which might be desired.