New proposed versions of IDNA documents

After working for another few months, we have new versions of the IDNAbis documents ready. The series consists of four documents.

Let me go through them in order. Most of the text below is directly from the Abstract of each of the documents.

Internationalizing Domain Names for Applications (IDNA): Issues and Rationale - draft-klensin-idnabis-issues-05.txt

A recent IAB report identified issues that have been raised with Internationalized Domain Names (IDNs). Some of these issues require tuning of the existing protocols and the tables on which they depend. Based on intensive discussion by an informal design team, this document provides an overview some of the proposals that are being made, provides explanatory material for them and then further explains some of the issues that have been encountered.

Internationalizing Domain Names in Applications (IDNA): Protocol - draft-klensin-idnabis-protocol-02.txt

This document supplies the protocol definition for a revised and updated specification for internationalized domain names. The rationale for these changes and relationship to the older specification and some new terminology is provided in specifically the issues document. This document defines internationalized domain names (IDNs) and a mechanism called Internationalizing Domain Names in Applications (IDNA) for handling them in a standard fashion. IDNs use characters drawn from a large subset of the Unicode repertoire, but IDNA allows the non-ASCII characters to be represented using only the ASCII characters already allowed in so-called host names today. This backward-compatible representation is required in existing protocols like DNS, so that IDNs can be introduced with no changes to the existing infrastructure. IDNA is only meant for processing domain names, not free text.

The Unicode Codepoints and IDN - draft-faltstrom-idnabis-tables-03.txt

This document specifies rules for deciding whether a codepoint, considered in isolation, is a candidate for inclusion in an Internationalized Domain Name. It defines an algorithm that tell wether a codepoint is in one of the categories ALWAYS, MAYBE YES, MAYBE NO, CONTEXT, NEVER or UNASSIGNED.

If you find the tables in this document to be hard to parse due to the page breaks, you can also find version -05 in HTML (that is easier to parse) here.

An IDNA problem in right-to-left scripts - draft-alvestrand-idna-bidi-01.txt

In the original version of IDNA (RFC3490), there where some basic rules for labels with codepoints with different directionality. Findings (specified in the issues document) show that those rules where not good enough, and in some cases plain wrong. This memo discusses some problems with scripts presenting challenges, including one resulting from a constraint on the use of combining characters at the end of an RTL domain label, causing some words to be declared invalid as IDN labels, and proposes a means for ameliorating this problem.