Jump to content

Latin Script Diacritics PDP Working Group

The Latin Script Diacritics PDP Working Group (often shortened to the Latin Diacritics PDP or LD PDP) is a GNSO Policy Development Process (PDP) working group tasked with determining in which limited circumstances, and through what mechanism, a single registry operator may simultaneously operate a base ASCII gTLD and the Latin script diacritic version of the same gTLD string when the two labels are not variants of each other under the Root Zone Label Generation Rules (RZ-LGR) but may be visually similar. The working group operates under an "Open Model", is open to all interested participants from across ICANN structures, and follows the GNSO Working Group Guidelines and the Annex A PDP framework in the ICANN Bylaws.[1] [2]

Background[edit | edit source]

The PDP arises from long-standing issues around the treatment of Latin script diacritics in domain names. The Latin script is used by a majority of the world’s literate population, and over a thousand languages employ Latin letters, many of which rely on diacritics (for example, French "é", Spanish "ñ", Portuguese "ã", Turkish "ş", or Vietnamese tone marks). In many languages, omission of these diacritics changes the correctness or meaning of a word, so a spelling without diacritics is often viewed by native speakers as a compromise or "workaround" rather than an equivalent form.[3]

Historically, most domain names have been registered using US-ASCII characters, while Latin-script Internationalized Domain Names (IDNs) are supported via Punycode representations in the DNS. For many communities, especially those whose languages systematically use diacritics, this leads to a tension between using a "workaround" ASCII label and the linguistically correct diacritic label (for example, ".deja" versus ".déjà").[3] The issue is amplified at the top level, where the visual similarity between an ASCII gTLD and a Latin-diacritic gTLD can trigger string similarity rules in the new gTLD program, blocking delegation of both labels under existing policy.

At the same time, the Latin Generation Panel for the Root Zone LGR determined that characters with and without diacritics are, in general, distinct letters and are therefore not variants of each other in the variant management sense used for IDNs. Only a few specific cases are treated as variants through cross-script relationships or diacritic-to-diacritic relationships. As a result, the Latin script has very few allocatable variants in the Root Zone LGR, and ASCII/Latin-diacritic pairs are normally not treated as variants.[3] [4]

Because ASCII/Latin-diacritic label pairs are not variants, the policy framework from the Expedited PDP on Internationalized Domain Names (EPDP on IDNs) for managing variants cannot, by itself, provide a path for a single registry operator to manage both labels at the top level. Also, the string similarity rules in the new gTLD program create a non-negligible probability that ASCII and diacritic labels will be judged confusingly similar, blocking both labels from being delegated or contracted under existing policy.[4]

Similar questions have already been addressed in the country-code space, where an exception procedure was developed to allow delegation of the Greek-script IDN ccTLD ".ευ" alongside ".eu" under a single manager despite potential confusing similarity, subject to conditions designed to mitigate user confusion. That ccTLD experience (including follow-up work in ccPDP4 for IDN ccTLDs) is explicitly referenced as one source of input that might inspire, though not necessarily directly determine, a solution for gTLDs.[3]

Origin and Initiation of the PDP[edit | edit source]

Concerns over Latin script diacritics had been raised within the ICANN community for several years, including by French-speaking communities and At-Large actors who pointed to cases such as ".quebec" and a potential ".québec" as an illustration of the problem: an operator may wish to run both the ASCII and diacritic forms of a string that native speakers consider distinct, yet practically equivalent for the purposes of localization and identity, but policy and LGR rules provide no path to do so when the labels are not variants.[3][5]

In June 2023, the ALAC Chair sent a detailed letter to the GNSO Council highlighting the need for a policy solution for Latin script diacritics at the top level. That correspondence became one of the triggers for the GNSO Council to request that ICANN org study the issue and advise whether a dedicated PDP was warranted.[5]

The GNSO Council received a briefing on the topic during its meeting at ICANN 78 in October 2023 and subsequently requested that ICANN org prepare an Issue Report on Latin Script diacritics. On May 16, 2024 the Council adopted a resolution directing staff to prepare a Preliminary Issue Report.[1] [3] The Preliminary Issue Report, published on July 17, 2024, defined the issue as the specific circumstance in which an ASCII gTLD and the Latin script diacritic version of that gTLD are not variants of each other but may be visually similar, with no current mechanism for the same registry to operate both labels.[3]

The Preliminary Issue Report was put out for Public Comment from July 18 to August 27, 2024. Forty-one submissions were received; thirty-seven supported initiating a PDP, two were mixed, one opposed a PDP, and one was considered out of scope. Supportive comments emphasized multilingualism, end-user equity, and alignment with Universal Acceptance efforts, while concerns focused on potential user confusion and security risks in environments where many visually similar labels might coexist.[6]

Taking this input into account, ICANN org produced the Final Issue Report on September 12, 2024. The report recommended that the GNSO Council initiate a PDP limited to a single policy question: what mechanism, if any, is needed to allow a single registry operator simultaneously to operate a base ASCII gTLD and the Latin script diacritic version of the gTLD where the labels are not variants but may be visually similar.[4] [1]

On November 13, 2024, during ICANN 81, the GNSO Council adopted a resolution initiating the PDP on Latin Script Diacritics. A draft charter included in the Final Issue Report was updated and then formally adopted as the working group charter on December 19, 2024.[1]

The working group held its first meeting in March 2025.[7]

Formation and Membership[edit | edit source]

On January 3, 2025, the GNSO Council issued an Expression of Interest (EOI) seeking volunteers for the Latin Script Diacritics PDP Working Group and candidates for the role of Working Group Chair. The announcement described the group as an open PDP WG, inviting participation from GNSO Stakeholder Groups and Constituencies, other Supporting Organizations and Advisory Committees, and interested individuals who could meet the expected skills profile.[2]

The EOI specified that, taken together, WG members should have familiarity with the Latin RZ-LGR and string similarity processes, practical understanding of running ASCII and Latin-diacritic labels, and experience with GNSO PDP work. The WG was also expected to apply the GNSO Working Group Guidelines, use the Consensus Playbook, and follow the Statement of Participation and SOI (Statement of Interest) requirements.[2]

The PDP uses an "Open Model": anyone can join as a Member at any point provided they commit to getting up to speed and avoid reopening previously closed topics without new information. WG membership and meeting logistics are maintained on the ICANN Community Wiki space created for the PDP.[2][8]

Leadership[edit | edit source]

The working group’s leadership is composed by:

Mandate and Scope[edit | edit source]

The WG’s mandate is deliberately narrow. According to the charter and EOI, the group is to examine a single issue: to identify the limited circumstances in which an ASCII gTLD and the Latin script diacritic version of that gTLD can be simultaneously delegated, and to determine the appropriate mechanism that would allow a single registry operator to operate both, under conditions that maintain security, stability, and user trust.[2][1]

Key elements of the scope include:

  • Non-variant pairs: The PDP concerns only pairs of labels that are not variants of each other under the Latin RZ-LGR. If ASCII and diacritic labels were variants, they would already be treated under the IDN variant framework arising from the EPDP on IDNs, and no special policy would be needed.[4][2]
  • Potential visual similarity: The PDP assumes that many ASCII/diacritic pairs may be judged visually confusingly similar under the string similarity rules of the new gTLD program, which today prevent both labels from being delegated or contracted in such cases.[3][4]
  • Same-entity operation: The central question is framed around a single registry operator running both labels, analogous to the exception procedure developed in the ccTLD space for cases such as ".eu"/".ευ", while still protecting end users from confusion.[3][4]
  • Use of Latin RZ-LGR as a baseline: The WG is expected to use the Latin RZ-LGR, including the variant sets and exclusions identified by the Latin Generation Panel, as a key reference when delineating what is and is not in scope for a potential policy solution.[2]

The Final Issue Report and charter also anticipate that the WG will need to consider potential scoping constraints to avoid an unmanageably large set of ASCII/diacritic pairs, such as limiting eligibility to certain application types (for example, Community, Geographic, or .brand TLDs), requiring that the ASCII label be a documented workaround for the diacritic label, or limiting solutions to fully diacritic versus fully ASCII equivalents rather than including pluralization or alternative spellings.[4]

While the PDP is primarily focused on the top level, its work is closely linked to the EPDP on IDNs Phases 1 and 2, which define "same entity" rules and variant management at both the top and second levels. The charter anticipates that any Latin-diacritic framework will need to be coherent with the IDN variant rules and respect the principle that variants are treated as the "same" label, while Latin diacritics are not.[4][5]

Key issues and Deliberations[edit | edit source]

Several key themes have emerged in the WG’s deliberations:

Distinguishing diacritics from variants[edit | edit source]

Building on the Latin Generation Panel’s work, the WG revisited why code points with and without diacritics are not treated as variants in the Root Zone LGR, and how that choice interacts with community expectations in languages where the diacritic and non-diacritic forms are viewed as closely related or substitutable in everyday practice. This includes recognizing that in many languages, spelling without diacritics is viewed as an accommodation to technical constraints, not as a correct form.[3][4]

The group also discussed the risk of conflating "Latin diacritic sets" (ASCII plus associated Latin-diacritic labels handled under the PDP) with IDN variant sets defined in the RZ-LGR. In late 2025 meetings, WG discussions explored how combining Latin-diacritic sets and IDN variant sets could lead to complex multi-label relationships that would need careful SSR and policy analysis.[8]

String Similarity and Exception Mechanisms[edit | edit source]

Another major area of work has been understanding how existing string similarity rules in the new gTLD program, as affirmed by the Subsequent Procedures PDP, would apply to ASCII/diacritic pairs, and what kind of exception or special mechanism might be needed to enable same-entity operation without undermining the goal of minimizing user confusion.[4][3]

The WG has looked at the ccTLD precedent for .eu/.ευ (an exception procedure tied to same-entity operation and specific mitigation measures) as one possible model, while recognising that gTLD policy may require a different set of conditions and could not simply copy the ccTLD solution.[3]

Relationship to EPDP on IDNs and Universal Acceptance[edit | edit source]

The WG relies on the EPDP on IDNs Phases 1 and 2 as a baseline for understanding "same entity" relationships and variant management across scripts. It has worked through EPDP recommendations to determine which can be applied directly to Latin-diacritic cases, which might inspire Latin-specific adaptations, and where distinct rules may be needed because diacritics are not variants under the RZ-LGR.[4][5]

Universal Acceptance considerations also feature in the deliberations. For many end users and service providers, diacritic-based labels still face support constraints in software and systems; the PDP therefore needs to balance the linguistic correctness and identity benefits of diacritics with UA realities and potential user-experience risks when ASCII and diacritic labels coexist.[3][6]

Stress Testing and Case Studies[edit | edit source]

By ICANN 84 (October 2025), the WG had begun using stress-test scenarios and case studies to examine edge cases where ASCII, Latin-diacritic labels and IDN variants might interact, including situations where multiple scripts and label sets could create complex relationships. These stress tests are used to refine preliminary recommendations and ensure that any policy framework can handle difficult cases without unintended consequences for security, stability, or user trust.[8][5]

Community Input and Perspectives[edit | edit source]

The public comment forum on the Preliminary Issue Report attracted contributions from registry operators, registrars, At-Large structures, individual community members, and other stakeholders. A large majority of commenters supported initiating the PDP, highlighting benefits for linguistic accuracy, cultural representation, and parity for communities using Latin script diacritics. A smaller number of submissions raised concerns about potential phishing and confusion risks, or questioned whether the issue justified a full PDP.[6]

Regional and community perspectives, such as those documented in the APRALO Bytes update of November 2025, have emphasized that Latin-diacritic issues have been a long-standing concern, especially in francophone and other communities where diacritics are integral to language. These updates also underscore the expectation that any eventual policy should be narrowly tailored, technically robust, and closely aligned with IDN variant and UA work streams.[5]

Status and Timeline[edit | edit source]

As of late 2025, the Latin Script Diacritics PDP Working Group had:

  • completed its initial pass through the EPDP on IDNs recommendations to assess their applicability to Latin-diacritic cases;
  • developed and refined a framework for distinguishing Latin-diacritic sets from IDN variant sets and for scoping eligible ASCII/diacritic pairs; and
  • begun drafting preliminary recommendations and stress-testing them through case studies and WG discussions.[8][5]

According to an APRALO status update in November 2025, the WG anticipated publishing an Initial Report for public comment in early 2026 (with a target of January 2026) and delivering a Final Report to the GNSO Council around August 2026, subject to the pace of deliberations and community feedback.[5]

References[edit | edit source]