1
1
Internationalized Domain Names in Applications (IDNA)
2
2
=====================================================
3
3
4
- Support for the Internationalised Domain Names in Applications
5
- (IDNA) protocol as specified in `RFC 5891 < https://tools.ietf.org/html/rfc5891 >`_.
6
- This is the latest version of the protocol and is sometimes referred to as
7
- “IDNA 2008”.
4
+ Support for the Internationalized Domain Names in
5
+ Applications (IDNA) protocol as specified in `RFC 5891
6
+ <https://tools.ietf.org/html/rfc5891> `_. This is the latest version of
7
+ the protocol and is sometimes referred to as “IDNA 2008”.
8
8
9
- This library also provides support for Unicode Technical Standard 46,
10
- `Unicode IDNA Compatibility Processing <https://unicode.org/reports/tr46/ >`_.
9
+ This library also provides support for Unicode Technical
10
+ Standard 46, `Unicode IDNA Compatibility Processing
11
+ <https://unicode.org/reports/tr46/> `_.
11
12
12
- This acts as a suitable replacement for the “encodings.idna” module that
13
- comes with the Python standard library, but which only supports the
14
- older superseded IDNA specification (`RFC 3490 <https://tools.ietf.org/html/rfc3490 >`_).
13
+ This acts as a suitable replacement for the “encodings.idna”
14
+ module that comes with the Python standard library, but which
15
+ only supports the older superseded IDNA specification (`RFC 3490
16
+ <https://tools.ietf.org/html/rfc3490> `_).
15
17
16
18
Basic functions are simply executed:
17
19
@@ -27,24 +29,19 @@ Basic functions are simply executed:
27
29
Installation
28
30
------------
29
31
30
- To install this library, you can use pip :
32
+ This package is available for installation from PyPI :
31
33
32
34
.. code-block :: bash
33
35
34
- $ pip install idna
35
-
36
- Alternatively, you can install the package using the bundled setup script:
37
-
38
- .. code-block :: bash
39
-
40
- $ python setup.py install
36
+ $ python3 -m pip install idna
41
37
42
38
43
39
Usage
44
40
-----
45
41
46
- For typical usage, the ``encode `` and ``decode `` functions will take a domain
47
- name argument and perform a conversion to A-labels or U-labels respectively.
42
+ For typical usage, the ``encode `` and ``decode `` functions will take a
43
+ domain name argument and perform a conversion to A-labels or U-labels
44
+ respectively.
48
45
49
46
.. code-block :: pycon
50
47
@@ -65,8 +62,8 @@ You may use the codec encoding and decoding methods using the
65
62
>>> print(b'xn--d1acufc.xn--80akhbyknj4f'.decode('idna'))
66
63
домен.испытание
67
64
68
- Conversions can be applied at a per-label basis using the ``ulabel `` or `` alabel ``
69
- functions if necessary:
65
+ Conversions can be applied at a per-label basis using the ``ulabel `` or
66
+ `` alabel `` functions if necessary:
70
67
71
68
.. code-block :: pycon
72
69
@@ -76,20 +73,22 @@ functions if necessary:
76
73
Compatibility Mapping (UTS #46)
77
74
+++++++++++++++++++++++++++++++
78
75
79
- As described in `RFC 5895 <https://tools.ietf.org/html/rfc5895 >`_, the IDNA
80
- specification does not normalize input from different potential ways a user
81
- may input a domain name. This functionality, known as a “mapping”, is
82
- considered by the specification to be a local user-interface issue distinct
83
- from IDNA conversion functionality.
76
+ As described in `RFC 5895 <https://tools.ietf.org/html/rfc5895 >`_, the
77
+ IDNA specification does not normalize input from different potential
78
+ ways a user may input a domain name. This functionality, known as
79
+ a “mapping”, is considered by the specification to be a local
80
+ user-interface issue distinct from IDNA conversion functionality.
84
81
85
- This library provides one such mapping, that was developed by the Unicode
86
- Consortium. Known as `Unicode IDNA Compatibility Processing <https://unicode.org/reports/tr46/ >`_,
87
- it provides for both a regular mapping for typical applications, as well as
88
- a transitional mapping to help migrate from older IDNA 2003 applications.
82
+ This library provides one such mapping, that was developed by the
83
+ Unicode Consortium. Known as `Unicode IDNA Compatibility Processing
84
+ <https://unicode.org/reports/tr46/> `_, it provides for both a regular
85
+ mapping for typical applications, as well as a transitional mapping to
86
+ help migrate from older IDNA 2003 applications.
89
87
90
- For example, “Königsgäßchen” is not a permissible label as *LATIN CAPITAL
91
- LETTER K * is not allowed (nor are capital letters in general). UTS 46 will
92
- convert this into lower case prior to applying the IDNA conversion.
88
+ For example, “Königsgäßchen” is not a permissible label as *LATIN
89
+ CAPITAL LETTER K * is not allowed (nor are capital letters in general).
90
+ UTS 46 will convert this into lower case prior to applying the IDNA
91
+ conversion.
93
92
94
93
.. code-block :: pycon
95
94
@@ -102,36 +101,38 @@ convert this into lower case prior to applying the IDNA conversion.
102
101
>>> print(idna.decode('xn--knigsgchen-b4a3dun'))
103
102
königsgäßchen
104
103
105
- Transitional processing provides conversions to help transition from the older
106
- 2003 standard to the current standard. For example, in the original IDNA
107
- specification, the *LATIN SMALL LETTER SHARP S * (ß) was converted into two
108
- *LATIN SMALL LETTER S * (ss), whereas in the current IDNA specification this
109
- conversion is not performed.
104
+ Transitional processing provides conversions to help transition from
105
+ the older 2003 standard to the current standard. For example, in the
106
+ original IDNA specification, the *LATIN SMALL LETTER SHARP S * (ß) was
107
+ converted into two *LATIN SMALL LETTER S * (ss), whereas in the current
108
+ IDNA specification this conversion is not performed.
110
109
111
110
.. code-block :: pycon
112
111
113
112
>>> idna.encode('Königsgäßchen', uts46=True, transitional=True)
114
113
'xn--knigsgsschen-lcb0w'
115
114
116
- Implementors should use transitional processing with caution, only in rare
117
- cases where conversion from legacy labels to current labels must be performed
118
- (i.e. IDNA implementations that pre-date 2008). For typical applications
119
- that just need to convert labels, transitional processing is unlikely to be
120
- beneficial and could produce unexpected incompatible results.
115
+ Implementors should use transitional processing with caution, only in
116
+ rare cases where conversion from legacy labels to current labels must be
117
+ performed (i.e. IDNA implementations that pre-date 2008). For typical
118
+ applications that just need to convert labels, transitional processing
119
+ is unlikely to be beneficial and could produce unexpected incompatible
120
+ results.
121
121
122
122
``encodings.idna `` Compatibility
123
123
++++++++++++++++++++++++++++++++
124
124
125
125
Function calls from the Python built-in ``encodings.idna `` module are
126
126
mapped to their IDNA 2008 equivalents using the ``idna.compat `` module.
127
- Simply substitute the ``import `` clause in your code to refer to the
128
- new module name.
127
+ Simply substitute the ``import `` clause in your code to refer to the new
128
+ module name.
129
129
130
130
Exceptions
131
131
----------
132
132
133
- All errors raised during the conversion following the specification should
134
- raise an exception derived from the ``idna.IDNAError `` base class.
133
+ All errors raised during the conversion following the specification
134
+ should raise an exception derived from the ``idna.IDNAError `` base
135
+ class.
135
136
136
137
More specific exceptions that may be generated as ``idna.IDNABidiError ``
137
138
when the error reflects an illegal combination of left-to-right and
@@ -149,29 +150,31 @@ tables for performance. These tables are derived from computing against
149
150
eligibility criteria in the respective standards. These tables are
150
151
computed using the command-line script ``tools/idna-data ``.
151
152
152
- This tool will fetch relevant codepoint data from the Unicode repository
153
- and perform the required calculations to identify eligibility. There are
153
+ This tool will fetch relevant codepoint data from the Unicode repository
154
+ and perform the required calculations to identify eligibility. There are
154
155
three main modes:
155
156
156
- * ``idna-data make-libdata ``. Generates ``idnadata.py `` and `` uts46data.py ``,
157
- the pre-calculated lookup tables using for IDNA and UTS 46 conversions. Implementors
158
- who wish to track this library against a different Unicode version may use this tool
159
- to manually generate a different version of the `` idnadata.py `` and `` uts46data.py ``
160
- files.
157
+ * ``idna-data make-libdata ``. Generates ``idnadata.py `` and
158
+ `` uts46data.py ``, the pre-calculated lookup tables using for IDNA and
159
+ UTS 46 conversions. Implementors who wish to track this library against
160
+ a different Unicode version may use this tool to manually generate a
161
+ different version of the `` idnadata.py `` and `` uts46data.py `` files.
161
162
162
163
* ``idna-data make-table ``. Generate a table of the IDNA disposition
163
- (e.g. PVALID, CONTEXTJ, CONTEXTO) in the format found in Appendix B.1 of RFC
164
- 5892 and the pre-computed tables published by `IANA <https://www.iana.org/ >`_.
164
+ (e.g. PVALID, CONTEXTJ, CONTEXTO) in the format found in Appendix
165
+ B.1 of RFC 5892 and the pre-computed tables published by `IANA
166
+ <https://www.iana.org/> `_.
165
167
166
- * ``idna-data U+0061 ``. Prints debugging output on the various properties
167
- associated with an individual Unicode codepoint (in this case, U+0061), that are
168
- used to assess the IDNA and UTS 46 status of a codepoint. This is helpful in debugging
169
- or analysis.
168
+ * ``idna-data U+0061 ``. Prints debugging output on the various
169
+ properties associated with an individual Unicode codepoint (in this
170
+ case, U+0061), that are used to assess the IDNA and UTS 46 status of a
171
+ codepoint. This is helpful in debugging or analysis.
170
172
171
- The tool accepts a number of arguments, described using ``idna-data -h ``. Most notably,
172
- the ``--version `` argument allows the specification of the version of Unicode to use
173
- in computing the table data. For example, ``idna-data --version 9.0.0 make-libdata ``
174
- will generate library data against Unicode 9.0.0.
173
+ The tool accepts a number of arguments, described using ``idna-data
174
+ -h ``. Most notably, the ``--version `` argument allows the specification
175
+ of the version of Unicode to use in computing the table data. For
176
+ example, ``idna-data --version 9.0.0 make-libdata `` will generate
177
+ library data against Unicode 9.0.0.
175
178
176
179
177
180
Additional Notes
@@ -180,25 +183,28 @@ Additional Notes
180
183
* **Packages **. The latest tagged release version is published in the
181
184
`Python Package Index <https://pypi.org/project/idna/ >`_.
182
185
183
- * **Version support **. This library supports Python 3.5 and higher. As this library
184
- serves as a low-level toolkit for a variety of applications, many of which strive
185
- for broad compatibility with older Python versions, there is no rush to remove
186
- older intepreter support. Removing support for older versions should be well
187
- justified in that the maintenance burden has become too high.
188
-
189
- * **Python 2 **. Python 2 is supported by version 2.x of this library. While active
190
- development of the version 2.x series has ended, notable issues being corrected
191
- may be backported to 2.x. Use "idna<3" in your requirements file if you need this
192
- library for a Python 2 application.
193
-
194
- * **Testing **. The library has a test suite based on each rule of the IDNA specification, as
195
- well as tests that are provided as part of the Unicode Technical Standard 46,
196
- `Unicode IDNA Compatibility Processing <https://unicode.org/reports/tr46/ >`_.
197
-
198
- * **Emoji **. It is an occasional request to support emoji domains in this library. Encoding
199
- of symbols like emoji is expressly prohibited by the technical standard IDNA 2008 and
200
- emoji domains are broadly phased out across the domain industry due to associated security
201
- risks. For now, applications that wish need to support these non-compliant labels may
202
- wish to consider trying the encode/decode operation in this library first, and then falling
203
- back to using `encodings.idna `. See `the Github project <https://github.com/kjd/idna/issues/18 >`_
204
- for more discussion.
186
+ * **Version support **. This library supports Python 3.5 and higher.
187
+ As this library serves as a low-level toolkit for a variety of
188
+ applications, many of which strive for broad compatibility with older
189
+ Python versions, there is no rush to remove older intepreter support.
190
+ Removing support for older versions should be well justified in that the
191
+ maintenance burden has become too high.
192
+
193
+ * **Python 2 **. Python 2 is supported by version 2.x of this library.
194
+ While active development of the version 2.x series has ended, notable
195
+ issues being corrected may be backported to 2.x. Use "idna<3" in your
196
+ requirements file if you need this library for a Python 2 application.
197
+
198
+ * **Testing **. The library has a test suite based on each rule of the
199
+ IDNA specification, as well as tests that are provided as part of the
200
+ Unicode Technical Standard 46, `Unicode IDNA Compatibility Processing
201
+ <https://unicode.org/reports/tr46/> `_.
202
+
203
+ * **Emoji **. It is an occasional request to support emoji domains in
204
+ this library. Encoding of symbols like emoji is expressly prohibited by
205
+ the technical standard IDNA 2008 and emoji domains are broadly phased
206
+ out across the domain industry due to associated security risks. For
207
+ now, applications that wish need to support these non-compliant labels
208
+ may wish to consider trying the encode/decode operation in this library
209
+ first, and then falling back to using `encodings.idna `. See `the Github
210
+ project <https://github.com/kjd/idna/issues/18> `_ for more discussion.
0 commit comments