Skip to main content

ISO/IEC 14651:2020

Current Date published:

Information technology — International string ordering and comparison — Method for comparing character strings and description of the common template tailorable ordering

This document defines the following.

— A reference comparison method. This method is applicable to two character strings to determine their collating order in a sorted list. The method can be applied to strings containing characters from the full repertoire of ISO/IEC 10646. This method is also applicable to subsets of that repertoire, such as those of the different ISO/IEC 8-bit standard character sets, or any other character set, standardized or not, to produce ordering results valid (after tailoring) for a given set of languages for each script. This method uses collation tables derived either from the Common Template Table defined in this document or from one of its tailorings. This method provides a reference format. The format is described using the Backus-Naur Form (BNF). This format is used to describe the Common Template Table. The format is used normatively within this document.

— A Common Template Table. A given tailoring of the Common Template Table is used by the reference comparison method. The Common Template Table describes an order for all characters encoded in the Unicode 13.0 standard,[27] included in ISO/IEC 10646:2020. It allows for a specification of a fully deterministic ordering. This table enables the specification of a string ordering adapted to local ordering rules, without requiring an implementer to have knowledge of all the different scripts already encoded in the Universal Coded Character Set (UCS).

NOTE 1 This Common Template Table is to be modified to suit the needs of a local environment. The main worldwide benefit is that, for other scripts, often no modification is required and the order will remain as consistent as possible and predictable from an international point of view.

NOTE 2 The character repertoire used in this document is equivalent to that of the Unicode Standard version 13.0[27].

— A reference name. The reference name refers to this particular version of the Common Template Table, for use as a reference when tailoring. In particular, this name implies that the table is linked to a particular stage of development of the ISO/IEC 10646 Universal coded character set.

— Requirements for a declaration of the differences (delta) between the collation table and the Common Template Table.

This document does not mandate the following.

— A specific comparison method; any equivalent method giving the same results is acceptable.

— A specific format for describing or tailoring tables in a given implementation.

— Specific symbols to be used by implementations, except for the name of the Common Template Table.

— Any specific user interface for choosing options.

— Any specific internal format for intermediate keys used when comparing, nor for the table used. The use of numeric keys is not mandated either.

— A context-dependent ordering.

— Any particular preparation of character strings prior to comparison.

NOTE 1 It is normally necessary to do preparation of character strings prior to comparison even if it is not prescribed by this document (see Annex C).

NOTE 2 Annex D describes problems that gave way to this International Standard with their anticipated solutions.

Get this standard Prices exclude GST
PDF ( Single user document)
$5.00 NZD
HardCopy
$50.00 NZD
Networkable PDF
Price varies
Preview only close
Prev {{ page }}/ {{ numPages }} Next
Preview only close
Prev {{ page }}/ {{ numPages }} Next
Pages: 52

Keep me up-to-date

Sign up to receive updates when there are changes to this standard

Related Information

Similar Standards

  • BS 4505-2:1990

    Digital data transmission, Character structure for start/stop and synchronous transmission

  • BS 4505-3:1981

    Digital data transmission, Method for use of longitudinal parity to detect errors in information messages

  • BS 4636-3:1986

    Punched cards, Specification for representation of 7-bit and 8-bit coded character sets on 12-row punched cards

  • BS 6429:1989

    Method for conversion between the two coded character sets of BS 4730 (ISO 646) and BS 6692:Part 2 (ISO 6937-2) and the CCITT international telegraph alphabet No. 2 (ITA 2)

Preview only close
Prev {{ page }}/ {{ numPages }} Next
Preview only close
Prev {{ page }}/ {{ numPages }} Next
Pages: 52

ISO/IEC 14651:2020

Get this standard Prices exclude GST
PDF ( Single user document)
$5.00 NZD
HardCopy
$50.00 NZD
Networkable PDF
Price varies

Request to add this standard to your subscription

ISO/IEC 14651:2020

Price varies
Online library subscription

Your organisation’s Account Administrator must approve a request to add a standard to your subscription.

You may add a comment to the administrator below.

Cancel