This is a help-text file for use with the survey tool. You can add a new row, where the key is a key that the program knows about, and the Text to Insert is what you want to show up as help text, or modify existing text. The software that interprets this expects a particular format, so don't make arbitrary changes (see the end).
Key | Text to Insert | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
territory_language_information | The
language data is provided for localization testing, and is under
development for CLDR 1.5. The main goal is to provide approximate
figures for the literate, functional population for each language in
each territory: that is, the population that is able to read and write
each language, and is comfortable enough to use it with computers.
The GDP and Literacy figures are taken from the World Bank where available, otherwise supplemented by FactBook data and other sources. The GDP figures are "PPP (constant 2000 international $)". Much of the per-language data is taken from the Ethnologue, but is supplemented and processed using many other sources, including per-country census data. (The focus of the Ethnologue is native speakers, which includes people who are not literate, and excludes people who are functional second-langauge users.) The percentages may add up to more than 100% due to multilingual populations, or may be less than 100% due to illiteracy or because the data has not yet been gathered or processed. Languages with a small population may be omitted. Official status is supplied where available, formatted as {O}. Hovering with the mouse shows a short description.
|
||||||||||||
language_territory_information |
The language data is provided for localization testing, and is under development for CLDR 1.5. For information on the meaning of the different values, see Territory-Language Information.
|
||||||||||||
detailed_territory_currency_information |
The following table shows when currencies were in use in different countries. See also Decimal Digits and Rounding. The digits column shows the number of digits to use; if there is special rounding (such as for CH), that is in parentheses. The Countries column shows which countries the currency is — or has been — used in, officially.
|
||||||||||||
languages_and_scripts | This
table shows some information about the scripts commonly used with
different languages. This information is not complete, and is being
enhanced over time. The table is sorted by language; for the same
information sorted by script, see Scripts and Languages. The following conventions are used in the table:
|
||||||||||||
scripts_and_languages | This
table shows some information about the scripts commonly used with
different languages. This information is not complete, and is being
enhanced over time. The table is sorted by script; for the same
information sorted by language, see Languages and Scripts. The following
conventions are used in the table:
|
||||||||||||
territory_containment_un_m_49 |
The Territory Containment table shows the organization of territories and regions according to UN M.49, starting with the World. (CLDR supplements this table with the QO code for outlying areas that would not otherwise be included.) As the last column, the timezone IDs for that country are listed.
|
||||||||||||
windows_tzid |
The Windows-Tzid table shows the mapping from Windows timezone IDs to the standard TZIDs.
|
||||||||||||
character_fallback_substitutions | The Character Fallback Substitutions table shows recommended fallbacks for use when a charset or supported repertoire does not contain a desired
character, using the data from characters.xml. There is more than one possible
fallback: the recommended usage is that when a character value is not in the desired repertoire
the following process is used, whereby the first value that is wholly in the desired
repertoire is used.
The Explicit, NFC, and NFKC substitutes are shown in the chart by different colors. Note that the character fallbacks do lose information, and should not be used where there is a viable alternative, such as HTML escapes.
|
||||||||||||
aliases |
Aliases show how to map deprecated codes or aliases onto the ones that should be used to access CLDR data. Most other metadata is not shown in tables; the source data should be consulted. Codes are shown in brackets before or after the English name, eg "Vanuatu [VU]"
|
||||||||||||
likely_subtags |
There are a number of situations where it is useful to be able to
find the most likely language, script, or region, if that information is otherwise missing.
For example:
Conversely, given a locale, it is useful to find out which fields (language, script, or region) may be superfluous, in the sense that they contain the likely tags. For example, "en_Latn" can be simplified down to "en" since "Latn" is the likely script for "en"; "ja_Japn_JP" can be simplified down to "ja". The likelySubtag supplemental data provides default information for computing these values. This data is based on the default content data, the population data, and the the suppress-script data in [BCP47]. It is heuristically derived, and may change over time. The chart shows how the data "fills in" the missing fields in the source values to get the target values.
|
||||||||||||
language_plural_rules |
Languages vary in how they handle plurals of nouns or unit expressions ("hours", "meters", and so on). Some languages have two forms, like English; some languages have only a single form; and some languages have multiple forms (see Slovenian below). CLDR uses short, mnemonic tags for these plural categories:
These categories are used to provide localized units, with a more natural ways of expressing phrases that vary in plural form, such as "1 hour" vs "2 hours". While they cannot express all the intricacies of natural languages, they allow for more natural phrasing than constructions like "1 hour(s)". These categories are only mnemonics -- the names don't necessarily imply the exact contents of the category. For example, for both English and French the number 1 has the category one (singular). In English, every other number has a plural form, and is given the category other. French is similar, except that the number 0 also has the category one and not other or zero, because the form of units qualified by 0 is also singular. Note that these categories may be different from the forms used for pronouns or other parts of speech. In particular, they are solely concerned with changes that would need to be made if different numbers, expressed with decimal digits, are used with a sentence. If there is a dual form in the language, but it isn't used with decimal numbers, it should not be reflected in the categories. That is, the key feature to look for is: If you were to substitute a different number for "1" in a sentence or phrase, would the rest of the text be required to change? For example, in a caption for a video: "Duration: 1 hour" → "Duration: 3.2 hours" Plural rule syntax: Each plural rule is a condition with a boolean result that specifies whether that rule (i.e. that plural form) applies for a given numeric value n; conditions have the following syntax: condition = and_condition ('or' and_condition)*
Examples:
The following is draft data for this version of CLDR, so please look it over carefully. Known omissions are:
|
||||||||||||
error_locale_header|error_index_header |
Please review
and correct them. Note that errors in sublocales are often fixed by
fixing the main locale. This
list is only generated daily, and so may not reflect fixes you have
made until tomorrow. (There were production problems in integrating it
fully into the Survey tool. However, it should let you see the problems
and make sure that they get taken care of.)
The table below gives a count for each of the following kinds of items. The focus is on correcting the problems, and getting enough votes for "minimal approval" (status=contributed -- high enough to get incorporated into most implementations).
|
The text to insert can be fairly arbitrary HTML. The software that reads this table will search the first column (eg between <td> and </td>) and return the contents of the second column.
WARNING