Discussion:
[gut] Fwd: Re: Web Query: tex-locale: "wrong" spaces in group-sep with fr-FR
Denis Bitouzé
2018-10-04 07:02:20 UTC
Permalink
Bonjour,

pour faire suite au fil « [Typo] Taille de l'espace séparant les
milliers », voici l'échange que j'ai eu avec Nicola Talbot. Elle suggère
de contacter l'équipe du « Unicode CLDR » (common locale data
repository).

Qu'en pensez-vous ?

-------------------- Start of forwarded message --------------------
Subject: Re: Web Query: tex-locale: "wrong" spaces in group-sep with
fr-FR
To: ***@univ-littoral.fr
From: Dr Nicola L C Talbot <***@dickimaw-books.com>
Date: Tue, 2 Oct 2018 19:52:09 +0100

Hi Denis,

the tex-locale package simply obtains the information from texosquery,
which in turn gets it from the Java virtual machine. The aim of
tex-locale is to remove the need for separate language packages so it's
not like datetime2. The locale information is dependent on the Java
setup. If you are using Java 8 (which is needed for texosquery-jre8)
then the locale information is obtained according to the data provider.

With older versions of Java, the only data provider was the JRE (Java
runtime environment) itself. With Java 8, there are several data
providers. The default is still the JRE (although that will change with
Java 9), but it's now possible to switch to a different provider by
defining the java.locale.providers variable according to your order of
preference. The bash script texosquery-jre8.sh which is used by
Unix-like systems sets this to "CLDR,JRE" by invoking Java with:

java -Djava.locale.providers=CLDR,JRE -jar "$jarpath" "$@"

It's possible to also set this variable globally.

The separators used depend entirely on the data provider. The aim of the
CLDR (common locale data repository) is to provide a common set of
locale data that can be consistently used across all applications.

There's more information about java.locale.providers at
https://docs.oracle.com/javase/8/docs/technotes/guides/intl/enhancements.8.html
and further information about the CLDR at http://cldr.unicode.org/

I've tried texosquery-jre8 -D fr-FR with various data providers and get
the following result for the numeric group separator:

CLDR and JRE: 0x00A0 (no-break space)
SPI and HOST: 0x002C (comma)

If 0x00A0 is inappropriate, the issue would need to be taken up with the
Unicode CLDR team.

If the separator isn't appropriate to your requirements, you can change
it within TeX by modifying the underlying local attribute using
\LocaleSetDialectAttribute as in your example, but the main aim of
tex-locale is to bring TeX in line with other applications that share
the CLDR.

Regards
Nicola
Hi Nicola,
thanks a lot for your `tex-locale' package.
I don't use https://www.dickimaw-books.com/ for reporting because this
package isn't listed on it.
As noticed in the following MWE, I observed that `fr-FR' locale leads
in
`group-sep' to "big" spaces (I guess non-breakable interword spaces).
After some
discussions on French forums dedicated to TeX and with guys who know
French
typography better than me, and after a look at
https://www.bipm.org/en/publications/si-brochure/section5-3-4.html, it
looks
like these spaces are "wrong".
If I understand well, `tex-locale' is not responsible of this since it
just "look[s] up the locale information from the operating system".
Maybe the reason for the OS point of view to define the numeric group
separator as interword spaces is because only monospace fonts were
considered.
Nevertheless, do you plan to provide multi-lingual support (as you did
for `datetime2') in order to let users who know well their locale to
override the locale information looked up from the OS?
And is it possible to change the percent separator?
All the best.
``` latex
\documentclass[french]{article}
\usepackage[locale=FR]{siunitx}
\usepackage[date=full,time=full,timedata]{tex-locale}
\newcommand{\test}{%
\CurrentLocaleIfNumericUsesGroup{yes}{no}.
\par
Currency Symbol: \CurrentLocaleCurrency
\par
\texosqueryfmtnumber{\CurrentLocaleCurrencyPattern}{1234567}{0}{0}\par
\par
\texosqueryfmtnumber{\CurrentLocaleIntegerPattern}{123456}{0}{0}
(via \textsf{tex-locale})
\par
\num{132456} (via \textsf{siunitx})
\par
\texosqueryfmtnumber{\CurrentLocaleDecimalPattern}{123456}{789876}{0}
\par
\num{123456,789876} (via \textsf{siunitx})
\par
\texosqueryfmtnumber{\CurrentLocalePercentPattern}{0}{65}{0} (via
\textsf{tex-locale})
\par
\SI{65}{\percent} (via \textsf{siunitx})
}
\begin{document}
\section{Default}
\test
\section{Modified}
\LocaleSetDialectAttribute{\CurrentTrackedDialect}{groupsep}{\,}%
\test
\end{document}
```
--
Denis
Arthur Reutenauer
2018-10-04 09:46:00 UTC
Permalink
Post by Denis Bitouzé
pour faire suite au fil « [Typo] Taille de l'espace séparant les
milliers », voici l'échange que j'ai eu avec Nicola Talbot. Elle suggère
de contacter l'équipe du « Unicode CLDR » (common locale data
repository).
Qu'en pensez-vous ?
C’est une très bonne idée, mon impression du processus de
développement du CLDR est qu’il n’y a pas beaucoup de gens avec des
vraies connaissances typographiques parmi les contributeurs.

Cordialement,

Arthur
Denis Bitouzé
2018-10-04 12:46:41 UTC
Permalink
Post by Arthur Reutenauer
Post by Denis Bitouzé
pour faire suite au fil « [Typo] Taille de l'espace séparant les
milliers », voici l'échange que j'ai eu avec Nicola Talbot. Elle suggère
de contacter l'équipe du « Unicode CLDR » (common locale data
repository).
Qu'en pensez-vous ?
C’est une très bonne idée,
Parfait.
Post by Arthur Reutenauer
mon impression du processus de développement du CLDR est qu’il n’y
a pas beaucoup de gens avec des vraies connaissances typographiques
parmi les contributeurs.
Accepterais-tu de te charger de signaler à l'équipe du CLDR les
imperfections actuelles, au moins en ce qui concerne la locale fr_FR ?
Tu es meilleur connaisseur de la typo que moi.

Cordialement.
--
Denis
Loading...