[PATCH] Pay more attention to the the locale's coding systems on startup

stephen at xemacs.org stephen at xemacs.org
Sun Nov 12 22:50:12 EST 2006


VETO

Correctness before convenience, please.  I agree with you that the
language environment stuff is broken with regards to understanding
about various environments, especially those using UTF-8, but this is
patching an inconvenience without fixing the real problem.

Aidan Kehoe writes:

 > 1. It doesn’t deal with the @modifier syntax in Unix locale specifications,
 > which are structured like so: 

To deal with the @modifier syntax you'll need to provide a hook.  At
least in POSIX.1 the @modifier stuff was basically implementation
and/or site-specific.  There are common ones, you can provide a
default implementation if you like.

 > 2. It’s possible for me to start up in, for example, the directory
-------------- next part --------------
 > "/tmp/aidan/?? ??????!" with a LC_CTYPE setting of en_US.UTF-8, and have
-------------- next part --------------
 > XEmacs treat the current directory’s name as being encoded in UTF-8.

LC_CTYPE is a user preference on a multilingual system, but the file
system encoding is typically a site-wide parameter, sometimes enforced
by (some subsystem of) the OS.  As I understand it, under Mac OS X
with HFS+ file names are UTF-8 with the maximally decomposed
canonicalization.  Period.  IIRC Japanese VFAT file systems are always
Shift JIS.  If file-name-coding-system is reset to something else,
these systems will generate very confusing errors.  On other systems
I've used, the file system is EUC-JP even though I used ja_JP.UTF8 or
en_US.ISO8859-1 for the default content encoding.  Depending on what
I'm doing, it's often convenient to change locales.  If XEmacs were to
pay attention to the locale for this parameter, hilarity (not to
mention annoyance) would ensue.

This should be handled by setting file-name-coding-system in the
site-start.el file, not by paying attention to user locale settings.

 > 2006-11-12  Aidan Kehoe  <kehoea at parhasard.net>
 > 
 > 	* mule/mule-cmds.el (get-language-environment-from-locale):
 > 	If the Unix locale matches, and the language environment doesn't
 > 	use the coding system specified in that locale, create a new one
 > 	on the fly that does, and return that language environment. 
 > 	<== #### NOTE NOTE NOTE bogus trailing whitespace

I wish you'd not do this.  Some of my private tools expect the regexp
"\n\n\d{4}-\d{2}-\d{2}  " to match start-of-log-entry, and for "\n\n"
to separate stanzas in a log entry.



More information about the XEmacs-Patches mailing list