From jrgn.keil at googlemail.com Fri Sep 18 18:48:31 2009 From: jrgn.keil at googlemail.com (Juergen Keil) Date: Fri, 18 Sep 2009 17:48:31 +0200 Subject: [patch] bug 88: 8-bit input not accepted any more in 8-bit locales on Solaris Message-ID: I'm using the tcsh binary that is included with OpenSolaris, and after a recent OS upgrade noticed that 8-bit characters don't work any more in tcsh in Solaris' 8-bit locales. Part of the OS upgrade was an update from tcsh 6.14.07 to 6.16.00. Full details are available in bug 88: http://bugs.gw.com/view.php?id=88 Root cause for the problem should be this piece of code in sh.c 339 { 340 int k; 341 342 for (k = 0200; k <= 0377 && !Isprint(CTL_ESC(k)); k++) 343 continue; 344 AsciiOnly = MB_CUR_MAX == 1 && k > 0377; 345 } When tcsh is compiled with WIDE_STRINGS, the Isprint() macro is expected to be used with wchar_t arguments, but is used with values 0200 .. 0377 (0x80 .. 0xff hex). This might work on some systems / locales, where unicode character codes are used as wchar_t values. But it does not work in Solaris' ISO8859-x 8-bit locales. As a result AsciiOnly is set to TRUE in Solaris ISO8859-x locales, and this breaks input for 8-bit characters. I think the tcsh 8-bit problem on Solaris can be fixed by changing the way "AsciiOnly" is set (see the attached sol-8bit.patch file): - on multibyte locales (MB_CUR_MAX > 1), AsciiOnly is always set to false This includes UTF-8, and locales like zh_CN.{EUC,GBK,GB18030} - on singlebyte locales, AsciiOnly is set to true when there are no printable characters in range 0x80 .. 0xff. Because we test this for single byte locales only, isprint() from ctype.h can be used. (This avoids having to use iconv(), mbtowc() and friends to construct valid wchar_t values for tcsh's Isprint() macro) From christos at zoulas.com Fri Sep 18 21:10:30 2009 From: christos at zoulas.com (Christos Zoulas) Date: Fri, 18 Sep 2009 14:10:30 -0400 (EDT) Subject: PR/88 CVS commit: tcsh Message-ID: <20090918181030.5429856550@rebar.astron.com> Module Name: tcsh Committed By: christos Date: Fri Sep 18 18:10:30 UTC 2009 Modified Files: tcsh: sh.c Log Message: from Juergen Keil, fix AsciiOnly setting per PR/88. To generate a diff of this commit: cvs rdiff -r3.146 -r3.147 tcsh/sh.c Please note that diffs are not public domain; they are subject to the copyright notices on the relevant files. From christos at zoulas.com Fri Sep 18 23:56:10 2009 From: christos at zoulas.com (Christos Zoulas) Date: Fri, 18 Sep 2009 16:56:10 -0400 Subject: [patch] bug 88: 8-bit input not accepted any more in 8-bit locales on Solaris In-Reply-To: from Juergen Keil (Sep 18, 5:48pm) Message-ID: <20090918205610.6351E5654E@rebar.astron.com> On Sep 18, 5:48pm, jrgn.keil at googlemail.com (Juergen Keil) wrote: -- Subject: [patch] bug 88: 8-bit input not accepted any more in 8-bit locale Thanks a lot for the explanation and the fix. I have committed it already. christos