[PATCH] Non-Latin-1 escapes can lead to corrupted ELC code.

Aidan Kehoe kehoea at parhasard.net
Sun May 6 19:46:40 EDT 2007


Without this patch, the following test file is compiled incorrectly: 

(defvar Pravda "\u05bf\u05e0\u05d0\u05d2\u05d4\u05d0")

The bug dates from my introduction of Unicode escapes, since previously we
didn’t allow non-Latin-1 chars to be specified with ?\x or ?\. FSF have had
a similar problem for much longer, since they’ve always allowed non-Latin-1
characters to be specified with ?\x and ?\, but it manifests itself on load,
not on compile. 

ChangeLog addition:

lisp/ChangeLog addition:

2007-05-07  Aidan Kehoe  <kehoea at parhasard.net>

	* bytecomp.el (byte-compile-insert-header):
	Check for any Unicode escapes in the source file text when
	deciding whether Mule support is necessary for it, and whether to
	use escape-quoted as the .elc coding system. 


XEmacs Trunk source patch:
Diff command:   cvs -q diff -Nu
Files affected: lisp/bytecomp.el
===================================================================
Index: lisp/bytecomp.el
===================================================================
RCS file: /pack/xemacscvs/XEmacs/xemacs/lisp/bytecomp.el,v
retrieving revision 1.19
diff -u -u -r1.19 bytecomp.el
--- lisp/bytecomp.el	2004/08/13 21:19:15	1.19
+++ lisp/bytecomp.el	2007/05/06 23:32:41
@@ -1842,10 +1842,16 @@
 	  (save-excursion
 	    (set-buffer byte-compile-inbuffer)
 	    (goto-char (point-min))
-	    ;; mrb- There must be a better way than skip-chars-forward
-	    (skip-chars-forward (concat (char-to-string 0) "-"
-					(char-to-string 255)))
-	    (eq (point) (point-max))))
+            ;; Look for any non-Latin-1 literals or Unicode character
+            ;; escapes. Also catches them in comments, which is actually
+            ;; irrelevant to us, but implementing a more complex algorithm
+            ;; is not worth the trade-off.
+            (let ((case-fold-search nil))
+              (re-search-forward 
+               (concat "[^\000-\377]" 
+                       #r"\\u[0-9a-fA-F]\{4,4\}\|\\U[0-9a-fA-F]\{8,8\}")
+               nil t)))
+	    (eq (point) (point-max)))
       (setq buffer-file-coding-system 'raw-text-unix)
     (insert "(or (featurep 'mule) (error \"Loading this file requires Mule support\"))
 ;;;###coding system: escape-quoted\n")

-- 
On the quay of the little Black Sea port, where the rescued pair came once
more into contact with civilization, Dobrinton was bitten by a dog which was
assumed to be mad, though it may only have been indiscriminating. (Saki)



More information about the XEmacs-Patches mailing list