GNU bug report logs - #30219
ert explainer for `equal' fails for equal-but-for-multibyte-ness strings

Previous Next

Package: emacs;

Reported by: Philipp <p.stephani2 <at> gmail.com>

Date: Mon, 22 Jan 2018 22:56:02 UTC

Severity: minor

Tags: confirmed, patch

Found in version 27.0.50

Done: Mattias Engdegård <mattiase <at> acm.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 30219 in the body.
You can then email your comments to 30219 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#30219; Package emacs. (Mon, 22 Jan 2018 22:56:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Philipp <p.stephani2 <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Mon, 22 Jan 2018 22:56:04 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Philipp <p.stephani2 <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: 27.0.50; (should (equal ...)) bug for string equality
Date: Mon, 22 Jan 2018 23:55:13 +0100
Define the following ERT test:

(ert-deftest foo ()
  (should (equal "a\xFF" "a\u00FF")))

Then run M-x ert.  The test triggers an assertion.  This is because ERT
incorrectly assumes that two arrays are equal if their lengths and
elements are equal, but that's not the case when comparing unibyte and
multibyte strings.


In GNU Emacs 27.0.50 (build 10, x86_64-apple-darwin17.3.0, NS appkit-1561.20 Version 10.13.2 (Build 17C205))
 of 2018-01-22 built on p
Repository revision: 3558d96b60393893a346f4382b813ca0738f9d9b
Windowing system distributor 'Apple', version 10.3.1561
System Description:  Mac OS X 10.13.2

Recent messages:
For information about GNU Emacs and the GNU system, type C-h C-a.

Configured using:
 'configure --without-threads --with-modules --without-pop
 --with-mailutils --enable-gcc-warnings=yes --enable-checking
 --enable-check-lisp-object-type 'CFLAGS=-ggdb3 -O0''

Configured features:
NOTIFY ACL GNUTLS LIBXML2 ZLIB TOOLKIT_SCROLL_BARS NS MODULES JSON

Important settings:
  value of $LANG: de_DE.UTF-8
  locale-coding-system: utf-8-unix

Major mode: Lisp Interaction

Minor modes in effect:
  tooltip-mode: t
  global-eldoc-mode: t
  eldoc-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  tool-bar-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  line-number-mode: t
  transient-mark-mode: t

Load-path shadows:
None found.

Features:
(shadow sort mail-extr emacsbug message rmc puny seq byte-opt gv
bytecomp byte-compile cconv cl-loaddefs cl-lib dired dired-loaddefs
format-spec rfc822 mml easymenu mml-sec password-cache epa derived epg
epg-config gnus-util rmail rmail-loaddefs mm-decode mm-bodies mm-encode
mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail rfc2047
rfc2045 ietf-drums mm-util mail-prsvr mail-utils elec-pair time-date
tooltip eldoc electric uniquify ediff-hook vc-hooks lisp-float-type
mwheel term/ns-win ns-win ucs-normalize mule-util term/common-win
tool-bar dnd fontset image regexp-opt fringe tabulated-list replace
newcomment text-mode elisp-mode lisp-mode prog-mode register page
menu-bar rfn-eshadow isearch timer select scroll-bar mouse jit-lock
font-lock syntax facemenu font-core term/tty-colors frame cl-generic
cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao
korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech
european ethiopic indian cyrillic chinese composite charscript charprop
case-table epa-hook jka-cmpr-hook help simple abbrev obarray minibuffer
cl-preloaded nadvice loaddefs button faces cus-face macroexp files
text-properties overlay sha1 md5 base64 format env code-pages mule
custom widget hashtable-print-readable backquote kqueue cocoa ns
multi-tty make-network-process emacs)

Memory information:
((conses 16 204871 8960)
 (symbols 48 20154 1)
 (miscs 40 57 145)
 (strings 32 28911 1996)
 (string-bytes 1 771735)
 (vectors 16 35241)
 (vector-slots 8 721940 14772)
 (floats 8 52 64)
 (intervals 56 208 0)
 (buffers 992 11))




Changed bug title to 'ert explainer for `equal' fails for equal-but-for-multibyte-ness strings' from '27.0.50; (should (equal ...)) bug for string equality' Request was from Noam Postavsky <npostavs <at> users.sourceforge.net> to control <at> debbugs.gnu.org. (Fri, 09 Feb 2018 02:34:02 GMT) Full text and rfc822 format available.

Severity set to 'minor' from 'normal' Request was from Noam Postavsky <npostavs <at> users.sourceforge.net> to control <at> debbugs.gnu.org. (Fri, 09 Feb 2018 02:34:02 GMT) Full text and rfc822 format available.

Added tag(s) confirmed. Request was from Noam Postavsky <npostavs <at> users.sourceforge.net> to control <at> debbugs.gnu.org. (Fri, 09 Feb 2018 02:34:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30219; Package emacs. (Wed, 09 Oct 2019 21:30:02 GMT) Full text and rfc822 format available.

Message #14 received at 30219 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Philipp <p.stephani2 <at> gmail.com>
Cc: 30219 <at> debbugs.gnu.org
Subject: Re: bug#30219: 27.0.50; (should (equal ...)) bug for string equality
Date: Wed, 09 Oct 2019 23:29:15 +0200
Philipp <p.stephani2 <at> gmail.com> writes:

> Define the following ERT test:
>
> (ert-deftest foo ()
>   (should (equal "a\xFF" "a\u00FF")))
>
> Then run M-x ert.  The test triggers an assertion.  This is because ERT
> incorrectly assumes that two arrays are equal if their lengths and
> elements are equal, but that's not the case when comparing unibyte and
> multibyte strings.

I tried putting

(ert-deftest foo ()
  (should (equal "a\xFF" "a\u00FF")))

into one of the test files and I got

1 unexpected results:
   FAILED  foo

So I'm not able to reproduce this bug.  But the code in question hasn't
changed since this was reported, so I'm not sure why I can't reproduce
it.

Does anybody else see this?

      ((pred arrayp)
       (if (/= (length a) (length b))
           `(arrays-of-different-length ,(length a) ,(length b)
                                        ,a ,b
                                        ,@(unless (char-table-p a)
                                            `(first-mismatch-at
                                              ,(cl-mismatch a b :test 'equal))))
         (cl-loop for i from 0
                  for ai across a
                  for bi across b
                  for xi = (ert--explain-equal-rec ai bi)
                  do (when xi (cl-return `(array-elt ,i ,xi)))
                  finally (cl-assert (equal a b) t))))



-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30219; Package emacs. (Thu, 10 Oct 2019 14:35:01 GMT) Full text and rfc822 format available.

Message #17 received at 30219 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: 30219 <at> debbugs.gnu.org
Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, Philipp <p.stephani2 <at> gmail.com>
Subject: Re: bug#30219: 27.0.50; (should (equal ...)) bug for string equality
Date: Thu, 10 Oct 2019 16:34:03 +0200
The bug is directly reproducible, and not very surprising given the subtleties in mixed uni/multibyte string comparisons. ERT treats strings as arrays to be compared element-wise via `aref', which isn't how `equal' works.

There are several solutions, none perfect. I'll be back with a patch later today or tomorrow.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30219; Package emacs. (Thu, 10 Oct 2019 19:30:03 GMT) Full text and rfc822 format available.

Message #20 received at 30219 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: 30219 <at> debbugs.gnu.org
Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, Philipp <p.stephani2 <at> gmail.com>
Subject: Re: bug#30219: 27.0.50; (should (equal ...)) bug for string equality
Date: Thu, 10 Oct 2019 21:29:01 +0200
[Message part 1 (text/plain, inline)]
> There are several solutions, none perfect. I'll be back with a patch later today or tomorrow.

Please try this patch.

[0001-Correctly-explain-test-failures-with-mixed-uni-multi.patch (application/octet-stream, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30219; Package emacs. (Fri, 11 Oct 2019 05:44:02 GMT) Full text and rfc822 format available.

Message #23 received at 30219 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 30219 <at> debbugs.gnu.org, Philipp <p.stephani2 <at> gmail.com>
Subject: Re: bug#30219: 27.0.50; (should (equal ...)) bug for string equality
Date: Fri, 11 Oct 2019 07:43:18 +0200
Mattias Engdegård <mattiase <at> acm.org> writes:

>        ((pred arrayp)
> +       ;; For mixed unibyte/multibyte string comparisons, make both multibyte.
> +       (when (and (stringp a)
> +                  (xor (multibyte-string-p a) (multibyte-string-p b)))
> +         (setq a (string-to-multibyte a))
> +         (setq b (string-to-multibyte b)))

I guess that makes sense.  We (possibly) lose text properties here, but
we don't care about those anyway in this context, I think?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30219; Package emacs. (Fri, 11 Oct 2019 09:22:02 GMT) Full text and rfc822 format available.

Message #26 received at 30219 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 30219 <at> debbugs.gnu.org, Philipp <p.stephani2 <at> gmail.com>
Subject: Re: bug#30219: 27.0.50; (should (equal ...)) bug for string equality
Date: Fri, 11 Oct 2019 11:21:33 +0200
[Message part 1 (text/plain, inline)]
11 okt. 2019 kl. 07.43 skrev Lars Ingebrigtsen <larsi <at> gnus.org>:
> 
> I guess that makes sense.  We (possibly) lose text properties here, but
> we don't care about those anyway in this context, I think?

Oops, that wasn't intended. Second try.

[0001-Correctly-explain-test-failures-with-mixed-uni-multi.patch (application/octet-stream, attachment)]

Added tag(s) patch. Request was from Mattias Engdegård <mattiase <at> acm.org> to control <at> debbugs.gnu.org. (Fri, 11 Oct 2019 12:20:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30219; Package emacs. (Sun, 13 Oct 2019 18:11:01 GMT) Full text and rfc822 format available.

Message #31 received at 30219 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 30219 <at> debbugs.gnu.org, Philipp <p.stephani2 <at> gmail.com>
Subject: Re: bug#30219: 27.0.50; (should (equal ...)) bug for string equality
Date: Sun, 13 Oct 2019 20:10:11 +0200
Mattias Engdegård <mattiase <at> acm.org> writes:

>        ((pred arrayp)
> +       ;; For mixed unibyte/multibyte string comparisons, make both multibyte.
> +       (when (and (stringp a)
> +                  (xor (multibyte-string-p a) (multibyte-string-p b)))
> +         (setq a (string-to-multibyte a))
> +         (setq b (string-to-multibyte b)))

Looks good to me, I think.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Reply sent to Mattias Engdegård <mattiase <at> acm.org>:
You have taken responsibility. (Sun, 13 Oct 2019 18:32:02 GMT) Full text and rfc822 format available.

Notification sent to Philipp <p.stephani2 <at> gmail.com>:
bug acknowledged by developer. (Sun, 13 Oct 2019 18:32:02 GMT) Full text and rfc822 format available.

Message #36 received at 30219-done <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 30219-done <at> debbugs.gnu.org, Philipp <p.stephani2 <at> gmail.com>
Subject: Re: bug#30219: 27.0.50; (should (equal ...)) bug for string equality
Date: Sun, 13 Oct 2019 20:30:52 +0200
13 okt. 2019 kl. 20.10 skrev Lars Ingebrigtsen <larsi <at> gnus.org>:
> 
> Looks good to me, I think.

Thank you, pushed to master and bug closed.





bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 11 Nov 2019 12:24:06 GMT) Full text and rfc822 format available.

This bug report was last modified 4 years and 157 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.