GNU bug report logs -
#7781
23.2.91; ispell problem with hunspell and UTF-8 file
Previous Next
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 7781 in the body.
You can then email your comments to 7781 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Mon, 03 Jan 2011 23:08:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Reuben Thomas <rrt <at> sc3d.org>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Mon, 03 Jan 2011 23:08:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
With the following text, and using emacs -Q, I get the errors you can
see in the messages log below when using hunspell to spell-check a UTF-8
buffer with some extended characters in it.
I did test this with emacs -Q, but the current session, in which I
reproduced the problem and am now composing this bug report, was not
started with -Q (this is so submitting the bug report works properly!).
I am running a freshly bzr-pulled build of the emacs-23 branch.
Text follows
----cut here----
---
title: Kindle 3 is a good first attempt
tags: computing, books
format: markdown
date: Mon, 03 Jan 2011 20:53:13 +0000
post-id: 2585181001
---
Giving my girlfriend a Kindle for Christmas was the carrot in a multi-pronged strategy to avoid needing more bookshelves (the stick being “I will start giving away your books” and my contribution being to archive books I’ve read (or return the many that aren’t even mine). This therefore required that I stocked it with books before she got her hands on it, which in turn was all the excuse I needed to play with the thing.
My lazy solution was simply to download all of [Feedbooks](http://www.feedbooks.com); I [wrote some scripts](http://rrt.sc3d.org/Software/Kindle/) to make this actually lazy, rather than brain-numbingly dull. In the process I found that while the Kindle is nice to hold and great to read, it struggles to cope with a large collection of books (even though the nearly 3,000 volumes of Feedbooks only half-filled its 4Gb memory), and is woeful as a research tool. And, of course, Amazon’s first-mover-evil surfaced early.
Here are the problems I had:
1. Amazon’s own store doesn’t seem to contain free books. I think it’s poor form not to give people a straightforward choice of free editions of out-of-copyright works. The Kindle may be a loss leader, but at £109 it’s still not cheap. Feedbooks, rather than integrating easily into the Kindle, like, say, a 3rd-party software provider into Ubuntu’s Software Center, provide a catalogue which itself is in the form of a book, doesn’t automatically update, and offers a list ordered only by title. In other words, it’s useless; one is better off using the built-in web browser to search the online catalogue…
2. …or better, another browser, since the Kindle’s is woefully slow (and I don’t just mean the screen update). It’s just about usable, and hence useful in an emergency, but is no good as, for example, an online research tool to use in parallel with the books you have downloaded, although…
3. …offline search is awful too. With just the few ebooks that come loaded on the device, it was slow; with the thousands of books I loaded, it simply locked up the device, even when trying to search in the manual, presumably already indexed. The Kindle seems to index its contents in the background, but even now, over a week later, search doesn’t work. The only effective navigation is by a book’s table of contents, and, to choose which books to read, the user-definable collections, though…
4. …collections are a pain to set up for many books, as you have to select each book manually; there is no way I have found to select a range. (Fortunately, I was able to define collections programmatically, but this will be beyond most users.)
In summary, it’s a lovely device, but the software is rather toytown. Amazon could improve it (and indeed, the 3.0.3 firmware update, at the experimental stage when I checked, claims, vaguely, “performance improvements”), but given that their main interest is in selling books and Kindles, I’m not hopeful that it will happen before the next hardware iteration; whether it happens at all depends on competition, and there should be plenty of that, to go by the number of other ebook readers.
----cut here----
In GNU Emacs 23.2.91.3 (i686-pc-linux-gnu, GTK+ Version 2.22.0)
of 2011-01-03 on mord
Windowing system distributor `The X.Org Foundation', version 11.0.10900000
Important settings:
value of $LC_ALL: nil
value of $LC_COLLATE: nil
value of $LC_CTYPE: nil
value of $LC_MESSAGES: nil
value of $LC_MONETARY: nil
value of $LC_NUMERIC: nil
value of $LC_TIME: nil
value of $LANG: en_GB.UTF-8
value of $XMODIFIERS: nil
locale-coding-system: utf-8-unix
default enable-multibyte-characters: t
Major mode: Text
Minor modes in effect:
longlines-mode: t
buffer-face-mode: t
flyspell-mode: t
show-paren-mode: t
savehist-mode: t
minibuffer-electric-default-mode: t
iswitchb-mode: t
icomplete-mode: t
global-auto-revert-mode: t
desktop-save-mode: t
smart-quotes-mode: t
mouse-wheel-mode: t
use-hard-newlines: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
auto-encryption-mode: t
auto-compression-mode: t
column-number-mode: t
line-number-mode: t
transient-mark-mode: t
Recent input:
M-x r e p o r t - e m <tab> <return> h u n s p e l
l SPC <M-backspace> i s p e l l SPC w i t h SPC h u
n s l e <backspace> <backspace> s p e <backspace> <backspace>
p e <backspace> <backspace> <backspace> p e l l SPC
f a i l s C-g <down> <down> <down> <down> <down> <down>
<down> <up> <up> <up> <up> <up> <up> <up> <up> <up>
<up> <up> <up> <up> <up> <up> <up> M-x i s p e l l
<return> SPC SPC SPC M-x i s p e <backspace> <backspace>
<backspace> <backspace> <up> <up> <return>
Recent messages:
Scanning for "hard" Perl constructions... done
Applying style hooks... done
Scanning for "hard" Perl constructions... done
Scanning for "hard" Perl constructions... done
Scanning for "hard" Perl constructions... done
Scanning for "hard" Perl constructions... done
Lazy desktop load complete
Quit
Spell-checking Kindle 3 is a good first attempt using hunspell with british+accs dictionary...
Spell-checking region using hunspell with british+accs dictionary...done
ispell-process-line: Ispell misalignment: word `Feedbooks' point 1363; probably incompatible versions
Load-path shadows:
/usr/local/share/emacs/23.2.91/site-lisp/auctex/tex-style hides /usr/share/emacs/site-lisp/auctex/tex-style
/usr/local/share/emacs/23.2.91/site-lisp/auctex/tex-buf hides /usr/share/emacs/site-lisp/auctex/tex-buf
/usr/local/share/emacs/23.2.91/site-lisp/auctex/context hides /usr/share/emacs/site-lisp/auctex/context
/usr/local/share/emacs/23.2.91/site-lisp/auctex/bib-cite hides /usr/share/emacs/site-lisp/auctex/bib-cite
/usr/local/share/emacs/23.2.91/site-lisp/auctex/tex-fold hides /usr/share/emacs/site-lisp/auctex/tex-fold
/usr/local/share/emacs/23.2.91/site-lisp/auctex/tex-jp hides /usr/share/emacs/site-lisp/auctex/tex-jp
/usr/local/share/emacs/23.2.91/site-lisp/auctex/context-nl hides /usr/share/emacs/site-lisp/auctex/context-nl
/usr/local/share/emacs/23.2.91/site-lisp/auctex/toolbar-x hides /usr/share/emacs/site-lisp/auctex/toolbar-x
/usr/local/share/emacs/23.2.91/site-lisp/auctex/tex-mik hides /usr/share/emacs/site-lisp/auctex/tex-mik
/usr/local/share/emacs/23.2.91/site-lisp/auctex/context-en hides /usr/share/emacs/site-lisp/auctex/context-en
/usr/local/share/emacs/23.2.91/site-lisp/auctex/texmathp hides /usr/share/emacs/site-lisp/auctex/texmathp
/usr/local/share/emacs/23.2.91/site-lisp/auctex/tex-info hides /usr/share/emacs/site-lisp/auctex/tex-info
/usr/local/share/emacs/23.2.91/site-lisp/auctex/tex-fptex hides /usr/share/emacs/site-lisp/auctex/tex-fptex
/usr/local/share/emacs/23.2.91/site-lisp/auctex/tex-font hides /usr/share/emacs/site-lisp/auctex/tex-font
/usr/local/share/emacs/23.2.91/site-lisp/auctex/latex hides /usr/share/emacs/site-lisp/auctex/latex
/usr/local/share/emacs/23.2.91/site-lisp/auctex/font-latex hides /usr/share/emacs/site-lisp/auctex/font-latex
/usr/local/share/emacs/23.2.91/site-lisp/auctex/tex-bar hides /usr/share/emacs/site-lisp/auctex/tex-bar
/usr/local/share/emacs/23.2.91/site-lisp/auctex/multi-prompt hides /usr/share/emacs/site-lisp/auctex/multi-prompt
/usr/local/share/emacs/23.2.91/site-lisp/auctex/tex hides /usr/share/emacs/site-lisp/auctex/tex
Features:
(shadow sort mail-extr message sendmail ecomplete rfc822 mml mml-sec
password-cache mm-decode mm-bodies mm-encode mailcap mail-parse rfc2231
rfc2047 rfc2045 qp ietf-drums mailabbrev nnheader gnus-util netrc
time-date mm-util mail-prsvr gmm-utils wid-edit mailheader canlock sha1
hex-util hashcash mail-utils emacsbug preview prv-emacs byte-opt
warnings tex-buf noutline outline font-latex bytecomp byte-compile latex
tex-style tex nxml-uchnm rng-xsd xsd-regexp rng-cmpct rng-nxml rng-valid
rng-loc rng-uri rng-parse nxml-parse rng-match rng-dt rng-util rng-pttrn
nxml-ns nxml-mode nxml-outln nxml-rap nxml-util nxml-glyph nxml-enc
xmltok sgml-mode conf-mode newcomment make-mode vc-git cperl-mode
longlines face-remap filladapt flyspell auto-dictionary-autoloads
dictionary-autoloads js2-mode-autoloads package reporter completing-help
ff-paths uniquify paren savehist minibuf-eldef iswitchb icomplete
autorevert time cus-start cus-load desktop server change-mode advice
help-fns advice-preload php-mode derived etags cc-langs cl cl-19 cc-mode
cc-fonts cc-menus cc-cmds cc-styles cc-align cc-engine cc-vars cc-defs
speedbar sb-image ezimage dframe easymenu assoc lua-mode regexp-opt
comint ring whitespace etags-update smart-quotes edmacro kmacro ispell
ffap muse-autoloads emacs-goodies-el emacs-goodies-custom
emacs-goodies-loaddefs easy-mmode devhelp preview-latex tex-site
auto-loads tooltip ediff-hook vc-hooks lisp-float-type mwheel x-win
x-dnd font-setting tool-bar dnd fontset image fringe lisp-mode register
page menu-bar rfn-eshadow timer select scroll-bar mldrag mouse jit-lock
font-lock syntax facemenu font-core frame cham georgian utf-8-lang
misc-lang vietnamese tibetan thai tai-viet lao korean japanese hebrew
greek romanian slovak czech european ethiopic indian cyrillic chinese
case-table epa-hook jka-cmpr-hook help simple abbrev loaddefs button
minibuffer faces cus-face files text-properties overlay md5 base64
format env code-pages mule custom widget hashtable-print-readable
backquote make-network-process dbusbind system-font-setting
font-render-setting gtk x-toolkit x multi-tty emacs)
--
http://rrt.sc3d.org/
Information forwarded
to
owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Fri, 07 Jan 2011 13:07:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 7781 <at> debbugs.gnu.org (full text, mbox):
2011/1/4 Reuben Thomas <rrt <at> sc3d.org>:
> With the following text, and using emacs -Q, I get the errors you can
> see in the messages log below when using hunspell to spell-check a UTF-8
> buffer with some extended characters in it.
>
> I did test this with emacs -Q, but the current session, in which I
> reproduced the problem and am now composing this bug report, was not
> started with -Q (this is so submitting the bug report works properly!).
>
> I am running a freshly bzr-pulled build of the emacs-23 branch.
Hi, Reuben,
I can also reproduce this with emacs23.2. I could locate problems in
two lines, after splititng original lines,
-- Cut here -- 8< ----- minimal.txt: utf-8
of out-of-copyright works. The Kindle may be a loss leader, but at £109
it’s still not cheap. Feedbooks, rather than integrating easily into
-- Cut here -- 8< ----- End of minimal.txt
In first line, currency seems to give some conversion errors when
iso-8859-1 is used, when that should have ignored by hunspell. I get
tons of
UTF-8 encoding error. Missing continuation byte in 0. character position:
for that line when using
$ cat minimal.txt | hunspell -d en_US -a -i iso-8859-1
In second line unusual apostrophe seems to cause some confusion to
hunspell when utf8 is used. Comparing what aspell and hunspell give in
similar text I get
$ cat minimal.txt | aspell --encoding=utf-8 -d en_US -a
& Feedbooks 6 22: Feed books, Feed-books, Feedback's, Feedbags, ...
$ cat minimal.txt | hunspell -d en_US -i utf-8 -a
& Feedbooks 8 24: Feed books, Feed-books, Feedback, Feedbags, ...
Do not worry about first number, is the number of suggestions. However
position in second number differ. Seems that hunspell is not
considering that apostrophe as a single (multibyte) char when
counting, but as three components
Looks to me an hunspell bug. I found no reference to this problem in
hunspell sf site, but noticed that Hunspell 1.2.14 was released
yesterday. Need to check if that has some related new.
--
Agustin
Information forwarded
to
owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Fri, 07 Jan 2011 14:24:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 7781 <at> debbugs.gnu.org (full text, mbox):
Thanks very much for your investigation, Agustin.
I tried hunspell 1.2.14 and got exactly the same error.
Information forwarded
to
owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Fri, 11 Feb 2011 16:53:01 GMT)
Full text and
rfc822 format available.
Message #14 received at 7781 <at> debbugs.gnu.org (full text, mbox):
forwarded 7781 https://sourceforge.net/tracker/?func=detail&aid=3178449&group_id=143754&atid=756395
thanks
2011/1/7 Agustin Martin <agustin.martin <at> hispalinux.es>:
> 2011/1/4 Reuben Thomas <rrt <at> sc3d.org>:
>> With the following text, and using emacs -Q, I get the errors you can
>> see in the messages log below when using hunspell to spell-check a UTF-8
>> buffer with some extended characters in it.
> Do not worry about first number, is the number of suggestions. However
> position in second number differ. Seems that hunspell is not
> considering that apostrophe as a single (multibyte) char when
> counting, but as three components
>
> Looks to me an hunspell bug. I found no reference to this problem in
> hunspell sf site, but noticed that Hunspell 1.2.14 was released
> yesterday. Need to check if that has some related new.
Opened an hunspell bug report for bad count problem
https://sourceforge.net/tracker/?func=detail&aid=3178449&group_id=143754&atid=756395
Seems I no longer see the other problem.
Cheers,
--
Agustin
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Sun, 01 Jan 2012 21:45:02 GMT)
Full text and
rfc822 format available.
Message #19 received at 7781 <at> debbugs.gnu.org (full text, mbox):
Those who want to compile a bug fix in Hunspell for themselves can find
fixes (based on Hunspell 1.2.8 and Emacs V23) to spell check
word-separated Thai in UTF-8 from Emacs at
http://homepage.ntlworld.com/richard.wordingham/thai/hunspell-1.2.8-jrw1.1.zip
- the byte v. character count problem was just one of those met and resolved. The full list is:
On Hunspell:
Bad UTF-8 char count in pipe mode - ID: 3178449
No Encoding of Word for Suggestions in Piped Mode
(https://sourceforge.net/tracker/?func=detail&aid=3468022&group_id=143754&atid=756395)
Multidictionary guesses dictionary for suggestions
(https://sourceforge.net/tracker/?func=detail&aid=3468039&group_id=143754&atid=756395)
Hunspell 1.2.8 Groups Thai TIS-620 Chars in Lower/Upper Case Pairs
(https://bugs.launchpad.net/ubuntu/+source/hunspell/+bug/910452) (fixed
in Release 1.2.14)
On the Thai dictionary:
th_TH Affix File Inadequate for Hunspell
(https://bugs.launchpad.net/ubuntu/+source/openoffice.org-dictionaries/+bug/910447)
There is also a problem with the size of the window holding correction
in Thai (probably depending on the choice of font); the addition of
(fit-window-to-buffer) at the appropriate point in ispell.el (as in the
zip file) fixes that.
Richard.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Sat, 13 Apr 2013 23:46:02 GMT)
Full text and
rfc822 format available.
Message #22 received at 7781 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
As soon as I can see, the hunspell team haven't fixed the bug in more
then 2 years. Maybe for them it is not a bug but a feature.
The problem is that hunspell reports byte-position instead of
char-position with multi-byte character input, while Emacs waits for
char-position. With the patch attached I propose to make conversation in
the ispell-parse-output function.
Thanks,
Nikolay Suschenko
[ispell.el.patch (text/plain, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Sun, 14 Apr 2013 05:47:02 GMT)
Full text and
rfc822 format available.
Message #25 received at 7781 <at> debbugs.gnu.org (full text, mbox):
> Date: Sat, 13 Apr 2013 23:12:38 +0400
> From: Николай Сущенко <sckol <at> yandex.ru>
>
> As soon as I can see, the hunspell team haven't fixed the bug in more
> then 2 years. Maybe for them it is not a bug but a feature.
Hunspell bug resolution process could use some speedup.
> The problem is that hunspell reports byte-position instead of
> char-position with multi-byte character input, while Emacs waits for
> char-position. With the patch attached I propose to make conversation in
> the ispell-parse-output function.
Sorry, no. I tried that initially, but this work-around has problems
(don't remember the details, though).
It is much better to rebuild Hunspell with this bug fixed. I can give
you a patch for that if you need it (I think there's a patch in the
bug database as well). I fixed my hunspell long ago, and never looked
back. Or ask your distribution's maintainers to release a fixed
hunspell distro.
Thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Sun, 14 Apr 2013 06:39:01 GMT)
Full text and
rfc822 format available.
Message #28 received at 7781 <at> debbugs.gnu.org (full text, mbox):
Hi, Eli
Please send me this patch, I'll ask the hunspell developers to include it.
Could you also recall which concrete problems produces this workaround?
For me it works fine, but I haven't tested it in different languages and
encodings. If it is some problems, I could try to fix it, but as for
now, Emacs don't work with hunspell+utf-8 at all, at the minimum in
Slackware and Arch.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Sun, 14 Apr 2013 07:13:02 GMT)
Full text and
rfc822 format available.
Message #31 received at 7781 <at> debbugs.gnu.org (full text, mbox):
> Date: Sun, 14 Apr 2013 10:33:39 +0400
> From: Николай Сущенко
> <sckol <at> yandex.ru>
> CC: 7781 <at> debbugs.gnu.org
>
> Please send me this patch, I'll ask the hunspell developers to include it.
Attached. This is a small part of a much larger patch, most of it for
Windows-specific problems. If you have problems compiling the patched
hunspell, let me know: it could be that I omitted some hunk that is
needed for this part.
> Could you also recall which concrete problems produces this workaround?
> For me it works fine, but I haven't tested it in different languages and
> encodings.
One problem is that you assume the encoding of the communications with
hunspell is UTF-8, and thus matches the internal representation of
text in Emacs buffers and strings (only then will byte-to-position
give correct results). But that assumption is false: hunspell
supports any encoding that it can convert to/from UTF-8 (it uses
libiconv internally). The "usual" choice of the encoding is the one
used by the dictionary. Not every dictionary out there is in UTF-8.
> If it is some problems, I could try to fix it
I don't think you can fix this on the Emacs side, because Emacs cannot
easily and/or quickly convert between bytes and characters in an
arbitrary multibyte encoding.
When I discovered this problem, I also tried fixing it on the Emacs
side first, but then I realized that this kind of solution has too
many problems, and instead fixed it in hunspell.
--- src/tools/hunspell.cxx~0 2011-01-21 19:01:29.000000000 +0200
+++ src/tools/hunspell.cxx 2013-02-07 10:11:54.443610900 +0200
@@ -710,13 +748,22 @@ if (pos >= 0) {
fflush(stdout);
} else {
char ** wlst = NULL;
- int ns = pMS[d]->suggest(&wlst, token);
+ int byte_offset = parser->get_tokenpos() + pos;
+ int char_offset = 0;
+ if (strcmp(io_enc, "UTF-8") == 0) {
+ for (int i = 0; i < byte_offset; i++) {
+ if ((buf[i] & 0xc0) != 0x80)
+ char_offset++;
+ }
+ } else {
+ char_offset = byte_offset;
+ }
+ int ns = pMS[d]->suggest(&wlst, chenc(token, io_enc, dic_enc[d]));
if (ns == 0) {
- fprintf(stdout,"# %s %d", token,
- parser->get_tokenpos() + pos);
+ fprintf(stdout,"# %s %d", token, char_offset);
} else {
fprintf(stdout,"& %s %d %d: ", token, ns,
- parser->get_tokenpos() + pos);
+ char_offset);
fprintf(stdout,"%s", chenc(wlst[0], dic_enc[d], io_enc));
}
for (int j = 1; j < ns; j++) {
@@ -745,13 +792,23 @@ if (pos >= 0) {
if (root) free(root);
} else {
char ** wlst = NULL;
+ int byte_offset = parser->get_tokenpos() + pos;
+ int char_offset = 0;
+ if (strcmp(io_enc, "UTF-8") == 0) {
+ for (int i = 0; i < byte_offset; i++) {
+ if ((buf[i] & 0xc0) != 0x80)
+ char_offset++;
+ }
+ } else {
+ char_offset = byte_offset;
+ }
int ns = pMS[d]->suggest(&wlst, chenc(token, io_enc, dic_enc[d]));
if (ns == 0) {
fprintf(stdout,"# %s %d", chenc(token, io_enc, ui_enc),
- parser->get_tokenpos() + pos);
+ char_offset);
} else {
fprintf(stdout,"& %s %d %d: ", chenc(token, io_enc, ui_enc), ns,
- parser->get_tokenpos() + pos);
+ char_offset);
fprintf(stdout,"%s", chenc(wlst[0], dic_enc[d], ui_enc));
}
for (int j = 1; j < ns; j++) {
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Sat, 20 Apr 2013 18:49:01 GMT)
Full text and
rfc822 format available.
Message #34 received at 7781 <at> debbugs.gnu.org (full text, mbox):
Thank you, for me this patch worked well. However, somebody have already
proposed another patch:
https://sourceforge.net/tracker/?func=detail&aid=3610147&group_id=143754&atid=756397
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Sun, 27 Apr 2014 21:31:02 GMT)
Full text and
rfc822 format available.
Message #37 received at 7781 <at> debbugs.gnu.org (full text, mbox):
Hi,
I'm using a patched hunspell
(http://sourceforge.net/p/hunspell/patches/57/) and it works well with
text-mode and message-mode. But unfortunately it does not work with
context-mode or latex-mode.
Example:
--8<---------------cut here---------------start------------->8---
\documentclass{article}
\begin{document}
bla
\end{document}
--8<---------------cut here---------------end--------------->8---
Running ispell fails with this error:
ispell-process-line: Ispell misalignment: word `bla' point 41; probably incompatible versions
Do you know a solution?
I'm using bzr emacs and git auctex.
TIA for any help,
--
Peter
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Mon, 28 Apr 2014 15:38:02 GMT)
Full text and
rfc822 format available.
Message #40 received at 7781 <at> debbugs.gnu.org (full text, mbox):
> From: Peter Münster <pmlists <at> free.fr>
> Date: Sun, 27 Apr 2014 23:30:25 +0200
>
> I'm using a patched hunspell
> (http://sourceforge.net/p/hunspell/patches/57/) and it works well with
> text-mode and message-mode. But unfortunately it does not work with
> context-mode or latex-mode.
>
> Example:
>
> --8<---------------cut here---------------start------------->8---
> \documentclass{article}
> \begin{document}
> bla
> \end{document}
> --8<---------------cut here---------------end--------------->8---
>
> Running ispell fails with this error:
>
> ispell-process-line: Ispell misalignment: word `bla' point 41; probably incompatible versions
I cannot reproduce this. If I start "emacs -Q" and try spell-checking
your example (with Hunspell being the speller), it works just fine for
me: I get suggestions to replace "bla". Same thing if I load AUCTeX
into "emacs -Q" (does AUCTeX even change anything about
spell-checking?).
Does this work for you in "emacs -Q"? If so, I suggest to review your
customizations to look for those which somehow cause this.
If "emacs -Q" doesn't work either, please provide a detailed
reproduction recipe starting from "emacs -Q".
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Mon, 28 Apr 2014 16:19:02 GMT)
Full text and
rfc822 format available.
Message #43 received at 7781 <at> debbugs.gnu.org (full text, mbox):
On Mon, Apr 28 2014, Eli Zaretskii wrote:
> I cannot reproduce this. If I start "emacs -Q" and try spell-checking
> your example (with Hunspell being the speller), it works just fine for
> me: I get suggestions to replace "bla". Same thing if I load AUCTeX
> into "emacs -Q" (does AUCTeX even change anything about
> spell-checking?).
Hi Eli,
It's not AUCTeX, I've just tested with normal latex-mode.
> If "emacs -Q" doesn't work either, please provide a detailed
> reproduction recipe starting from "emacs -Q".
Here a reproduction recipe:
- create minimal latex file /tmp/test.tex
- start emacs:
LANG=C emacs -Q --eval '(setq ispell-program-name "hunspell")' /tmp/test.tex
- M-x ispell
Here are more details about my system:
In GNU Emacs 24.4.50.2 (x86_64-suse-linux-gnu, GTK+ Version 3.10.4)
of 2014-04-20 on micropit
Repository revision: 116996 dancol <at> dancol.org-20140420144613-8e4t4swlxauwl4w7
Windowing system distributor `The X.Org Foundation', version 11.0.11403901
System Description: openSUSE 13.1 (Bottle) (x86_64)
Configured using:
`configure --without-toolkit-scroll-bars'
Configured features:
XPM JPEG TIFF GIF PNG RSVG IMAGEMAGICK SOUND GPM DBUS GCONF GSETTINGS
NOTIFY LIBSELINUX LIBXML2 FREETYPE M17N_FLT LIBOTF XFT ZLIB
Important settings:
value of $LANG: C
value of $XMODIFIERS: @im=ibus
locale-coding-system: nil
Major mode: LaTeX
Minor modes in effect:
shell-dirtrack-mode: t
tooltip-mode: t
electric-indent-mode: t
mouse-wheel-mode: t
tool-bar-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
line-number-mode: t
transient-mark-mode: t
Recent input:
M-x i s p <tab> <return> M-x r e p o r t - e m <tab>
<return>
Recent messages:
For information about GNU Emacs and the GNU system, type C-h C-a.
Starting new Ispell process hunspell with default dictionary...
Spell-checking test.tex using hunspell with default dictionary...done
ispell-process-line: Ispell misalignment: word `bla' point 41; probably incompatible versions
Load-path shadows:
None found.
Features:
(shadow sort gnus-util mail-extr emacsbug message dired format-spec
rfc822 mml easymenu mml-sec mm-decode mm-bodies mm-encode mail-parse
rfc2231 mailabbrev gmm-utils mailheader sendmail rfc2047 rfc2045
ietf-drums mm-util help-fns mail-prsvr mail-utils ispell tex-mode
compile shell pcomplete comint ansi-color ring latexenc time-date
tooltip electric uniquify ediff-hook vc-hooks lisp-float-type mwheel
x-win x-dnd tool-bar dnd fontset image regexp-opt fringe tabulated-list
newcomment lisp-mode prog-mode register page menu-bar rfn-eshadow timer
select scroll-bar mouse jit-lock font-lock syntax facemenu font-core
frame cham georgian utf-8-lang misc-lang vietnamese tibetan thai
tai-viet lao korean japanese hebrew greek romanian slovak czech european
ethiopic indian cyrillic chinese case-table epa-hook jka-cmpr-hook help
simple abbrev minibuffer nadvice loaddefs button faces cus-face macroexp
files text-properties overlay sha1 md5 base64 format env code-pages mule
custom widget hashtable-print-readable backquote make-network-process
dbusbind gfilenotify dynamic-setting system-font-setting
font-render-setting move-toolbar gtk x-toolkit x multi-tty emacs)
Memory information:
((conses 16 87695 6922)
(symbols 48 19137 0)
(miscs 40 44 125)
(strings 32 14709 4542)
(string-bytes 1 418678)
(vectors 16 10601)
(vector-slots 8 389507 5806)
(floats 8 67 64)
(intervals 56 250 165)
(buffers 960 13)
(heap 1024 42866 735))
--
Peter
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Mon, 28 Apr 2014 16:49:02 GMT)
Full text and
rfc822 format available.
Message #46 received at 7781 <at> debbugs.gnu.org (full text, mbox):
> From: Peter Münster <pmlists <at> free.fr>
> Cc: 7781 <at> debbugs.gnu.org
> Date: Mon, 28 Apr 2014 18:18:10 +0200
>
> - create minimal latex file /tmp/test.tex
> - start emacs:
> LANG=C emacs -Q --eval '(setq ispell-program-name "hunspell")' /tmp/test.tex
> - M-x ispell
Works fine for me, sorry.
Maybe your Hunspell is not patched enough. Mine has much more patches
than the one you mentioned. Most of them are Windows-specific or
related to encoding/decoding non-ASCII characters, something that
doesn't sound relevant for your use case. But who knows? you might
take a look at the file DIFFS in this archive, where you will find all
the changes I made to Hunspell:
http://sourceforge.net/projects/ezwinports/files/hunspell-1.3.2-3-w32-src.zip/download
Or maybe wait for someone on Unix to try reproducing your recipe.
One other idea is to try spell-checking your sample file outside of
Emacs, maybe you will see something that will give some ideas.
Finally, are you sure the 'hunspell' executable Emacs finds on PATH is
indeed the one you intend? (Try putting a full absolute file name
into ispell-program-name.)
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Mon, 28 Apr 2014 17:18:02 GMT)
Full text and
rfc822 format available.
Message #49 received at 7781 <at> debbugs.gnu.org (full text, mbox):
On Mon, Apr 28 2014, Eli Zaretskii wrote:
> Maybe your Hunspell is not patched enough.
Perhaps.
> Mine has much more patches than the one you mentioned. Most of them
> are Windows-specific or related to encoding/decoding non-ASCII
> characters, something that doesn't sound relevant for your use case.
> But who knows? you might take a look at the file DIFFS in this
> archive, where you will find all the changes I made to Hunspell:
>
> http://sourceforge.net/projects/ezwinports/files/hunspell-1.3.2-3-w32-src.zip/download
Indeed. I'll take a look when I have some more time.
> Or maybe wait for someone on Unix to try reproducing your recipe.
Yes, let's see.
> One other idea is to try spell-checking your sample file outside of
> Emacs, maybe you will see something that will give some ideas.
No. Here is the result:
--8<---------------cut here---------------start------------->8---
$ hunspell -a -d en_US -i UTF-8 /tmp/test.tex
@(#) International Ispell Version 3.2.06 (but really Hunspell 1.3.2)
& documentclass 8 1: document class, document-class, documentations, documentation, documents, documentary, underclassmen, underclassman
*
*
*
& bla 15 0: alb, bl, la, blat, bola, blag, blah, blab, lab, baa, bra, boa, Ila, Ala, Ola
*
*
--8<---------------cut here---------------end--------------->8---
> Finally, are you sure the 'hunspell' executable Emacs finds on PATH is
> indeed the one you intend?
Yes. And after switching to "M-x text-mode", there is no more problem.
--
Peter
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Mon, 28 Apr 2014 17:33:01 GMT)
Full text and
rfc822 format available.
Message #52 received at 7781 <at> debbugs.gnu.org (full text, mbox):
> From: Peter Münster <pmlists <at> free.fr>
> Cc: 7781 <at> debbugs.gnu.org
> Date: Mon, 28 Apr 2014 19:17:36 +0200
>
> after switching to "M-x text-mode", there is no more problem.
Maybe you should activate the debugging code in ispell.el and see what
is being submitted to hunspell and what it returns.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Mon, 28 Apr 2014 18:28:01 GMT)
Full text and
rfc822 format available.
Message #55 received at 7781 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Mon, Apr 28 2014, Eli Zaretskii wrote:
>> after switching to "M-x text-mode", there is no more problem.
>
> Maybe you should activate the debugging code in ispell.el and see what
> is being submitted to hunspell and what it returns.
Please find attached 2 debug-outputs, one with latex-mode and one with
text-mode. Both are created with `ispell-buffer-with-debug'.
Do you see, what is going on there?
--
Peter
[ispell-debug-latex.txt (text/plain, attachment)]
[ispell-debug-text.txt (text/plain, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Tue, 29 Apr 2014 10:04:02 GMT)
Full text and
rfc822 format available.
Message #58 received at 7781 <at> debbugs.gnu.org (full text, mbox):
On Mon, Apr 28, 2014 at 06:18:10PM +0200, Peter Münster wrote:
> On Mon, Apr 28 2014, Eli Zaretskii wrote:
>
> > I cannot reproduce this. If I start "emacs -Q" and try spell-checking
> > your example (with Hunspell being the speller), it works just fine for
> > me: I get suggestions to replace "bla". Same thing if I load AUCTeX
> > into "emacs -Q" (does AUCTeX even change anything about
> > spell-checking?).
>
> Hi Eli,
>
> It's not AUCTeX, I've just tested with normal latex-mode.
>
>
> > If "emacs -Q" doesn't work either, please provide a detailed
> > reproduction recipe starting from "emacs -Q".
>
> Here a reproduction recipe:
>
> - create minimal latex file /tmp/test.tex
> - start emacs:
> LANG=C emacs -Q --eval '(setq ispell-program-name "hunspell")' /tmp/test.tex
> - M-x ispell
>
> Here are more details about my system:
>
> In GNU Emacs 24.4.50.2 (x86_64-suse-linux-gnu, GTK+ Version 3.10.4)
> of 2014-04-20 on micropit
Cannot reproduce it here with emacs-snapshot 24.3.50.1 in Debian. What does
'ps -aux' show for hunspell call when run in xterm?
--
Agustin
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Tue, 29 Apr 2014 10:14:01 GMT)
Full text and
rfc822 format available.
Message #61 received at 7781 <at> debbugs.gnu.org (full text, mbox):
On Tue, Apr 29 2014, Agustin Martin wrote:
> Cannot reproduce it here with emacs-snapshot 24.3.50.1 in Debian. What does
> 'ps -aux' show for hunspell call when run in xterm?
hunspell -a -d en_US -i UTF-8
--
Peter
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Tue, 29 Apr 2014 10:21:01 GMT)
Full text and
rfc822 format available.
Message #64 received at 7781 <at> debbugs.gnu.org (full text, mbox):
On Tue, Apr 29 2014, Agustin Martin wrote:
> Cannot reproduce it here with emacs-snapshot 24.3.50.1 in Debian.
Could you please send the ispell-debug buffer, created with
`ispell-buffer-with-debug'? Then we could compare it with mine. There
are perhaps differences.
--
Peter
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Tue, 29 Apr 2014 10:23:02 GMT)
Full text and
rfc822 format available.
Message #67 received at 7781 <at> debbugs.gnu.org (full text, mbox):
On Tue, Apr 29, 2014 at 12:13:04PM +0200, Peter Münster wrote:
> On Tue, Apr 29 2014, Agustin Martin wrote:
>
> > Cannot reproduce it here with emacs-snapshot 24.3.50.1 in Debian. What does
> > 'ps -aux' show for hunspell call when run in xterm?
>
> hunspell -a -d en_US -i UTF-8
That is what is expected. I am clueless about this.
--
Agustin
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Tue, 29 Apr 2014 10:41:02 GMT)
Full text and
rfc822 format available.
Message #70 received at 7781 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Tue, Apr 29, 2014 at 12:20:50PM +0200, Peter Münster wrote:
> On Tue, Apr 29 2014, Agustin Martin wrote:
>
> > Cannot reproduce it here with emacs-snapshot 24.3.50.1 in Debian.
>
> Could you please send the ispell-debug buffer, created with
> `ispell-buffer-with-debug'? Then we could compare it with mine. There
> are perhaps differences.
Please find it attached. Apart from the misalignment problem the only
difference seems to be that I have lots of dicts installed and the
~/.openoffice.org/ path.
--
Agustin
[ispell-debug-buffer-amd-7781.txt (text/plain, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Tue, 29 Apr 2014 11:56:01 GMT)
Full text and
rfc822 format available.
Message #73 received at 7781 <at> debbugs.gnu.org (full text, mbox):
On Tue, Apr 29 2014, Agustin Martin wrote:
> Please find it attached. Apart from the misalignment problem the only
> difference seems to be that I have lots of dicts installed and the
> ~/.openoffice.org/ path.
There is probably not enough information in the debug buffer.
Could you please try this:
mv /usr/bin/hunspell /usr/bin/hunspell-orig
And create the file /usr/bin/hunspell with the following content:
--8<---------------cut here---------------start------------->8---
#!/bin/bash
tee /tmp/hunspell-input | hunspell-orig "$@" | tee /tmp/hunspell-output
--8<---------------cut here---------------end--------------->8---
This is what I get:
input:
--8<---------------cut here---------------start------------->8---
!
+
^bla
--8<---------------cut here---------------end--------------->8---
output:
--8<---------------cut here---------------start------------->8---
@(#) International Ispell Version 3.2.06 (but really Hunspell 1.3.2)
& bla 15 0: alb, bl, la, blat, bola, blag, blah, blab, lab, baa, bra, boa, Ila, Ala, Ola
--8<---------------cut here---------------end--------------->8---
I guess, that you get "bla 15 1", because of the "^" before the "bla".
That would mean, that my hunspell would need another patch. Which one
please?
Thanks for your efforts,
--
Peter
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Tue, 29 Apr 2014 12:49:01 GMT)
Full text and
rfc822 format available.
Message #76 received at 7781 <at> debbugs.gnu.org (full text, mbox):
I've just tried unpatched hunspell: no problem with TeX-mode.
It's the patch on sf.net that breaks the TeX-mode, the character
position is always 0:
https://sourceforge.net/p/hunspell/patches/57/#d425
I'll build hunspell with Eli's patch now.
Sorry for the noise...
--
Peter
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Tue, 29 Apr 2014 13:58:01 GMT)
Full text and
rfc822 format available.
Message #79 received at 7781 <at> debbugs.gnu.org (full text, mbox):
> From: Peter Münster <pmlists <at> free.fr>
> Date: Tue, 29 Apr 2014 14:48:43 +0200
> Cc: 7781 <at> debbugs.gnu.org
>
> I've just tried unpatched hunspell: no problem with TeX-mode.
> It's the patch on sf.net that breaks the TeX-mode, the character
> position is always 0:
> https://sourceforge.net/p/hunspell/patches/57/#d425
That's what I thought. If I invoke Hunspell like ispell.el does for a
LaTeX buffer, i.e.
hunspell -a -d en_US -i UTF-8
and then type "^bla RET" into Hunspell, I get this as output:
& bla 15 1: alb, bl, la, bola, blah, blab, lab, baa, ala, bra, boa, Ila, Ala, Ola, Ula
As you see, I get "15 1". If you get 0 instead of 1, then that's the
cause of the problem, because the part of your debug output marked
below:
ispell-process-line: Ispell misalignment error:
[Word from ispell pipe]: [bla], actual (point,line,column): (41,2,16)
^^^^^^^
clearly shows that ispell.el is confused about where the word "bla"
begins in the buffer; the correct data is 42,3,0. Also note that just
before reading Hunspell's output, ispell.el correctly identified both
the word and its location:
ispell-region: string pos (42->45), eol: 45, [in-comment]: [nil], [add-comment]: [nil], [string]: [^bla
]
> I'll build hunspell with Eli's patch now.
I think that will solve the problem.
(I have no idea why visiting the same file in Text mode avoids the
problem. The only difference is that in Text mode, ispell.el does not
skip the first 2 lines, but instead submits them to Hunspell. Why
this makes the difference, I don't know, but probably the lone "^bla"
somehow triggers the bug in the patch you installed, whatever that bug
is.)
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Tue, 29 Apr 2014 14:31:03 GMT)
Full text and
rfc822 format available.
Message #82 received at 7781 <at> debbugs.gnu.org (full text, mbox):
On Tue, Apr 29 2014, Eli Zaretskii wrote:
> (I have no idea why visiting the same file in Text mode avoids the
> problem. The only difference is that in Text mode, ispell.el does not
> skip the first 2 lines, but instead submits them to Hunspell.
No. In latex-mode, emacs switches hunspell into TeX-mode with the "+".
> Why this makes the difference, I don't know, but probably the lone
> "^bla" somehow triggers the bug in the patch you installed, whatever
> that bug is.)
No. In normal mode, the "^bla" works fine. The patch on sf.net just
breaks the TeX-mode: every position becomes 0.
Your patch works nicely, thanks!
I should have tested hunspell on the command line, before reporting the
problem. Now I know, how to do that.
--
Peter
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Tue, 29 Apr 2014 15:26:01 GMT)
Full text and
rfc822 format available.
Message #85 received at 7781 <at> debbugs.gnu.org (full text, mbox):
> From: Peter Münster <pmlists <at> free.fr>
> Cc: agustin.martin <at> hispalinux.es, 7781 <at> debbugs.gnu.org
> Date: Tue, 29 Apr 2014 16:30:07 +0200
>
> On Tue, Apr 29 2014, Eli Zaretskii wrote:
>
> > (I have no idea why visiting the same file in Text mode avoids the
> > problem. The only difference is that in Text mode, ispell.el does not
> > skip the first 2 lines, but instead submits them to Hunspell.
>
> No. In latex-mode, emacs switches hunspell into TeX-mode with the "+".
It does both, evidently. Compare this part of your debug output (in
LaTeX buffer):
ispell-region: First skip: \documentclass at (pos,line,column): (1,1,0).
ispell-region: Continue spell-checking with hunspell and default dictionary...
ispell-region: string pos (41->41), eol: 45, [in-comment]: [nil], [add-comment]: [nil], [string]: [nil]
ispell-region: string pos (42->45), eol: 45, [in-comment]: [nil], [add-comment]: [nil], [string]: [^bla
]
with this (in Text buffer):
ispell-region: string pos (1->24), eol: 24, [in-comment]: [nil], [add-comment]: [nil], [string]: [^\documentclass{article}
]
ispell-region: string pos (24->24), eol: 41, [in-comment]: [nil], [add-comment]: [nil], [string]: [nil]
ispell-region: string pos (25->41), eol: 41, [in-comment]: [nil], [add-comment]: [nil], [string]: [^\begin{document}
]
ispell-region: string pos (41->41), eol: 45, [in-comment]: [nil], [add-comment]: [nil], [string]: [nil]
ispell-region: string pos (42->45), eol: 45, [in-comment]: [nil], [add-comment]: [nil], [string]: [^bla
]
ispell-region: string pos (45->45), eol: 60, [in-comment]: [nil], [add-comment]: [nil], [string]: [nil]
ispell-region: string pos (46->60), eol: 60, [in-comment]: [nil], [add-comment]: [nil], [string]: [^\end{document}
]
As you see, in the second case, the TeX directives are also sent to
Hunspell for checking, while in the first case they are not.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Tue, 29 Apr 2014 16:35:01 GMT)
Full text and
rfc822 format available.
Message #88 received at 7781 <at> debbugs.gnu.org (full text, mbox):
On Tue, Apr 29 2014, Eli Zaretskii wrote:
>> > The only difference is that in Text mode, ispell.el does not skip
>> > the first 2 lines, but instead submits them to Hunspell.
>>
>> No. In latex-mode, emacs switches hunspell into TeX-mode with the "+".
>
> It does both, evidently. Compare this part of your debug output (in
> LaTeX buffer):
Sorry. I just wanted to say: "No, it's not the *only* difference." ... ;)
--
Peter
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Thu, 25 Sep 2014 09:55:02 GMT)
Full text and
rfc822 format available.
Message #91 received at 7781 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
I have sent a message to the upstream maintainer informing him of the
situation and asking for the patch to be included in the next release.
--
http://rrt.sc3d.org
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Thu, 16 Oct 2014 13:38:01 GMT)
Full text and
rfc822 format available.
Message #94 received at 7781 <at> debbugs.gnu.org (full text, mbox):
Control: tag 7781 + upstream fixed-upstream
On Fri, Feb 11, 2011 at 06:00:53PM +0100, Agustin Martin wrote:
> forwarded 7781 https://sourceforge.net/tracker/?func=detail&aid=3178449&group_id=143754&atid=756395
> thanks
>
> 2011/1/7 Agustin Martin <agustin.martin <at> hispalinux.es>:
> > 2011/1/4 Reuben Thomas <rrt <at> sc3d.org>:
> >> With the following text, and using emacs -Q, I get the errors you can
> >> see in the messages log below when using hunspell to spell-check a UTF-8
> >> buffer with some extended characters in it.
>
> > Do not worry about first number, is the number of suggestions. However
> > position in second number differ. Seems that hunspell is not
> > considering that apostrophe as a single (multibyte) char when
> > counting, but as three components
> >
> > Looks to me an hunspell bug. I found no reference to this problem in
> > hunspell sf site, but noticed that Hunspell 1.2.14 was released
> > yesterday. Need to check if that has some related new.
>
> Opened an hunspell bug report for bad count problem
>
> https://sourceforge.net/tracker/?func=detail&aid=3178449&group_id=143754&atid=756395
Reuben Thomas wrote:
> I have sent a message to the upstream maintainer informing him of the
> situation and asking for the patch to be included in the next release.
Proposed patch has been integrated in hunspell upstream by caolan mcnamara.
Regards,
PS: My old hispalinux.es address is failing silently and I do not if I will
ever be able to get it fixed. Please use current gmail address for replies.
--
Agustin
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Thu, 16 Oct 2014 13:55:01 GMT)
Full text and
rfc822 format available.
Message #97 received at 7781 <at> debbugs.gnu.org (full text, mbox):
> Date: Thu, 16 Oct 2014 15:37:24 +0200
> From: Agustin Martin <agustin6martin <at> gmail.com>
>
> > Opened an hunspell bug report for bad count problem
> >
> > https://sourceforge.net/tracker/?func=detail&aid=3178449&group_id=143754&atid=756395
>
> Reuben Thomas wrote:
> > I have sent a message to the upstream maintainer informing him of the
> > situation and asking for the patch to be included in the next release.
>
> Proposed patch has been integrated in hunspell upstream by caolan mcnamara.
Do you mean there's now an official release of Hunspell with this bug
fixed? If so, where can one find it?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Thu, 16 Oct 2014 14:10:02 GMT)
Full text and
rfc822 format available.
Message #100 received at 7781 <at> debbugs.gnu.org (full text, mbox):
On Thu, Oct 16, 2014 at 04:54:16PM +0300, Eli Zaretskii wrote:
> > Date: Thu, 16 Oct 2014 15:37:24 +0200
> > From: Agustin Martin <agustin6martin <at> gmail.com>
> >
> > > Opened an hunspell bug report for bad count problem
> > >
> > > https://sourceforge.net/tracker/?func=detail&aid=3178449&group_id=143754&atid=756395
> >
> > Reuben Thomas wrote:
> > > I have sent a message to the upstream maintainer informing him of the
> > > situation and asking for the patch to be included in the next release.
> >
> > Proposed patch has been integrated in hunspell upstream by caolan mcnamara.
>
> Do you mean there's now an official release of Hunspell with this bug
> fixed? If so, where can one find it?
I am afraid it only means that fix has been pushed to upstream VCS.
http://hunspell.cvs.sourceforge.net/viewvc/hunspell/hunspell/src/tools/hunspell.cxx?r1=1.60&r2=1.61
Another good new is that this is not the only bug just handled,
http://sourceforge.net/p/hunspell/bugs/228/
[hunspell:bugs] #228 Some problems with Emacs and init string in pipe mode
has been changed to closed-accepted and pushed to the repo (r1.62).
Regards,
--
Agustin
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Fri, 28 Aug 2020 12:01:01 GMT)
Full text and
rfc822 format available.
Message #103 received at 7781 <at> debbugs.gnu.org (full text, mbox):
Reuben Thomas <rrt <at> sc3d.org> writes:
> With the following text, and using emacs -Q, I get the errors you can
> see in the messages log below when using hunspell to spell-check a UTF-8
> buffer with some extended characters in it.
>
> I did test this with emacs -Q, but the current session, in which I
> reproduced the problem and am now composing this bug report, was not
> started with -Q (this is so submitting the bug report works properly!).
>
> I am running a freshly bzr-pulled build of the emacs-23 branch.
>
> Text follows
I tried this but couldn't reproduce the bug using current master and
Hunspell 1.7.0. Having read the bug report, IIUC, this was a bug in
Hunspell and not in Emacs?
Are you still able to reproduce this using a recent Emacs and Hunspell?
If I don't hear back from you within a couple of weeks, I'll just
close this bug as unreproducible.
Best regards,
Stefan Kangas
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Fri, 28 Aug 2020 12:37:02 GMT)
Full text and
rfc822 format available.
Message #106 received at 7781 <at> debbugs.gnu.org (full text, mbox):
> From: Stefan Kangas <stefan <at> marxist.se>
> Date: Fri, 28 Aug 2020 05:00:11 -0700
> Cc: 7781 <at> debbugs.gnu.org
>
> Reuben Thomas <rrt <at> sc3d.org> writes:
>
> > With the following text, and using emacs -Q, I get the errors you can
> > see in the messages log below when using hunspell to spell-check a UTF-8
> > buffer with some extended characters in it.
> >
> > I did test this with emacs -Q, but the current session, in which I
> > reproduced the problem and am now composing this bug report, was not
> > started with -Q (this is so submitting the bug report works properly!).
> >
> > I am running a freshly bzr-pulled build of the emacs-23 branch.
> >
> > Text follows
>
> I tried this but couldn't reproduce the bug using current master and
> Hunspell 1.7.0. Having read the bug report, IIUC, this was a bug in
> Hunspell and not in Emacs?
>
> Are you still able to reproduce this using a recent Emacs and Hunspell?
Some (old) versions of Hunspell had a bug, whereby the mis-spelled
words were reported with offsets in bytes, not in characters. When
this happens, ispell.el reports "misalignment" errors.
I don't remember when (or even if) Hunspell fixed that problem (in the
version I use I fixed it myself), but if 1.7.0 has that problem fixed,
you will not see the problem.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#7781
; Package
emacs
.
(Fri, 28 Aug 2020 12:57:01 GMT)
Full text and
rfc822 format available.
Message #109 received at 7781 <at> debbugs.gnu.org (full text, mbox):
tags 7781 + notabug
close 7781
thanks
Eli Zaretskii <eliz <at> gnu.org> writes:
> Some (old) versions of Hunspell had a bug, whereby the mis-spelled
> words were reported with offsets in bytes, not in characters. When
> this happens, ispell.el reports "misalignment" errors.
>
> I don't remember when (or even if) Hunspell fixed that problem (in the
> version I use I fixed it myself), but if 1.7.0 has that problem fixed,
> you will not see the problem.
Thanks, so this is not a bug in Emacs. I'm therefore closing this bug report.
If this conclusion is incorrect, please reopen the bug report.
Best regards,
Stefan Kangas
Added tag(s) notabug.
Request was from
Stefan Kangas <stefan <at> marxist.se>
to
control <at> debbugs.gnu.org
.
(Fri, 28 Aug 2020 12:57:01 GMT)
Full text and
rfc822 format available.
bug closed, send any further explanations to
7781 <at> debbugs.gnu.org and Reuben Thomas <rrt <at> sc3d.org>
Request was from
Stefan Kangas <stefan <at> marxist.se>
to
control <at> debbugs.gnu.org
.
(Fri, 28 Aug 2020 12:57:02 GMT)
Full text and
rfc822 format available.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Sat, 26 Sep 2020 11:24:07 GMT)
Full text and
rfc822 format available.
This bug report was last modified 4 years and 288 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.