GNU bug report logs - #49982
27.2; ispell.el fails to find a Hunspell dictionary to use as default despite ispell-dictionary being set

Previous Next

Package: emacs;

Reported by: Kisaragi Hiu <mail <at> kisaragi-hiu.com>

Date: Tue, 10 Aug 2021 15:13:01 UTC

Severity: normal

Found in version 27.2

Fixed in version 29.1

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 49982 in the body.
You can then email your comments to 49982 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#49982; Package emacs. (Tue, 10 Aug 2021 15:13:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Kisaragi Hiu <mail <at> kisaragi-hiu.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Tue, 10 Aug 2021 15:13:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Kisaragi Hiu <mail <at> kisaragi-hiu.com>
To: bug-gnu-emacs <at> gnu.org
Subject: 27.2; ispell.el fails to find a Hunspell dictionary to use as default
 despite ispell-dictionary being set
Date: Wed, 11 Aug 2021 00:12:06 +0900
This configuration should be everything that's needed for ispell.el to
work with Hunspell, regardless of system locale:

    (setq ispell-program-name (executable-find "hunspell")
          ispell-dictionary "en_US"))

However, when system locale (the LANG environment variable) does not 
have a corresponding Hunspell dictionary, 
`ispell-find-hunspell-dictionaries` returns the error "Can't find 
Hunspell dictionary with a .aff affix file", despite ispell-dictionary 
being set.

ispell.el relies on Hunspell to load a default and report it, but
Hunspell just errors out if it can't find a dictionary for the system
locale. And because ispell.el is trying to get Hunspell's default
dictionary, it doesn't pass `ispell-dictionary' onto Hunspell.

This behavior is surprising. If `ispell-dictionary` is non-nil, that
means the user has already specified their preferred dictionary, and it
should not matter that Hunspell cannot find the dictionary it would use
when a preferred dictionary isn't specified.

It's ispell.el that needs to be fixed here because the user specifies
their preference in Emacs, and it is its job to communicate that
preference to Hunspell.

`ispell-find-hunspell-dictionaries` should pass "-d
${ispell-dictionary}" to Hunspell if `ispell-dictionary` is set. This 
invocation:

    hunspell -d "en_US" -D /dev/null

works as expected regardless of the system locale.

* System info

In GNU Emacs 27.2 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.27,
 cairo version 1.17.4) of 2021-03-27 built on juergen Windowing system
 distributor 'The X.Org Foundation', version 11.0.12013000 System
 Description: Arch Linux

Hunspell 1.7.0; hunspell -D is

    SEARCH PATH:

.::/usr/share/hunspell:/usr/share/myspell:/usr/share/myspell/dicts:/Library/Spelling:/home/kisaragi-hiu/.openoffice.org/3/user/wordbook:/home/kisaragi-hiu/.openoffice.org2/user/wordbook:/home/kisaragi-hiu/.openoffice.org2.0/user/wordbook:/home/kisaragi-hiu/Library/Spelling:/opt/openoffice.org/basis3.0/share/dict/ooo:/usr/lib/openoffice.org/basis3.0/share/dict/ooo:/opt/openoffice.org2.4/share/dict/ooo:/usr/lib/openoffice.org2.4/share/dict/ooo:/opt/openoffice.org2.3/share/dict/ooo:/usr/lib/openoffice.org2.3/share/dict/ooo:/opt/openoffice.org2.2/share/dict/ooo:/usr/lib/openoffice.org2.2/share/dict/ooo:/opt/openoffice.org2.1/share/dict/ooo:/usr/lib/openoffice.org2.1/share/dict/ooo:/opt/openoffice.org2.0/share/dict/ooo:/usr/lib/openoffice.org2.0/share/dict/ooo
    AVAILABLE DICTIONARIES (path is not mandatory for -d option):
    ... [truncated]
    /usr/share/hunspell/en_US-large
    ... [truncated]

* Reproduction

- Notice how Hunspell does not return LOADED DICTIONARY under, for 
example, ja_JP:

    export LANG=ja_JP
    hunspell -D /dev/null
    # Output:
    # ... [truncated]
    # Can't open affix or dictionary files for dictionary named "ja_JP".

- Now, in Emacs with LANG set to ja_JP, set ispell up with Hunspell as 
usual.

    (setq ispell-program-name (executable-find "hunspell")
          ispell-dictionary "en_US"))

- Observe the error.

    (ispell-start-process)
    ;; -> ispell-find-hunspell-dictionaries: Can$B!G(Bt find Hunspell 
dictionary with a .aff affix file




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#49982; Package emacs. (Tue, 10 Aug 2021 16:04:02 GMT) Full text and rfc822 format available.

Message #8 received at 49982 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Kisaragi Hiu <mail <at> kisaragi-hiu.com>
Cc: 49982 <at> debbugs.gnu.org
Subject: Re: bug#49982: 27.2;
 ispell.el fails to find a Hunspell dictionary to use as default
 despite ispell-dictionary being set
Date: Tue, 10 Aug 2021 19:03:07 +0300
> Date: Wed, 11 Aug 2021 00:12:06 +0900
> From:  Kisaragi Hiu via "Bug reports for GNU Emacs,
>  the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
> 
> This configuration should be everything that's needed for ispell.el to
> work with Hunspell, regardless of system locale:
> 
>      (setq ispell-program-name (executable-find "hunspell")
>            ispell-dictionary "en_US"))
> 
> However, when system locale (the LANG environment variable) does not 
> have a corresponding Hunspell dictionary, 
> `ispell-find-hunspell-dictionaries` returns the error "Can't find 
> Hunspell dictionary with a .aff affix file", despite ispell-dictionary 
> being set.
> 
> ispell.el relies on Hunspell to load a default and report it, but
> Hunspell just errors out if it can't find a dictionary for the system
> locale. And because ispell.el is trying to get Hunspell's default
> dictionary, it doesn't pass `ispell-dictionary' onto Hunspell.
> 
> This behavior is surprising. If `ispell-dictionary` is non-nil, that
> means the user has already specified their preferred dictionary, and it
> should not matter that Hunspell cannot find the dictionary it would use
> when a preferred dictionary isn't specified.
> 
> It's ispell.el that needs to be fixed here because the user specifies
> their preference in Emacs, and it is its job to communicate that
> preference to Hunspell.
> 
> `ispell-find-hunspell-dictionaries` should pass "-d
> ${ispell-dictionary}" to Hunspell if `ispell-dictionary` is set. This 
> invocation:
> 
>      hunspell -d "en_US" -D /dev/null
> 
> works as expected regardless of the system locale.

Thanks for the report and the analysis.

Frankly, I'm a bit wary of making the proposed change unconditionally.
First, yours is an unusual use case, I think: when Hunspell is
installed, the dictionary that corresponds to the locale is always
installed, because otherwise Hunspell will not work reliably from the
shell command line.  And second, relying on the non-nil value of
ispell-dictionary is fragile: the value could be a remnant from some
previous invocation or from an unsuccessful customization that has
nothing to do with the user's choice or his/her current intent.

Moreover, if you manually set ispell-dictionary, then what would be
the purpose of calling ispell-find-hunspell-dictionaries at all?

So maybe we should add a new user option that would force using the
value of ispell-dictionary right from the start.  That would at least
avoid the risk of breaking somebody else's use case.

I wonder if anyone else has an opinion about this.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#49982; Package emacs. (Tue, 10 Aug 2021 18:52:01 GMT) Full text and rfc822 format available.

Message #11 received at 49982 <at> debbugs.gnu.org (full text, mbox):

From: Kisaragi Hiu <mail <at> kisaragi-hiu.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 49982 <at> debbugs.gnu.org
Subject: Re: bug#49982: 27.2; ispell.el fails to find a Hunspell dictionary to
 use as default despite ispell-dictionary being set
Date: Wed, 11 Aug 2021 03:51:22 +0900
Thank you for the response! Let me try to add some clarifications (that 
hopefully don't sound too harsh):

> First, yours is an unusual use case, I think: when Hunspell is
> installed, the dictionary that corresponds to the locale is always
> installed, because otherwise Hunspell will not work reliably from the
> shell command line.

I'm fairly certain my use case isn't unusual.

There are no easily installable Hunspell dictionaries for, among other 
languages:

- Any variant of Chinese (Mandarin)
- Japanese
- Kazakh
- Khmer
- Malay

Every user of any of these languages who tries to set up Hunspell
along with ispell.el and Flyspell has to find or invent a poorly
documented workaround.

- [[https://texwiki.texjp.org/?Hunspell][TeXJP (Japanese) mentions]] 
"add[ing] the DICTIONARY or WORDLIST environment variables if needed" 
(「また、必要に応じて環境変数DICTIONARYやWORDLISTを指定しておきます。」)
- [[https://home.hirosaki-u.ac.jp/heroic-2020/1575/][Hirosaki University 
Information Technology Center PC lab's tutorial to spellchecking in 
Emacs]] sets DICTIONARY to en_US
- 200ok.ch (developer of Organice)'s 
[[https://200ok.ch/posts/2020-08-22_setting_up_spell_checking_with_multiple_dictionaries.html][tutorial 
for using multiple dictionaries for Hunspell + ispell.el]] mentions

    ;; Configure `LANG`, otherwise ispell.el cannot find a 'default
    ;; dictionary' even though multiple dictionaries will be configured
    ;; in next line.
    (setenv "LANG" "en_US.UTF-8")

- 
[[http://blog.binchen.org/posts/what-s-the-best-spell-check-set-up-in-emacs/][Chen 
Bin's blog post on setting up spell check]] uses this block:

    ;; find aspell and hunspell automatically
    (cond
     ;; try hunspell at first
      ;; if hunspell does NOT exist, use aspell
     ((executable-find "hunspell")
      (setq ispell-program-name "hunspell")
      (setq ispell-local-dictionary "en_US")
      (setq ispell-local-dictionary-alist
            ;; Please note the list `("-d" "en_US")` contains ACTUAL 
parameters passed to hunspell
            ;; You could use `("-d" "en_US,en_US-med")` to check with 
multiple dictionaries
            '(("en_US" "[[:alpha:]]" "[^[:alpha:]]" "[']" nil ("-d" 
"en_US") nil utf-8)))

      ;; new variable `ispell-hunspell-dictionary-alist' is defined in 
Emacs
      ;; If it's nil, Emacs tries to automatically set up the dictionaries.
      (when (boundp 'ispell-hunspell-dictionary-alist)
        (setq ispell-hunspell-dictionary-alist 
ispell-local-dictionary-alist)))

  "Emacs tries to automatically set up the dictionaries" refers to
  ispell-set-spellchecker-params running 
ispell-find-hunspell-dictionaries after
  seeing that ispell-hunspell-dictionary-alist is nil.

My use case is not unusual. Fixing this bug would eliminate the need
for these workarounds.

(From the command line you just pass in -d yourself. Setting environment 
variables is also a native way of configuring programs in the CLI; in 
Emacs generally wrapper packages like ispell.el define user options 
instead of asking users to do `setenv` themselves.)

> And second, relying on the non-nil value of
> ispell-dictionary is fragile: the value could be a remnant from some
> previous invocation or from an unsuccessful customization that has
> nothing to do with the user's choice or his/her current intent.

ispell-dictionary is a user option, not an internal variable. Nothing
in ispell.el changes ispell-dictionary besides the command to help the
user change the preferred dictionary, `ispell-change-dictionary`, so
the value cannot be a remnant from a previous invocation.

Without doing anything, ispell-dictionary being nil signals to ispell.el to
use the spell checker's default, as evident from its Custom type:

    (defcustom ispell-dictionary nil
      "Default dictionary to use if `ispell-local-dictionary' is nil."
      :type '(choice string
                     (const :tag "default" nil))
      :group 'ispell)

In fact, the user can set ispell-dictionary in their init.el when 
they're using aspell and have it work as expected. That's why I consider 
this a bug.

> Moreover, if you manually set ispell-dictionary, then what would be
> the purpose of calling ispell-find-hunspell-dictionaries at all?

I don't call ispell-find-hunspell-dictionaries myself --- turning on 
flyspell eventually calls it.

The error actually occurs when flyspell-mode-on calls
ispell-set-spellchecker-params, which in turn calls
ispell-find-hunspell-dictionaries to set up internal variables.

This is how Chen Bin's workaround works: it sets
ispell-local-dictionary-alist first, then sets
ispell-hunspell-dictionary-alist to it, preventing
ispell-set-spellchecker-params from triggering the error.

ispell-find-hunspell-dictionaries in fact always returns nil, and is 
only usedfor side effects: setting up
- ispell-hunspell-dictionary-alist,
- ispell-hunspell-dict-paths-alist,
- and ispell-dicts-name2locale-equivs-alist.

I'd like to hear more perspectives on this as well.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#49982; Package emacs. (Tue, 10 Aug 2021 19:30:02 GMT) Full text and rfc822 format available.

Message #14 received at 49982 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Kisaragi Hiu <mail <at> kisaragi-hiu.com>
Cc: 49982 <at> debbugs.gnu.org
Subject: Re: bug#49982: 27.2; ispell.el fails to find a Hunspell dictionary to
 use as default despite ispell-dictionary being set
Date: Tue, 10 Aug 2021 22:29:43 +0300
> From: Kisaragi Hiu <mail <at> kisaragi-hiu.com>
> Cc: 49982 <at> debbugs.gnu.org
> Date: Wed, 11 Aug 2021 03:51:22 +0900
> 
> Thank you for the response! Let me try to add some clarifications (that 
> hopefully don't sound too harsh):
> 
>  > First, yours is an unusual use case, I think: when Hunspell is
>  > installed, the dictionary that corresponds to the locale is always
>  > installed, because otherwise Hunspell will not work reliably from the
>  > shell command line.
> 
> I'm fairly certain my use case isn't unusual.
> 
> There are no easily installable Hunspell dictionaries for, among other 
> languages:
> 
> - Any variant of Chinese (Mandarin)
> - Japanese
> - Kazakh
> - Khmer
> - Malay
> 
> Every user of any of these languages who tries to set up Hunspell
> along with ispell.el and Flyspell has to find or invent a poorly
> documented workaround.
> 
> - [[https://texwiki.texjp.org/?Hunspell][TeXJP (Japanese) mentions]] 
> "add[ing] the DICTIONARY or WORDLIST environment variables if needed" 
> (「また、必要に応じて環境変数DICTIONARYやWORDLISTを指定しておきます。」)
> - [[https://home.hirosaki-u.ac.jp/heroic-2020/1575/][Hirosaki University 
> Information Technology Center PC lab's tutorial to spellchecking in 
> Emacs]] sets DICTIONARY to en_US
> - 200ok.ch (developer of Organice)'s 
> [[https://200ok.ch/posts/2020-08-22_setting_up_spell_checking_with_multiple_dictionaries.html][tutorial 
> for using multiple dictionaries for Hunspell + ispell.el]] mentions

Indeed, defining DICTIONARY in the environment is the way to control
the default dictionary.  It is documented in the Hunspell's man page.
Why cannot it be the solution for when no Hunspell dictionary could be
found that matches the locale?  Using $DICTIONARY should solve your
problem both inside Emacs and outside it.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#49982; Package emacs. (Wed, 11 Aug 2021 11:18:01 GMT) Full text and rfc822 format available.

Message #17 received at 49982 <at> debbugs.gnu.org (full text, mbox):

From: Kisaragi Hiu <mail <at> kisaragi-hiu.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 49982 <at> debbugs.gnu.org
Subject: Re: bug#49982: 27.2; ispell.el fails to find a Hunspell dictionary to
 use as default despite ispell-dictionary being set
Date: Wed, 11 Aug 2021 20:17:20 +0900
> Indeed, defining DICTIONARY in the environment is the way to control
the default dictionary.  It is documented in the Hunspell's man page.
Why cannot it be the solution for when no Hunspell dictionary could be
found that matches the locale?  Using $DICTIONARY should solve your
problem both inside Emacs and outside it.

I don't know, maybe I'm biased here. Hunspell has its quirks, but isn't 
it ispell.el's job to work around quirks in spellcheckers, and not the 
end user's? ispell.el worked around Hunspell 1.7's new output quirk. Why 
can't it work around this quirk?

*My* problem is already solved by using the workaround. The bug is that 
nobody should have to use the workaround.

Using environment variables to configure subprocesses is always 
something that a user can do, but, as you know, there's a reason why 
ispell.el exposes spellchecker options through Emacs user options.

Besides, which dictionary one specifies in `DICTIONARY` doesn't actually 
matter, it just needs to be one that exists, as it will be overridden by 
ispell-dictionary when ispell.el actually starts spellchecking. You can 
do (in emacs -Q):

    (setenv "LANG" "ja_JP") ; trigger the quirk
    (setenv "DICTIONARY" "en_US") ; tame ispell-find-hunspell-dictionaries
    (setq ispell-program (executable-find "hunspell")
          ispell-dictionary "en_GB")
    (flyspell-mode)

and see that it's spellchecking color to colour. (Try typing "color" 
then running M-x flyspell-auto-correct-previous-word)

---

ispell-dictionary is ispell.el's way of specifying the main dictionary. 
The manual:

> Spell-checkers look up spelling in two dictionaries: the standard
dictionary and your personal dictionary.  The standard dictionary is
specified by the variable ‘ispell-local-dictionary’ or, if that is
‘nil’, by the variable ‘ispell-dictionary’.  If both are ‘nil’, the
spelling program’s default dictionary is used.

The spelling program's default should only ever have an effect when both 
ispell-local-dictionary and ispell-dictionary is nil.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#49982; Package emacs. (Wed, 11 Aug 2021 12:13:02 GMT) Full text and rfc822 format available.

Message #20 received at 49982 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Kisaragi Hiu <mail <at> kisaragi-hiu.com>
Cc: 49982 <at> debbugs.gnu.org
Subject: Re: bug#49982: 27.2; ispell.el fails to find a Hunspell dictionary to
 use as default despite ispell-dictionary being set
Date: Wed, 11 Aug 2021 15:12:40 +0300
> From: Kisaragi Hiu <mail <at> kisaragi-hiu.com>
> Cc: 49982 <at> debbugs.gnu.org
> Date: Wed, 11 Aug 2021 20:17:20 +0900
> 
>  > Indeed, defining DICTIONARY in the environment is the way to control
> the default dictionary.  It is documented in the Hunspell's man page.
> Why cannot it be the solution for when no Hunspell dictionary could be
> found that matches the locale?  Using $DICTIONARY should solve your
> problem both inside Emacs and outside it.
> 
> I don't know, maybe I'm biased here. Hunspell has its quirks, but isn't 
> it ispell.el's job to work around quirks in spellcheckers, and not the 
> end user's?

Not when the spell-checker is basically not configured correctly.

> ispell.el worked around Hunspell 1.7's new output quirk.

That was something users could do nothing on their end to solve.

> Using environment variables to configure subprocesses is always 
> something that a user can do, but, as you know, there's a reason why 
> ispell.el exposes spellchecker options through Emacs user options.

That's not what I meant.  I meant to suggest that you set DICTIONARY
in the init files of your interactive shell, so that it would allow
you to use Hunspell both inside Emacs (because Emacs inherits the
environment variables of its parent shell) and outside Emacs.  I
didn't mean to suggest that you (or others) should inject DICTIONARY
into the environment of the Hunspell sub-process by doing something in
Emacs, like setenv etc.

> Besides, which dictionary one specifies in `DICTIONARY` doesn't actually 
> matter, it just needs to be one that exists, as it will be overridden by 
> ispell-dictionary when ispell.el actually starts spellchecking.

It should be the dictionary you want to use by default.  In your case,
I assume it's en_US.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#49982; Package emacs. (Mon, 22 Aug 2022 12:59:02 GMT) Full text and rfc822 format available.

Message #23 received at 49982 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Kisaragi Hiu <mail <at> kisaragi-hiu.com>
Cc: 49982 <at> debbugs.gnu.org
Subject: Re: bug#49982: 27.2; ispell.el fails to find a Hunspell dictionary
 to use as default despite ispell-dictionary being set
Date: Mon, 22 Aug 2022 14:57:59 +0200
Kisaragi Hiu <mail <at> kisaragi-hiu.com> writes:

> This configuration should be everything that's needed for ispell.el to
> work with Hunspell, regardless of system locale:
>
>     (setq ispell-program-name (executable-find "hunspell")
>           ispell-dictionary "en_US"))
>
> However, when system locale (the LANG environment variable) does not
> have a corresponding Hunspell dictionary,
> `ispell-find-hunspell-dictionaries` returns the error "Can't find
> Hunspell dictionary with a .aff affix file", despite ispell-dictionary
> being set.

I've now fixed this in Emacs 29 (by first using our current rules, and
then trying again with -d ispell-dictionary).




bug marked as fixed in version 29.1, send any further explanations to 49982 <at> debbugs.gnu.org and Kisaragi Hiu <mail <at> kisaragi-hiu.com> Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Mon, 22 Aug 2022 12:59:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 20 Sep 2022 11:24:12 GMT) Full text and rfc822 format available.

This bug report was last modified 1 year and 217 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.