GNU bug report logs - #23097
24.5; ispell.el: lines with both CASECHARS and NOT-CASECHARS get sent to the spell checker

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: emacs; Reported by: Nikolay Kudryavtsev <nikolay.kudryavtsev@HIDDEN>; dated Wed, 23 Mar 2016 18:12:01 UTC; Maintainer for emacs is bug-gnu-emacs@HIDDEN.

Message received at 23097 <at> debbugs.gnu.org:


Received: (at 23097) by debbugs.gnu.org; 23 Mar 2016 18:23:08 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Mar 23 14:23:08 2016
Received: from localhost ([127.0.0.1]:34724 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1ainQy-00087J-MT
	for submit <at> debbugs.gnu.org; Wed, 23 Mar 2016 14:23:08 -0400
Received: from eggs.gnu.org ([208.118.235.92]:40292)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eliz@HIDDEN>) id 1ainQy-000872-0k
 for 23097 <at> debbugs.gnu.org; Wed, 23 Mar 2016 14:23:08 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <eliz@HIDDEN>) id 1ainQp-0005KT-0p
 for 23097 <at> debbugs.gnu.org; Wed, 23 Mar 2016 14:23:02 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,T_RP_MATCHES_RCVD
 autolearn=disabled version=3.3.2
Received: from fencepost.gnu.org ([2001:4830:134:3::e]:56074)
 by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <eliz@HIDDEN>)
 id 1ainQo-0005KO-Tq; Wed, 23 Mar 2016 14:22:58 -0400
Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:2115
 helo=home-c4e4a596f7)
 by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128)
 (Exim 4.82) (envelope-from <eliz@HIDDEN>)
 id 1ainQo-0001sc-1x; Wed, 23 Mar 2016 14:22:58 -0400
Date: Wed, 23 Mar 2016 20:22:42 +0200
Message-Id: <83fuvh2gwd.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Nikolay Kudryavtsev <nikolay.kudryavtsev@HIDDEN>
In-reply-to: <56F2DC47.2090600@HIDDEN> (message from Nikolay Kudryavtsev on
 Wed, 23 Mar 2016 21:11:19 +0300)
Subject: Re: bug#23097: 24.5;
 ispell.el: lines with both CASECHARS and NOT-CASECHARS get sent to
 the spell checker
References: <56F2DC47.2090600@HIDDEN>
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-Received-From: 2001:4830:134:3::e
X-Spam-Score: -5.0 (-----)
X-Debbugs-Envelope-To: 23097
Cc: 23097 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Reply-To: Eli Zaretskii <eliz@HIDDEN>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -5.0 (-----)

> From: Nikolay Kudryavtsev <nikolay.kudryavtsev@HIDDEN>
> Date: Wed, 23 Mar 2016 21:11:19 +0300
> 
> Each entry ispell-dictionary-alist has elements called CASECHARS and 
> NOT-CASECHARS. They are used for defining what gets sent to the spell 
> checker and what does not.
> 
> One use case for them is that, if you have two dictionaries for 
> languages with totally different alphabets, you can spellcheck a file 
> where both languages are mixed together. In theory.

Don't you need to restart the spell-checker each time you switch the
dictionaries?  AFAIK, only Hunspell supports such mixed
spell-checking, and with Hunspell you don't need to break the line
into separate words in that case.  With any other spell-checker, you
need to restart it whenever you switch languages.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#23097; Package emacs. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 23 Mar 2016 18:11:56 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Mar 23 14:11:56 2016
Received: from localhost ([127.0.0.1]:34699 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1ainG7-0007nU-RM
	for submit <at> debbugs.gnu.org; Wed, 23 Mar 2016 14:11:56 -0400
Received: from eggs.gnu.org ([208.118.235.92]:38004)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <nikolay.kudryavtsev@HIDDEN>) id 1ainG6-0007nE-UD
 for submit <at> debbugs.gnu.org; Wed, 23 Mar 2016 14:11:55 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <nikolay.kudryavtsev@HIDDEN>) id 1ainG0-0002pQ-JY
 for submit <at> debbugs.gnu.org; Wed, 23 Mar 2016 14:11:49 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM,
 T_DKIM_INVALID autolearn=disabled version=3.3.2
Received: from lists.gnu.org ([2001:4830:134:3::11]:37858)
 by eggs.gnu.org with esmtp (Exim 4.71)
 (envelope-from <nikolay.kudryavtsev@HIDDEN>) id 1ainG0-0002pM-G2
 for submit <at> debbugs.gnu.org; Wed, 23 Mar 2016 14:11:48 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:57191)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <nikolay.kudryavtsev@HIDDEN>) id 1ainFz-0000YA-LE
 for bug-gnu-emacs@HIDDEN; Wed, 23 Mar 2016 14:11:48 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <nikolay.kudryavtsev@HIDDEN>) id 1ainFu-0002oD-LY
 for bug-gnu-emacs@HIDDEN; Wed, 23 Mar 2016 14:11:47 -0400
Received: from mail-lf0-x235.google.com ([2a00:1450:4010:c07::235]:33571)
 by eggs.gnu.org with esmtp (Exim 4.71)
 (envelope-from <nikolay.kudryavtsev@HIDDEN>) id 1ainFu-0002mr-DI
 for bug-gnu-emacs@HIDDEN; Wed, 23 Mar 2016 14:11:42 -0400
Received: by mail-lf0-x235.google.com with SMTP id o73so17604652lfe.0
 for <bug-gnu-emacs@HIDDEN>; Wed, 23 Mar 2016 11:11:42 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=from:to:subject:message-id:date:user-agent:mime-version
 :content-transfer-encoding;
 bh=ajVq6aHx8lCtKBcjXkdGy2e4QuQfeZnFmos3Yw1TcUc=;
 b=hBqOlX07WYHRf748f+L+RztJC0yGvy6gxEYvabM0XszAVcnCpjWve6kwgGx/hs00VJ
 wRgpyv71ipJsNEzxNOjnTeo8EkfTmMN2b5w4aren+LFBYt/8cLDY5s2DGp4fw0q76otr
 BAVRBU/fUBmEszd5QnazpMh0w/m9gc/H3rKIDRCZdoZTHpxoC4RH/hWylGC8LnHowels
 64QiYxbaCfqDN+Z4mCN9iXRE9fvnA3i7B06kvpmtM5Kob6vukqPDryhBzkqpj9upXL4J
 5nO0fp5Fcz8JPH0TS8XxuhtK1Qeq9o0MVTwkNi09U5EGXEZw0JRBHIFJKJxiYFQOTSYe
 V2iQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:from:to:subject:message-id:date:user-agent
 :mime-version:content-transfer-encoding;
 bh=ajVq6aHx8lCtKBcjXkdGy2e4QuQfeZnFmos3Yw1TcUc=;
 b=Oeh6JjRS7OyDx0D2ndBTnexcYyB2fEdyzMZzFGXPKnBc0MW0QmpK65PIXtfdw+9QgS
 iSPPMNz3BVCs/4mEjQo+ZSIb+LdrgY3+eEDbY7f5LSVGST6FJ7PysG19khZngpqEeqJW
 4QTWVjOJSfYa1w9DlREd3dE41ew6lxloJqBrcgI+K1MapCssyeqVcznq6NDLg8loVAHY
 DgWh/usNqNF+hyaL9jMuayLrl3zHLK8tcb5VZbtu98ef2JKIfu/EBxZEZVAB96LwZvQh
 CukJpgDTHS0Ygb4gZsSwXZQcoN2OQcl8Qr4rCei5LwPtVjIpzU8bXa3UJ5I65I4/TOSv
 SSmg==
X-Gm-Message-State: AD7BkJJ4ePGvZeJ/lcerXG96+9qNmwrUGPRQX79fvrik8Z8fI/tnxSydZocyRVJir6r2kw==
X-Received: by 10.25.22.214 with SMTP id 83mr1559343lfw.60.1458756701423;
 Wed, 23 Mar 2016 11:11:41 -0700 (PDT)
Received: from [192.168.199.2]
 (broadband-95-84-209-126.nationalcablenetworks.ru. [95.84.209.126])
 by smtp.gmail.com with ESMTPSA id c14sm588190lfc.9.2016.03.23.11.11.40
 for <bug-gnu-emacs@HIDDEN> (version=TLSv1/SSLv3 cipher=OTHER);
 Wed, 23 Mar 2016 11:11:40 -0700 (PDT)
From: Nikolay Kudryavtsev <nikolay.kudryavtsev@HIDDEN>
X-Google-Original-From: Nikolay Kudryavtsev <Nikolay.Kudryavtsev@HIDDEN>
To: bug-gnu-emacs@HIDDEN
Subject: 24.5; ispell.el: lines with both CASECHARS and NOT-CASECHARS get sent
 to the spell checker
Message-ID: <56F2DC47.2090600@HIDDEN>
Date: Wed, 23 Mar 2016 21:11:19 +0300
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101
 Thunderbird/38.6.0
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x
X-Received-From: 2001:4830:134:3::11
X-Spam-Score: -4.0 (----)
X-Debbugs-Envelope-To: submit
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -4.0 (----)

Each entry ispell-dictionary-alist has elements called CASECHARS and 
NOT-CASECHARS. They are used for defining what gets sent to the spell 
checker and what does not.

One use case for them is that, if you have two dictionaries for 
languages with totally different alphabets, you can spellcheck a file 
where both languages are mixed together. In theory.

Here's what happens in practice:
If line contains only CASECHARS, it gets sent to the spell checker.
If line contains only NOT-CASECHARS, it does not get sent to the spell 
checker.
If line contains both CASECHARS and NOT-CASECHARS, the whole line gets 
sent to the spell checker.

Sending the whole line makes NOT-CASECHARS pretty useless. I think the 
reasonable behavior in this case would be sending the line word by word.

Here's how to repeat this with aspell.
1. Starting from emacs -Q eval this:
(setq ispell-program-name "aspell")
(defun ispell-set-my-dictionaries()
   (setq ispell-dictionary-alist (delq (assoc "english" 
ispell-dictionary-alist) ispell-dictionary-alist))
   (add-to-list 'ispell-dictionary-alist
                '("english" "[kcat]" "[dogh]" "[']" nil ("-B") nil 
iso-8859-1)))
(advice-add 'ispell-set-spellchecker-params :after 
#'ispell-set-my-dictionaries)
2. ispell-change-dictionary to english.
3. ispell-buffer a buffer containing this:
kat
doh
kat doh

"Kat" at the first line would get sent to aspell, since it passes 
CASECHARS. This is fine. "Doh" at the second line would be ignored, 
since it's not in CASECHARS. This is fine too. At the line with both 
words, not only "kat" would get sent, but also "doh" and that's what we 
don't want to happen.

-- 
Best Regards,
Nikolay Kudryavtsev





Acknowledgement sent to Nikolay Kudryavtsev <nikolay.kudryavtsev@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs@HIDDEN. Full text available.
Report forwarded to bug-gnu-emacs@HIDDEN:
bug#23097; Package emacs. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Wed, 23 Mar 2016 18:30:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.