GNU bug report logs - #22815
[module] allow different encodings for to copy_string_contents and make_string

Previous Next

Package: emacs;

Reported by: Atsuo Ohki <ohki <at> gssm.otsuka.tsukuba.ac.jp>

Date: Fri, 26 Feb 2016 07:31:02 UTC

Severity: wishlist

Tags: wontfix

Found in version 25.0.91

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 22815 in the body.
You can then email your comments to 22815 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#22815; Package emacs. (Fri, 26 Feb 2016 07:31:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Atsuo Ohki <ohki <at> gssm.otsuka.tsukuba.ac.jp>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Fri, 26 Feb 2016 07:31:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Atsuo Ohki <ohki <at> gssm.otsuka.tsukuba.ac.jp>
To: bug-gnu-emacs <at> gnu.org
Subject: 25.0.91; emacs-module.*
Date: Fri, 26 Feb 2016 16:28:09 +0900
 This is not a bug report, but feature enhancemnt for emacs-module.*

 API of emacs-module has two Lisp string related functions,
 `copy_string_contents' and `make_string'.
 Those functions assume UTF-8 encoded strings,
 but it becomes flexible if a user can specify a specific encoding.

 I made a following patch for accepting encoding.


--- src/emacs-module.c-ORIG	2016-02-05 03:15:31.000000000 +0900
+++ src/emacs-module.c	2016-02-26 10:20:31.831515000 +0900
@@ -502,17 +502,22 @@
 
 static bool
 module_copy_string_contents (emacs_env *env, emacs_value value, char *buffer,
-			     ptrdiff_t *length)
+			     ptrdiff_t *length, emacs_value coding)
 {
   MODULE_FUNCTION_BEGIN (false);
   Lisp_Object lisp_str = value_to_lisp (value);
+  Lisp_Object lisp_coding;
   if (! STRINGP (lisp_str))
     {
       module_wrong_type (env, Qstringp, lisp_str);
       return false;
     }
 
-  Lisp_Object lisp_str_utf8 = ENCODE_UTF_8 (lisp_str);
+  if (coding == module_nil) lisp_coding = Qutf_8;
+  else if ((lisp_coding = value_to_lisp(coding)) == Qnil) lisp_coding = Qutf_8;
+
+  Lisp_Object lisp_str_utf8 =
+      code_convert_string_norecord (lisp_str, lisp_coding, false);
   ptrdiff_t raw_size = SBYTES (lisp_str_utf8);
   if (raw_size == PTRDIFF_MAX)
     {
@@ -545,16 +550,23 @@
 }
 
 static emacs_value
-module_make_string (emacs_env *env, const char *str, ptrdiff_t length)
+module_make_string (emacs_env *env, const char *str,	
+		    ptrdiff_t length, emacs_value coding)
 {
   MODULE_FUNCTION_BEGIN (module_nil);
+  Lisp_Object lisp_coding;
   if (length > STRING_BYTES_BOUND)
     {
       module_non_local_exit_signal_1 (env, Qoverflow_error, Qnil);
       return module_nil;
     }
   Lisp_Object lstr = make_unibyte_string (str, length);
-  return lisp_to_value (code_convert_string_norecord (lstr, Qutf_8, false));
+
+  if (coding == module_nil) lisp_coding = Qutf_8;
+  else if ((lisp_coding = value_to_lisp(coding)) == Qnil) lisp_coding = Qutf_8;
+
+  return lisp_to_value (
+	code_convert_string_norecord (lstr, lisp_coding, false));
 }
 
 static emacs_value
--- src/emacs-module.h-ORIG	2016-02-05 03:15:31.000000000 +0900
+++ src/emacs-module.h	2016-02-25 14:25:04.785587000 +0900
@@ -151,7 +151,7 @@
   emacs_value (*make_float) (emacs_env *env, double value);
 
   /* Copy the content of the Lisp string VALUE to BUFFER as an utf8
-     null-terminated string.
+     (or specified coding) null-terminated string.
 
      SIZE must point to the total size of the buffer.  If BUFFER is
      NULL or if SIZE is not big enough, write the required buffer size
@@ -165,11 +165,13 @@
   bool (*copy_string_contents) (emacs_env *env,
                                 emacs_value value,
                                 char *buffer,
-                                ptrdiff_t *size_inout);
+                                ptrdiff_t *size_inout,
+                                emacs_value coding);
 
-  /* Create a Lisp string from a utf8 encoded string.  */
+  /* Create a Lisp string from a utf8 (or specified) encoded string.  */
   emacs_value (*make_string) (emacs_env *env,
-			      const char *contents, ptrdiff_t length);
+			      const char *contents,
+			      ptrdiff_t length, emacs_value coding);
 
   /* Embedded pointer type.  */
   emacs_value (*make_user_ptr) (emacs_env *env,




#===========================================================================
In GNU Emacs 25.0.91.1 (x86_64-ibm-freebsd10.2, X toolkit, Xaw3d scroll bars)
 of 2016-02-26 built on smr00
Configured using:
 'configure x86_64-ibm-freebsd10.2
 --srcdir=/usr/src/local/GNU/emacs/emacs-25.0.91 --with-x
 --x-includes=/usr/local/include --x-libraries=/usr/local/lib
 --with-x-toolkit=lucid --without-xim --with-modules --without-pop
 --without-xpm --without-jpeg --without-tiff --without-gif --without-png
 --without-rsvg --without-imagemagick --without-gpm --without-dbus
 --without-gconf --without-gsettings --without-selinux --without-gnutls
 LDFLAGS=-L/usr/local/lib'

Configured features:
XAW3D SOUND NOTIFY ACL LIBXML2 FREETYPE XFT ZLIB TOOLKIT_SCROLL_BARS
LUCID X11 MODULES

Important settings:
  locale-coding-system: nil

Major mode: Fundamental

Minor modes in effect:
  tooltip-mode: t
  global-eldoc-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  tool-bar-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  line-number-mode: t
  transient-mark-mode: t

Recent messages:
Loading /usr/local/share/emacs/site-lisp/site-start-25.x.el (source)...
Loading /usr/local/share/emacs/25.0.91/site-lisp/canna-leim.el (source)...
byte-code: Canna is not built into this Emacs
No docstring slot for setup-japanese-environment-internal

Load-path shadows:
None found.

Features:
(shadow sort mail-extr emacsbug message idna dired format-spec rfc822
mml mml-sec epg epg-config gnus-util mm-decode mm-bodies mm-encode
mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail rfc2047
rfc2045 ietf-drums mm-util help-fns help-mode easymenu cl-loaddefs pcase
cl-lib mail-prsvr mail-utils time-date japan-util disp-table mule-util
tooltip eldoc electric uniquify ediff-hook vc-hooks lisp-float-type
mwheel x-win term/common-win x-dnd tool-bar dnd fontset image regexp-opt
fringe tabulated-list newcomment elisp-mode lisp-mode prog-mode register
page menu-bar rfn-eshadow timer select scroll-bar mouse jit-lock
font-lock syntax facemenu font-core frame cl-generic cham georgian
utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean
japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european
ethiopic indian cyrillic chinese charscript case-table epa-hook
jka-cmpr-hook help simple abbrev minibuffer cl-preloaded nadvice
loaddefs button faces cus-face macroexp files text-properties overlay
sha1 md5 base64 format env code-pages mule custom widget
hashtable-print-readable backquote kqueue dynamic-setting
font-render-setting x-toolkit x multi-tty make-network-process emacs)

Memory information:
((conses 16 87847 5715)
 (symbols 48 19548 0)
 (miscs 40 44 85)
 (strings 32 14122 4881)
 (string-bytes 1 406710)
 (vectors 16 9853)
 (vector-slots 8 453785 38367)
 (floats 8 162 258)
 (intervals 56 200 18)
 (buffers 976 11))




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22815; Package emacs. (Fri, 26 Feb 2016 09:12:02 GMT) Full text and rfc822 format available.

Message #8 received at 22815 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Atsuo Ohki <ohki <at> gssm.otsuka.tsukuba.ac.jp>
Cc: 22815 <at> debbugs.gnu.org
Subject: Re: bug#22815: 25.0.91; emacs-module.*
Date: Fri, 26 Feb 2016 11:10:57 +0200
> From: Atsuo Ohki <ohki <at> gssm.otsuka.tsukuba.ac.jp>
> Date: Fri, 26 Feb 2016 16:28:09 +0900
> 
>  This is not a bug report, but feature enhancemnt for emacs-module.*
> 
>  API of emacs-module has two Lisp string related functions,
>  `copy_string_contents' and `make_string'.
>  Those functions assume UTF-8 encoded strings,
>  but it becomes flexible if a user can specify a specific encoding.

Thanks.

But why cannot the module convert the string to UTF-8 before passing
it to Emacs?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22815; Package emacs. (Fri, 26 Feb 2016 09:17:02 GMT) Full text and rfc822 format available.

Message #11 received at 22815 <at> debbugs.gnu.org (full text, mbox):

From: ohki <at> gssm.otsuka.tsukuba.ac.jp
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Atsuo Ohki <ohki <at> gssm.otsuka.tsukuba.ac.jp>, 22815 <at> debbugs.gnu.org
Subject: Re: bug#22815: 25.0.91; emacs-module.*
Date: Fri, 26 Feb 2016 18:16:05 +0900
Eli Zaretskii writes:
> > From: Atsuo Ohki <ohki <at> gssm.otsuka.tsukuba.ac.jp>
> > Date: Fri, 26 Feb 2016 16:28:09 +0900
> > 
> >  This is not a bug report, but feature enhancemnt for emacs-module.*
> > 
> >  API of emacs-module has two Lisp string related functions,
> >  `copy_string_contents' and `make_string'.
> >  Those functions assume UTF-8 encoded strings,
> >  but it becomes flexible if a user can specify a specific encoding.
> 
> Thanks.
> 
> But why cannot the module convert the string to UTF-8 before passing
> it to Emacs?
> 

 Because, an external module could use a coding system other then UTF-8.
 I have used emacs-module capability to plugin CANNA server interface
 (CANNA server is a rather out of dated KANA-KANJI converter for japanese),
 and CANNA server uses euc-jp encoding for communication.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22815; Package emacs. (Fri, 26 Feb 2016 09:47:01 GMT) Full text and rfc822 format available.

Message #14 received at 22815 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: ohki <at> gssm.otsuka.tsukuba.ac.jp
Cc: 22815 <at> debbugs.gnu.org
Subject: Re: bug#22815: 25.0.91; emacs-module.*
Date: Fri, 26 Feb 2016 11:46:38 +0200
> From: ohki <at> gssm.otsuka.tsukuba.ac.jp
> Cc: Atsuo Ohki <ohki <at> gssm.otsuka.tsukuba.ac.jp>, 22815 <at> debbugs.gnu.org
> Date: Fri, 26 Feb 2016 18:16:05 +0900
> 
> > But why cannot the module convert the string to UTF-8 before passing
> > it to Emacs?
> 
>  Because, an external module could use a coding system other then UTF-8.
>  I have used emacs-module capability to plugin CANNA server interface
>  (CANNA server is a rather out of dated KANA-KANJI converter for japanese),
>  and CANNA server uses euc-jp encoding for communication.

I was asking why couldn't the plug-in do the conversion, e.g., by
using libiconv?  Emacs is not the only piece of software that knows
how to convert from one encoding to another.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22815; Package emacs. (Fri, 26 Feb 2016 10:11:02 GMT) Full text and rfc822 format available.

Message #17 received at 22815 <at> debbugs.gnu.org (full text, mbox):

From: ohki <at> gssm.otsuka.tsukuba.ac.jp
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: ohki <at> gssm.otsuka.tsukuba.ac.jp, 22815 <at> debbugs.gnu.org
Subject: Re: bug#22815: 25.0.91; emacs-module.*
Date: Fri, 26 Feb 2016 19:09:57 +0900
Eli Zaretskii writes:
> > From: ohki <at> gssm.otsuka.tsukuba.ac.jp
> > Cc: Atsuo Ohki <ohki <at> gssm.otsuka.tsukuba.ac.jp>, 22815 <at> debbugs.gnu.org
> > Date: Fri, 26 Feb 2016 18:16:05 +0900
> > 
> > > But why cannot the module convert the string to UTF-8 before passing
> > > it to Emacs?
> > 
> >  Because, an external module could use a coding system other then UTF-8.
> >  I have used emacs-module capability to plugin CANNA server interface
> >  (CANNA server is a rather out of dated KANA-KANJI converter for japanese),
> >  and CANNA server uses euc-jp encoding for communication.
> 
> I was asking why couldn't the plug-in do the conversion, e.g., by
> using libiconv?  Emacs is not the only piece of software that knows
> how to convert from one encoding to another.

 I considered using libiconv once, but Emacs has the conversion
 capability, so why not use it.  It is simple to use Emacs itself for
 conversion, than using different conversion library, as long as a
 plugin runs as a part of Emacs process.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22815; Package emacs. (Fri, 26 Feb 2016 18:27:02 GMT) Full text and rfc822 format available.

Message #20 received at 22815 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: ohki <at> gssm.otsuka.tsukuba.ac.jp
Cc: 22815 <at> debbugs.gnu.org
Subject: Re: bug#22815: 25.0.91; emacs-module.*
Date: Fri, 26 Feb 2016 20:26:07 +0200
> From: ohki <at> gssm.otsuka.tsukuba.ac.jp
> Cc: ohki <at> gssm.otsuka.tsukuba.ac.jp, 22815 <at> debbugs.gnu.org
> Date: Fri, 26 Feb 2016 19:09:57 +0900
> 
> > I was asking why couldn't the plug-in do the conversion, e.g., by
> > using libiconv?  Emacs is not the only piece of software that knows
> > how to convert from one encoding to another.
> 
>  I considered using libiconv once, but Emacs has the conversion
>  capability, so why not use it.

Because it can signal an error, if the encoding you pass is not a
valid coding-system that Emacs recognizes?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22815; Package emacs. (Fri, 26 Feb 2016 23:22:02 GMT) Full text and rfc822 format available.

Message #23 received at 22815 <at> debbugs.gnu.org (full text, mbox):

From: ohki <at> gssm.otsuka.tsukuba.ac.jp
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: ohki <at> gssm.otsuka.tsukuba.ac.jp, 22815 <at> debbugs.gnu.org
Subject: Re: bug#22815: 25.0.91; emacs-module.*
Date: Sat, 27 Feb 2016 08:21:39 +0900
Eli Zaretskii writes:
> > From: ohki <at> gssm.otsuka.tsukuba.ac.jp
> > Cc: ohki <at> gssm.otsuka.tsukuba.ac.jp, 22815 <at> debbugs.gnu.org
> > Date: Fri, 26 Feb 2016 19:09:57 +0900
> > 
> > > I was asking why couldn't the plug-in do the conversion, e.g., by
> > > using libiconv?  Emacs is not the only piece of software that knows
> > > how to convert from one encoding to another.
> > 
> >  I considered using libiconv once, but Emacs has the conversion
> >  capability, so why not use it.
> 
> Because it can signal an error, if the encoding you pass is not a
> valid coding-system that Emacs recognizes?
> 

 Yes it does!
 In the course of developing my plugin,
 I encountered `Invalid coding system' message,  and Emacs keep working
 (no crash, no hangup).




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22815; Package emacs. (Sat, 27 Feb 2016 08:21:01 GMT) Full text and rfc822 format available.

Message #26 received at 22815 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: ohki <at> gssm.otsuka.tsukuba.ac.jp, Daniel Colascione <dancol <at> dancol.org>,
 John Wiegley <johnw <at> gnu.org>
Cc: 22815 <at> debbugs.gnu.org
Subject: Re: bug#22815: 25.0.91; emacs-module.*
Date: Sat, 27 Feb 2016 10:19:50 +0200
> From: ohki <at> gssm.otsuka.tsukuba.ac.jp
> Cc: ohki <at> gssm.otsuka.tsukuba.ac.jp, 22815 <at> debbugs.gnu.org
> Date: Sat, 27 Feb 2016 08:21:39 +0900
> 
> Eli Zaretskii writes:
> > > From: ohki <at> gssm.otsuka.tsukuba.ac.jp
> > > Cc: ohki <at> gssm.otsuka.tsukuba.ac.jp, 22815 <at> debbugs.gnu.org
> > > Date: Fri, 26 Feb 2016 19:09:57 +0900
> > > 
> > > > I was asking why couldn't the plug-in do the conversion, e.g., by
> > > > using libiconv?  Emacs is not the only piece of software that knows
> > > > how to convert from one encoding to another.
> > > 
> > >  I considered using libiconv once, but Emacs has the conversion
> > >  capability, so why not use it.
> > 
> > Because it can signal an error, if the encoding you pass is not a
> > valid coding-system that Emacs recognizes?
> 
>  Yes it does!
>  In the course of developing my plugin,
>  I encountered `Invalid coding system' message,  and Emacs keep working
>  (no crash, no hangup).

It's all too easy to get that, since Emacs coding-systems have names
that are rarely used elsewhere.  And using libiconv is easy enough.

So I'm uneasy about this.  What do others think?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#22815; Package emacs. (Tue, 29 Mar 2016 10:06:02 GMT) Full text and rfc822 format available.

Message #29 received at 22815 <at> debbugs.gnu.org (full text, mbox):

From: Philipp Stephani <p.stephani2 <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>, ohki <at> gssm.otsuka.tsukuba.ac.jp, 
 Daniel Colascione <dancol <at> dancol.org>, John Wiegley <johnw <at> gnu.org>
Cc: 22815 <at> debbugs.gnu.org
Subject: Re: bug#22815: 25.0.91; emacs-module.*
Date: Tue, 29 Mar 2016 10:05:39 +0000
[Message part 1 (text/plain, inline)]
Eli Zaretskii <eliz <at> gnu.org> schrieb am Sa., 27. Feb. 2016 um 09:21 Uhr:

> > From: ohki <at> gssm.otsuka.tsukuba.ac.jp
> > Cc: ohki <at> gssm.otsuka.tsukuba.ac.jp, 22815 <at> debbugs.gnu.org
> > Date: Sat, 27 Feb 2016 08:21:39 +0900
> >
> > Eli Zaretskii writes:
> > > > From: ohki <at> gssm.otsuka.tsukuba.ac.jp
> > > > Cc: ohki <at> gssm.otsuka.tsukuba.ac.jp, 22815 <at> debbugs.gnu.org
> > > > Date: Fri, 26 Feb 2016 19:09:57 +0900
> > > >
> > > > > I was asking why couldn't the plug-in do the conversion, e.g., by
> > > > > using libiconv?  Emacs is not the only piece of software that knows
> > > > > how to convert from one encoding to another.
> > > >
> > > >  I considered using libiconv once, but Emacs has the conversion
> > > >  capability, so why not use it.
> > >
> > > Because it can signal an error, if the encoding you pass is not a
> > > valid coding-system that Emacs recognizes?
> >
> >  Yes it does!
> >  In the course of developing my plugin,
> >  I encountered `Invalid coding system' message,  and Emacs keep working
> >  (no crash, no hangup).
>
> It's all too easy to get that, since Emacs coding-systems have names
> that are rarely used elsewhere.  And using libiconv is easy enough.
>
> So I'm uneasy about this.  What do others think?
>
>
>
>
I agree, this adds complexity without significant advantages.
I'd recommend to add a wrapper for make-unibyte-string instead, then users
can choose to use Emacs functions for decoding and encoding strings.
[Message part 2 (text/html, inline)]

Changed bug title to '[module] allow different encodings for to copy_string_contents and make_string' from '25.0.91; emacs-module.*' Request was from npostavs <at> users.sourceforge.net to control <at> debbugs.gnu.org. (Sun, 02 Apr 2017 05:46:01 GMT) Full text and rfc822 format available.

Added tag(s) wontfix. Request was from npostavs <at> users.sourceforge.net to control <at> debbugs.gnu.org. (Sun, 02 Apr 2017 05:46:01 GMT) Full text and rfc822 format available.

bug closed, send any further explanations to 22815 <at> debbugs.gnu.org and Atsuo Ohki <ohki <at> gssm.otsuka.tsukuba.ac.jp> Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Mon, 29 Jul 2019 18:40:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 27 Aug 2019 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 4 years and 215 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.