GNU bug report logs - #9800
Incomplete truncated file buffers from the /proc filesystem

Previous Next

Package: emacs;

Reported by: Juri Linkov <juri <at> jurta.org>

Date: Wed, 19 Oct 2011 23:03:01 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 9800 in the body.
You can then email your comments to 9800 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#9800; Package emacs. (Wed, 19 Oct 2011 23:03:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Juri Linkov <juri <at> jurta.org>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Wed, 19 Oct 2011 23:03:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> jurta.org>
To: bug-gnu-emacs <at> gnu.org
Subject: Incomplete truncated file buffers from the /proc filesystem
Date: Thu, 20 Oct 2011 01:59:42 +0300
Large files from the /proc filesystem are visited incompletely,
their file buffers are truncated at the position 65536.
One possible test case to reproduce this is to load enough libraries
with e.g. (imagemagick-register-types) and visit Emacs's maps file
in /proc/$PID/maps.

Andreas said in http://lists.gnu.org/archive/html/emacs-devel/2011-10/msg00782.html
that it's due to this code in `insert-file-contents':

	  /* The file size returned from stat may be zero, but data
	     may be readable nonetheless, for example when this is a
	     file in the /proc filesystem.  */
	  if (end_offset == 0)
	    end_offset = READ_BUF_SIZE;

How this could be fixed?  Should it keep reading while more data can be
read from the file?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#9800; Package emacs. (Thu, 20 Oct 2011 08:25:02 GMT) Full text and rfc822 format available.

Message #8 received at 9800 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Juri Linkov <juri <at> jurta.org>
Cc: 9800 <at> debbugs.gnu.org
Subject: Re: bug#9800: Incomplete truncated file buffers from the /proc
	filesystem
Date: Thu, 20 Oct 2011 10:22:43 +0200
> From: Juri Linkov <juri <at> jurta.org>
> Date: Thu, 20 Oct 2011 01:59:42 +0300
> 
> Large files from the /proc filesystem are visited incompletely,
> their file buffers are truncated at the position 65536.
> One possible test case to reproduce this is to load enough libraries
> with e.g. (imagemagick-register-types) and visit Emacs's maps file
> in /proc/$PID/maps.
> 
> Andreas said in http://lists.gnu.org/archive/html/emacs-devel/2011-10/msg00782.html
> that it's due to this code in `insert-file-contents':
> 
> 	  /* The file size returned from stat may be zero, but data
> 	     may be readable nonetheless, for example when this is a
> 	     file in the /proc filesystem.  */
> 	  if (end_offset == 0)
> 	    end_offset = READ_BUF_SIZE;
> 
> How this could be fixed?  Should it keep reading while more data can be
> read from the file?

Does lseek work on these files?  If so, we could use something like

   lseek (fd, 0L, SEEK_END)

to find its size.  Or we could treat those files as non-regular, where
we set end_offset to TYPE_MAXIMUM (off_t) -- would that work with
these files?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#9800; Package emacs. (Thu, 20 Oct 2011 08:47:02 GMT) Full text and rfc822 format available.

Message #11 received at 9800 <at> debbugs.gnu.org (full text, mbox):

From: Andreas Schwab <schwab <at> linux-m68k.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Juri Linkov <juri <at> jurta.org>, 9800 <at> debbugs.gnu.org
Subject: Re: bug#9800: Incomplete truncated file buffers from the /proc
	filesystem
Date: Thu, 20 Oct 2011 10:44:57 +0200
Eli Zaretskii <eliz <at> gnu.org> writes:

> Does lseek work on these files?

No.  The contents are completely dynamic, generated on-the-fly when
reading.

> Or we could treat those files as non-regular,

That is the only option.

Andreas.

-- 
Andreas Schwab, schwab <at> linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#9800; Package emacs. (Mon, 24 Oct 2011 02:55:01 GMT) Full text and rfc822 format available.

Message #14 received at 9800 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: 9800 <at> debbugs.gnu.org
Subject: Re: bug#9800: Incomplete truncated file buffers from the /proc
	filesystem
Date: Sun, 23 Oct 2011 19:53:15 -0700
It strikes me that regular files can go as you read them, too,
and that Emacs is not doing this properly.  That is, Emacs should
be fixed so that it continues to read from a growing regular file
until a proper EOF is reached (i.e., until read returns 0).

If Emacs is fixed in this way, it will read these /proc files
correctly too.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#9800; Package emacs. (Mon, 24 Oct 2011 21:52:01 GMT) Full text and rfc822 format available.

Message #17 received at 9800 <at> debbugs.gnu.org (full text, mbox):

From: Richard Stallman <rms <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 9800 <at> debbugs.gnu.org
Subject: Re: bug#9800: Incomplete truncated file buffers from the /proc
	filesystem
Date: Mon, 24 Oct 2011 17:50:14 -0400
    It strikes me that regular files can go as you read them, too,
    and that Emacs is not doing this properly.  That is, Emacs should
    be fixed so that it continues to read from a growing regular file
    until a proper EOF is reached (i.e., until read returns 0).

I think there was a reason for doing it this way.  Perhaps so as to
allocate the space before reading the file.

-- 
Dr Richard Stallman
President, Free Software Foundation
51 Franklin St
Boston MA 02110
USA
www.fsf.org  www.gnu.org
Skype: No way! That's nonfree (freedom-denying) software.
  Use free telephony http://directory.fsf.org/category/tel/




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#9800; Package emacs. (Mon, 24 Oct 2011 22:05:02 GMT) Full text and rfc822 format available.

Message #20 received at 9800 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: rms <at> gnu.org
Cc: 9800 <at> debbugs.gnu.org
Subject: Re: bug#9800: Incomplete truncated file buffers from the /proc
	filesystem
Date: Mon, 24 Oct 2011 15:02:53 -0700
On 10/24/11 14:50, Richard Stallman wrote:
> I think there was a reason for doing it this way.  Perhaps so as to
> allocate the space before reading the file.

Yes, that sounds right.  And in the typical case where the file is not
growing, that allocates space efficiently.  If the file is growing, though,
it's OK to allocate more space after discovering that the initial
allocation was too small.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#9800; Package emacs. (Thu, 03 Nov 2011 20:35:02 GMT) Full text and rfc822 format available.

Message #23 received at 9800 <at> debbugs.gnu.org (full text, mbox):

From: Lars Magne Ingebrigtsen <larsi <at> gnus.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 9800 <at> debbugs.gnu.org
Subject: Re: bug#9800: Incomplete truncated file buffers from the /proc
	filesystem
Date: Thu, 03 Nov 2011 21:32:07 +0100
Paul Eggert <eggert <at> cs.ucla.edu> writes:

> It strikes me that regular files can go as you read them, too,
> and that Emacs is not doing this properly.  That is, Emacs should
> be fixed so that it continues to read from a growing regular file
> until a proper EOF is reached (i.e., until read returns 0).

Sounds like a good idea, but remember to bail out some time before
reading the infinitely big files to the very end.  :-)

-- 
(domestic pets only, the antidote for overdose, milk.)
  bloggy blog http://lars.ingebrigtsen.no/




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#9800; Package emacs. (Fri, 04 Nov 2011 10:02:02 GMT) Full text and rfc822 format available.

Message #26 received at 9800 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> jurta.org>
To: Lars Magne Ingebrigtsen <larsi <at> gnus.org>
Cc: Paul Eggert <eggert <at> cs.ucla.edu>, 9800 <at> debbugs.gnu.org
Subject: Re: bug#9800: Incomplete truncated file buffers from the /proc
	filesystem
Date: Fri, 04 Nov 2011 11:36:01 +0200
>> It strikes me that regular files can go as you read them, too,
>> and that Emacs is not doing this properly.  That is, Emacs should
>> be fixed so that it continues to read from a growing regular file
>> until a proper EOF is reached (i.e., until read returns 0).
>
> Sounds like a good idea, but remember to bail out some time before
> reading the infinitely big files to the very end.  :-)

Maybe limit the reading by the value of `large-file-warning-threshold'.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#9800; Package emacs. (Fri, 04 Nov 2011 10:57:02 GMT) Full text and rfc822 format available.

Message #29 received at 9800 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Juri Linkov <juri <at> jurta.org>
Cc: larsi <at> gnus.org, eggert <at> cs.ucla.edu, 9800 <at> debbugs.gnu.org
Subject: Re: bug#9800: Incomplete truncated file buffers from the /proc
	filesystem
Date: Fri, 04 Nov 2011 12:54:22 +0200
> From: Juri Linkov <juri <at> jurta.org>
> Date: Fri, 04 Nov 2011 11:36:01 +0200
> Cc: Paul Eggert <eggert <at> cs.ucla.edu>, 9800 <at> debbugs.gnu.org
> 
> >> It strikes me that regular files can go as you read them, too,
> >> and that Emacs is not doing this properly.  That is, Emacs should
> >> be fixed so that it continues to read from a growing regular file
> >> until a proper EOF is reached (i.e., until read returns 0).
> >
> > Sounds like a good idea, but remember to bail out some time before
> > reading the infinitely big files to the very end.  :-)
> 
> Maybe limit the reading by the value of `large-file-warning-threshold'.

IMO, that value is ridiculously low for such use.  Maybe multiply it
by some large factor.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#9800; Package emacs. (Mon, 07 Feb 2022 00:11:02 GMT) Full text and rfc822 format available.

Message #32 received at 9800 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Juri Linkov <juri <at> jurta.org>
Cc: 9800 <at> debbugs.gnu.org
Subject: Re: bug#9800: Incomplete truncated file buffers from the /proc
 filesystem
Date: Mon, 07 Feb 2022 01:10:38 +0100
Juri Linkov <juri <at> jurta.org> writes:

> Large files from the /proc filesystem are visited incompletely,
> their file buffers are truncated at the position 65536.

It seems like this issue has been exacerbated somewhat since this was
reported.

(with-temp-buffer
  (insert-file-contents "/proc/cpuinfo")
  (buffer-size))
=> 16384

(with-temp-buffer
  (call-process "cat" nil t nil "/proc/cpuinfo")
  (buffer-size))
=> 24626

But perhaps it's dependent on the block size.  (This is on
Debian/bookworm with Emacs 29.)

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#9800; Package emacs. (Mon, 07 Feb 2022 19:51:02 GMT) Full text and rfc822 format available.

Message #35 received at 9800 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> jurta.org>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 9800 <at> debbugs.gnu.org
Subject: Re: bug#9800: Incomplete truncated file buffers from the /proc
 filesystem
Date: Mon, 07 Feb 2022 21:41:01 +0200
>> Large files from the /proc filesystem are visited incompletely,
>> their file buffers are truncated at the position 65536.
>
> It seems like this issue has been exacerbated somewhat since this was
> reported.
>
> (with-temp-buffer
>   (insert-file-contents "/proc/cpuinfo")
>   (buffer-size))
> => 16384
>
> (with-temp-buffer
>   (call-process "cat" nil t nil "/proc/cpuinfo")
>   (buffer-size))
> => 24626
>
> But perhaps it's dependent on the block size.  (This is on
> Debian/bookworm with Emacs 29.)

I confirm that the block size now is decreased from 65536 to 16384,
so more buffers are truncated.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#9800; Package emacs. (Sun, 12 Feb 2023 07:40:02 GMT) Full text and rfc822 format available.

Message #38 received at 9800 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Andreas Schwab <schwab <at> linux-m68k.org>, Paul Eggert <eggert <at> cs.ucla.edu>
Cc: juri <at> jurta.org, 9800 <at> debbugs.gnu.org
Subject: Re: bug#9800: Incomplete truncated file buffers from the /proc
 filesystem
Date: Sun, 12 Feb 2023 09:38:33 +0200
> From: Andreas Schwab <schwab <at> linux-m68k.org>
> Date: Thu, 20 Oct 2011 10:44:57 +0200
> Cc: Juri Linkov <juri <at> jurta.org>, 9800 <at> debbugs.gnu.org
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > Does lseek work on these files?
> 
> No.  The contents are completely dynamic, generated on-the-fly when
> reading.
> 
> > Or we could treat those files as non-regular,
> 
> That is the only option.

Are all the files in "/proc" of this nature?  If so, we could consider
all of the files in that directory non-regular; if that is all that's
needed to visit /proc/foo files, insert-file-contents already has code
to deal with non-regular files.

Paul, do you see any downsides to such heuristic?  We could make it a
user option if the heuristic could sometimes backfire, I guess.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#9800; Package emacs. (Sun, 12 Feb 2023 09:26:01 GMT) Full text and rfc822 format available.

Message #41 received at 9800 <at> debbugs.gnu.org (full text, mbox):

From: Ruijie Yu <ruijie <at> netyu.xyz>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: juri <at> jurta.org, Paul Eggert <eggert <at> cs.ucla.edu>,
 Andreas Schwab <schwab <at> linux-m68k.org>, 9800 <at> debbugs.gnu.org
Subject: Re: bug#9800: Incomplete truncated file buffers from the /proc
 filesystem
Date: Sun, 12 Feb 2023 17:24:51 +0800
There is /proc/config.gz which does report a size, contrary to other files which report a size of 0. 

> On Feb 12, 2023, at 15:41, Eli Zaretskii <eliz <at> gnu.org> wrote:
> 
> 
>> 
>> From: Andreas Schwab <schwab <at> linux-m68k.org>
>> Date: Thu, 20 Oct 2011 10:44:57 +0200
>> Cc: Juri Linkov <juri <at> jurta.org>, 9800 <at> debbugs.gnu.org
>> 
>> Eli Zaretskii <eliz <at> gnu.org> writes:
>> 
>>> Does lseek work on these files?
>> 
>> No.  The contents are completely dynamic, generated on-the-fly when
>> reading.
>> 
>>> Or we could treat those files as non-regular,
>> 
>> That is the only option.
> 
> Are all the files in "/proc" of this nature?  If so, we could consider
> all of the files in that directory non-regular; if that is all that's
> needed to visit /proc/foo files, insert-file-contents already has code
> to deal with non-regular files.
> 
> Paul, do you see any downsides to such heuristic?  We could make it a
> user option if the heuristic could sometimes backfire, I guess.

Best,


RY




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#9800; Package emacs. (Sun, 12 Feb 2023 10:40:02 GMT) Full text and rfc822 format available.

Message #44 received at 9800 <at> debbugs.gnu.org (full text, mbox):

From: Arsen Arsenović <arsen <at> aarsen.me>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: rms <at> gnu.org, 9800 <at> debbugs.gnu.org
Subject: Re: bug#9800: Incomplete truncated file buffers from the /proc
 filesystem
Date: Sun, 12 Feb 2023 11:21:29 +0100
[Message part 1 (text/plain, inline)]
Hi,

I was just debugging this before I found the bug report.  The diagnosis
is right: st_size is wrong for proc files (and, I'd argue, for regular
files sometimes).  So, I agree with Paul.

Paul Eggert <eggert <at> cs.ucla.edu> writes:

> On 10/24/11 14:50, Richard Stallman wrote:
>> I think there was a reason for doing it this way.  Perhaps so as to
>> allocate the space before reading the file.
>
> Yes, that sounds right.  And in the typical case where the file is not
> growing, that allocates space efficiently.  If the file is growing, though,
> it's OK to allocate more space after discovering that the initial
> allocation was too small.

Right.  The best possible approach is, likely:

  fstat (fd, x, &st)
  bufsz = max (READ_BUF_SIZE, st.st_size)
  buf = malloc (bufsz)

  int ret = 0, readsz = 0;
  do
    {
      readsz += ret;
      if (readsz == bufsz && size isn't unreasonable)
        {
          /* value chosen arbitrarily.  */
          bufsz += min (16 * READ_BUF_SIZE, bufsz)
          buf = realloc (buf, bufsz)
        }
      errno = 0
      ret = read (fd, buf + readsz, bufsz - readsz)
    }
  while (ret > 0 || errno == EINTR);

... or such.  This approach is robust and general, and I suspect it'd
even work for named pipes.

st_size isn't a good enough indicator of size, and it can go out of date
before TOU, however, it's - no doubt - a useful hint in the 99% case.
Using st_size to figure out a base allocation size and extending
appropriately is a well known strategy, and it would be appropriate to
do so here.

Thanks in advance, have a great day.
-- 
Arsen Arsenović
[signature.asc (application/pgp-signature, inline)]

Reply sent to Paul Eggert <eggert <at> cs.ucla.edu>:
You have taken responsibility. (Mon, 13 Feb 2023 20:48:01 GMT) Full text and rfc822 format available.

Notification sent to Juri Linkov <juri <at> jurta.org>:
bug acknowledged by developer. (Mon, 13 Feb 2023 20:48:02 GMT) Full text and rfc822 format available.

Message #49 received at 9800-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Arsen Arsenović <arsen <at> aarsen.me>
Cc: rms <at> gnu.org, 9800-done <at> debbugs.gnu.org
Subject: Re: bug#9800: Incomplete truncated file buffers from the /proc
 filesystem
Date: Mon, 13 Feb 2023 12:47:33 -0800
[Message part 1 (text/plain, inline)]
On 2023-02-12 02:21, Arsen Arsenović wrote:
> ... or such.  This approach is robust and general, and I suspect it'd
> even work for named pipes.

Although indeed robust and general and it will work with named pipes in 
some cases, it still has a problem if the other side of the named pipe 
outputs data very slowly: Emacs will still seem to hang until you type C-g.

That being said, the approach is an improvement and it fixes the 
original bug report so I installed the attached and am boldly closing 
the bug report (we can reopen it if I'm wrong). The last patch in the 
attached series is the actual fix: the others are minor cleanups of this 
messy area, which I discovered while looking into the fix.

This patch does not address the abovementioned issue of named pipes, nor 
the issue of inserting very large files: the code should behave roughly 
the same as before in those two areas. These issues can be raised in 
separate bug reports if needed.

PS. I was surprised to see that Emacs master currently has several test 
case failures on GNU/Linux (specifically the latest Fedora and Ubuntu 
releases). I hope these are known and that people are working on them.

1 files did not finish:
  lisp/server-tests.log
4 files contained unexpected results:
  src/lread-tests.log
  lisp/international/mule-tests.log
  lisp/emacs-lisp/map-tests.log
  lisp/emacs-lisp/bytecomp-tests.log
[0001-Improve-insert-file-contents-checking.patch (text/x-patch, attachment)]
[0002-Improve-insert-file-contents-on-non-regular-files.patch (text/x-patch, attachment)]
[0003-Don-t-scan-text-twice-to-guess-coding-system.patch (text/x-patch, attachment)]
[0004-Fix-src-fileio.c-comment.patch (text/x-patch, attachment)]
[0005-Fix-insert-file-contents-on-proc-files.patch (text/x-patch, attachment)]

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 14 Mar 2023 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 2 years and 112 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.