GNU bug report logs -
#9800
Incomplete truncated file buffers from the /proc filesystem
Previous Next
Reported by: Juri Linkov <juri <at> jurta.org>
Date: Wed, 19 Oct 2011 23:03:01 UTC
Severity: normal
Done: Paul Eggert <eggert <at> cs.ucla.edu>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 9800 in the body.
You can then email your comments to 9800 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#9800
; Package
emacs
.
(Wed, 19 Oct 2011 23:03:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Juri Linkov <juri <at> jurta.org>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Wed, 19 Oct 2011 23:03:01 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Large files from the /proc filesystem are visited incompletely,
their file buffers are truncated at the position 65536.
One possible test case to reproduce this is to load enough libraries
with e.g. (imagemagick-register-types) and visit Emacs's maps file
in /proc/$PID/maps.
Andreas said in http://lists.gnu.org/archive/html/emacs-devel/2011-10/msg00782.html
that it's due to this code in `insert-file-contents':
/* The file size returned from stat may be zero, but data
may be readable nonetheless, for example when this is a
file in the /proc filesystem. */
if (end_offset == 0)
end_offset = READ_BUF_SIZE;
How this could be fixed? Should it keep reading while more data can be
read from the file?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#9800
; Package
emacs
.
(Thu, 20 Oct 2011 08:25:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 9800 <at> debbugs.gnu.org (full text, mbox):
> From: Juri Linkov <juri <at> jurta.org>
> Date: Thu, 20 Oct 2011 01:59:42 +0300
>
> Large files from the /proc filesystem are visited incompletely,
> their file buffers are truncated at the position 65536.
> One possible test case to reproduce this is to load enough libraries
> with e.g. (imagemagick-register-types) and visit Emacs's maps file
> in /proc/$PID/maps.
>
> Andreas said in http://lists.gnu.org/archive/html/emacs-devel/2011-10/msg00782.html
> that it's due to this code in `insert-file-contents':
>
> /* The file size returned from stat may be zero, but data
> may be readable nonetheless, for example when this is a
> file in the /proc filesystem. */
> if (end_offset == 0)
> end_offset = READ_BUF_SIZE;
>
> How this could be fixed? Should it keep reading while more data can be
> read from the file?
Does lseek work on these files? If so, we could use something like
lseek (fd, 0L, SEEK_END)
to find its size. Or we could treat those files as non-regular, where
we set end_offset to TYPE_MAXIMUM (off_t) -- would that work with
these files?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#9800
; Package
emacs
.
(Thu, 20 Oct 2011 08:47:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 9800 <at> debbugs.gnu.org (full text, mbox):
Eli Zaretskii <eliz <at> gnu.org> writes:
> Does lseek work on these files?
No. The contents are completely dynamic, generated on-the-fly when
reading.
> Or we could treat those files as non-regular,
That is the only option.
Andreas.
--
Andreas Schwab, schwab <at> linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#9800
; Package
emacs
.
(Mon, 24 Oct 2011 02:55:01 GMT)
Full text and
rfc822 format available.
Message #14 received at 9800 <at> debbugs.gnu.org (full text, mbox):
It strikes me that regular files can go as you read them, too,
and that Emacs is not doing this properly. That is, Emacs should
be fixed so that it continues to read from a growing regular file
until a proper EOF is reached (i.e., until read returns 0).
If Emacs is fixed in this way, it will read these /proc files
correctly too.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#9800
; Package
emacs
.
(Mon, 24 Oct 2011 21:52:01 GMT)
Full text and
rfc822 format available.
Message #17 received at 9800 <at> debbugs.gnu.org (full text, mbox):
It strikes me that regular files can go as you read them, too,
and that Emacs is not doing this properly. That is, Emacs should
be fixed so that it continues to read from a growing regular file
until a proper EOF is reached (i.e., until read returns 0).
I think there was a reason for doing it this way. Perhaps so as to
allocate the space before reading the file.
--
Dr Richard Stallman
President, Free Software Foundation
51 Franklin St
Boston MA 02110
USA
www.fsf.org www.gnu.org
Skype: No way! That's nonfree (freedom-denying) software.
Use free telephony http://directory.fsf.org/category/tel/
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#9800
; Package
emacs
.
(Mon, 24 Oct 2011 22:05:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 9800 <at> debbugs.gnu.org (full text, mbox):
On 10/24/11 14:50, Richard Stallman wrote:
> I think there was a reason for doing it this way. Perhaps so as to
> allocate the space before reading the file.
Yes, that sounds right. And in the typical case where the file is not
growing, that allocates space efficiently. If the file is growing, though,
it's OK to allocate more space after discovering that the initial
allocation was too small.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#9800
; Package
emacs
.
(Thu, 03 Nov 2011 20:35:02 GMT)
Full text and
rfc822 format available.
Message #23 received at 9800 <at> debbugs.gnu.org (full text, mbox):
Paul Eggert <eggert <at> cs.ucla.edu> writes:
> It strikes me that regular files can go as you read them, too,
> and that Emacs is not doing this properly. That is, Emacs should
> be fixed so that it continues to read from a growing regular file
> until a proper EOF is reached (i.e., until read returns 0).
Sounds like a good idea, but remember to bail out some time before
reading the infinitely big files to the very end. :-)
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog http://lars.ingebrigtsen.no/
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#9800
; Package
emacs
.
(Fri, 04 Nov 2011 10:02:02 GMT)
Full text and
rfc822 format available.
Message #26 received at 9800 <at> debbugs.gnu.org (full text, mbox):
>> It strikes me that regular files can go as you read them, too,
>> and that Emacs is not doing this properly. That is, Emacs should
>> be fixed so that it continues to read from a growing regular file
>> until a proper EOF is reached (i.e., until read returns 0).
>
> Sounds like a good idea, but remember to bail out some time before
> reading the infinitely big files to the very end. :-)
Maybe limit the reading by the value of `large-file-warning-threshold'.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#9800
; Package
emacs
.
(Fri, 04 Nov 2011 10:57:02 GMT)
Full text and
rfc822 format available.
Message #29 received at 9800 <at> debbugs.gnu.org (full text, mbox):
> From: Juri Linkov <juri <at> jurta.org>
> Date: Fri, 04 Nov 2011 11:36:01 +0200
> Cc: Paul Eggert <eggert <at> cs.ucla.edu>, 9800 <at> debbugs.gnu.org
>
> >> It strikes me that regular files can go as you read them, too,
> >> and that Emacs is not doing this properly. That is, Emacs should
> >> be fixed so that it continues to read from a growing regular file
> >> until a proper EOF is reached (i.e., until read returns 0).
> >
> > Sounds like a good idea, but remember to bail out some time before
> > reading the infinitely big files to the very end. :-)
>
> Maybe limit the reading by the value of `large-file-warning-threshold'.
IMO, that value is ridiculously low for such use. Maybe multiply it
by some large factor.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#9800
; Package
emacs
.
(Mon, 07 Feb 2022 00:11:02 GMT)
Full text and
rfc822 format available.
Message #32 received at 9800 <at> debbugs.gnu.org (full text, mbox):
Juri Linkov <juri <at> jurta.org> writes:
> Large files from the /proc filesystem are visited incompletely,
> their file buffers are truncated at the position 65536.
It seems like this issue has been exacerbated somewhat since this was
reported.
(with-temp-buffer
(insert-file-contents "/proc/cpuinfo")
(buffer-size))
=> 16384
(with-temp-buffer
(call-process "cat" nil t nil "/proc/cpuinfo")
(buffer-size))
=> 24626
But perhaps it's dependent on the block size. (This is on
Debian/bookworm with Emacs 29.)
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#9800
; Package
emacs
.
(Mon, 07 Feb 2022 19:51:02 GMT)
Full text and
rfc822 format available.
Message #35 received at 9800 <at> debbugs.gnu.org (full text, mbox):
>> Large files from the /proc filesystem are visited incompletely,
>> their file buffers are truncated at the position 65536.
>
> It seems like this issue has been exacerbated somewhat since this was
> reported.
>
> (with-temp-buffer
> (insert-file-contents "/proc/cpuinfo")
> (buffer-size))
> => 16384
>
> (with-temp-buffer
> (call-process "cat" nil t nil "/proc/cpuinfo")
> (buffer-size))
> => 24626
>
> But perhaps it's dependent on the block size. (This is on
> Debian/bookworm with Emacs 29.)
I confirm that the block size now is decreased from 65536 to 16384,
so more buffers are truncated.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#9800
; Package
emacs
.
(Sun, 12 Feb 2023 07:40:02 GMT)
Full text and
rfc822 format available.
Message #38 received at 9800 <at> debbugs.gnu.org (full text, mbox):
> From: Andreas Schwab <schwab <at> linux-m68k.org>
> Date: Thu, 20 Oct 2011 10:44:57 +0200
> Cc: Juri Linkov <juri <at> jurta.org>, 9800 <at> debbugs.gnu.org
>
> Eli Zaretskii <eliz <at> gnu.org> writes:
>
> > Does lseek work on these files?
>
> No. The contents are completely dynamic, generated on-the-fly when
> reading.
>
> > Or we could treat those files as non-regular,
>
> That is the only option.
Are all the files in "/proc" of this nature? If so, we could consider
all of the files in that directory non-regular; if that is all that's
needed to visit /proc/foo files, insert-file-contents already has code
to deal with non-regular files.
Paul, do you see any downsides to such heuristic? We could make it a
user option if the heuristic could sometimes backfire, I guess.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#9800
; Package
emacs
.
(Sun, 12 Feb 2023 09:26:01 GMT)
Full text and
rfc822 format available.
Message #41 received at 9800 <at> debbugs.gnu.org (full text, mbox):
There is /proc/config.gz which does report a size, contrary to other files which report a size of 0.
> On Feb 12, 2023, at 15:41, Eli Zaretskii <eliz <at> gnu.org> wrote:
>
>
>>
>> From: Andreas Schwab <schwab <at> linux-m68k.org>
>> Date: Thu, 20 Oct 2011 10:44:57 +0200
>> Cc: Juri Linkov <juri <at> jurta.org>, 9800 <at> debbugs.gnu.org
>>
>> Eli Zaretskii <eliz <at> gnu.org> writes:
>>
>>> Does lseek work on these files?
>>
>> No. The contents are completely dynamic, generated on-the-fly when
>> reading.
>>
>>> Or we could treat those files as non-regular,
>>
>> That is the only option.
>
> Are all the files in "/proc" of this nature? If so, we could consider
> all of the files in that directory non-regular; if that is all that's
> needed to visit /proc/foo files, insert-file-contents already has code
> to deal with non-regular files.
>
> Paul, do you see any downsides to such heuristic? We could make it a
> user option if the heuristic could sometimes backfire, I guess.
Best,
RY
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#9800
; Package
emacs
.
(Sun, 12 Feb 2023 10:40:02 GMT)
Full text and
rfc822 format available.
Message #44 received at 9800 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Hi,
I was just debugging this before I found the bug report. The diagnosis
is right: st_size is wrong for proc files (and, I'd argue, for regular
files sometimes). So, I agree with Paul.
Paul Eggert <eggert <at> cs.ucla.edu> writes:
> On 10/24/11 14:50, Richard Stallman wrote:
>> I think there was a reason for doing it this way. Perhaps so as to
>> allocate the space before reading the file.
>
> Yes, that sounds right. And in the typical case where the file is not
> growing, that allocates space efficiently. If the file is growing, though,
> it's OK to allocate more space after discovering that the initial
> allocation was too small.
Right. The best possible approach is, likely:
fstat (fd, x, &st)
bufsz = max (READ_BUF_SIZE, st.st_size)
buf = malloc (bufsz)
int ret = 0, readsz = 0;
do
{
readsz += ret;
if (readsz == bufsz && size isn't unreasonable)
{
/* value chosen arbitrarily. */
bufsz += min (16 * READ_BUF_SIZE, bufsz)
buf = realloc (buf, bufsz)
}
errno = 0
ret = read (fd, buf + readsz, bufsz - readsz)
}
while (ret > 0 || errno == EINTR);
... or such. This approach is robust and general, and I suspect it'd
even work for named pipes.
st_size isn't a good enough indicator of size, and it can go out of date
before TOU, however, it's - no doubt - a useful hint in the 99% case.
Using st_size to figure out a base allocation size and extending
appropriately is a well known strategy, and it would be appropriate to
do so here.
Thanks in advance, have a great day.
--
Arsen Arsenović
[signature.asc (application/pgp-signature, inline)]
Reply sent
to
Paul Eggert <eggert <at> cs.ucla.edu>
:
You have taken responsibility.
(Mon, 13 Feb 2023 20:48:01 GMT)
Full text and
rfc822 format available.
Notification sent
to
Juri Linkov <juri <at> jurta.org>
:
bug acknowledged by developer.
(Mon, 13 Feb 2023 20:48:02 GMT)
Full text and
rfc822 format available.
Message #49 received at 9800-done <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 2023-02-12 02:21, Arsen Arsenović wrote:
> ... or such. This approach is robust and general, and I suspect it'd
> even work for named pipes.
Although indeed robust and general and it will work with named pipes in
some cases, it still has a problem if the other side of the named pipe
outputs data very slowly: Emacs will still seem to hang until you type C-g.
That being said, the approach is an improvement and it fixes the
original bug report so I installed the attached and am boldly closing
the bug report (we can reopen it if I'm wrong). The last patch in the
attached series is the actual fix: the others are minor cleanups of this
messy area, which I discovered while looking into the fix.
This patch does not address the abovementioned issue of named pipes, nor
the issue of inserting very large files: the code should behave roughly
the same as before in those two areas. These issues can be raised in
separate bug reports if needed.
PS. I was surprised to see that Emacs master currently has several test
case failures on GNU/Linux (specifically the latest Fedora and Ubuntu
releases). I hope these are known and that people are working on them.
1 files did not finish:
lisp/server-tests.log
4 files contained unexpected results:
src/lread-tests.log
lisp/international/mule-tests.log
lisp/emacs-lisp/map-tests.log
lisp/emacs-lisp/bytecomp-tests.log
[0001-Improve-insert-file-contents-checking.patch (text/x-patch, attachment)]
[0002-Improve-insert-file-contents-on-non-regular-files.patch (text/x-patch, attachment)]
[0003-Don-t-scan-text-twice-to-guess-coding-system.patch (text/x-patch, attachment)]
[0004-Fix-src-fileio.c-comment.patch (text/x-patch, attachment)]
[0005-Fix-insert-file-contents-on-proc-files.patch (text/x-patch, attachment)]
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Tue, 14 Mar 2023 11:24:04 GMT)
Full text and
rfc822 format available.
This bug report was last modified 2 years and 112 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.