GNU bug report logs - #35256
Bug report for -W argument (maximum width) - minor and not dangerous

Previous Next

Package: diffutils;

Reported by: alec <at> unifiedmathematics.com

Date: Sat, 13 Apr 2019 15:33:02 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 35256 in the body.
You can then email your comments to 35256 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-diffutils <at> gnu.org:
bug#35256; Package diffutils. (Sat, 13 Apr 2019 15:33:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to alec <at> unifiedmathematics.com:
New bug report received and forwarded. Copy sent to bug-diffutils <at> gnu.org. (Sat, 13 Apr 2019 15:33:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: alec <at> unifiedmathematics.com
To: bug-diffutils <at> gnu.org
Subject: Bug report for -W argument (maximum width) - minor and not dangerous
Date: Sat, 13 Apr 2019 12:29:45 +0100
[Message part 1 (text/plain, inline)]
Hello there.

I was hoping to view a side-by-side diff of something and, perhaps
unfairly, was hoping for a setting where diff would choose a width
such that there were no truncations and I would use less with no
wrapping to inspect the results.

My first attempt was "-W 0" (a width of 0 has no "legit" meaning
afterall) - error, so I tried -1. This leads to a weird situation
where it seems to just output loads of tabs - while it'll probably
still terminate eventually the behaviour is unreasonable.

To try this yourself run something like:

diff -y ./maps ./task/4974/maps -W -1 

from /proc/XXXX where XXXX is some PID for a program with threads (eg
firefox) and the 4974 is any task that isn't XXXX

ADDENDUM: The -1 isn't important, 9999999999999 also illustrates the
problem - END ADDENDUM

Looking at the code (in the 3.7 tarball, src/diff.c modified on 18th
of December 2018) notice:

Line 284:
  uintmax_t numval;
Line 525:
    case 'W':
      numval = strtoumax (optarg, &numend, 10);
      if (! (0 < numval && numval <= SIZE_MAX) || *numend)
        try_help ("invalid width '%s'", optarg);
      if (width != numval)
        {
          if (width)
        fatal ("conflicting width options");
          width = numval;
        }
      break;

For convenience:
      uintmax_t strtoumax(const char *nptr, char **endptr, int
base);
and it may set errno, my man page doesn't say whether this -1
behaviour is "okay" however it probably is, unsigned afterall, this
means that numval is going to be a really really big value. 

ABUSE POTENTIAL:

Just basic DOS (denial-of-service) stuff, a CPU usage spike comes from
diff itself and it seems to output a lot of tabs (a good 275mib / sec
on my machine) and will probably do so for a good few years before
anything else comes out, a testament to the robustness of diff is that
it did this, and its memory usage didn't start ballooning.

I know diff is used by A LOT of other programs, some of which are
web-accessible (eg mediawiki uses diff - and will by default if it
finds it), many of my projects use it too. It is not a big stretch to
imagine someone has a web-service out there which allows side-by-side
format, and not much of a further leap to assume that someone might
have an input box for width which exposes -W, guarded only by a regex
of the form ^[1-9][0-9]* (which yes, wont allow -1 but will allow
9999999999999)

You could bring a server to its knees pretty quickly using just diff's
CPU usage and a few tabs using this - that's not even considering
whether or not the system hypothesised here doesn't have trouble with
memory from a convenient get_line() function first.

While not really diff's fault or problem, a potential solution
detailed below would fix it and not cause any problems for those with
legit (?) needs for really wide diffs 

SUGGESTIONS:
Humans are limiting here, improvements and the growth of computers
wont really affect the maximum width so putting a limit in place is
reasonable. I make no claim there is a "maximum useful width" so being
able to override will ensure my half-assed musings on such a limit
wont cause any problems in the future. 

I'd go with something like
#define REASONABLE_LIMIT 1000

Add a check that numval is <= get_reasonable_specified_width_limit()
after the existing checks, if not output an error in the form of:

"You probably don't want to do that, see [wherever], if you do specify
--we-have-evolved-cylindical-lenses-now or set the environmental
variable GNU_DIFF_REASONABLE_LIMIT to a new limit, using 0 for none"

Lastly, for what it's worth from a perfect stranger:

I'm very impressed that diff didn't start consuming huge amounts of
memory, and a little saddened that it is impressive! 

Thanks very much for diff and your work on it, you have no idea how
many things it underpins! 

[Message part 2 (text/html, inline)]

Reply sent to Paul Eggert <eggert <at> cs.ucla.edu>:
You have taken responsibility. (Tue, 27 Aug 2019 23:24:02 GMT) Full text and rfc822 format available.

Notification sent to alec <at> unifiedmathematics.com:
bug acknowledged by developer. (Tue, 27 Aug 2019 23:24:02 GMT) Full text and rfc822 format available.

Message #10 received at 35256-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: alec <at> unifiedmathematics.com
Cc: 35256-done <at> debbugs.gnu.org
Subject: Re: [bug-diffutils] bug#35256: Bug report for -W argument (maximum
 width) - minor and not dangerous
Date: Tue, 27 Aug 2019 16:23:08 -0700
[Message part 1 (text/plain, inline)]
alec <at> unifiedmathematics.com wrote:

> I know diff is used by A LOT of other programs, some of which are
> web-accessible

I'm afraid that ship sailed a while ago: if you let a remote attacker specify an 
arbitrary option to GNU diff there is lots of other trouble you can get into. 
For example, the -I option lets the attacker specify a regular expression that 
can cause diff to undergo exponential complexity. The general wisdom nowadays is 
to not expose command-line operands to attackers.

As for putting in a limit, the GNU Coding Standards say to not impose arbitrary 
limits. In some cases there are good reasons to impose a limit anyway but this 
one doesn't seem to rise to that level.

You do raise a good point that 'diff' shouldn't treat negative inputs as if they 
were large positive inputs, so I installed the attached patch.

Thanks for reporting the problem; your bug report was a pleasure to read.
[0001-diff-don-t-mistreat-N-in-arg-as-a-large-number.patch (text/x-patch, attachment)]

Information forwarded to bug-diffutils <at> gnu.org:
bug#35256; Package diffutils. (Wed, 28 Aug 2019 00:57:01 GMT) Full text and rfc822 format available.

Message #13 received at 35256 <at> debbugs.gnu.org (full text, mbox):

From: Assaf Gordon <assafgordon <at> gmail.com>
To: 35256 <at> debbugs.gnu.org, alec <at> unifiedmathematics.com
Cc: eggert <at> cs.ucla.edu
Subject: Re: [bug-diffutils] bug#35256: Bug report for -W argument (maximum
 width) - minor and not dangerous
Date: Tue, 27 Aug 2019 18:56:46 -0600
Hello,

Slightly off-topic, but potentially helpful:

On 2019-08-27 5:23 p.m., Paul Eggert wrote:
> alec <at> unifiedmathematics.com wrote:
> 
>> I know diff is used by A LOT of other programs, some of which are
>> web-accessible
> 
> [...] if you let a remote attacker 
> specify an arbitrary option to GNU diff there is lots of other trouble 
> you can get into. 
> [....] The general wisdom nowadays is to not expose command-line 
> operands to attackers.
While generally true, sometimes there's no way around it
(or perhaps it is even the goal).

An easy way to restrict resources is to execute a simple
wrapper shell script that uses 'timeout', 'prlimit' and 'setpriv' for
additional restrictions.

For example:

 timeout 10s \
   setpriv --no-new-privs \
     prlimit --cpu=3 --data=50000000 --nproc=1 \
       diff [ARGS]

will limit the "diff" process to running 10 seconds (of wall time),
consume up to 3 seconds of CPU time,
use up to 50MB of memory,
and limit to a single process (so it can't execute child processes).
The "setpriv" ensures it can't gain new privileges.

"prlimit" has more options (e.g. "--fsize" to limit file sizes
so it won't fill the drive, and "--nofiles" to limit number of open files).

These should work on any modern gnu/linux system
("timeout" is from coreutils, "setpriv" and "prlimit" are from util-linux).

None of the above is perfect,
but they add a quick layer of additional restrictions
(and they don't require additional privileges to use).

To take it a step further, you can use containers and tools such as
"bubblewrap" and "firefail" to isolate a process from the network,
from the filesystem, and even from other processes.


Hope this helps,
 -assaf




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Wed, 25 Sep 2019 11:24:06 GMT) Full text and rfc822 format available.

This bug report was last modified 4 years and 212 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.