GNU bug report logs - #14158
performance: Base64 -d is slow

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: coreutils; Severity: wishlist; Reported by: Ole Tange <ole@HIDDEN>; Keywords: confirmed; dated Mon, 8 Apr 2013 07:39:02 UTC; Maintainer for coreutils is bug-coreutils@HIDDEN.
Changed bug title to 'performance: Base64 -d is slow' from 'Base64 -d is slow' Request was from Assaf Gordon <assafgordon@HIDDEN> to control <at> debbugs.gnu.org. Full text available.
Added tag(s) confirmed. Request was from Bob Proulx <bob@HIDDEN> to control <at> debbugs.gnu.org. Full text available.
Severity set to 'wishlist' from 'normal' Request was from Bob Proulx <bob@HIDDEN> to control <at> debbugs.gnu.org. Full text available.

Message received at 14158 <at> debbugs.gnu.org:


Received: (at 14158) by debbugs.gnu.org; 8 Apr 2013 20:01:48 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Mon Apr 08 16:01:48 2013
Received: from localhost ([127.0.0.1]:40509 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.72)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1UPIGC-00011i-BP
	for submit <at> debbugs.gnu.org; Mon, 08 Apr 2013 16:01:48 -0400
Received: from joseki.proulx.com ([216.17.153.58]:33928)
	by debbugs.gnu.org with esmtp (Exim 4.72)
	(envelope-from <bob@HIDDEN>) id 1UPIG9-00011Z-RI
	for 14158 <at> debbugs.gnu.org; Mon, 08 Apr 2013 16:01:47 -0400
Received: from hysteria.proulx.com (hysteria.proulx.com [192.168.230.119])
	by joseki.proulx.com (Postfix) with ESMTP id 3AA3F211E3
	for <14158 <at> debbugs.gnu.org>; Mon,  8 Apr 2013 13:58:10 -0600 (MDT)
Received: by hysteria.proulx.com (Postfix, from userid 1000)
	id 132882DC8E; Mon,  8 Apr 2013 13:58:08 -0600 (MDT)
Date: Mon, 8 Apr 2013 13:58:08 -0600
From: Bob Proulx <bob@HIDDEN>
To: 14158 <at> debbugs.gnu.org
Subject: tag and classify
Message-ID: <20130408195808.GA31673@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Spam-Score: -1.6 (-)
X-Debbugs-Envelope-To: 14158
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.13
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>,
	<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>,
	<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Sender: debbugs-submit-bounces <at> debbugs.gnu.org
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
X-Spam-Score: -2.9 (--)

severity 14158 wishlist
tag 14158 + confirmed
thanks




Information forwarded to bug-coreutils@HIDDEN:
bug#14158; Package coreutils. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 8 Apr 2013 07:38:18 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Mon Apr 08 03:38:18 2013
Received: from localhost ([127.0.0.1]:39376 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.72)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1UP6ef-0006YY-SH
	for submit <at> debbugs.gnu.org; Mon, 08 Apr 2013 03:38:18 -0400
Received: from eggs.gnu.org ([208.118.235.92]:57367)
	by debbugs.gnu.org with esmtp (Exim 4.72)
	(envelope-from <ole.tange@HIDDEN>) id 1UP6Ww-0005uc-Lm
	for submit <at> debbugs.gnu.org; Mon, 08 Apr 2013 03:30:20 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <ole.tange@HIDDEN>) id 1UP6TS-0007Gv-1n
	for submit <at> debbugs.gnu.org; Mon, 08 Apr 2013 03:26:47 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=-100.5 required=5.0 tests=BAYES_05,FREEMAIL_FROM,
	T_DKIM_INVALID,USER_IN_WHITELIST autolearn=disabled version=3.3.2
Received: from lists.gnu.org ([208.118.235.17]:60674)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <ole.tange@HIDDEN>) id 1UP6TR-0007Gr-Tl
	for submit <at> debbugs.gnu.org; Mon, 08 Apr 2013 03:26:41 -0400
Received: from eggs.gnu.org ([208.118.235.92]:40389)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <ole.tange@HIDDEN>) id 1UP6TL-000182-0U
	for bug-coreutils@HIDDEN; Mon, 08 Apr 2013 03:26:41 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <ole.tange@HIDDEN>) id 1UP6TC-0007BO-0v
	for bug-coreutils@HIDDEN; Mon, 08 Apr 2013 03:26:34 -0400
Received: from mail-bk0-x229.google.com ([2a00:1450:4008:c01::229]:39111)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <ole.tange@HIDDEN>) id 1UP6TB-0007B0-O2
	for bug-coreutils@HIDDEN; Mon, 08 Apr 2013 03:26:25 -0400
Received: by mail-bk0-f41.google.com with SMTP id i18so2938896bkv.0
	for <bug-coreutils@HIDDEN>; Mon, 08 Apr 2013 00:26:25 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
	h=x-received:mime-version:sender:from:date:x-google-sender-auth
	:message-id:subject:to:content-type;
	bh=wGyYCppYe7QEzRTeqLOEJOyba6ZxtRDtl7OAhCelweA=;
	b=sh8iPBftTFo9SdKo4M6p1YjsvlzHZB+R0qI2XFH3rAgJzRHzhTWHCOTvX+aaVpRAlj
	cDz1dG4n4atCbVKeGxnyX5KQh1mt/S448+QjsR2hxiSc1b6M7PaOYCPUe6m3rdUbdbBT
	1YyEy/E6AK9FW/vaOZUbj+6Q1X0jtv1kp39cs3tDUqN13CD0m1F+Uy2hUAALSKUSBWiR
	DqGHle5nHU2M02ssSqH8bx/S1/HKuQxtjfVK5ZNt6ASt9COfUOvXE1kLw4HHdW3j4rTW
	W/b1iSPF/fVgXlhuepazisKrUWiEYwSN/ny+FDBcQm0kPN7AnA1SvxzSrjUa35VrZSfE
	ArtA==
X-Received: by 10.205.122.80 with SMTP id gf16mr10603083bkc.130.1365405984893; 
	Mon, 08 Apr 2013 00:26:24 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.204.129.16 with HTTP; Mon, 8 Apr 2013 00:26:04 -0700 (PDT)
From: Ole Tange <ole@HIDDEN>
Date: Mon, 8 Apr 2013 09:26:04 +0200
X-Google-Sender-Auth: 44SF3057D1EKxX4IzSbhOazuPHo
Message-ID: <CA+4vN7woWZBJ7aKHnxBADP=WjmMRJMn21sO1msadBwO+M3kWYg@HIDDEN>
Subject: Base64 -d is slow
To: bug-coreutils@HIDDEN
Content-Type: text/plain; charset=ISO-8859-1
X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address
	(bad octet value).
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x
X-Received-From: 208.118.235.17
X-Spam-Score: -3.4 (---)
X-Debbugs-Envelope-To: submit
X-Mailman-Approved-At: Mon, 08 Apr 2013 03:38:16 -0400
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.13
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>,
	<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>,
	<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Sender: debbugs-submit-bounces <at> debbugs.gnu.org
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
X-Spam-Score: -6.1 (------)

I was astonished to learn that:

  perl -MMIME::Base64 -e 'while(read(STDIN,$buf,770*50)){print
decode_base64($buf)}'

is faster than:

  base64 -d

My C-skills are quite limited, but it seems the current implementation
does a lot of computation and not table lookups.

Below are my thoughts on how this could be improved.


/Ole


= Designing a fast Base64 decoder =

By: Ole Tange <tange@HIDDEN>

== Goal ==

We want something that is parallelizable since that can take advantage of:

* Multiple cores/CPUs
* Vector processing (MMX/SSE)
* Processors with pipelines

== Observations ==

In Base64 every 4 bytes represent 3 bytes of decoded information.

Binary input: aaaaaaaa bbbbbbbb cccccccc dddddddd
=>
Base64 index: 111111 222222 333333 444444
=>
Decoded: 11111122 22223333 33444444

Given input byte 1 and 2 we can determine output byte 1.

Given input byte 2 and 3 we can determine output byte 2.

Given input byte 3 and 4 we can determine output byte 3.

A way to make an algorithm parallelizable is by not making
computations based on previous results, but only base the computation
on the input.

== Idea ==

Make a lookup table for the output byte.

By making a 16-bit lookup table for the first 2 bytes, the middle 2
bytes and the last 2 bytes, conversion of 4 byte to 3 bytes is fast
and independent of previous computations.

The lookup table can also deal with the conversion to Base64 index -
thus skipping this step. It can even deal with all the non-conflicting
variations of which char codes for index 62 and 63.

  out_byte1 = first[in_byte1][in_byte2];
  out_byte2 = middle[in_byte2][in_byte3];
  out_byte3 = last[in_byte3][in_byte4];

Each lookup table will at most be 64kbytes = 192kbytes total.

== Implementation ==

Here is the implementation in pseudo code:

byte compute_first_byte(int i, int j) {
  /* Return the decoded value of the first byte, given the first 2
bytes are i and j */
}

byte compute_middle_byte(int i, int j) {
  /* Return the decoded value of the middle byte, given the middle 2
bytes are i and j */
}

byte compute_last_byte(int i, int j) {
  /* Return the decoded value of the last byte, given the last 2 bytes
are i and j */
}

/* Lookup tables */
static byte first[256][256];
static byte middle[256][256];
static byte last[256][256];
static bool already_computed = 0;

void decode(byte* in, int len, byte* out) {
  /* in = pointer to input bytes with no garbage (\n and similar)
     len = length of input (must in 4 byte blocks)
     out = pointer to output buffer of length 3/4*len
  */

  /* Initialize lookup tables */
  if(! already_computed) {
    parallel_for(int i = 0; i < 256; i++) {
      parallel_for(int j = 0; j < 256; j++) {
        first[i][j] = compute_first_byte(i,j);
        middle[i][j] = compute_middle_byte(i,j);
        last[i][j] = compute_last_byte(i,j);
      }
    }
    already_computed = 1;
  }

  /* Lookup byte1+2, byte2+3, byte3+4 */
  parallel_for(out_idx = 0, in_idx = 0; in_idx < len; in_idx += 4,
out_idx += 3) {
    out[out_idx] = first[ in[in_idx] ][ in[in_idx+1] ];
    out[out_idx+1] = middle[ in[in_idx+1] ][ in[in_idx+2] ];
    out[out_idx+2] = last[ in[in_idx+2] ][ in[in_idx+3] ];
  }
}

== Improvement ideas ==

Use SIMD to do multiple 4-byte blocks in parallel - maybe a line at a
time?




Acknowledgement sent to Ole Tange <ole@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-coreutils@HIDDEN. Full text available.
Report forwarded to bug-coreutils@HIDDEN:
bug#14158; Package coreutils. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Mon, 25 Nov 2019 12:00:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.