X-Loop: help-debbugs@HIDDEN Subject: bug#45343: factor: missed optimizations Resent-From: Denys Vlasenko <dvlasenk@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-coreutils@HIDDEN Resent-Date: Sun, 20 Dec 2020 18:06:01 +0000 Resent-Message-ID: <handler.45343.B.160848752426818 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: report 45343 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: coreutils@HIDDEN, 45343 <at> debbugs.gnu.org X-Debbugs-Original-To: coreutils@HIDDEN, bug-coreutils@HIDDEN Received: via spool by submit <at> debbugs.gnu.org id=B.160848752426818 (code B ref -1); Sun, 20 Dec 2020 18:06:01 +0000 Received: (at submit) by debbugs.gnu.org; 20 Dec 2020 18:05:24 +0000 Received: from localhost ([127.0.0.1]:45582 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1kr358-0006yT-NL for submit <at> debbugs.gnu.org; Sun, 20 Dec 2020 13:05:24 -0500 Received: from lists.gnu.org ([209.51.188.17]:49834) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <dvlasenk@HIDDEN>) id 1kr1OU-00005t-5X for submit <at> debbugs.gnu.org; Sun, 20 Dec 2020 11:17:14 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:43938) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <dvlasenk@HIDDEN>) id 1kr1OL-0002dZ-O5 for bug-coreutils@HIDDEN; Sun, 20 Dec 2020 11:17:10 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:36142) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from <dvlasenk@HIDDEN>) id 1kr1OF-0003Zk-Q6 for bug-coreutils@HIDDEN; Sun, 20 Dec 2020 11:17:03 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1608481014; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type; bh=tTMiRZSj207/JKpyM0F4UTby8BZMItnp6wrjWKWMV0Q=; b=Lh/1E+GTyXhbTUuLLrvC1WPZjhC9JRx08Xbk5+gbrM8D0kDYvz4mienLoJXGPo1BuVzHwz B3IK1tEusIhPQoRDWR1gbSbXiSnE1PRtKVQh5X2KPvvNbnAFLdJTam8PJT+9maXSyLy/eM nvLefPaXqi1pV0hwlzO0rBjV6aWyMI4= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-467-vxPYPNPsONOB7_BDNXcNQg-1; Sun, 20 Dec 2020 11:16:52 -0500 X-MC-Unique: vxPYPNPsONOB7_BDNXcNQg-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 39B1A801AC0; Sun, 20 Dec 2020 16:16:51 +0000 (UTC) Received: from [10.40.192.23] (unknown [10.40.192.23]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6CD095B692; Sun, 20 Dec 2020 16:16:50 +0000 (UTC) From: Denys Vlasenko <dvlasenk@HIDDEN> Message-ID: <9436ddae-b31b-7e1f-30a2-b56746344d85@HIDDEN> Date: Sun, 20 Dec 2020 17:16:49 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=dvlasenk@HIDDEN X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: multipart/mixed; boundary="------------CF1065E22818613C93660811" Content-Language: en-US Received-SPF: pass client-ip=216.205.24.124; envelope-from=dvlasenk@HIDDEN; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.4 (-) X-Mailman-Approved-At: Sun, 20 Dec 2020 13:05:20 -0500 X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -2.4 (--) This is a multi-part message in MIME format. --------------CF1065E22818613C93660811 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit factor ---debug output is rather rudimentary. In order to understand the logic, I added more printouts (try attached debug patch) and immediately noticed some obvious snafus: $ src/factor ---debug 340282366920938461286658806734041124249 [using arbitrary-precision arithmetic] 340282366920938461286658806734041124249:[trial division 340282366920938461286658806734041124249] [trial stopped 4999(max)] [is number 34028236692093846128665880673404112424 9 prime?] [no] [pollard-rho (1)] 1:1:2:4:8:16:32:64:128:256:512:1024:2048:4096:8192:16384:32768:65536:131072:262144:524288:1048576:2097152:4194304:8388608:16777216:33554432:67108864:134217728:268435456:536870912 [factor found][rho factor 18446744073709551557] [trial division 18446744073709551556] [factor 2] [factor 2] [factor 11] [factor 137] [factor 547] [trial stopped 4999(max)] [is number 5594472617641 prime?] [trial division 5594472617640] [factor 2] [factor 2] [factor 2] [factor 3] [factor 5] [factor 1427] [factor 2131] [trial stopped 2131] [is number 15331 prime?] [yes] [factor 15331] [yes] ... Look at the ens of the last line. We don't need to check whether 15331 is prime. We know it is: we just stopped trial division because next prime divisor to try is larger than sqrt(number_we_try_to_factor) - therefore number_we_try_to_factor has no more divisors, ergo it's prime. The fix: for (unsigned int i = 1; i <= PRIMES_PTAB_ENTRIES;) { if (! mpz_divisible_ui_p (t, p)) { p += primes_diff[i++]; if (mpz_cmp_ui (t, p * p) < 0) + mp_factor_insert (factors, t); + mpz_set_ui (t, 1); break; } else ... [factor 5594472617641] [factor 18446744073709551557] [trial division 18446744073709551556] [factor 2] [factor 2] [factor 11] [factor 137] [factor 547] [trial stopped 4999(max)] [is number 5594472617641 prime?] [trial division 5594472617640] [factor 2] [factor 2] [factor 2] [factor 3] [factor 5] [factor 1427] [factor 2131] [trial stopped 2131] [is number 15331 prime?] [yes] [factor 15331] [yes] [factor 5594472617641] [factor 18446744073709551557] [p ollard-rho end] 18446744073709551557 18446744073709551557 Second optimization is this: as you see, I deliberately fed it a square (the factorization result is 18446744073709551557^2). It happily did not notice it and spent an awful lot of time in pollard rho: 1:1:2:4:8:16:32:64:128:256:512:1024:2048:4096:8192:16384:32768:65536:131072:262144:524288:1048576:2097152:4194304:8388608:16777216:33554432:67108864:134217728:268435456:536870912 A quick check for square drastically cuts down execution time: /* Use Pollard-rho to compute the prime factors of arbitrary-precision T, and put the results in FACTORS. */ static void +mp_factor_big (mpz_t t, struct mp_factors *factors) +{ + /* the _big() function assumes trial division was already tried */ + devmsg ("[is number prime?] "); + if (mp_prime_p (t)) + mp_factor_insert (factors, t); + else + if (mpz_perfect_square_p (t)) + { + struct mp_factors f2; + devmsg ("[it is square] "); + mpz_sqrt (t, t); + mp_factor_init (&f2); + mp_factor_big (t, &f2); + for (unsigned long int i = 0; i < f2.nfactors; i++) + { + unsigned long int e = f2.e[i]; + while (e--) { // + mp_factor_insert (factors, f2.p[i]); //FIXME: obvious optimization: + mp_factor_insert (factors, f2.p[i]); // need to introduce/use mp_factor_insert_multiplicity(..., e*2); + } + } + mp_factor_clear (&f2); + } + else + mp_factor_using_pollard_rho (t, 1, factors); +} + +static void mp_factor (mpz_t t, struct mp_factors *factors) { mp_factor_init (factors); @@ -2265,11 +2309,7 @@ mp_factor (mpz_t t, struct mp_factors *f . if (mpz_cmp_ui (t, 1) != 0) { - devmsg ("[is number prime?] "); - if (mp_prime_p (t)) - mp_factor_insert (factors, t); - else - mp_factor_using_pollard_rho (t, 1, factors); + mp_factor_big (t, factors); } } } --------------CF1065E22818613C93660811 Content-Type: text/x-patch; name="zz.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="zz.diff" diff -urp coreutils-8.32/src/factor.c coreutils-8.32-TT/src/factor.c --- coreutils-8.32/src/factor.c 2020-01-01 15:13:12.000000000 +0100 +++ coreutils-8.32-TT/src/factor.c 2020-12-20 17:08:33.022081567 +0100 @@ -620,6 +620,8 @@ mp_factor_insert (struct mp_factors *fac unsigned long int *e = factors->e; long i; +gmp_printf ("[factor %Zd] ", prime); + /* Locate position for insert new or increment e. */ for (i = nfactors - 1; i >= 0; i--) { @@ -842,7 +844,8 @@ mp_factor_using_division (mpz_t t, struc mpz_t q; unsigned long int p; - devmsg ("[trial division] "); +gmp_printf ("[trial division %Zd] ", t); +// devmsg ("[trial division] "); mpz_init (q); @@ -861,6 +864,8 @@ mp_factor_using_division (mpz_t t, struc { p += primes_diff[i++]; if (mpz_cmp_ui (t, p * p) < 0) + mp_factor_insert (factors, t); + mpz_set_ui (t, 1); break; } else @@ -1407,6 +1412,7 @@ mp_prime_p (mpz_t n) if (!mp_millerrabin (n, nm1, a, tmp, q, k)) { is_prime = false; +gmp_printf ("[ML1:no] "); goto ret2; } @@ -1414,6 +1420,7 @@ mp_prime_p (mpz_t n) { /* Factor n-1 for Lucas. */ mpz_set (tmp, nm1); +gmp_printf ("[ML does not know, trying Lucas] "); mp_factor (tmp, &factors); } @@ -1438,13 +1445,16 @@ mp_prime_p (mpz_t n) } if (is_prime) +{gmp_printf ("[Lucas:yes] "); goto ret1; +} mpz_add_ui (a, a, primes_diff[r]); /* Establish new base. */ if (!mp_millerrabin (n, nm1, a, tmp, q, k)) { is_prime = false; +gmp_printf ("[ML2:no] "); goto ret1; } } @@ -1692,6 +1702,7 @@ mp_factor_using_pollard_rho (mpz_t n, un { for (;;) { +printf("%llu", k); do { mpz_mul (t, x, x); @@ -1715,6 +1726,7 @@ mp_factor_using_pollard_rho (mpz_t n, un mpz_set (z, x); k = l; l = 2 * l; +devmsg(":"); for (unsigned long long int i = 0; i < k; i++) { mpz_mul (t, x, x); @@ -1725,6 +1737,7 @@ mp_factor_using_pollard_rho (mpz_t n, un } factor_found: +gmp_printf("[factor found]"); do { mpz_mul (t, y, y); @@ -1738,6 +1751,7 @@ mp_factor_using_pollard_rho (mpz_t n, un mpz_divexact (n, n, t); /* divide by t, before t is overwritten */ +gmp_printf("[rho factor %Zd] ", t); if (!mp_prime_p (t)) { devmsg ("[composite factor--restarting pollard-rho] "); @@ -1759,6 +1773,7 @@ mp_factor_using_pollard_rho (mpz_t n, un mpz_mod (y, y, n); } + devmsg ("[pollard-rho end] "); mpz_clears (P, t2, t, z, x, y, NULL); } #endif @@ -2255,6 +2270,35 @@ factor (uintmax_t t1, uintmax_t t0, stru /* Use Pollard-rho to compute the prime factors of arbitrary-precision T, and put the results in FACTORS. */ static void +mp_factor_big (mpz_t t, struct mp_factors *factors) +{ + /* the _big() function assumes trial division was already tried */ + devmsg ("[is number prime?] "); + if (mp_prime_p (t)) + mp_factor_insert (factors, t); + else + if (mpz_perfect_square_p (t)) + { + struct mp_factors f2; + devmsg ("[it is square] "); + mpz_sqrt (t, t); + mp_factor_init (&f2); + mp_factor_big (t, &f2); + for (unsigned long int i = 0; i < f2.nfactors; i++) + { + unsigned long int e = f2.e[i]; + while (e--) { + mp_factor_insert (factors, f2.p[i]); + mp_factor_insert (factors, f2.p[i]); + } + } + mp_factor_clear (&f2); + } + else + mp_factor_using_pollard_rho (t, 1, factors); +} + +static void mp_factor (mpz_t t, struct mp_factors *factors) { mp_factor_init (factors); @@ -2265,11 +2309,7 @@ mp_factor (mpz_t t, struct mp_factors *f if (mpz_cmp_ui (t, 1) != 0) { - devmsg ("[is number prime?] "); - if (mp_prime_p (t)) - mp_factor_insert (factors, t); - else - mp_factor_using_pollard_rho (t, 1, factors); + mp_factor_big (t, factors); } } } @@ -2599,6 +2639,8 @@ do_stdin (void) int main (int argc, char **argv) { + setvbuf(stdout, NULL, _IONBF, 0); + initialize_main (&argc, &argv); set_program_name (argv[0]); setlocale (LC_ALL, ""); --------------CF1065E22818613C93660811--
Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.505 (Entity 5.505) Content-Type: text/plain; charset=utf-8 X-Loop: help-debbugs@HIDDEN From: help-debbugs@HIDDEN (GNU bug Tracking System) To: Denys Vlasenko <dvlasenk@HIDDEN> Subject: bug#45343: Acknowledgement (factor: missed optimizations) Message-ID: <handler.45343.B.160848752426818.ack <at> debbugs.gnu.org> References: <9436ddae-b31b-7e1f-30a2-b56746344d85@HIDDEN> X-Gnu-PR-Message: ack 45343 X-Gnu-PR-Package: coreutils Reply-To: 45343 <at> debbugs.gnu.org Date: Sun, 20 Dec 2020 18:06:01 +0000 Thank you for filing a new bug report with debbugs.gnu.org. This is an automatically generated reply to let you know your message has been received. Your message is being forwarded to the package maintainers and other interested parties for their attention; they will reply in due course. Your message has been sent to the package maintainer(s): bug-coreutils@HIDDEN If you wish to submit further information on this problem, please send it to 45343 <at> debbugs.gnu.org. Please do not send mail to help-debbugs@HIDDEN unless you wish to report a problem with the Bug-tracking system. --=20 45343: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D45343 GNU Bug Tracking System Contact help-debbugs@HIDDEN with problems
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.