GNU bug report logs - #73484
31.0.50; Abolishing etags-regen-file-extensions

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: emacs; Severity: wishlist; Reported by: Sean Whitton <spwhitton@HIDDEN>; dated Wed, 25 Sep 2024 19:41:01 UTC; Maintainer for emacs is bug-gnu-emacs@HIDDEN.
Severity set to 'wishlist' from 'normal' Request was from Stefan Kangas <stefankangas@HIDDEN> to control <at> debbugs.gnu.org. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 11 Oct 2024 10:37:59 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri Oct 11 06:37:59 2024
Received: from localhost ([127.0.0.1]:33589 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1szD1z-0001kf-95
	for submit <at> debbugs.gnu.org; Fri, 11 Oct 2024 06:37:59 -0400
Received: from eggs.gnu.org ([209.51.188.92]:54344)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <pot@HIDDEN>) id 1szD1w-0001kP-Qt
 for 73484 <at> debbugs.gnu.org; Fri, 11 Oct 2024 06:37:57 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <pot@HIDDEN>)
 id 1szD1e-0002n1-49; Fri, 11 Oct 2024 06:37:38 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=MIME-Version:References:Subject:In-Reply-To:To:Date:
 From; bh=9qtRSCg8ELNsAkCPm/WGEODlzdtzt/iywHXVk6giWgg=; b=LT4Mlj35bCTEODaRbpmq
 /GBMVE1M/ITW4tdiCK24jqnKzWI/qYNElFL7NXgaqTQGU0Fwes21QHsQFEe+C+5EhP3HsKE/M8JJo
 qGeFWyl/HlP4trjlow/ka2K4WGywB7wZtlhxTyb29KXnzOHW3klqMQ8pHzUVMGn0jqUsnt096tBWv
 od4f3tAESQzhNX2MKSFqOs/8dOqIlPUqHdwInilm9s/CnEk6t4VydeIQv+yhgN042M44r6vHV9Z9y
 odHYWAYxYBImCjtrFWPbUrXhUfmlmXfzsKYuTdv5HqG/IcMhpG1bZiufCMjE5hLPHqIxuSJBhxykE
 Du39q+jSspErMg==;
Message-Id: <8734l3hqmp.fsf@HIDDEN>
From: =?iso-8859-1?Q?Francesco_Potort=EC?= <pot@HIDDEN>
Date: Fri, 11 Oct 2024 12:37:34 +0200
To: Eli Zaretskii <eliz@HIDDEN>
In-Reply-To: <86frp32a90.fsf@HIDDEN> (eliz@HIDDEN)
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
References: <87tteaznog.fsf@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN>
 <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN>
 <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN>
 <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN>
 <86ttdtzoof.fsf@HIDDEN>
 <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN>
 <86ttdsxt6x.fsf@HIDDEN>
 <52cb1caa-9e7e-45df-b328-d60948d397f6@HIDDEN>
 <864j5rxca1.fsf@HIDDEN>
 <b59bf102-a9d8-4723-91ac-acc3f8ff3aa8@HIDDEN>
 <87a5fiijy9.fsf@HIDDEN>
 <bd6751c8-5504-48d7-82d7-a3e8849a1910@HIDDEN>
 <86jzelvjh4.fsf@HIDDEN>
 <8b6560a9-e2d6-42ae-ac1d-014700f21804@HIDDEN>
 <86wmiktzez.fsf@HIDDEN>
 <b0192173-3d8a-49da-8792-521b9b486568@HIDDEN>
 <86ldyzucdd.fsf@HIDDEN>
 <021c625b-adc9-4e19-819c-fe929583e503@HIDDEN>
 <86ed4ru41x.fsf@HIDDEN>
 <8d86f23e-fdc3-45a5-b3c8-cd4670813e21@HIDDEN>
 <86ploasq35.fsf@HIDDEN> <3e63f532-c6af-4923-880b-01a32cc667ec@HIDDEN>
 <878quwix4c.fsf@HIDDEN> <86ldyw3467.fsf@HIDDEN>
 <875xq0icqa.fsf@HIDDEN> <86y12w1hjp.fsf@HIDDEN>
 <874j5khw6f.fsf@HIDDEN> <86frp32a90.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Organization: The GNU project
X-fingerprint: 4B02 6187 5C03 D6B1 2E31  7666 09DF 2DC9 BE21 6115
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 73484
Cc: dmitry@HIDDEN, 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

>> From: Francesco Potort=C3=AC <pot@HIDDEN>
>> Date: Thu, 10 Oct 2024 16:25:28 +0200
>> Cc: dmitry@HIDDEN,
>> 	73484 <at> debbugs.gnu.org,
>> 	spwhitton@HIDDEN
>>=20
>>   for (fdp =3D fdhead; fdp !=3D NULL; fdp =3D fdp->next)
>>     {
>>       assert (fdp->infname !=3D NULL);
>>       if (streq (uncompressed_name, fdp->infname))
>> 	goto cleanup;
>>     }
>>=20
>> This is a simple O^2 comparison, which is repeated sum(1,N,N-1)=3D~N^2/2=
, which for ~375k files means ~70G comparisons.  If you can count the numbe=
r of times streq is called and 70G is a substantial portion of that number,=
 then we have the culprit.  To check, just remove the above test and see if=
 the running time drops.
>
>Dmitry already made this check, and the run time did drop, see
>https://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D73484#107

Yes, sorry, I am travelling and I had missed that email.

>> In that case, using a hash rather than a comparison would probably make =
sense.
>
>Right.

If I recall correctly, etags depends on libc only.  If that is really the c=
ase, it would be nice to create an ad hoc has function without relying on a=
dditional libraries.

>> Alternatively, rather than managing file names in a single loop, do a fi=
rst loop on all file names to canonicalise them, but without searching for =
tags (essentially, remove the call to process_file from process_file_name),=
 then uniquify the list of canonicalised file names, then run process_file =
on them.
>
>I don't think this is possible because command-line options can be
>interspersed with file names, and each option affects the processing
>of the files whose names follow the option.

It should be possible as I have outlined above.  When the command line is p=
arsed, process_file_name is called on each file name.  It canonicalises the=
 current name, compares it with the previous file names, adds a new node co=
ntaining the canonicalised name to a linked list and calls process_file on =
the file name.  It is possible to remove the last step and instead call pro=
cess_file in a second loop, but I do not know if it is convenient.

The uniquify solutions would be nonparametric, if I am not wrong.  While th=
e hash solution requires choosing the size of the hash table.

I guess that the hash solution is simpler and equally efficient in the grea=
t majority of cases, provided that the size of the hash table is appropriat=
e.  Probably it would be reasonable to start with a 20-bit hash.  And incre=
ase that number if in some years it will look reasonable doing so.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 10 Oct 2024 16:28:37 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Oct 10 12:28:37 2024
Received: from localhost ([127.0.0.1]:60311 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1syw1l-00036N-6w
	for submit <at> debbugs.gnu.org; Thu, 10 Oct 2024 12:28:37 -0400
Received: from eggs.gnu.org ([209.51.188.92]:60102)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eliz@HIDDEN>) id 1syw1i-000367-Ek
 for 73484 <at> debbugs.gnu.org; Thu, 10 Oct 2024 12:28:35 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1syw1Q-0005MG-6I; Thu, 10 Oct 2024 12:28:16 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From:
 Date; bh=tpG2CqdiUKDgUXI8pHyuSJPhTNJXhvrnQc//MbX3Klo=; b=IfwAqZ+We+aMl+kypxtL
 v1X4DZ/2EkuM5JrAL+hlDyIYpxi8aGmJCkuKBYlWQCGW0mvUayvpKjTdna6zXGmMvO6lIhDO7p81Q
 MbdbXV15NlhmfvtGGaGuhVJ3yBaIugWJyrCjebF+Y5ci1tO+ocARoI5vPBeYyydgUCtfFailW4EO9
 tSpK1iJPm4U/DI+Pk1tb4JXofD/aCKHweEbw6DcxtJdC4qt06q5FlY6YfzuM0ZrX0NUALgS4w+HTg
 KrZ/eQCbqbhCzNMr1IpNPqO8SHhS6ekqYytduJxOcqvX0v+6qBVL1oFWvdM52bLvxErr6DtBLXCV3
 1mMZCuIk1+2PNg==;
Date: Thu, 10 Oct 2024 19:28:11 +0300
Message-Id: <86frp32a90.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Francesco =?iso-8859-1?Q?Potort=EC?= <pot@HIDDEN>
In-Reply-To: <874j5khw6f.fsf@HIDDEN> (message from Francesco
 =?iso-8859-1?Q?Potort=EC?= on Thu, 10 Oct 2024 16:25:28 +0200)
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
References: <87tteaznog.fsf@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN>
 <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN>
 <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN>
 <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN>
 <86ttdtzoof.fsf@HIDDEN>
 <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN>
 <86ttdsxt6x.fsf@HIDDEN>
 <52cb1caa-9e7e-45df-b328-d60948d397f6@HIDDEN>
 <864j5rxca1.fsf@HIDDEN>
 <b59bf102-a9d8-4723-91ac-acc3f8ff3aa8@HIDDEN>
 <87a5fiijy9.fsf@HIDDEN>
 <bd6751c8-5504-48d7-82d7-a3e8849a1910@HIDDEN>
 <86jzelvjh4.fsf@HIDDEN>
 <8b6560a9-e2d6-42ae-ac1d-014700f21804@HIDDEN>
 <86wmiktzez.fsf@HIDDEN>
 <b0192173-3d8a-49da-8792-521b9b486568@HIDDEN>
 <86ldyzucdd.fsf@HIDDEN>
 <021c625b-adc9-4e19-819c-fe929583e503@HIDDEN>
 <86ed4ru41x.fsf@HIDDEN>
 <8d86f23e-fdc3-45a5-b3c8-cd4670813e21@HIDDEN>
 <86ploasq35.fsf@HIDDEN> <3e63f532-c6af-4923-880b-01a32cc667ec@HIDDEN>
 <878quwix4c.fsf@HIDDEN> <86ldyw3467.fsf@HIDDEN>
 <875xq0icqa.fsf@HIDDEN> <86y12w1hjp.fsf@HIDDEN>
 <874j5khw6f.fsf@HIDDEN>
MIME-version: 1.0
Content-type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 73484
Cc: dmitry@HIDDEN, 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

> From: Francesco Potortì <pot@HIDDEN>
> Date: Thu, 10 Oct 2024 16:25:28 +0200
> Cc: dmitry@HIDDEN,
> 	73484 <at> debbugs.gnu.org,
> 	spwhitton@HIDDEN
> 
>   for (fdp = fdhead; fdp != NULL; fdp = fdp->next)
>     {
>       assert (fdp->infname != NULL);
>       if (streq (uncompressed_name, fdp->infname))
> 	goto cleanup;
>     }
> 
> This is a simple O^2 comparison, which is repeated sum(1,N,N-1)=~N^2/2, which for ~375k files means ~70G comparisons.  If you can count the number of times streq is called and 70G is a substantial portion of that number, then we have the culprit.  To check, just remove the above test and see if the running time drops.

Dmitry already made this check, and the run time did drop, see
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=73484#107

> In that case, using a hash rather than a comparison would probably make sense.

Right.

> Alternatively, rather than managing file names in a single loop, do a first loop on all file names to canonicalise them, but without searching for tags (essentially, remove the call to process_file from process_file_name), then uniquify the list of canonicalised file names, then run process_file on them.

I don't think this is possible because command-line options can be
interspersed with file names, and each option affects the processing
of the files whose names follow the option.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 10 Oct 2024 14:28:12 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Oct 10 10:28:12 2024
Received: from localhost ([127.0.0.1]:60167 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1syu9D-00058Q-HP
	for submit <at> debbugs.gnu.org; Thu, 10 Oct 2024 10:28:12 -0400
Received: from eggs.gnu.org ([209.51.188.92]:54268)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <pot@HIDDEN>) id 1syu9B-000588-7O
 for 73484 <at> debbugs.gnu.org; Thu, 10 Oct 2024 10:28:10 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <pot@HIDDEN>)
 id 1syu6m-0007ih-Tu; Thu, 10 Oct 2024 10:25:40 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=MIME-Version:References:Subject:In-Reply-To:To:Date:
 From; bh=jzpwtLuufN0WlzSpynCAMjPVm7WWIS9CI/wT5NUeOe0=; b=U5WjZGC2ZnqhHVfSmH2Y
 dC2wbNCkb6NvBiaH9qFDcc5WmfbpuBRaRmFLDJ+U0EEdaZfNXsPMObhZXemTDAbunuVUcdfbFUogS
 RYpeV39qTOZ+KGqWHt3rN02VYXj2p93PV7VHl8ud/NRTK2s/s1TGJigtywOSDAQELB8Uh3EwdhvUy
 m8Gk7fOBVnm04DGBDJMwYgeAE9o2u201t06Qg/Z6XilM5mSd06H+3Wuior6mf5D+0s0f8KnHMI84A
 SvXh3H2/6Mj1T8mHL671g2KXLIu+0ZAUxHdkcFHIySFB5pkZPHneM5POf/m+LDGs8FCeE0gDg3niZ
 Jd9es10vJ9Fh3A==;
Message-Id: <874j5khw6f.fsf@HIDDEN>
From: =?iso-8859-1?Q?Francesco_Potort=EC?= <pot@HIDDEN>
Date: Thu, 10 Oct 2024 16:25:28 +0200
To: Eli Zaretskii <eliz@HIDDEN>
In-Reply-To: <86y12w1hjp.fsf@HIDDEN> (eliz@HIDDEN)
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
References: <87tteaznog.fsf@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN>
 <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN>
 <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN>
 <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN>
 <86ttdtzoof.fsf@HIDDEN>
 <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN>
 <86ttdsxt6x.fsf@HIDDEN>
 <52cb1caa-9e7e-45df-b328-d60948d397f6@HIDDEN>
 <864j5rxca1.fsf@HIDDEN>
 <b59bf102-a9d8-4723-91ac-acc3f8ff3aa8@HIDDEN>
 <87a5fiijy9.fsf@HIDDEN>
 <bd6751c8-5504-48d7-82d7-a3e8849a1910@HIDDEN>
 <86jzelvjh4.fsf@HIDDEN>
 <8b6560a9-e2d6-42ae-ac1d-014700f21804@HIDDEN>
 <86wmiktzez.fsf@HIDDEN>
 <b0192173-3d8a-49da-8792-521b9b486568@HIDDEN>
 <86ldyzucdd.fsf@HIDDEN>
 <021c625b-adc9-4e19-819c-fe929583e503@HIDDEN>
 <86ed4ru41x.fsf@HIDDEN>
 <8d86f23e-fdc3-45a5-b3c8-cd4670813e21@HIDDEN>
 <86ploasq35.fsf@HIDDEN> <3e63f532-c6af-4923-880b-01a32cc667ec@HIDDEN>
 <878quwix4c.fsf@HIDDEN> <86ldyw3467.fsf@HIDDEN>
 <875xq0icqa.fsf@HIDDEN> <86y12w1hjp.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain
Organization: The GNU project
X-fingerprint: 4B02 6187 5C03 D6B1 2E31  7666 09DF 2DC9 BE21 6115
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 73484
Cc: dmitry@HIDDEN, 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

>> I had a quick look at the whole code and in fact the only place I can find where ou have O^2 behaviour seems to be file name comparison, and it still looks so strange to me that this can in facrt cause significant delay.
>
>We are using etags on a huge tree: about 375K files.  I think that's
>the reason, because non-linear behaviors are like that: they are
>insignificant with small sets, but huge with larger ones...
>
>Profiles don't lie...

Ok, makes sense.  I must have missed the number of files in your previous explanations, sorry.  The only other place where I found O^2 behaviour is when managing #line directives, but you already tried to disable them without much change.  So let's concentrate on file name comparison which is done in process_file_name at

  for (fdp = fdhead; fdp != NULL; fdp = fdp->next)
    {
      assert (fdp->infname != NULL);
      if (streq (uncompressed_name, fdp->infname))
	goto cleanup;
    }

This is a simple O^2 comparison, which is repeated sum(1,N,N-1)=~N^2/2, which for ~375k files means ~70G comparisons.  If you can count the number of times streq is called and 70G is a substantial portion of that number, then we have the culprit.  To check, just remove the above test and see if the running time drops.

In that case, using a hash rather than a comparison would probably make sense.  Alternatively, rather than managing file names in a single loop, do a first loop on all file names to canonicalise them, but without searching for tags (essentially, remove the call to process_file from process_file_name), then uniquify the list of canonicalised file names, then run process_file on them.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 10 Oct 2024 10:18:36 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Oct 10 06:18:36 2024
Received: from localhost ([127.0.0.1]:58757 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1syqFf-0008Tw-RK
	for submit <at> debbugs.gnu.org; Thu, 10 Oct 2024 06:18:36 -0400
Received: from fout-a7-smtp.messagingengine.com ([103.168.172.150]:38959)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <dmitry@HIDDEN>) id 1syqFd-0008Ti-U6
 for 73484 <at> debbugs.gnu.org; Thu, 10 Oct 2024 06:18:35 -0400
Received: from phl-compute-06.internal (phl-compute-06.phl.internal
 [10.202.2.46])
 by mailfout.phl.internal (Postfix) with ESMTP id 8EE4E13805F1;
 Thu, 10 Oct 2024 06:18:17 -0400 (EDT)
Received: from phl-imap-04 ([10.202.2.82])
 by phl-compute-06.internal (MEProxy); Thu, 10 Oct 2024 06:18:17 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gutov.dev; h=cc
 :cc:content-type:content-type:date:date:from:from:in-reply-to
 :in-reply-to:message-id:mime-version:references:reply-to:subject
 :subject:to:to; s=fm1; t=1728555497; x=1728641897; bh=f8KGo//n+a
 GXlD0U+2AaYpJFZcxPpp0nY4Gj6RFY7qw=; b=U3cFNSgiL+trNptlK6WjX+sDfM
 yPIigvYBywznm+KXugN5seohY+S8RihSKF6vTkco9a50eD9MbTrNdEHhwgjn8RfP
 X7kg+9SixRxlK0RMIlP+jmJbOhXBfgZS2DT63+GA2GByArJHqWPczfwr6toJXGpW
 6/y+nrhBE0pBJvWs+u+7m24TcK7c030MOSSSzvzggo6RP9woL+QiUSW/YBmAEB9u
 8vrMsDfKwXNvCAvimVUGFbS00lIcFd8pehE/84wgo5w+VAYS01CPDyfHLfIzYuYs
 vyOIOw/x/Kb3GCBEEnq8cf35DNVcURaYxRkS2jBBVG3uV4Qe9yFCjYptsCmw==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 messagingengine.com; h=cc:cc:content-type:content-type:date:date
 :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to
 :message-id:mime-version:references:reply-to:subject:subject:to
 :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=
 fm2; t=1728555497; x=1728641897; bh=f8KGo//n+aGXlD0U+2AaYpJFZcxP
 pp0nY4Gj6RFY7qw=; b=L4Vx30qQhQ3snUNbYJn9Yt/n9vBnvrjoazYiJqRXRd5i
 NmhjWfrufypjmmmjdetyZUOiedbpgrH5z00cniFQjOzpNM4I2eYRFrts4HusIuPe
 /095KiXPqkmLHwRQAadc9byv7PgaAANbePZBAC2QtN+PhTSzkmqGZiYfKQJLHtJW
 dDGXrGt1M6KXQGou4d6Wyy5DJYez8+0EV3I6W0BvN2c/S8TSf8mivCD1D0rqq3sE
 uiQeixCQKwIp0cV4PUTkYwkzsnCaFdJwGQcqn4rxgoOkKhdab1d3i9Cyh4q3QcZ3
 WunzPZjZHvGHpunXK6Yr09x0EyrZABHxL19XL1AOdA==
X-ME-Sender: <xms:6akHZ7P9miuuxGmbgjwkguZWzsrLRfMqQBlei_QwzVUyOV5QztRqfg>
 <xme:6akHZ1_z5jPMWEEZgV8pzq8sSU2_VWku3G4zpl257MPnOQYUknX1ZqnWHtoLJQjvN
 zJXAde13-EvWs0iOng>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvdefhedgvdekucetufdoteggodetrfdotf
 fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu
 rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnh
 htshculddquddttddmnecujfgurhepofggfffhvfevkfgjfhfutgesrgdtreerredtjeen
 ucfhrhhomhepfdffmhhithhrhicuifhuthhovhdfuceoughmihhtrhihsehguhhtohhvrd
 guvghvqeenucggtffrrghtthgvrhhnpeekveejuddvlefgueelvefhffekudetgfelvdeu
 leeghfelieeghfevteegfffhffenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmh
 epmhgrihhlfhhrohhmpegumhhithhrhiesghhuthhovhdruggvvhdpnhgspghrtghpthht
 ohepgedpmhhouggvpehsmhhtphhouhhtpdhrtghpthhtohepjeefgeekgeesuggvsggsuh
 hgshdrghhnuhdrohhrghdprhgtphhtthhopegvlhhiiiesghhnuhdrohhrghdprhgtphht
 thhopehpohhtsehgnhhurdhorhhgpdhrtghpthhtohepshhpfihhihhtthhonhesshhpfi
 hhihhtthhonhdrnhgrmhgv
X-ME-Proxy: <xmx:6akHZ6RR8UzzNbmEVPGgMSsWs1KGO_IbtLSE6_WxPIo5mF1jBpXsfg>
 <xmx:6akHZ_uGLFJzgpvOLQx29a4jOn3ogLm2Q-VnRVe55Iy9FjRROZIXiA>
 <xmx:6akHZze_4lPPf2LoJQgoM1e40m5PaXckFHWIRc9dqXMLxXE4uDNA_Q>
 <xmx:6akHZ71w8eh_iMt3l4JCKko-FmsguH9sC0cmjrOJlAs4BI71-KVKpw>
 <xmx:6akHZ_5QHpAvKCody-Zslwl75fe1FDCd1QUYoOo-An5YAqDpe7wr8u_b>
Feedback-ID: i07de48aa:Fastmail
Received: by mailuser.phl.internal (Postfix, from userid 501)
 id 26D982E60084; Thu, 10 Oct 2024 06:18:17 -0400 (EDT)
X-Mailer: MessagingEngine.com Webmail Interface
MIME-Version: 1.0
Date: Thu, 10 Oct 2024 12:17:52 +0200
From: "Dmitry Gutov" <dmitry@HIDDEN>
To: =?UTF-8?Q?Francesco_Potort=C3=AC?= <pot@HIDDEN>
Message-Id: <4f454dfa-1fab-4f97-ac19-0cc6914ca5de@HIDDEN>
In-Reply-To: <878quwix4c.fsf@HIDDEN>
References: <87tteaznog.fsf@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN>
 <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN>
 <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN>
 <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN>
 <86ttdtzoof.fsf@HIDDEN>
 <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN>
 <86ttdsxt6x.fsf@HIDDEN>
 <52cb1caa-9e7e-45df-b328-d60948d397f6@HIDDEN>
 <864j5rxca1.fsf@HIDDEN>
 <b59bf102-a9d8-4723-91ac-acc3f8ff3aa8@HIDDEN>
 <87a5fiijy9.fsf@HIDDEN>
 <bd6751c8-5504-48d7-82d7-a3e8849a1910@HIDDEN>
 <86jzelvjh4.fsf@HIDDEN>
 <8b6560a9-e2d6-42ae-ac1d-014700f21804@HIDDEN>
 <86wmiktzez.fsf@HIDDEN>
 <b0192173-3d8a-49da-8792-521b9b486568@HIDDEN>
 <86ldyzucdd.fsf@HIDDEN>
 <021c625b-adc9-4e19-819c-fe929583e503@HIDDEN>
 <86ed4ru41x.fsf@HIDDEN>
 <8d86f23e-fdc3-45a5-b3c8-cd4670813e21@HIDDEN>
 <86ploasq35.fsf@HIDDEN>
 <3e63f532-c6af-4923-880b-01a32cc667ec@HIDDEN>
 <878quwix4c.fsf@HIDDEN>
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
Content-Type: multipart/alternative; boundary=df510841c8144fd398cc60b60b5dcd22
X-Spam-Score: -0.7 (/)
X-Debbugs-Envelope-To: 73484
Cc: Eli Zaretskii <eliz@HIDDEN>, 73484 <at> debbugs.gnu.org,
 spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.7 (-)

--df510841c8144fd398cc60b60b5dcd22
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable

On Thu, Oct 10, 2024, at 3:07 AM, Francesco Potort=C3=AC wrote:
> >Here is the nested loop, which if I comment out, makes the parse fini=
sh=20
> >in ~20 seconds, with all the extra files (except *.js), or in 15s whe=
n=20
> >using with new flags.
> >
> >diff --git a/lib-src/etags.c b/lib-src/etags.c
> >index a822a823a90..331e3ffe816 100644
> >--- a/lib-src/etags.c
> >+++ b/lib-src/etags.c
> >@@ -1697,14 +1697,14 @@ process_file_name (char *file, language *lang)
> >        uncompressed_name =3D file;
> >      }
> >
> >-  /* If the canonicalized uncompressed name
> >-     has already been dealt with, skip it silently. */
> >-  for (fdp =3D fdhead; fdp !=3D NULL; fdp =3D fdp->next)
> >-    {
> >-      assert (fdp->infname !=3D NULL);
> >-      if (streq (uncompressed_name, fdp->infname))
> >- goto cleanup;
> >-    }
> >+  /* /\* If the canonicalized uncompressed name */
> >+  /*    has already been dealt with, skip it silently. *\/ */
> >+  /* for (fdp =3D fdhead; fdp !=3D NULL; fdp =3D fdp->next) */
> >+  /*   { */
> >+  /*     assert (fdp->infname !=3D NULL); */
> >+  /*     if (streq (uncompressed_name, fdp->infname)) */
> >+  /* goto cleanup; */
> >+  /*   } */
> >
> >    inf =3D fopen (file, "r" FOPEN_BINARY);
> >    if (inf)
> >
> >This is basically a "uniqueness" operation using linear search, O(N^2=
).
>=20
> This is only for dealing with the case when the same file exists in bo=
th compressed and uncompressed form, and we are currently hitting the se=
cond one.  In that case, we should skip it.  Yes, this is a uniqueness t=
est and yes, it is O^2 in the number of file names, but I doubt that thi=
s can explain a serious slowdown.
Like mentioned in a previous email, I did recompile with that step remov=
ed, and the slowdown was gone.

The whole scan went down to ~20 seconds.
--df510841c8144fd398cc60b60b5dcd22
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE html><html><head><title></title><style type=3D"text/css">p.Mso=
Normal,p.MsoNoSpacing{margin:0}</style></head><body><div>On Thu, Oct 10,=
 2024, at 3:07 AM, Francesco Potort=C3=AC wrote:<br></div><blockquote ty=
pe=3D"cite" id=3D"qt" style=3D""><div>&gt;Here is the nested loop, which=
 if I comment out, makes the parse finish&nbsp;<br></div><div>&gt;in ~20=
 seconds, with all the extra files (except *.js), or in 15s when&nbsp;<b=
r></div><div>&gt;using with new flags.<br></div><div>&gt;<br></div><div>=
&gt;diff --git a/lib-src/etags.c b/lib-src/etags.c<br></div><div>&gt;ind=
ex a822a823a90..331e3ffe816 100644<br></div><div>&gt;--- a/lib-src/etags=
.c<br></div><div>&gt;+++ b/lib-src/etags.c<br></div><div>&gt;@@ -1697,14=
 +1697,14 @@ process_file_name (char *file, language *lang)<br></div><di=
v>&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; uncompressed_name =3D f=
ile;<br></div><div>&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<br></div><div>&g=
t;<br></div><div>&gt;-&nbsp; /* If the canonicalized uncompressed name<b=
r></div><div>&gt;-&nbsp;&nbsp;&nbsp;&nbsp; has already been dealt with, =
skip it silently. */<br></div><div>&gt;-&nbsp; for (fdp =3D fdhead; fdp =
!=3D NULL; fdp =3D fdp-&gt;next)<br></div><div>&gt;-&nbsp;&nbsp;&nbsp; {=
<br></div><div>&gt;-&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; assert (fdp-&gt;infna=
me !=3D NULL);<br></div><div>&gt;-&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (str=
eq (uncompressed_name, fdp-&gt;infname))<br></div><div>&gt;-	goto cleanu=
p;<br></div><div>&gt;-&nbsp;&nbsp;&nbsp; }<br></div><div>&gt;+&nbsp; /* =
/\* If the canonicalized uncompressed name */<br></div><div>&gt;+&nbsp; =
/*&nbsp;&nbsp;&nbsp; has already been dealt with, skip it silently. *\/ =
*/<br></div><div>&gt;+&nbsp; /* for (fdp =3D fdhead; fdp !=3D NULL; fdp =
=3D fdp-&gt;next) */<br></div><div>&gt;+&nbsp; /*&nbsp;&nbsp; { */<br></=
div><div>&gt;+&nbsp; /*&nbsp;&nbsp;&nbsp;&nbsp; assert (fdp-&gt;infname =
!=3D NULL); */<br></div><div>&gt;+&nbsp; /*&nbsp;&nbsp;&nbsp;&nbsp; if (=
streq (uncompressed_name, fdp-&gt;infname)) */<br></div><div>&gt;+&nbsp;=
 /* 	goto cleanup; */<br></div><div>&gt;+&nbsp; /*&nbsp;&nbsp; } */<br><=
/div><div>&gt;<br></div><div>&gt;&nbsp;&nbsp;&nbsp; inf =3D fopen (file,=
 "r" FOPEN_BINARY);<br></div><div>&gt;&nbsp;&nbsp;&nbsp; if (inf)<br></d=
iv><div>&gt;<br></div><div>&gt;This is basically a "uniqueness" operatio=
n using linear search, O(N^2).<br></div><div><br></div><div>This is only=
 for dealing with the case when the same file exists in both compressed =
and uncompressed form, and we are currently hitting the second one.&nbsp=
; In that case, we should skip it.&nbsp; Yes, this is a uniqueness test =
and yes, it is O^2 in the number of file names, but I doubt that this ca=
n explain a serious slowdown.<br></div></blockquote><div>Like mentioned =
in a previous email, I did recompile with that step removed, and the slo=
wdown was gone.<br></div><div><br></div><div>The whole scan went down to=
 ~20 seconds.<br></div></body></html>
--df510841c8144fd398cc60b60b5dcd22--




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 10 Oct 2024 08:36:20 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Oct 10 04:36:20 2024
Received: from localhost ([127.0.0.1]:58634 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1syoeh-0003Ih-NC
	for submit <at> debbugs.gnu.org; Thu, 10 Oct 2024 04:36:20 -0400
Received: from eggs.gnu.org ([209.51.188.92]:52250)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eliz@HIDDEN>) id 1syoed-0003IS-0A
 for 73484 <at> debbugs.gnu.org; Thu, 10 Oct 2024 04:36:18 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1syoeM-00072X-4c; Thu, 10 Oct 2024 04:35:58 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From:
 Date; bh=SkinHbxsp3JwwWq2RxgG0JL/emyGmpfmjBTri2sIk8c=; b=QJw+dgVERQzWPnmfnvS5
 prHtTkplC/dUcQ+AUi7wq4SkLnIFRSkn08cnK5RNcLj5v0vTLHsOC+6qxueD5XYC89/tbhc5wVaq0
 Q7V6tk7ZSz1iW0pBUHecyXPOLPjXwMM06GG7wTNZHFomn4FfuglLqUwKXvJ45EPhnfziCBbBEtC7P
 QfYIcuBafQqYag8gGac2P2cdmi/mMGbeQXxl7p08zTZgyFvvTkA168YWEYq6NKkFfLi6dSQ6/lYzI
 /Fp1iI2ZivwljGFgzGb4tgX/p0cM43Tdqau1ORQrmsgzpVZ4hWldTS4s/xiU9yF0hasdoy5m2bbZi
 RuKbZNOZpvl6lA==;
Date: Thu, 10 Oct 2024 11:35:54 +0300
Message-Id: <86y12w1hjp.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Francesco =?iso-8859-1?Q?Potort=EC?= <pot@HIDDEN>
In-Reply-To: <875xq0icqa.fsf@HIDDEN> (message from Francesco
 =?iso-8859-1?Q?Potort=EC?= on Thu, 10 Oct 2024 10:27:57 +0200)
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
References: <87tteaznog.fsf@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN>
 <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN>
 <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN>
 <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN>
 <86ttdtzoof.fsf@HIDDEN>
 <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN>
 <86ttdsxt6x.fsf@HIDDEN>
 <52cb1caa-9e7e-45df-b328-d60948d397f6@HIDDEN>
 <864j5rxca1.fsf@HIDDEN>
 <b59bf102-a9d8-4723-91ac-acc3f8ff3aa8@HIDDEN>
 <87a5fiijy9.fsf@HIDDEN>
 <bd6751c8-5504-48d7-82d7-a3e8849a1910@HIDDEN>
 <86jzelvjh4.fsf@HIDDEN>
 <8b6560a9-e2d6-42ae-ac1d-014700f21804@HIDDEN>
 <86wmiktzez.fsf@HIDDEN>
 <b0192173-3d8a-49da-8792-521b9b486568@HIDDEN>
 <86ldyzucdd.fsf@HIDDEN>
 <021c625b-adc9-4e19-819c-fe929583e503@HIDDEN>
 <86ed4ru41x.fsf@HIDDEN>
 <8d86f23e-fdc3-45a5-b3c8-cd4670813e21@HIDDEN>
 <86ploasq35.fsf@HIDDEN> <3e63f532-c6af-4923-880b-01a32cc667ec@HIDDEN>
 <878quwix4c.fsf@HIDDEN> <86ldyw3467.fsf@HIDDEN>
 <875xq0icqa.fsf@HIDDEN>
MIME-version: 1.0
Content-type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 73484
Cc: dmitry@HIDDEN, 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

> From: Francesco Potortì <pot@HIDDEN>
> Date: Thu, 10 Oct 2024 10:27:57 +0200
> Cc: spwhitton@HIDDEN,
> 	73484 <at> debbugs.gnu.org,
> 	dmitry@HIDDEN
> 
> >That's not what I see in the code.  But it should be easy to count the
> >number of loop iterations in the use case we are talking about
> >(running etags on the geck-dev tree), so we don't need to argue about
> >facts.
> 
> Yes.  If finding a bottleneck is the objective, you should maybe instrument the string comparison functions so that you can count how many times they are called from different places.
> 
> I had a quick look at the whole code and in fact the only place I can find where ou have O^2 behaviour seems to be file name comparison, and it still looks so strange to me that this can in facrt cause significant delay.

We are using etags on a huge tree: about 375K files.  I think that's
the reason, because non-linear behaviors are like that: they are
insignificant with small sets, but huge with larger ones...

Profiles don't lie...




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 10 Oct 2024 08:28:20 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Oct 10 04:28:20 2024
Received: from localhost ([127.0.0.1]:58622 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1syoWx-0002oq-KV
	for submit <at> debbugs.gnu.org; Thu, 10 Oct 2024 04:28:20 -0400
Received: from eggs.gnu.org ([209.51.188.92]:33810)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <pot@HIDDEN>) id 1syoWw-0002ob-69
 for 73484 <at> debbugs.gnu.org; Thu, 10 Oct 2024 04:28:18 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <pot@HIDDEN>)
 id 1syoWf-00063L-FU; Thu, 10 Oct 2024 04:28:01 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=MIME-Version:References:Subject:In-Reply-To:To:Date:
 From; bh=gsfRpcHkt7i0X2k8PbY6wgH/1mpgTn+t1pQRJbV/L9U=; b=GV6wHKwt1NaqT/BIUrfg
 3ojpwtwmhwpSWUWHZjuNcTW+qpizf4XP3DrEehqekIa3eyVIQyW4y9YOgdOc1mwjWH+iwOGYQcp2L
 G/fpK1390YiX2o/cyEv3lPqky8j+87Htss6uwPZ/eJCbrsHFx9e3+xRWf6ghSanLupapIORzSsTU2
 SvzGW8p9oxM0+4dcSlViLgpQzet5cu8nwzBPIyEiiB5jsjwtm5MEjvnNlP6kX1NRIybbwan5hqkax
 OMq2aUUGme2Qt60/slLNwdp3ZLgmH8+d+M57NUZNDAItDE4xQXJqN+/N/WwNGiImEw7xf/r8Oy0iu
 FNLNIUZsueuYGA==;
Message-Id: <875xq0icqa.fsf@HIDDEN>
From: =?iso-8859-1?Q?Francesco_Potort=EC?= <pot@HIDDEN>
Date: Thu, 10 Oct 2024 10:27:57 +0200
To: Eli Zaretskii <eliz@HIDDEN>
In-Reply-To: <86ldyw3467.fsf@HIDDEN> (eliz@HIDDEN)
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
References: <87tteaznog.fsf@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN>
 <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN>
 <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN>
 <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN>
 <86ttdtzoof.fsf@HIDDEN>
 <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN>
 <86ttdsxt6x.fsf@HIDDEN>
 <52cb1caa-9e7e-45df-b328-d60948d397f6@HIDDEN>
 <864j5rxca1.fsf@HIDDEN>
 <b59bf102-a9d8-4723-91ac-acc3f8ff3aa8@HIDDEN>
 <87a5fiijy9.fsf@HIDDEN>
 <bd6751c8-5504-48d7-82d7-a3e8849a1910@HIDDEN>
 <86jzelvjh4.fsf@HIDDEN>
 <8b6560a9-e2d6-42ae-ac1d-014700f21804@HIDDEN>
 <86wmiktzez.fsf@HIDDEN>
 <b0192173-3d8a-49da-8792-521b9b486568@HIDDEN>
 <86ldyzucdd.fsf@HIDDEN>
 <021c625b-adc9-4e19-819c-fe929583e503@HIDDEN>
 <86ed4ru41x.fsf@HIDDEN>
 <8d86f23e-fdc3-45a5-b3c8-cd4670813e21@HIDDEN>
 <86ploasq35.fsf@HIDDEN> <3e63f532-c6af-4923-880b-01a32cc667ec@HIDDEN>
 <878quwix4c.fsf@HIDDEN> <86ldyw3467.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable
Organization: The GNU project
X-fingerprint: 4B02 6187 5C03 D6B1 2E31  7666 09DF 2DC9 BE21 6115
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 73484
Cc: dmitry@HIDDEN, 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

>> >This is basically a "uniqueness" operation using linear search, O(N^2).

> Thus, I
>believe the intent is to avoid duplicate tags if the same file was
>encountered twice in some way.

Yes.  Sorry, I spoke from memory and I was inaccurate.

>Note that canonicalize_filename in this case doesn't really do what
>its name seems to imply, e.g., relative file names will generally stay
>relative.

It canonicalises, that is, reduces to a standard common form.  It retains r=
elative vs absolute difference.
=20
>So specifying the same file once as relative and the other
>time as absolute will still process the file more than once.

From=20memory, I would tell so, yes.  Have not checked right now.

>We need
>to use an inode test or equivalent, and probably use realpath or
>equivalent, to make the duplicate test reliable.
>Or maybe having the
>same file processed under different names is okay, since TAGS is for
>helping Emacs find the file, and so using relative names and symlinks
>is okay?

Yes, I think so.  And from memory I think it should be left unchanged.

>> I do not think that it makes sense to build a hash table for file names =
given on the command line, because the number of comparisons made on those =
names is generally vastly inferior to the number of comparisons used to sea=
rch for tags.
>
>That's not what I see in the code.  But it should be easy to count the
>number of loop iterations in the use case we are talking about
>(running etags on the geck-dev tree), so we don't need to argue about
>facts.

Yes.  If finding a bottleneck is the objective, you should maybe instrument=
 the string comparison functions so that you can count how many times they =
are called from different places.

I had a quick look at the whole code and in fact the only place I can find =
where ou have O^2 behaviour seems to be file name comparison, and it still =
looks so strange to me that this can in facrt cause significant delay.

I may certainly have missed something, but if that's really the case, first=
 thing is looking for code inefficiencies.  If this is really structural, o=
ne should first read all filenames, canonicalise and uniquify them, and onl=
y then create the tags.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 10 Oct 2024 05:45:31 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Oct 10 01:45:30 2024
Received: from localhost ([127.0.0.1]:58389 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1sylzO-0002rs-I0
	for submit <at> debbugs.gnu.org; Thu, 10 Oct 2024 01:45:30 -0400
Received: from eggs.gnu.org ([209.51.188.92]:35044)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eliz@HIDDEN>) id 1sylzN-0002re-4a
 for 73484 <at> debbugs.gnu.org; Thu, 10 Oct 2024 01:45:29 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1sylz6-0005QR-5J; Thu, 10 Oct 2024 01:45:12 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From:
 Date; bh=5DllfFfD+2f3GuXBNMYh+vd4FwBYxDDD1Oq6+bry/zA=; b=efs6iiEEChLo5nfLzJGm
 G3iMY96Uz3on1tz6OuoV+nk4By1CPD3M0xVPKDzqMG1kAL52oMeSk6zokn5u5pz74oflgE3ba9xRU
 Sy2tzeXKoEdoDZD6T1BtfHvVeHIA/2Ui3/CoKiC2a0RYRbHPnC/n6BqtH1RFZ+iZ3slQ3vJFuz3Cy
 EepjWYOj6IPXJW6fgyIKPwdPPinuyZ4kothbONDkhsiAkk0j7uiI5m60ZEtto/IrVPmVsIIQ7DU2d
 PJELpwhTHUlDXUz3idksUHFh6amnZk7NmlHk+doJI9LwEgeDf5a6BGNZXF4c2CBSr+Q2/W/KEOO+w
 iGAw388rVjSZoA==;
Date: Thu, 10 Oct 2024 08:45:06 +0300
Message-Id: <86jzeg340t.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Francesco =?iso-8859-1?Q?Potort=EC?= <pot@HIDDEN>
In-Reply-To: <877cagivmk.fsf@HIDDEN> (message from Francesco
 =?iso-8859-1?Q?Potort=EC?= on Thu, 10 Oct 2024 03:39:47 +0200)
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
References: <87tteaznog.fsf@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN>
 <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN>
 <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN>
 <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN>
 <86ttdtzoof.fsf@HIDDEN>
 <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN>
 <86ttdsxt6x.fsf@HIDDEN>
 <52cb1caa-9e7e-45df-b328-d60948d397f6@HIDDEN>
 <864j5rxca1.fsf@HIDDEN>
 <b59bf102-a9d8-4723-91ac-acc3f8ff3aa8@HIDDEN>
 <87a5fiijy9.fsf@HIDDEN>
 <bd6751c8-5504-48d7-82d7-a3e8849a1910@HIDDEN>
 <86jzelvjh4.fsf@HIDDEN>
 <8b6560a9-e2d6-42ae-ac1d-014700f21804@HIDDEN>
 <86wmiktzez.fsf@HIDDEN>
 <b0192173-3d8a-49da-8792-521b9b486568@HIDDEN>
 <86ldyzucdd.fsf@HIDDEN>
 <021c625b-adc9-4e19-819c-fe929583e503@HIDDEN>
 <86ed4ru41x.fsf@HIDDEN>
 <8d86f23e-fdc3-45a5-b3c8-cd4670813e21@HIDDEN>
 <86ploasq35.fsf@HIDDEN> <3e63f532-c6af-4923-880b-01a32cc667ec@HIDDEN>
 <877cagivmk.fsf@HIDDEN>
MIME-version: 1.0
Content-type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 73484
Cc: dmitry@HIDDEN, 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

> From: Francesco Potortì <pot@HIDDEN>
> Date: Thu, 10 Oct 2024 03:39:47 +0200
> Cc: 73484 <at> debbugs.gnu.org,
> 	spwhitton@HIDDEN,
> 	Eli Zaretskii <eliz@HIDDEN>
> 
> I have just written:
> >There are two O^2 test in the number of tags in C/C++ files which depend on the two options "no-line-directive" and "no-duplicates".  Both options are usable to disable those checks and both are off by default because they help producing a more sane tags file and have no practical impact in most cases.  Both are there because, in principle, they cause significant slowdown in huge tags files.
> 
> However, --no-line-directive exhibits the O^2 behaviour inthe number of tags only for languages with the "metafile" property, currently only yacc files.  Unless you have a significant number of yacc files, the impact is O^1 in the number of tag candidates.  And --no-duplicates only matters when creating a ctags file.
> 
> Maybe you could give a try and check whether --no-line-directives has any impact

I already did that: the effect is null and void.  Which is not a
surprise, since there are only 3 Yacc files in this tree.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 10 Oct 2024 05:42:14 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Oct 10 01:42:14 2024
Received: from localhost ([127.0.0.1]:58381 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1sylwD-0002eb-Ov
	for submit <at> debbugs.gnu.org; Thu, 10 Oct 2024 01:42:14 -0400
Received: from eggs.gnu.org ([209.51.188.92]:58866)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eliz@HIDDEN>) id 1sylwC-0002eJ-1d
 for 73484 <at> debbugs.gnu.org; Thu, 10 Oct 2024 01:42:13 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1sylvv-00058L-A3; Thu, 10 Oct 2024 01:41:55 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From:
 Date; bh=qINdUYluqQRK6P+8JsIL4MTz/B4IyRVBOPzxaS1vaQE=; b=OrBgub72oADaog7mhiTR
 pAfvv/O/yb8yiQ1Uk2H5EPa2AUvkX/O7YaXHZmXsYRlfmtiGyMg348Xp7PNF6kh6N38pHr8+k1zXP
 dZjY+YIclysDExv86ieq2xhbiBi+x6qpomhhap8E1TydY+9rChrpSkwtR3iJY/zRbW6zx+cLycarP
 yLb+prOpN5zSzh+QxAbe1th3dUKUyk0Nku+el/ebfZA65/tfmVwujNevgXoDymiifG6zFpluRc3ij
 Uzoid6bJ4NJ87U5sdxnclO6RciMLTgpOl7EHRi1RPUidPfr0seg/joGzEqPQSuZzUrXqqGCkBGwha
 L8YL7o6PM9EVgg==;
Date: Thu, 10 Oct 2024 08:41:52 +0300
Message-Id: <86ldyw3467.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Francesco =?iso-8859-1?Q?Potort=EC?= <pot@HIDDEN>
In-Reply-To: <878quwix4c.fsf@HIDDEN> (message from Francesco
 =?iso-8859-1?Q?Potort=EC?= on Thu, 10 Oct 2024 03:07:31 +0200)
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
References: <87tteaznog.fsf@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN>
 <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN>
 <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN>
 <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN>
 <86ttdtzoof.fsf@HIDDEN>
 <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN>
 <86ttdsxt6x.fsf@HIDDEN>
 <52cb1caa-9e7e-45df-b328-d60948d397f6@HIDDEN>
 <864j5rxca1.fsf@HIDDEN>
 <b59bf102-a9d8-4723-91ac-acc3f8ff3aa8@HIDDEN>
 <87a5fiijy9.fsf@HIDDEN>
 <bd6751c8-5504-48d7-82d7-a3e8849a1910@HIDDEN>
 <86jzelvjh4.fsf@HIDDEN>
 <8b6560a9-e2d6-42ae-ac1d-014700f21804@HIDDEN>
 <86wmiktzez.fsf@HIDDEN>
 <b0192173-3d8a-49da-8792-521b9b486568@HIDDEN>
 <86ldyzucdd.fsf@HIDDEN>
 <021c625b-adc9-4e19-819c-fe929583e503@HIDDEN>
 <86ed4ru41x.fsf@HIDDEN>
 <8d86f23e-fdc3-45a5-b3c8-cd4670813e21@HIDDEN>
 <86ploasq35.fsf@HIDDEN> <3e63f532-c6af-4923-880b-01a32cc667ec@HIDDEN>
 <878quwix4c.fsf@HIDDEN>
MIME-version: 1.0
Content-type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 73484
Cc: dmitry@HIDDEN, 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

> From: Francesco Potortì <pot@HIDDEN>
> Date: Thu, 10 Oct 2024 03:07:31 +0200
> Cc: 73484 <at> debbugs.gnu.org,
> 	spwhitton@HIDDEN,
> 	Eli Zaretskii <eliz@HIDDEN>
> 
> >+  /* /\* If the canonicalized uncompressed name */
> >+  /*    has already been dealt with, skip it silently. *\/ */
> >+  /* for (fdp = fdhead; fdp != NULL; fdp = fdp->next) */
> >+  /*   { */
> >+  /*     assert (fdp->infname != NULL); */
> >+  /*     if (streq (uncompressed_name, fdp->infname)) */
> >+  /* 	goto cleanup; */
> >+  /*   } */
> >
> >    inf = fopen (file, "r" FOPEN_BINARY);
> >    if (inf)
> >
> >This is basically a "uniqueness" operation using linear search, O(N^2).
> 
> This is only for dealing with the case when the same file exists in both compressed and uncompressed form, and we are currently hitting the second one.  In that case, we should skip it.  Yes, this is a uniqueness test and yes, it is O^2 in the number of file names, but I doubt that this can explain a serious slowdown.

Are you sure this is executed only for compressed files?  Maybe I'm
missing something, but that's not my reading of the code:

  compr = get_compressor_from_suffix (file, &ext);
  if (compr)
    {
      compressed_name = file;
      uncompressed_name = savenstr (file, ext - file);
    }
  else
    {
      compressed_name = NULL;
      uncompressed_name = file;
    }

  /* If the canonicalized uncompressed name
     has already been dealt with, skip it silently. */
  for (fdp = fdhead; fdp != NULL; fdp = fdp->next)
    {
      assert (fdp->infname != NULL);
      if (streq (uncompressed_name, fdp->infname))
	goto cleanup;
    }

As you see, if the file is not compressed by any known method, the
code sets compressed_name to NULL and uncompressed_name to the
canonicalized file.  But the loop doesn't test compressed_name, so it
is executed for all the files, compressed and uncompressed.  Thus, I
believe the intent is to avoid duplicate tags if the same file was
encountered twice in some way.

Note that canonicalize_filename in this case doesn't really do what
its name seems to imply, e.g., relative file names will generally stay
relative.  So specifying the same file once as relative and the other
time as absolute will still process the file more than once.  We need
to use an inode test or equivalent, and probably use realpath or
equivalent, to make the duplicate test reliable.  Or maybe having the
same file processed under different names is okay, since TAGS is for
helping Emacs find the file, and so using relative names and symlinks
is okay?

> >Is there a hash table we could use?
> 
> No, we have a hash table for C tags, and that's all.  It is useful because there are 34 keywords against which most strings in a C/C++ file are compared.  It makes sesns to build hash tables for other languages where a similar situation happens.

The hash table we have was build by gperf, and that method can only be
used for fixed sets of strings known in advance.  We need a different
hash table for storing file names.

> I do not think that it makes sense to build a hash table for file names given on the command line, because the number of comparisons made on those names is generally vastly inferior to the number of comparisons used to search for tags.

That's not what I see in the code.  But it should be easy to count the
number of loop iterations in the use case we are talking about
(running etags on the geck-dev tree), so we don't need to argue about
facts.

> >>   . Some files have their language identified by means other than their
> >>     names or extensions: those are the languages that have
> >>     "interpreters" defined in etags.c
> 
> The interpreter is the token what comes after #!, with The possible exception for "env", in which case the interpreter is the second token after #!
> 
> There are two O^2 test in the number of tags in C/C++ files which depend on the two options "no-line-directive" and "no-duplicates".  Both options are usable to disable those checks and both are off by default because they help producing a more sane tags file and have no practical impact in most cases.  Both are there because, in principle, they cause significant slowdown in huge tags files.

AFAIU, --no-duplicates is only for ctags, not for etags.  I don't see
how --no-duplicates could be relevant to the loop described above.  Am
I missing something?




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 10 Oct 2024 05:13:37 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Oct 10 01:13:36 2024
Received: from localhost ([127.0.0.1]:58351 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1sylUW-00012F-FM
	for submit <at> debbugs.gnu.org; Thu, 10 Oct 2024 01:13:36 -0400
Received: from eggs.gnu.org ([209.51.188.92]:58056)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eliz@HIDDEN>) id 1sylUU-00011z-3g
 for 73484 <at> debbugs.gnu.org; Thu, 10 Oct 2024 01:13:34 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1sylUC-00027t-Je; Thu, 10 Oct 2024 01:13:17 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date:
 mime-version; bh=CJIS0MPSJW/+oBeDJ9A03sIEB0LxKrQAAR2zTfdD024=; b=lFU8g/yifZsn
 HEUdWj/FnFDXPIyKsQ7An7gVy7bx14bAY1OUsL4skLlAW6WheT5uzxSrfS2EVsXmriFeAd2fGk5qu
 YgLGD4RNcoRfFa2HIaUoAlRnbdZEJdyZY4jyWvsmTuf5hZemhiYerAicC1dV6xUXUKQU+7X+Q7Fst
 ucPn5tm56BTiUMmGz3aQ/VPjTK6KldtY/I+W4vyu/ame0Mh4XqdAEUgAlUcdhLBPU8AMtcPtZ1SnM
 UEj46/UlJG/P1i+ojcgyEpsJj8r8+iqsIK+dO/d+dfhteczhwd9JnikU+ivbW7eFKujZLnJc6qDJF
 1gWJXhqY2omXzK+R11JZeg==;
Date: Thu, 10 Oct 2024 08:13:11 +0300
Message-Id: <86o73s35i0.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <ba81b071-d5f4-4133-b5d6-e94684aec84b@HIDDEN> (message from
 Dmitry Gutov on Thu, 10 Oct 2024 01:22:13 +0300)
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
References: <87tteaznog.fsf@HIDDEN> <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN>
 <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN>
 <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN>
 <86ttdtzoof.fsf@HIDDEN>
 <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN>
 <86ttdsxt6x.fsf@HIDDEN>
 <52cb1caa-9e7e-45df-b328-d60948d397f6@HIDDEN>
 <864j5rxca1.fsf@HIDDEN>
 <b59bf102-a9d8-4723-91ac-acc3f8ff3aa8@HIDDEN>
 <87a5fiijy9.fsf@HIDDEN>
 <bd6751c8-5504-48d7-82d7-a3e8849a1910@HIDDEN>
 <86jzelvjh4.fsf@HIDDEN>
 <8b6560a9-e2d6-42ae-ac1d-014700f21804@HIDDEN>
 <86wmiktzez.fsf@HIDDEN>
 <b0192173-3d8a-49da-8792-521b9b486568@HIDDEN>
 <86ldyzucdd.fsf@HIDDEN>
 <021c625b-adc9-4e19-819c-fe929583e503@HIDDEN>
 <86ed4ru41x.fsf@HIDDEN>
 <8d86f23e-fdc3-45a5-b3c8-cd4670813e21@HIDDEN>
 <86ploasq35.fsf@HIDDEN>
 <3e63f532-c6af-4923-880b-01a32cc667ec@HIDDEN>
 <86v7y12ism.fsf@HIDDEN> <ba81b071-d5f4-4133-b5d6-e94684aec84b@HIDDEN>
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 73484
Cc: pot@HIDDEN, 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

> Date: Thu, 10 Oct 2024 01:22:13 +0300
> Cc: pot@HIDDEN, spwhitton@HIDDEN, 73484 <at> debbugs.gnu.org
> From: Dmitry Gutov <dmitry@HIDDEN>
> 
> On 09/10/2024 22:11, Eli Zaretskii wrote:
> 
> >> This is basically a "uniqueness" operation using linear search, O(N^2).
> > 
> > Yes, this seems to be a protection against the same file name
> > mentioned more than once on the command line..
> 
> Or, maybe more likely, against having symlinks scanned if the symlink 
> target is also in the passed list.

Yes, that, but also any other possible ways of specifying the same
file twice, like having a file both compressed and uncompressed, etc.

> >> Is there a hash table we could use?
> > 
> > Something like that should do, yes.
> 
> Can we use search.h? hcreate/hsearch/etc. IIUC it's on in the C stndard, 
> and 
> https://www.gnu.org/savannah-checkouts/gnu/gnulib/manual/html_node/hcreate.html 
> says it's available on certain platforms.

I think we shouldn't: it is not sufficiently portable and Gnulib
doesn't have an implementation for it for those platforms that don't
have it.

We could perhaps use the standard tsearch (although it will be more
expensive).  Alternatively, we could steal the hash table code from
somewhere, for example, from Gawk.

> >> Or perhaps we would skip the search when the canonicalized name is the
> >> same as the original one.
> > 
> > That's not the same as the loop above does, I think.
> 
> If we assumed the duplicate check is only necessary for symlinks, and 
> there is on average a small number of them, I think we could avoid using 
> a hash table. But passing the same exact file 2 times would result in 
> duplicate tags.

canonicalize_filename in etags.c does not resolve symlinks, AFAICT, so
the symlink scenario will not be solved by that.  We'd need realpath
or its equivalent, I think?




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 10 Oct 2024 01:40:24 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Oct 09 21:40:24 2024
Received: from localhost ([127.0.0.1]:58091 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1syiAC-0006Wr-D0
	for submit <at> debbugs.gnu.org; Wed, 09 Oct 2024 21:40:24 -0400
Received: from eggs.gnu.org ([209.51.188.92]:50396)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <pot@HIDDEN>) id 1syiA8-0006Pv-Tg
 for 73484 <at> debbugs.gnu.org; Wed, 09 Oct 2024 21:40:22 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <pot@HIDDEN>)
 id 1syi9s-0007vq-6E; Wed, 09 Oct 2024 21:40:04 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=MIME-Version:References:Subject:In-Reply-To:To:Date:
 From; bh=0E8KS5eHGkL1/vX0g1bD5X3tpbHh0YfownnYJyy3hjw=; b=ZAViH1ZC0Pmw0u2C+HDu
 ZGxl+VHhmtHdcWvco3S+73aa26OVBT9Fc9199kZwZI31HMJEWCDGYLSMbuBsZGCVPbpASBrEGTzpM
 efVMff4DNwRZV9RUwv8Oa1NU/iaaUywZ6g1gBTsndlWAiDFZHfaAFMdgigdsq5HWy0XlKXiOm+VzP
 pWwy5uGhRcwJEpwx3cGBN3qxEuAJbngDVTBrs7YsGcDDTd2kPU6sgbtb0UfwG2/zwgTZm8De/x2mB
 Xi1CBCyECXOjdl2IS98yRfRfvBeldkXRiAGIMLsjcoeANyJlfVyi/Q7E0stJMz7DRFsoBhEz18i6U
 S3t6PF/E8mkQrA==;
Message-Id: <877cagivmk.fsf@HIDDEN>
From: =?utf-8?Q?Francesco_Potort=C3=AC?= <pot@HIDDEN>
Date: Thu, 10 Oct 2024 03:39:47 +0200
To: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <3e63f532-c6af-4923-880b-01a32cc667ec@HIDDEN>
 (dmitry@HIDDEN)
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
References: <87tteaznog.fsf@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN>
 <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN>
 <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN>
 <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN>
 <86ttdtzoof.fsf@HIDDEN>
 <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN>
 <86ttdsxt6x.fsf@HIDDEN>
 <52cb1caa-9e7e-45df-b328-d60948d397f6@HIDDEN>
 <864j5rxca1.fsf@HIDDEN>
 <b59bf102-a9d8-4723-91ac-acc3f8ff3aa8@HIDDEN>
 <87a5fiijy9.fsf@HIDDEN>
 <bd6751c8-5504-48d7-82d7-a3e8849a1910@HIDDEN>
 <86jzelvjh4.fsf@HIDDEN>
 <8b6560a9-e2d6-42ae-ac1d-014700f21804@HIDDEN>
 <86wmiktzez.fsf@HIDDEN>
 <b0192173-3d8a-49da-8792-521b9b486568@HIDDEN>
 <86ldyzucdd.fsf@HIDDEN>
 <021c625b-adc9-4e19-819c-fe929583e503@HIDDEN>
 <86ed4ru41x.fsf@HIDDEN>
 <8d86f23e-fdc3-45a5-b3c8-cd4670813e21@HIDDEN>
 <86ploasq35.fsf@HIDDEN> <3e63f532-c6af-4923-880b-01a32cc667ec@HIDDEN>
Content-Transfer-Encoding: quoted-printable
Organization: The GNU project
X-fingerprint: 4B02 6187 5C03 D6B1 2E31  7666 09DF 2DC9 BE21 6115
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 73484
Cc: Eli Zaretskii <eliz@HIDDEN>, 73484 <at> debbugs.gnu.org,
 spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

I have just written:
>There are two O^2 test in the number of tags in C/C++ files which depend on the two options "no-line-directive" and "no-duplicates".  Both options are usable to disable those checks and both are off by default because they help producing a more sane tags file and have no practical impact in most cases.  Both are there because, in principle, they cause significant slowdown in huge tags files.

However, --no-line-directive exhibits the O^2 behaviour inthe number of tags only for languages with the "metafile" property, currently only yacc files.  Unless you have a significant number of yacc files, the impact is O^1 in the number of tag candidates.  And --no-duplicates only matters when creating a ctags file.

Maybe you could give a try and check whether --no-line-directives has any impact




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 10 Oct 2024 01:10:04 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Oct 09 21:10:04 2024
Received: from localhost ([127.0.0.1]:58053 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1syhgp-0004qt-Pq
	for submit <at> debbugs.gnu.org; Wed, 09 Oct 2024 21:10:04 -0400
Received: from eggs.gnu.org ([209.51.188.92]:33638)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <pot@HIDDEN>) id 1syhgo-0004qH-4T
 for 73484 <at> debbugs.gnu.org; Wed, 09 Oct 2024 21:10:02 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <pot@HIDDEN>)
 id 1syheQ-0002gb-NC; Wed, 09 Oct 2024 21:07:34 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=MIME-Version:References:Subject:In-Reply-To:To:Date:
 From; bh=U1IKs2JrWqSEwQScERQawPPv+jzGjHHfi0BkgDZ8WmU=; b=ickUNWASDg/A64A/J8xk
 brYDLi7VEHf97yK1PJm8+HT0QAeZaB6uqQFiIbyMCpvC3vH7bm4Asi3i8BXyE3qYG8o+2pqs6fFH6
 DZhWNd4caOh3CuzdiSmFzYSFnFuwnmLpD3U2561n8g9BxBXNrl9fVlgJaWDBaf4DMUrVG1melS1Ag
 b5nKAk9qqvw9jBQhPVPdRNTbOMofm2M/sPzwfaLFNPLyy0sIqQkxpiXzkN67xayFgj85Qmmduxhgu
 /C5Y/W07BcTn2HW1dpxAUOHXz+OX9TWTGqFHeohbtKqDJMPY4q5AVY3nPpv+r1AuH/gzgOf+3Fm50
 wzemAFFDrCGFVw==;
Message-Id: <878quwix4c.fsf@HIDDEN>
From: =?utf-8?Q?Francesco_Potort=C3=AC?= <pot@HIDDEN>
Date: Thu, 10 Oct 2024 03:07:31 +0200
To: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <3e63f532-c6af-4923-880b-01a32cc667ec@HIDDEN>
 (dmitry@HIDDEN)
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
References: <87tteaznog.fsf@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN>
 <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN>
 <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN>
 <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN>
 <86ttdtzoof.fsf@HIDDEN>
 <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN>
 <86ttdsxt6x.fsf@HIDDEN>
 <52cb1caa-9e7e-45df-b328-d60948d397f6@HIDDEN>
 <864j5rxca1.fsf@HIDDEN>
 <b59bf102-a9d8-4723-91ac-acc3f8ff3aa8@HIDDEN>
 <87a5fiijy9.fsf@HIDDEN>
 <bd6751c8-5504-48d7-82d7-a3e8849a1910@HIDDEN>
 <86jzelvjh4.fsf@HIDDEN>
 <8b6560a9-e2d6-42ae-ac1d-014700f21804@HIDDEN>
 <86wmiktzez.fsf@HIDDEN>
 <b0192173-3d8a-49da-8792-521b9b486568@HIDDEN>
 <86ldyzucdd.fsf@HIDDEN>
 <021c625b-adc9-4e19-819c-fe929583e503@HIDDEN>
 <86ed4ru41x.fsf@HIDDEN>
 <8d86f23e-fdc3-45a5-b3c8-cd4670813e21@HIDDEN>
 <86ploasq35.fsf@HIDDEN> <3e63f532-c6af-4923-880b-01a32cc667ec@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable
Organization: The GNU project
X-fingerprint: 4B02 6187 5C03 D6B1 2E31  7666 09DF 2DC9 BE21 6115
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 73484
Cc: Eli Zaretskii <eliz@HIDDEN>, 73484 <at> debbugs.gnu.org,
 spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

>Here is the nested loop, which if I comment out, makes the parse finish=20
>in ~20 seconds, with all the extra files (except *.js), or in 15s when=20
>using with new flags.
>
>diff --git a/lib-src/etags.c b/lib-src/etags.c
>index a822a823a90..331e3ffe816 100644
>--- a/lib-src/etags.c
>+++ b/lib-src/etags.c
>@@ -1697,14 +1697,14 @@ process_file_name (char *file, language *lang)
>        uncompressed_name =3D file;
>      }
>
>-  /* If the canonicalized uncompressed name
>-     has already been dealt with, skip it silently. */
>-  for (fdp =3D fdhead; fdp !=3D NULL; fdp =3D fdp->next)
>-    {
>-      assert (fdp->infname !=3D NULL);
>-      if (streq (uncompressed_name, fdp->infname))
>-	goto cleanup;
>-    }
>+  /* /\* If the canonicalized uncompressed name */
>+  /*    has already been dealt with, skip it silently. *\/ */
>+  /* for (fdp =3D fdhead; fdp !=3D NULL; fdp =3D fdp->next) */
>+  /*   { */
>+  /*     assert (fdp->infname !=3D NULL); */
>+  /*     if (streq (uncompressed_name, fdp->infname)) */
>+  /* 	goto cleanup; */
>+  /*   } */
>
>    inf =3D fopen (file, "r" FOPEN_BINARY);
>    if (inf)
>
>This is basically a "uniqueness" operation using linear search, O(N^2).

This is only for dealing with the case when the same file exists in both co=
mpressed and uncompressed form, and we are currently hitting the second one=
.  In that case, we should skip it.  Yes, this is a uniqueness test and yes=
, it is O^2 in the number of file names, but I doubt that this can explain =
a serious slowdown.

>Is there a hash table we could use?

No, we have a hash table for C tags, and that's all.  It is useful because =
there are 34 keywords against which most strings in a C/C++ file are compar=
ed.  It makes sesns to build hash tables for other languages where a simila=
r situation happens.

I do not think that it makes sense to build a hash table for file names giv=
en on the command line, because the number of comparisons made on those nam=
es is generally vastly inferior to the number of comparisons used to search=
 for tags.

>>   . Some files have their language identified by means other than their
>>     names or extensions: those are the languages that have
>>     "interpreters" defined in etags.c

The interpreter is the token what comes after #!, with The possible excepti=
on for "env", in which case the interpreter is the second token after #!

There are two O^2 test in the number of tags in C/C++ files which depend on=
 the two options "no-line-directive" and "no-duplicates".  Both options are=
 usable to disable those checks and both are off by default because they he=
lp producing a more sane tags file and have no practical impact in most cas=
es.  Both are there because, in principle, they cause significant slowdown =
in huge tags files.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 9 Oct 2024 22:22:38 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Oct 09 18:22:38 2024
Received: from localhost ([127.0.0.1]:57875 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1syf4n-0002vA-N6
	for submit <at> debbugs.gnu.org; Wed, 09 Oct 2024 18:22:38 -0400
Received: from fout-a8-smtp.messagingengine.com ([103.168.172.151]:57695)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <dmitry@HIDDEN>) id 1syf4k-0002ut-0B
 for 73484 <at> debbugs.gnu.org; Wed, 09 Oct 2024 18:22:36 -0400
Received: from phl-compute-04.internal (phl-compute-04.phl.internal
 [10.202.2.44])
 by mailfout.phl.internal (Postfix) with ESMTP id 2D50313801EC;
 Wed,  9 Oct 2024 18:22:17 -0400 (EDT)
Received: from phl-mailfrontend-02 ([10.202.2.163])
 by phl-compute-04.internal (MEProxy); Wed, 09 Oct 2024 18:22:17 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gutov.dev; h=cc
 :cc:content-transfer-encoding:content-type:content-type:date
 :date:from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to; s=fm1; t=1728512537;
 x=1728598937; bh=UTYe95bCKJAVbWeuv5Pd9jTMVo0fKETobenHyKZ3ePs=; b=
 fLr+DNI0OL/njdGoVNoZS0CnhclIcqUnNQh0hm01Ge6cyQXYbzwnFc8FjlYCweFU
 l3NW3joOc9HxiNaqHvNYM6aPSvoxu1d2+AyQ9guuWWxRhOsy0fwNsUy8gTh+bk31
 ZwFSKtEw9rLS2Pm8GIfpZbcDRGJ5k5pyl/BR0en9Q/OABLaf8GZQ1J1CLWbBY2pJ
 CsAthEHXKcOLtytrT2ETzeyzZd+DFmlPAsUk0hT6kun83UoCxF1maQ9/j2u8Di3q
 QS+Jid4/mFkAZ8+7bq3XFGC1xaxMZ3e0/kMNpHCEaGotYmCVc9RAiw9zH+mTSlpg
 4w8J7AjoJ49zODR5fJaGDw==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 messagingengine.com; h=cc:cc:content-transfer-encoding
 :content-type:content-type:date:date:feedback-id:feedback-id
 :from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy
 :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1728512537; x=
 1728598937; bh=UTYe95bCKJAVbWeuv5Pd9jTMVo0fKETobenHyKZ3ePs=; b=P
 7Q+CEwnuANYI/wSrn70zyPFc21hgEyQPj4EKj9fWCeiJK/W9iC4As/WN+P+Euxzw
 tYZSfDpIxhEBwPiWJV7Fm44tP8WKob3XcLvkgFifUWypAUew/KWH+DnoR53jswQB
 /jvtdkp+cw9zdG6vtCVl3D95z13z8wp1obKOcM2/Hmhi2M40efq6mkd5A22NQ73t
 SGhi20ignTsi7tIgq6vUbPK1MCs4u1BW10wl7oA4OSeeY2HC03fh0Sv69eMsVqfz
 kUOqoQCHfP3qozwIJlbscpLhI00Y+bj3v169gwKetdNrYcdE8EvX7nXH3A3PRfaA
 lqrvR7CzRq8rHmuYo5Yxw==
X-ME-Sender: <xms:GAIHZ8Fwhu_X0Ix0xUwf-GTFmMtaw5kB9_SPfHSiDKKClP2hk7J5Xw>
 <xme:GAIHZ1WEbV0vuRF3GE2OKz83Vo8VmgxiwTRRZfZMnv_2_OgoPNeAdSC78uu4oVx-E
 b2HbRSJtHcujYp7GSo>
X-ME-Received: <xmr:GAIHZ2IOQtiIc9yTts7et1hct3SmCCn9g-S6fMe4vqcd4HvcdQWuFHlN0Y9D5yXGUYdjz9mfqWSf7Yrpac6kqGapsoCGFg>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvdefgedgtdelucetufdoteggodetrfdotf
 fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu
 rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnh
 htshculddquddttddmnecujfgurhepkfffgggfuffvvehfhfgjtgfgsehtjeertddtvdej
 necuhfhrohhmpeffmhhithhrhicuifhuthhovhcuoegumhhithhrhiesghhuthhovhdrug
 gvvheqnecuggftrfgrthhtvghrnhepffeifedvleeukedtgfelieegudfgveekfeejveej
 ffetffeuueeugefhveeiuddvnecuffhomhgrihhnpehgnhhurdhorhhgnecuvehluhhsth
 gvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepughmihhtrhihsehguhht
 ohhvrdguvghvpdhnsggprhgtphhtthhopeegpdhmohguvgepshhmthhpohhuthdprhgtph
 htthhopegvlhhiiiesghhnuhdrohhrghdprhgtphhtthhopehpohhtsehgnhhurdhorhhg
 pdhrtghpthhtohepshhpfihhihhtthhonhesshhpfihhihhtthhonhdrnhgrmhgvpdhrtg
 hpthhtohepjeefgeekgeesuggvsggsuhhgshdrghhnuhdrohhrgh
X-ME-Proxy: <xmx:GAIHZ-GXR_Dc_aWb-7AunlhVMFknNFh4OOl_LTatW58EKP5fmt72ig>
 <xmx:GAIHZyWVn_5OwFFVOO1Gtudr1BU0k5uOl0_3h7uvyUU5U7BeXJ-cjA>
 <xmx:GAIHZxNb9N40uUYuWfk8d4vOA8ckU9EMuNpAF8k8Bm9dI1ZeAS20XA>
 <xmx:GAIHZ51q_1kQirXiOqtc4aHJGlTTqueZOAe8-8RsHsPAT-UqpK_C2g>
 <xmx:GQIHZwzQxn33vjR2qbbM1f1syUHBM_4XyR-LxE-LHsL0ZTNpzYZd1OcQ>
Feedback-ID: i07de48aa:Fastmail
Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed,
 9 Oct 2024 18:22:15 -0400 (EDT)
Message-ID: <ba81b071-d5f4-4133-b5d6-e94684aec84b@HIDDEN>
Date: Thu, 10 Oct 2024 01:22:13 +0300
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
To: Eli Zaretskii <eliz@HIDDEN>
References: <87tteaznog.fsf@HIDDEN> <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN>
 <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN>
 <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN>
 <86ttdtzoof.fsf@HIDDEN>
 <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN>
 <86ttdsxt6x.fsf@HIDDEN>
 <52cb1caa-9e7e-45df-b328-d60948d397f6@HIDDEN>
 <864j5rxca1.fsf@HIDDEN>
 <b59bf102-a9d8-4723-91ac-acc3f8ff3aa8@HIDDEN>
 <87a5fiijy9.fsf@HIDDEN>
 <bd6751c8-5504-48d7-82d7-a3e8849a1910@HIDDEN>
 <86jzelvjh4.fsf@HIDDEN>
 <8b6560a9-e2d6-42ae-ac1d-014700f21804@HIDDEN>
 <86wmiktzez.fsf@HIDDEN>
 <b0192173-3d8a-49da-8792-521b9b486568@HIDDEN>
 <86ldyzucdd.fsf@HIDDEN>
 <021c625b-adc9-4e19-819c-fe929583e503@HIDDEN>
 <86ed4ru41x.fsf@HIDDEN>
 <8d86f23e-fdc3-45a5-b3c8-cd4670813e21@HIDDEN>
 <86ploasq35.fsf@HIDDEN>
 <3e63f532-c6af-4923-880b-01a32cc667ec@HIDDEN>
 <86v7y12ism.fsf@HIDDEN>
Content-Language: en-US
From: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <86v7y12ism.fsf@HIDDEN>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Score: -0.7 (/)
X-Debbugs-Envelope-To: 73484
Cc: pot@HIDDEN, 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.7 (-)

On 09/10/2024 22:11, Eli Zaretskii wrote:

>> This is basically a "uniqueness" operation using linear search, O(N^2).
> 
> Yes, this seems to be a protection against the same file name
> mentioned more than once on the command line..

Or, maybe more likely, against having symlinks scanned if the symlink 
target is also in the passed list.

>> Is there a hash table we could use?
> 
> Something like that should do, yes.

Can we use search.h? hcreate/hsearch/etc. IIUC it's on in the C stndard, 
and 
https://www.gnu.org/savannah-checkouts/gnu/gnulib/manual/html_node/hcreate.html 
says it's available on certain platforms.

>> Or perhaps we would skip the search when the canonicalized name is the
>> same as the original one.
> 
> That's not the same as the loop above does, I think.

If we assumed the duplicate check is only necessary for symlinks, and 
there is on average a small number of them, I think we could avoid using 
a hash table. But passing the same exact file 2 times would result in 
duplicate tags.

>> I guess someone might ask for flag "--no-decompress", sometime.
> 
> Yes, but it's also easy to exclude them via 'find'.

Or through etags-regen-ignores.

>>>    . Some files have their language identified by means other than their
>>>      names or extensions: those are the languages that have
>>>      "interpreters" defined in etags.c.  Shell scripts is one such case,
>>>      but not the only one.  So when etags-regen.el passes only files
>>>      with known extensions to etags, it misses those files from TAGS.
>>>      As one example, the file js/src/devtools/rootAnalysis/run_complete
>>>      in the gecko-dev tree is a Perl script, but has no .pl extension.
>>
>> This sounds the same as the "hashbang" files that we mentioned
>> previously. It makes sense for the scan to take longer, of course,
>> proportional to the number of the detected files.
> 
> My point was that if someone wants all the Python files, say,
> submitting only Python extensions to etags might miss some Python
> scripts.

Yes, that's the problem from the first comments of this report: to have 
hashbang files scanned, one can't use a whitelist of extensions. Using a 
blacklist should be fine, though.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 9 Oct 2024 19:11:54 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Oct 09 15:11:54 2024
Received: from localhost ([127.0.0.1]:57673 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1syc6D-0001Ll-Va
	for submit <at> debbugs.gnu.org; Wed, 09 Oct 2024 15:11:54 -0400
Received: from eggs.gnu.org ([209.51.188.92]:43656)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eliz@HIDDEN>) id 1syc69-0001LT-Cg
 for 73484 <at> debbugs.gnu.org; Wed, 09 Oct 2024 15:11:52 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1syc5t-0005cP-70; Wed, 09 Oct 2024 15:11:33 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date:
 mime-version; bh=UsSvf/K03s3LHfhD3yE3HzpdxGhS576VWGjkaMUCFXw=; b=FjQ6nLJ/Mj51
 WPThyDnXsDtLHYvHh9kI0oyLikBVuGDury/TGS7bhd5WzObiGS3H1MWQyka642wln00B2y+PbX6v2
 wtj9CmQad0ciIGFIAZnHOAhox7YzfzermLv2PqzX0CUsZ4606fcOjzWg2t2fYXjU8G0bq1+ZFI5Nl
 0p++ibD3ZgznMnTJ5cZSMEPiz4MTRZlROuf+DRYcPIvKSlS9B5xyofBjMf7OJhLakVZoirw3XdJ9y
 eHjBlaC9zkqV2zPVx/VYG+HyTHBAeeK2xi8Qds+jOatZruyiU2WNbzafkjBP5Wy+iXmJ7svCLRX4i
 UmXBIfswItx8irScs9FEXg==;
Date: Wed, 09 Oct 2024 22:11:21 +0300
Message-Id: <86v7y12ism.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <3e63f532-c6af-4923-880b-01a32cc667ec@HIDDEN> (message from
 Dmitry Gutov on Wed, 9 Oct 2024 21:23:37 +0300)
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
References: <87tteaznog.fsf@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN>
 <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN>
 <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN>
 <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN>
 <86ttdtzoof.fsf@HIDDEN>
 <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN>
 <86ttdsxt6x.fsf@HIDDEN>
 <52cb1caa-9e7e-45df-b328-d60948d397f6@HIDDEN>
 <864j5rxca1.fsf@HIDDEN>
 <b59bf102-a9d8-4723-91ac-acc3f8ff3aa8@HIDDEN>
 <87a5fiijy9.fsf@HIDDEN>
 <bd6751c8-5504-48d7-82d7-a3e8849a1910@HIDDEN>
 <86jzelvjh4.fsf@HIDDEN>
 <8b6560a9-e2d6-42ae-ac1d-014700f21804@HIDDEN>
 <86wmiktzez.fsf@HIDDEN>
 <b0192173-3d8a-49da-8792-521b9b486568@HIDDEN>
 <86ldyzucdd.fsf@HIDDEN>
 <021c625b-adc9-4e19-819c-fe929583e503@HIDDEN>
 <86ed4ru41x.fsf@HIDDEN>
 <8d86f23e-fdc3-45a5-b3c8-cd4670813e21@HIDDEN>
 <86ploasq35.fsf@HIDDEN> <3e63f532-c6af-4923-880b-01a32cc667ec@HIDDEN>
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 73484
Cc: pot@HIDDEN, 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

> Date: Wed, 9 Oct 2024 21:23:37 +0300
> Cc: pot@HIDDEN, spwhitton@HIDDEN, 73484 <at> debbugs.gnu.org
> From: Dmitry Gutov <dmitry@HIDDEN>
> 
> 'perf' shows me a profile like this:
> 
>    67.31%  etags    libc.so.6          [.] __strcmp_avx2
>    26.29%  etags    etags              [.] process_file_name
>     2.00%  etags    etags              [.] streq
>     0.96%  etags    etags              [.] strcmp@plt
>     0.32%  etags    etags              [.] readline_internal
>     0.11%  etags    etags              [.] HTML_labels
>     0.08%  etags    [kernel.kallsyms]  [k] syscall_return_via_sysret
>     0.07%  etags    [kernel.kallsyms]  [k] kmem_cache_alloc
>     0.06%  etags    [kernel.kallsyms]  [k] entry_SYSRETQ_unsafe_stack
>     0.05%  etags    [kernel.kallsyms]  [k] perf_adjust_freq_unthr_context
>     0.04%  etags    etags              [.] c_strncasecmp
> 
> So... most of the time is spent in string comparison.
> 
> Here is the nested loop, which if I comment out, makes the parse finish 
> in ~20 seconds, with all the extra files (except *.js), or in 15s when 
> using with new flags.
> 
> diff --git a/lib-src/etags.c b/lib-src/etags.c
> index a822a823a90..331e3ffe816 100644
> --- a/lib-src/etags.c
> +++ b/lib-src/etags.c
> @@ -1697,14 +1697,14 @@ process_file_name (char *file, language *lang)
>         uncompressed_name = file;
>       }
> 
> -  /* If the canonicalized uncompressed name
> -     has already been dealt with, skip it silently. */
> -  for (fdp = fdhead; fdp != NULL; fdp = fdp->next)
> -    {
> -      assert (fdp->infname != NULL);
> -      if (streq (uncompressed_name, fdp->infname))
> -	goto cleanup;
> -    }
> +  /* /\* If the canonicalized uncompressed name */
> +  /*    has already been dealt with, skip it silently. *\/ */
> +  /* for (fdp = fdhead; fdp != NULL; fdp = fdp->next) */
> +  /*   { */
> +  /*     assert (fdp->infname != NULL); */
> +  /*     if (streq (uncompressed_name, fdp->infname)) */
> +  /* 	goto cleanup; */
> +  /*   } */
> 
>     inf = fopen (file, "r" FOPEN_BINARY);
>     if (inf)
> 
> This is basically a "uniqueness" operation using linear search, O(N^2).

Yes, this seems to be a protection against the same file name
mentioned more than once on the command line..

> Is there a hash table we could use?

Something like that should do, yes.

> Or perhaps we would skip the search when the canonicalized name is the 
> same as the original one.

That's not the same as the loop above does, I think.

> > Two aspects that I found trying to understand the long scan times, and
> > I'd like to mention so they don't become forgotten:
> > 
> >   . If there are compressed files in the directory, etags will
> >     uncompress them before it attempts to identify their language.
> >     There are 20 such files in the gecko-dev tree (removing them from
> >     the list of scanned files had only minor effect on the elapsed
> >     time, but it could be different in other cases, especially if
> >     uncompressing them produces very large files).
> 
> I guess someone might ask for flag "--no-decompress", sometime.

Yes, but it's also easy to exclude them via 'find'.

> >   . Some files have their language identified by means other than their
> >     names or extensions: those are the languages that have
> >     "interpreters" defined in etags.c.  Shell scripts is one such case,
> >     but not the only one.  So when etags-regen.el passes only files
> >     with known extensions to etags, it misses those files from TAGS.
> >     As one example, the file js/src/devtools/rootAnalysis/run_complete
> >     in the gecko-dev tree is a Perl script, but has no .pl extension.
> 
> This sounds the same as the "hashbang" files that we mentioned 
> previously. It makes sense for the scan to take longer, of course, 
> proportional to the number of the detected files.

My point was that if someone wants all the Python files, say,
submitting only Python extensions to etags might miss some Python
scripts.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 9 Oct 2024 18:24:01 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Oct 09 14:24:01 2024
Received: from localhost ([127.0.0.1]:57620 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1sybLs-0006tm-QD
	for submit <at> debbugs.gnu.org; Wed, 09 Oct 2024 14:24:01 -0400
Received: from fhigh-a2-smtp.messagingengine.com ([103.168.172.153]:46223)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <dmitry@HIDDEN>) id 1sybLq-0006tV-Pd
 for 73484 <at> debbugs.gnu.org; Wed, 09 Oct 2024 14:23:59 -0400
Received: from phl-compute-12.internal (phl-compute-12.phl.internal
 [10.202.2.52])
 by mailfhigh.phl.internal (Postfix) with ESMTP id 0985B1140190;
 Wed,  9 Oct 2024 14:23:43 -0400 (EDT)
Received: from phl-mailfrontend-01 ([10.202.2.162])
 by phl-compute-12.internal (MEProxy); Wed, 09 Oct 2024 14:23:43 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gutov.dev; h=cc
 :cc:content-transfer-encoding:content-type:content-type:date
 :date:from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to; s=fm1; t=1728498223;
 x=1728584623; bh=nTX3T56RS7EDqurXRg+WUK+hNcuI1VFAQmrE30f+ZJo=; b=
 YC7alE8I7bgCRuRuSa+7OXmJw4PNXsv/o+z5bfqPM7kh2O6ksw0tqQb6ayrNL99D
 Li+KIwgefqpdUcVRSZeKl2v+40XsRB/uYVkpnPsUuJK3h3eKjwbc38cpmBtb/jTl
 E/OMSK+T65HjwZruSNBB1P4NyAF0nwxtCn6hwzXqUJeY7DpaIk8yEVumlw11vThc
 WyMI6OIawrhgyd9KCXe7czb8p+71SM/6+hFxDzyKX/0rwuCkZAYsu2pCO0R1kIAT
 /RLVZ9ivwxqylrDuU85HXVVMX7r8iQ0cGV70vkxST4ZTLHYy+hqxsRv4xXyoqCXz
 5wOZkUmbO0e119TV24MlFQ==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 messagingengine.com; h=cc:cc:content-transfer-encoding
 :content-type:content-type:date:date:feedback-id:feedback-id
 :from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy
 :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1728498223; x=
 1728584623; bh=nTX3T56RS7EDqurXRg+WUK+hNcuI1VFAQmrE30f+ZJo=; b=T
 AyaTYh8MuRxgkoFtyRtHyvLhISYvObuz+lfnwyFujDy+AfHbdbjq9gVanfOcXS9+
 5CGISa+rzDBoJ5OQO9TKyWd2HJuCRyiHINugcJFFYGK2zBtkSTb4+M41TbtQwECV
 btOh6C5hGbcU9weHcVBKB7SQrJCLzeVrzORHEXQIy/gl4G8Zxw1AWg97+C4gBxvl
 hIlCsUU9gIBovpm9lUceJtHpNRWnkSSv6KxLnNw0Ogs8Vq1BlIv+tS5IPYWtxcS8
 HOchgG4JGaPSvFZSiCbWXs8FQ3KWUZqpHRffhJmkQZg3C75lgUmGpF+EqaCfAjoR
 HM+DXLnJvA7a8akGJ+Beg==
X-ME-Sender: <xms:LsoGZ3PrPegbiC5sdhSWWmA6cbADM0SjuDq49gOpkA7H3IkNnd9nBA>
 <xme:LsoGZx-u4JPfHMjVQTlzDypXQJCWz-zVcquIs5sHC3hIUPkkNYCwN3kAxA2SNeQEC
 K-fSLDeV6fy_6SUulg>
X-ME-Received: <xmr:LsoGZ2RVDjE9QHAnTrVUPg97BEL9T-VW9ZNqUBOUSxt_-hSt7DoDUW-0NWVcwKR-idhl4xBjf7i2mg6BfVIDDV7QTQ1cvg>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvdeffedguddviecutefuodetggdotefrod
 ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpggftfghnshhusghstghrihgsvgdp
 uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg
 hnthhsucdlqddutddtmdenucfjughrpefkffggfgfuvfevfhfhjggtgfesthejredttddv
 jeenucfhrhhomhepffhmihhtrhihucfiuhhtohhvuceoughmihhtrhihsehguhhtohhvrd
 guvghvqeenucggtffrrghtthgvrhhnpeetudeljeegheetgfehgeejkeeuhedvveeikeeu
 fedtvddtveefhfdvveegudejheenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmh
 epmhgrihhlfhhrohhmpegumhhithhrhiesghhuthhovhdruggvvhdpnhgspghrtghpthht
 ohepgedpmhhouggvpehsmhhtphhouhhtpdhrtghpthhtohepvghlihiisehgnhhurdhorh
 hgpdhrtghpthhtohepphhothesghhnuhdrohhrghdprhgtphhtthhopehsphifhhhithht
 ohhnsehsphifhhhithhtohhnrdhnrghmvgdprhgtphhtthhopeejfeegkeegseguvggssg
 hughhsrdhgnhhurdhorhhg
X-ME-Proxy: <xmx:LsoGZ7uf5XgFmaYYIiYqqvO5FvAKlZMz55B4NdNaUBAKj_Y9kBPcVQ>
 <xmx:LsoGZ_fI7m1yeYFNXV80pV5VbGr49hcWZwneLSDrStKpYwjooq5LCg>
 <xmx:LsoGZ30X1E2P8-xO7z4JUmfdqc9_DD7UtYLGuYBQPbv2FGDowRTjgg>
 <xmx:LsoGZ7_fQTA0B2l68QhVQd6088gWT5j4gKNQM-OCWyCKz_F70pmgCA>
 <xmx:L8oGZ24HAm9o1nIGjQYWP9biinj7viOgMUpa7TQjKHSGEldmnLd4VNiQ>
Feedback-ID: i07de48aa:Fastmail
Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed,
 9 Oct 2024 14:23:41 -0400 (EDT)
Message-ID: <3e63f532-c6af-4923-880b-01a32cc667ec@HIDDEN>
Date: Wed, 9 Oct 2024 21:23:37 +0300
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
To: Eli Zaretskii <eliz@HIDDEN>
References: <87tteaznog.fsf@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN>
 <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN>
 <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN>
 <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN>
 <86ttdtzoof.fsf@HIDDEN>
 <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN>
 <86ttdsxt6x.fsf@HIDDEN>
 <52cb1caa-9e7e-45df-b328-d60948d397f6@HIDDEN>
 <864j5rxca1.fsf@HIDDEN>
 <b59bf102-a9d8-4723-91ac-acc3f8ff3aa8@HIDDEN>
 <87a5fiijy9.fsf@HIDDEN>
 <bd6751c8-5504-48d7-82d7-a3e8849a1910@HIDDEN>
 <86jzelvjh4.fsf@HIDDEN>
 <8b6560a9-e2d6-42ae-ac1d-014700f21804@HIDDEN>
 <86wmiktzez.fsf@HIDDEN>
 <b0192173-3d8a-49da-8792-521b9b486568@HIDDEN>
 <86ldyzucdd.fsf@HIDDEN>
 <021c625b-adc9-4e19-819c-fe929583e503@HIDDEN>
 <86ed4ru41x.fsf@HIDDEN>
 <8d86f23e-fdc3-45a5-b3c8-cd4670813e21@HIDDEN>
 <86ploasq35.fsf@HIDDEN>
Content-Language: en-US
From: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <86ploasq35.fsf@HIDDEN>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Score: -0.7 (/)
X-Debbugs-Envelope-To: 73484
Cc: pot@HIDDEN, 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.7 (-)

On 08/10/2024 16:04, Eli Zaretskii wrote:

>>>> I think that option will be useful, but for better benchmarks and for
>>>> end usability as well, I think we need the N^2 thing fixed as well.
>>>> Maybe before the rest of the changes.
>>>
>>> If this latter part is a precodintion,
>>
>> I think we still could use the new flag, just not switch to it (no
>> extension filtering) by default yet.
> 
> OK, installed on master.  I leave it up to you whether to close the
> bug.

Thank you!

Before closing though, I'd like to look into the performance issue more.

>>> then someone else will have to
>>> work on this.  I have the new option coded and tested (and
>>> documented), but I don't intend to work on redesigning the core etags
>>> algorithms to remove the non-linear behavior, that's a much larger
>>> project which I currently cannot afford, sorry.
>>
>> Do you mind pointing at the places in the code where you already noticed
>> non-linear performance coming from?
> 
> The while-loop near line 2020, for example.

Thanks. This one must be proportional to the number of files such as 
*.y. There are only 2 in our big repo.

> Another one is the for-loop near line 1420, which deals with writing
> into TAGS the entries of files with no tags.

It's not a nested 'for' loop, though (right?), and it's called from 
'main'. That seems to mean it's just O(N) - also fine.

> There may be others, but those are what I saw.  Perhaps it is a good
> idea to profile etags while it scans the files during those 15 min, to
> see where it spends that much time, because I'm not sure even those
> loops can account for that.  It's possible there's something else at
> work here which we don't yet understand.

'perf' shows me a profile like this:

   67.31%  etags    libc.so.6          [.] __strcmp_avx2
   26.29%  etags    etags              [.] process_file_name
    2.00%  etags    etags              [.] streq
    0.96%  etags    etags              [.] strcmp@plt
    0.32%  etags    etags              [.] readline_internal
    0.11%  etags    etags              [.] HTML_labels
    0.08%  etags    [kernel.kallsyms]  [k] syscall_return_via_sysret
    0.07%  etags    [kernel.kallsyms]  [k] kmem_cache_alloc
    0.06%  etags    [kernel.kallsyms]  [k] entry_SYSRETQ_unsafe_stack
    0.05%  etags    [kernel.kallsyms]  [k] perf_adjust_freq_unthr_context
    0.04%  etags    etags              [.] c_strncasecmp

So... most of the time is spent in string comparison.

Here is the nested loop, which if I comment out, makes the parse finish 
in ~20 seconds, with all the extra files (except *.js), or in 15s when 
using with new flags.

diff --git a/lib-src/etags.c b/lib-src/etags.c
index a822a823a90..331e3ffe816 100644
--- a/lib-src/etags.c
+++ b/lib-src/etags.c
@@ -1697,14 +1697,14 @@ process_file_name (char *file, language *lang)
        uncompressed_name = file;
      }

-  /* If the canonicalized uncompressed name
-     has already been dealt with, skip it silently. */
-  for (fdp = fdhead; fdp != NULL; fdp = fdp->next)
-    {
-      assert (fdp->infname != NULL);
-      if (streq (uncompressed_name, fdp->infname))
-	goto cleanup;
-    }
+  /* /\* If the canonicalized uncompressed name */
+  /*    has already been dealt with, skip it silently. *\/ */
+  /* for (fdp = fdhead; fdp != NULL; fdp = fdp->next) */
+  /*   { */
+  /*     assert (fdp->infname != NULL); */
+  /*     if (streq (uncompressed_name, fdp->infname)) */
+  /* 	goto cleanup; */
+  /*   } */

    inf = fopen (file, "r" FOPEN_BINARY);
    if (inf)

This is basically a "uniqueness" operation using linear search, O(N^2).

Is there a hash table we could use?

Or perhaps we would skip the search when the canonicalized name is the 
same as the original one.

> Two aspects that I found trying to understand the long scan times, and
> I'd like to mention so they don't become forgotten:
> 
>   . If there are compressed files in the directory, etags will
>     uncompress them before it attempts to identify their language.
>     There are 20 such files in the gecko-dev tree (removing them from
>     the list of scanned files had only minor effect on the elapsed
>     time, but it could be different in other cases, especially if
>     uncompressing them produces very large files).

I guess someone might ask for flag "--no-decompress", sometime.

>   . Some files have their language identified by means other than their
>     names or extensions: those are the languages that have
>     "interpreters" defined in etags.c.  Shell scripts is one such case,
>     but not the only one.  So when etags-regen.el passes only files
>     with known extensions to etags, it misses those files from TAGS.
>     As one example, the file js/src/devtools/rootAnalysis/run_complete
>     in the gecko-dev tree is a Perl script, but has no .pl extension.

This sounds the same as the "hashbang" files that we mentioned 
previously. It makes sense for the scan to take longer, of course, 
proportional to the number of the detected files.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 8 Oct 2024 13:05:14 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Tue Oct 08 09:05:14 2024
Received: from localhost ([127.0.0.1]:51627 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1sy9tp-0005aH-LY
	for submit <at> debbugs.gnu.org; Tue, 08 Oct 2024 09:05:14 -0400
Received: from eggs.gnu.org ([209.51.188.92]:52420)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eliz@HIDDEN>) id 1sy9tk-0005Yh-3Z
 for 73484 <at> debbugs.gnu.org; Tue, 08 Oct 2024 09:05:11 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1sy9tS-0004IZ-Ir; Tue, 08 Oct 2024 09:04:52 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date:
 mime-version; bh=IXro6TzKiff35a4Yhfsz5/4haYpWjdevNR+VUHSqQas=; b=nMrSS9e11PZr
 4PVgHwyaL7Sf07CVkLNyt8Uy4Pg5qVN1KVqBBP55hqIyX5fRduMr52tBEQFd3C2ZDMR8rJgONUi/h
 r3QEFrXlJpOfv4gFQEli3BqbhACsZL7dp/V03Z6dt6MZoSOexdK8LFNggYz6Tr0P5sUkr0638pbge
 2ntkoM5otsAUpAGzgqLTaR6eVYfxTMP+JqUynREOQhRwmapWw7p5qzhfj4RqBh/CBcduy80Vb6MTK
 4ST31xL3UTGz6kAxCScpCo9Dg0OEP6rGrQHK/gm01BDgJis4o5xFReqFGB1119PmzT/3Aei9omBVI
 0S6Ty8rW25Abhzs+HMMz3g==;
Date: Tue, 08 Oct 2024 16:04:46 +0300
Message-Id: <86ploasq35.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <8d86f23e-fdc3-45a5-b3c8-cd4670813e21@HIDDEN> (message from
 Dmitry Gutov on Tue, 8 Oct 2024 01:08:00 +0300)
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
References: <87tteaznog.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN> <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN> <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN> <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN> <86ttdtzoof.fsf@HIDDEN>
 <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN> <86ttdsxt6x.fsf@HIDDEN>
 <52cb1caa-9e7e-45df-b328-d60948d397f6@HIDDEN> <864j5rxca1.fsf@HIDDEN>
 <b59bf102-a9d8-4723-91ac-acc3f8ff3aa8@HIDDEN>
 <87a5fiijy9.fsf@HIDDEN>
 <bd6751c8-5504-48d7-82d7-a3e8849a1910@HIDDEN> <86jzelvjh4.fsf@HIDDEN>
 <8b6560a9-e2d6-42ae-ac1d-014700f21804@HIDDEN> <86wmiktzez.fsf@HIDDEN>
 <b0192173-3d8a-49da-8792-521b9b486568@HIDDEN> <86ldyzucdd.fsf@HIDDEN>
 <021c625b-adc9-4e19-819c-fe929583e503@HIDDEN> <86ed4ru41x.fsf@HIDDEN>
 <8d86f23e-fdc3-45a5-b3c8-cd4670813e21@HIDDEN>
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 73484
Cc: pot@HIDDEN, 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

> Date: Tue, 8 Oct 2024 01:08:00 +0300
> Cc: pot@HIDDEN, spwhitton@HIDDEN, 73484 <at> debbugs.gnu.org
> From: Dmitry Gutov <dmitry@HIDDEN>
> 
> On 07/10/2024 22:05, Eli Zaretskii wrote:
> >> Date: Mon, 7 Oct 2024 20:36:47 +0300
> >> Cc: pot@HIDDEN, spwhitton@HIDDEN, 73484 <at> debbugs.gnu.org
> >> From: Dmitry Gutov <dmitry@HIDDEN>
> >>
> >> On 07/10/2024 19:05, Eli Zaretskii wrote:
> >>
> >>> So what is the conclusion from this?  Are you saying that the long
> >>> scan times in this large tree basically make this new no-fallbacks
> >>> option not very useful, since we still need to carefully include or
> >>> exclude certain files from the scan?  Or should I go ahead and install
> >>> these changes?
> >>
> >> I think that option will be useful, but for better benchmarks and for
> >> end usability as well, I think we need the N^2 thing fixed as well.
> >> Maybe before the rest of the changes.
> > 
> > If this latter part is a precodintion,
> 
> I think we still could use the new flag, just not switch to it (no 
> extension filtering) by default yet.

OK, installed on master.  I leave it up to you whether to close the
bug.

> > then someone else will have to
> > work on this.  I have the new option coded and tested (and
> > documented), but I don't intend to work on redesigning the core etags
> > algorithms to remove the non-linear behavior, that's a much larger
> > project which I currently cannot afford, sorry.
> 
> Do you mind pointing at the places in the code where you already noticed 
> non-linear performance coming from?

The while-loop near line 2020, for example.

Another one is the for-loop near line 1420, which deals with writing
into TAGS the entries of files with no tags.

There may be others, but those are what I saw.  Perhaps it is a good
idea to profile etags while it scans the files during those 15 min, to
see where it spends that much time, because I'm not sure even those
loops can account for that.  It's possible there's something else at
work here which we don't yet understand.

Two aspects that I found trying to understand the long scan times, and
I'd like to mention so they don't become forgotten:

 . If there are compressed files in the directory, etags will
   uncompress them before it attempts to identify their language.
   There are 20 such files in the gecko-dev tree (removing them from
   the list of scanned files had only minor effect on the elapsed
   time, but it could be different in other cases, especially if
   uncompressing them produces very large files).
 . Some files have their language identified by means other than their
   names or extensions: those are the languages that have
   "interpreters" defined in etags.c.  Shell scripts is one such case,
   but not the only one.  So when etags-regen.el passes only files
   with known extensions to etags, it misses those files from TAGS.
   As one example, the file js/src/devtools/rootAnalysis/run_complete
   in the gecko-dev tree is a Perl script, but has no .pl extension.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 7 Oct 2024 22:08:21 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Mon Oct 07 18:08:21 2024
Received: from localhost ([127.0.0.1]:49195 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1sxvts-00084m-Nl
	for submit <at> debbugs.gnu.org; Mon, 07 Oct 2024 18:08:21 -0400
Received: from fout-a2-smtp.messagingengine.com ([103.168.172.145]:58655)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <dmitry@HIDDEN>) id 1sxvtr-00084M-8S
 for 73484 <at> debbugs.gnu.org; Mon, 07 Oct 2024 18:08:20 -0400
Received: from phl-compute-02.internal (phl-compute-02.phl.internal
 [10.202.2.42])
 by mailfout.phl.internal (Postfix) with ESMTP id 76B6C13805E8;
 Mon,  7 Oct 2024 18:08:05 -0400 (EDT)
Received: from phl-mailfrontend-02 ([10.202.2.163])
 by phl-compute-02.internal (MEProxy); Mon, 07 Oct 2024 18:08:05 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gutov.dev; h=cc
 :cc:content-transfer-encoding:content-type:content-type:date
 :date:from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to; s=fm1; t=1728338885;
 x=1728425285; bh=Uouz52FgFk2rfLkk8m5c5hhe0gE2tUbFakX0oFbY644=; b=
 k/p0pI6OjEF1bnOwbnalsguq4hx2D0GuS8s6QPn8kEJJqLGOqMJOngOeeLsDXLOK
 30ZhOv2U2Ldx32mgmYbYoE2d2XovogLeilBnrYZ8NcnRlnyA3L8TEWXPd4r9dC79
 eoO5BjATDcBmxAht+uMisux8fwzfUVsCeHcAjybUP74ZbOrHpqD7cg3PplLRxzvI
 dIACNqIZrfsAfbuiuRoBTGb+xpulIhrVOOp66uFLFFZErHqkn6CBbSDtYcPkZGZp
 WW2eTvqeUwcPUpWZUe6o5DJ79vRta9ZnU3FIbN4Ss1s6lETWOHWE+UPi5/sFoeqq
 MdNsZ0SCihFBk7FweYr99g==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 messagingengine.com; h=cc:cc:content-transfer-encoding
 :content-type:content-type:date:date:feedback-id:feedback-id
 :from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy
 :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1728338885; x=
 1728425285; bh=Uouz52FgFk2rfLkk8m5c5hhe0gE2tUbFakX0oFbY644=; b=W
 rwNQpueLNzol83Lfe+ZTb9xAY2SrM9wCd2Zd8C40SGJPwOlwMksyGpVa8dfU77zy
 oa96bIJmKts2lupaV5A9zmza+IwSlD4NliHl25FWrESWN2mA3gekDmUsw6Fy0TJv
 KN6ivKUOe07Etfag1ka/+BSzk/fku/2s23N8u774yp72zqSEsVclT1JvGmbKi00z
 Q5I2nC72Y3xSbD9wsRP1eL7/85sRD2mcfQbwUlcI6fMxYxzvatXfuc2Heu1XIdjc
 EmAJisIRi4FYfV/c2HpEsrJATzrXFDZVF8fA3nLR4cTE2taiNUFQb7s4/8ZDgQtT
 hlYp87QVQIdkcZGhN+XNg==
X-ME-Sender: <xms:xFsEZywdeINNE_awdXBvRsiRRSLZ9VL1_H-Lw1P82vN4jBzBQ8kwgA>
 <xme:xFsEZ-QP4Dh2o40AUQATq45_4veQcmAKNgtGkrEG4rdjuN-SJFVRVWWc6yqtoYV8_
 -Qzr4-Ot8HdFRmqvfo>
X-ME-Received: <xmr:xFsEZ0X0Pdu4rsNG4Vt7k-DL3XezmG0vG1XGA1e140VoHViezJhEPmVkxPUcZNr6C6I>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvdeftddgtdeiucetufdoteggodetrfdotf
 fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu
 rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnh
 htshculddquddttddmnecujfgurhepkfffgggfuffvvehfhfgjtgfgsehtjeertddtvdej
 necuhfhrohhmpeffmhhithhrhicuifhuthhovhcuoegumhhithhrhiesghhuthhovhdrug
 gvvheqnecuggftrfgrthhtvghrnhepteduleejgeehtefgheegjeekueehvdevieekueef
 tddvtdevfefhvdevgedujeehnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpe
 hmrghilhhfrhhomhepughmihhtrhihsehguhhtohhvrdguvghvpdhnsggprhgtphhtthho
 peegpdhmohguvgepshhmthhpohhuthdprhgtphhtthhopegvlhhiiiesghhnuhdrohhrgh
 dprhgtphhtthhopehpohhtsehgnhhurdhorhhgpdhrtghpthhtohepshhpfihhihhtthho
 nhesshhpfihhihhtthhonhdrnhgrmhgvpdhrtghpthhtohepjeefgeekgeesuggvsggsuh
 hgshdrghhnuhdrohhrgh
X-ME-Proxy: <xmx:xVsEZ4hrVk1ckXpkr5iGJ_pr97PG2RdmLAjBxoX5dkh2X4L19gErKQ>
 <xmx:xVsEZ0C9C-DFZz4VhXgVR2ay3ZuRrIFD9eBRLmM1W38YP-havh16cw>
 <xmx:xVsEZ5Kw7Wg72Rj_wtZ829wR9PE-fAULIQyg9_UxA7G-1NzITAxXvw>
 <xmx:xVsEZ7DrDA8iUuxnwA6lHrSsL4oJioWN-3T-Di-AiDF_q1ohG_3nPQ>
 <xmx:xVsEZ39AbqwO8CVZyd-3pNpQmHkSCypyL00Yf67-OgXjmXptsBTj5m9S>
Feedback-ID: i07de48aa:Fastmail
Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon,
 7 Oct 2024 18:08:03 -0400 (EDT)
Message-ID: <8d86f23e-fdc3-45a5-b3c8-cd4670813e21@HIDDEN>
Date: Tue, 8 Oct 2024 01:08:00 +0300
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
To: Eli Zaretskii <eliz@HIDDEN>
References: <87tteaznog.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN> <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN> <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN> <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN> <86ttdtzoof.fsf@HIDDEN>
 <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN> <86ttdsxt6x.fsf@HIDDEN>
 <52cb1caa-9e7e-45df-b328-d60948d397f6@HIDDEN> <864j5rxca1.fsf@HIDDEN>
 <b59bf102-a9d8-4723-91ac-acc3f8ff3aa8@HIDDEN>
 <87a5fiijy9.fsf@HIDDEN>
 <bd6751c8-5504-48d7-82d7-a3e8849a1910@HIDDEN> <86jzelvjh4.fsf@HIDDEN>
 <8b6560a9-e2d6-42ae-ac1d-014700f21804@HIDDEN> <86wmiktzez.fsf@HIDDEN>
 <b0192173-3d8a-49da-8792-521b9b486568@HIDDEN> <86ldyzucdd.fsf@HIDDEN>
 <021c625b-adc9-4e19-819c-fe929583e503@HIDDEN> <86ed4ru41x.fsf@HIDDEN>
Content-Language: en-US
From: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <86ed4ru41x.fsf@HIDDEN>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Score: -0.7 (/)
X-Debbugs-Envelope-To: 73484
Cc: pot@HIDDEN, 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.7 (-)

On 07/10/2024 22:05, Eli Zaretskii wrote:
>> Date: Mon, 7 Oct 2024 20:36:47 +0300
>> Cc: pot@HIDDEN, spwhitton@HIDDEN, 73484 <at> debbugs.gnu.org
>> From: Dmitry Gutov <dmitry@HIDDEN>
>>
>> On 07/10/2024 19:05, Eli Zaretskii wrote:
>>
>>> So what is the conclusion from this?  Are you saying that the long
>>> scan times in this large tree basically make this new no-fallbacks
>>> option not very useful, since we still need to carefully include or
>>> exclude certain files from the scan?  Or should I go ahead and install
>>> these changes?
>>
>> I think that option will be useful, but for better benchmarks and for
>> end usability as well, I think we need the N^2 thing fixed as well.
>> Maybe before the rest of the changes.
> 
> If this latter part is a precodintion,

I think we still could use the new flag, just not switch to it (no 
extension filtering) by default yet.

> then someone else will have to
> work on this.  I have the new option coded and tested (and
> documented), but I don't intend to work on redesigning the core etags
> algorithms to remove the non-linear behavior, that's a much larger
> project which I currently cannot afford, sorry.

Do you mind pointing at the places in the code where you already noticed 
non-linear performance coming from?




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 7 Oct 2024 19:05:58 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Mon Oct 07 15:05:57 2024
Received: from localhost ([127.0.0.1]:48196 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1sxt3N-0006MM-Io
	for submit <at> debbugs.gnu.org; Mon, 07 Oct 2024 15:05:57 -0400
Received: from eggs.gnu.org ([209.51.188.92]:55642)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eliz@HIDDEN>) id 1sxt3L-0006M7-Eo
 for 73484 <at> debbugs.gnu.org; Mon, 07 Oct 2024 15:05:56 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1sxt37-0002f4-Sv; Mon, 07 Oct 2024 15:05:41 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date:
 mime-version; bh=yJcZXuTIid1w2icynEm7QJoPevFjrDie/BzNjQJjMGM=; b=ayjk0qADkxeB
 cyBJFq47hiUw+G+Hzyi1kRhl0eXvvX4drDqqsxhcFyy9eaLOzxMW5QoYiSDhFCDRdnnnGRoZ3G048
 knIunN88L3q5sHqMKKNaV0wpmQnaURP9pB/9jFbQwFjGa4+JUIyeNDl7NEbCx3wXFYxUn6YvGfSXQ
 d8cJSoE7CSfOp9lunAj+TXCVV+bQPlvjjtHncwLs2HqARzFkTLKjn87lgiWpzX2+YZha+q3r0/W88
 f5yW5OjEOo7/nlIYyvP92e176V/G8zFBgeQpfBvtn81gj2oLKMkp/c5VzyiMkaNa9blOVUEgyriuv
 l7LKjAFLogw+IpMbTrY2gQ==;
Date: Mon, 07 Oct 2024 22:05:30 +0300
Message-Id: <86ed4ru41x.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <021c625b-adc9-4e19-819c-fe929583e503@HIDDEN> (message from
 Dmitry Gutov on Mon, 7 Oct 2024 20:36:47 +0300)
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
References: <87tteaznog.fsf@HIDDEN>
 <ea10f340-9b46-4199-93fc-274c5e81ace4@HIDDEN>
 <87o74c1ce1.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN>
 <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN>
 <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN>
 <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN>
 <86ttdtzoof.fsf@HIDDEN>
 <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN>
 <86ttdsxt6x.fsf@HIDDEN>
 <52cb1caa-9e7e-45df-b328-d60948d397f6@HIDDEN>
 <864j5rxca1.fsf@HIDDEN>
 <b59bf102-a9d8-4723-91ac-acc3f8ff3aa8@HIDDEN>
 <87a5fiijy9.fsf@HIDDEN>
 <bd6751c8-5504-48d7-82d7-a3e8849a1910@HIDDEN>
 <86jzelvjh4.fsf@HIDDEN>
 <8b6560a9-e2d6-42ae-ac1d-014700f21804@HIDDEN>
 <86wmiktzez.fsf@HIDDEN>
 <b0192173-3d8a-49da-8792-521b9b486568@HIDDEN>
 <86ldyzucdd.fsf@HIDDEN> <021c625b-adc9-4e19-819c-fe929583e503@HIDDEN>
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 73484
Cc: pot@HIDDEN, 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

> Date: Mon, 7 Oct 2024 20:36:47 +0300
> Cc: pot@HIDDEN, spwhitton@HIDDEN, 73484 <at> debbugs.gnu.org
> From: Dmitry Gutov <dmitry@HIDDEN>
> 
> On 07/10/2024 19:05, Eli Zaretskii wrote:
> 
> > So what is the conclusion from this?  Are you saying that the long
> > scan times in this large tree basically make this new no-fallbacks
> > option not very useful, since we still need to carefully include or
> > exclude certain files from the scan?  Or should I go ahead and install
> > these changes?
> 
> I think that option will be useful, but for better benchmarks and for 
> end usability as well, I think we need the N^2 thing fixed as well. 
> Maybe before the rest of the changes.

If this latter part is a precodintion, then someone else will have to
work on this.  I have the new option coded and tested (and
documented), but I don't intend to work on redesigning the core etags
algorithms to remove the non-linear behavior, that's a much larger
project which I currently cannot afford, sorry.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 7 Oct 2024 17:37:14 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Mon Oct 07 13:37:14 2024
Received: from localhost ([127.0.0.1]:47790 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1sxrfW-00017F-9m
	for submit <at> debbugs.gnu.org; Mon, 07 Oct 2024 13:37:14 -0400
Received: from fout-a7-smtp.messagingengine.com ([103.168.172.150]:35849)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <dmitry@HIDDEN>) id 1sxrfU-00016z-L4
 for 73484 <at> debbugs.gnu.org; Mon, 07 Oct 2024 13:37:13 -0400
Received: from phl-compute-02.internal (phl-compute-02.phl.internal
 [10.202.2.42])
 by mailfout.phl.internal (Postfix) with ESMTP id 698ED1380245;
 Mon,  7 Oct 2024 13:36:52 -0400 (EDT)
Received: from phl-mailfrontend-02 ([10.202.2.163])
 by phl-compute-02.internal (MEProxy); Mon, 07 Oct 2024 13:36:52 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gutov.dev; h=cc
 :cc:content-transfer-encoding:content-type:content-type:date
 :date:from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to; s=fm1; t=1728322612;
 x=1728409012; bh=OmnbFYaP5aXNgVrcRF/52Nroph9Q5aUYw1r5wNWoG44=; b=
 fbYevSLD01D5Feiet4KEXj1ilokrX9GMHXt2ZJfyci3hE/Y3y0x+w2kbwD5aqM80
 R8j1rJFfoQsKUNjWKONA7eR8GEOh4hjIPQw2qP9BSvr/kGKF9QaeUjURGf4g35ig
 hbHnPwU6eWfqb5IaHd5vTeoLsp5JCiuURhiwabbboD6V+57817icwFlkBQPYHTVm
 Ey0g5GbBtcLxUGSCeEmTN64C2LbtVCZbMm1AxmTdRNJzd2ACMJMr5nK5eW7Ng0Bq
 Mk9UEKhZJt4cCFckfTZF3n7hs5LbgztEnm913Jv5rHiZYfsDRXjb8HYTAH5R/gtw
 jR9wWHoIUWmNE7Cp8cOz3A==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 messagingengine.com; h=cc:cc:content-transfer-encoding
 :content-type:content-type:date:date:feedback-id:feedback-id
 :from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy
 :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1728322612; x=
 1728409012; bh=OmnbFYaP5aXNgVrcRF/52Nroph9Q5aUYw1r5wNWoG44=; b=X
 BLEKNIWSPusVAfV5P33E036bgIwJhK7JP/0noLK/fmjLQtaTp8GYP6tl8fImem4e
 KHZnJ0QUFhQakZ6atNp/j84bnsxQ4IlInZL3o5RbBPUYl0lwLmQGppKdMh9u+0pi
 G0CrpzvqBj00NOhkWoxLXdNJpouO5wshR5auuIofKgkHwfQ0HhKJVCbD1c46oxDT
 tWa0y5gejv4h+LcqMg++cZvCEY36JEA8BXC0t1Yy7uTRK644dYEDGQuRknxZAFTJ
 md1fjLvoPiHB5stmtcKBHtBtxyIUK6NvCpLWkPedugdgBkY5X88r1W41Pgaawl96
 3xVAw4F739GqrY41AmPTg==
X-ME-Sender: <xms:MxwEZ37JounlR_mn2Dirn2Cpp9XN67aC8tQvBH_HbHhHO-D1He3fGg>
 <xme:MxwEZ84adQdsUfupafzJ5BQw82wMxHsPlbXaaYGkNHIRPAhK2wgX0Ml-u_p5FTg6B
 XLwJrshsGpVMqjiafY>
X-ME-Received: <xmr:MxwEZ-fVObRsC1QvL2YQ9SOHWMpMOHgN9C5Pbbe1OZvarb_hfHDYl1cpWb_lj4oo9oI>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvddvledgudduiecutefuodetggdotefrod
 ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpggftfghnshhusghstghrihgsvgdp
 uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg
 hnthhsucdlqddutddtmdenucfjughrpefkffggfgfuvfevfhfhjggtgfesthejredttddv
 jeenucfhrhhomhepffhmihhtrhihucfiuhhtohhvuceoughmihhtrhihsehguhhtohhvrd
 guvghvqeenucggtffrrghtthgvrhhnpeetudeljeegheetgfehgeejkeeuhedvveeikeeu
 fedtvddtveefhfdvveegudejheenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmh
 epmhgrihhlfhhrohhmpegumhhithhrhiesghhuthhovhdruggvvhdpnhgspghrtghpthht
 ohepgedpmhhouggvpehsmhhtphhouhhtpdhrtghpthhtohepvghlihiisehgnhhurdhorh
 hgpdhrtghpthhtohepphhothesghhnuhdrohhrghdprhgtphhtthhopehsphifhhhithht
 ohhnsehsphifhhhithhtohhnrdhnrghmvgdprhgtphhtthhopeejfeegkeegseguvggssg
 hughhsrdhgnhhurdhorhhg
X-ME-Proxy: <xmx:MxwEZ4IJpXqUgWzX9KPmg_95xyCLdAUwrwNKPc1l7X7TAoIOnrCYwQ>
 <xmx:MxwEZ7Kc6zq4Q6rIv3TdpDZ0Mpf3wXzVcrTAuN3N-WAvog535co3iA>
 <xmx:MxwEZxxAGBa30-l8Ix8nxCN9QlXjo8B-zNS21PSK8Uy0s9KvzFCrqQ>
 <xmx:MxwEZ3IFWgbbQfpc8ingwq3gH0x_fk0Mg6jK-IVu8dXWtQer5EQp5w>
 <xmx:NBwEZ4EQaKcw8oxEhy9wEz0X993nhGd33av5T1_6UlqCm6CSMBIe10OL>
Feedback-ID: i07de48aa:Fastmail
Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon,
 7 Oct 2024 13:36:50 -0400 (EDT)
Message-ID: <021c625b-adc9-4e19-819c-fe929583e503@HIDDEN>
Date: Mon, 7 Oct 2024 20:36:47 +0300
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
To: Eli Zaretskii <eliz@HIDDEN>
References: <87tteaznog.fsf@HIDDEN>
 <ea10f340-9b46-4199-93fc-274c5e81ace4@HIDDEN>
 <87o74c1ce1.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN>
 <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN>
 <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN>
 <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN>
 <86ttdtzoof.fsf@HIDDEN>
 <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN>
 <86ttdsxt6x.fsf@HIDDEN>
 <52cb1caa-9e7e-45df-b328-d60948d397f6@HIDDEN>
 <864j5rxca1.fsf@HIDDEN>
 <b59bf102-a9d8-4723-91ac-acc3f8ff3aa8@HIDDEN>
 <87a5fiijy9.fsf@HIDDEN>
 <bd6751c8-5504-48d7-82d7-a3e8849a1910@HIDDEN>
 <86jzelvjh4.fsf@HIDDEN>
 <8b6560a9-e2d6-42ae-ac1d-014700f21804@HIDDEN>
 <86wmiktzez.fsf@HIDDEN>
 <b0192173-3d8a-49da-8792-521b9b486568@HIDDEN>
 <86ldyzucdd.fsf@HIDDEN>
Content-Language: en-US
From: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <86ldyzucdd.fsf@HIDDEN>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Score: -0.7 (/)
X-Debbugs-Envelope-To: 73484
Cc: pot@HIDDEN, 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.7 (-)

On 07/10/2024 19:05, Eli Zaretskii wrote:

> So you are comparing the speed of scanning ~60K files with the speed
> of scanning ~375K of files?  I'm not generally surprised that the
> latter takes much longer, only that the slowdown is not proportional
> to the number of scanned files.  But see below.

I forgot one thing: all .js files are actually set to be ignored there. 
And my tree is a little old, so it's 200K files total. Otherwise -- yes.

Note, however, that the time is really not proportional: 30 s vs 15 min 
is a 30x difference.

And I've been assuming that the "other" files would mostly fall in the 
non-recognized category, and most of them would only have the 2 first 
characters read (then, recognizing that those chars are not '#!', etags 
would skip the file).

> Btw, did you exclude the .git/* files from the list submitted to
> etags?

Yes, it's excluded. And the files matching the .gitignore entries are 
excluded as well.

> Here, scanning, with the unmodified etags from Emacs 30, of only those
> files with extensions in etags-regen-file-extensions takes 16.7 sec
> and produces a 80.5MB tags table, whereas scanning all the files with
> the same etags takes almost 16 min and produces 304MB tags table, of
> which more than 200MB are from files whose language is not recognized.

My result in the latter case was only 88 MB. Maybe the many .js files 
make the difference. I've put them into the "ignored" category long ago 
because most of them are used for tests, and there are a lot of those 
files, and there are generated one-long-line files.

>  From my testing, it seems like the elapsed time depends non-linearly
> on the length of the list of files submitted to etags.  For example,
> if I break the list of files in two, I get 3 min 20 sec and 1 min 40
> sec, together 5 min.  But if I submit a single list with all the files
> in those two lists, I get 14 min 30 sec.  I guess some internal
> processing etags does depends non-linearly on the number of files it
> scans.  The various loops in etags that scan all of the known files
> and/or the tags it previously found seem to confirm this hypothesis.

Makes sense! It sounds like some N^2 complexity somewhere.

> So what is the conclusion from this?  Are you saying that the long
> scan times in this large tree basically make this new no-fallbacks
> option not very useful, since we still need to carefully include or
> exclude certain files from the scan?  Or should I go ahead and install
> these changes?

I think that option will be useful, but for better benchmarks and for 
end usability as well, I think we need the N^2 thing fixed as well. 
Maybe before the rest of the changes.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 7 Oct 2024 16:06:11 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Mon Oct 07 12:06:11 2024
Received: from localhost ([127.0.0.1]:47514 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1sxqFO-0004kh-Hs
	for submit <at> debbugs.gnu.org; Mon, 07 Oct 2024 12:06:11 -0400
Received: from eggs.gnu.org ([209.51.188.92]:57662)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eliz@HIDDEN>) id 1sxqFL-0004kQ-6u
 for 73484 <at> debbugs.gnu.org; Mon, 07 Oct 2024 12:06:08 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1sxqF7-0004wV-Vc; Mon, 07 Oct 2024 12:05:53 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date:
 mime-version; bh=eOj7ib1xcnVUEa0J2c28m7DnbbILfWwjTgis1icLJq0=; b=ZnIp0956pFQT
 YJweGcVKdWditmZR2ytJ8wFvgXHgIouU91mcXAV9aLf0pwuDo6mEjbXMXF/U/P0lLw3nBzMOI+Efw
 aCB5wQETVlfyFSqexuAlAjjNJ5lU3lermIwNsZn/jLmWL70OKGDm6B05OtOn+7ju75obRvOV5uEZ+
 1Zmpakj4Y9aMYjzsoGEiGTQ1CoQVbmv/uFqq4fNRRGWA0MWH5X9+gmW9QWem9QA3eDyPVOiltB7g0
 i135tspSXn7EvMo0xTv7KuLnI4tE3JCQ6FGRtJ0yXceFCiC85xtM4LpMF3EY1IqwWCp/dQwX4/UST
 vm/TPciOCH6gaLs2+DILCQ==;
Date: Mon, 07 Oct 2024 19:05:50 +0300
Message-Id: <86ldyzucdd.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <b0192173-3d8a-49da-8792-521b9b486568@HIDDEN> (message from
 Dmitry Gutov on Mon, 7 Oct 2024 10:11:08 +0300)
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
References: <87tteaznog.fsf@HIDDEN>
 <edab570c-b2fa-4162-9383-df5c8aaff251@HIDDEN>
 <8734lrrj4e.fsf@HIDDEN>
 <ea10f340-9b46-4199-93fc-274c5e81ace4@HIDDEN>
 <87o74c1ce1.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN>
 <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN>
 <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN>
 <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN>
 <86ttdtzoof.fsf@HIDDEN>
 <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN>
 <86ttdsxt6x.fsf@HIDDEN>
 <52cb1caa-9e7e-45df-b328-d60948d397f6@HIDDEN>
 <864j5rxca1.fsf@HIDDEN>
 <b59bf102-a9d8-4723-91ac-acc3f8ff3aa8@HIDDEN>
 <87a5fiijy9.fsf@HIDDEN>
 <bd6751c8-5504-48d7-82d7-a3e8849a1910@HIDDEN>
 <86jzelvjh4.fsf@HIDDEN>
 <8b6560a9-e2d6-42ae-ac1d-014700f21804@HIDDEN>
 <86wmiktzez.fsf@HIDDEN> <b0192173-3d8a-49da-8792-521b9b486568@HIDDEN>
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 73484
Cc: pot@HIDDEN, 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

> Date: Mon, 7 Oct 2024 10:11:08 +0300
> Cc: pot@HIDDEN, spwhitton@HIDDEN, 73484 <at> debbugs.gnu.org
> From: Dmitry Gutov <dmitry@HIDDEN>
> 
> > Can you please show the etags command line in each of these two cases
> > that you are comparing?
> 
> Both commands end with a '-' (scanning the list of files passed from stdin).
> 
> >>> And if they don't have extensions, the code you
> >>> removed would have caused etags to scan these files anyway, looking
> >>> for Fortran or C tags.  So how come the change slowed down etags so
> >>> much?  What am I missing?
> >> I think it would also concern "unknown" extensions, right? Like .txt,
> >> .png and so on.
> > I have difficulty reasoning about this without knowing the command
> > lines you used.  E.g., I don't understand why in one case it would
> > scan files with unknown extensions that were not scanned in the other.
> 
> In one case the list is pre-filtered with etags-regen-file-extensions 
> (see 'etags-regen--all-files'), in the other - it is not, and all files 
> in project are passed.

So you are comparing the speed of scanning ~60K files with the speed
of scanning ~375K of files?  I'm not generally surprised that the
latter takes much longer, only that the slowdown is not proportional
to the number of scanned files.  But see below.

Btw, did you exclude the .git/* files from the list submitted to
etags?

Here, scanning, with the unmodified etags from Emacs 30, of only those
files with extensions in etags-regen-file-extensions takes 16.7 sec
and produces a 80.5MB tags table, whereas scanning all the files with
the same etags takes almost 16 min and produces 304MB tags table, of
which more than 200MB are from files whose language is not recognized.

From my testing, it seems like the elapsed time depends non-linearly
on the length of the list of files submitted to etags.  For example,
if I break the list of files in two, I get 3 min 20 sec and 1 min 40
sec, together 5 min.  But if I submit a single list with all the files
in those two lists, I get 14 min 30 sec.  I guess some internal
processing etags does depends non-linearly on the number of files it
scans.  The various loops in etags that scan all of the known files
and/or the tags it previously found seem to confirm this hypothesis.

So what is the conclusion from this?  Are you saying that the long
scan times in this large tree basically make this new no-fallbacks
option not very useful, since we still need to carefully include or
exclude certain files from the scan?  Or should I go ahead and install
these changes?




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 7 Oct 2024 07:11:29 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Mon Oct 07 03:11:29 2024
Received: from localhost ([127.0.0.1]:44394 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1sxhtx-0000pf-Ac
	for submit <at> debbugs.gnu.org; Mon, 07 Oct 2024 03:11:29 -0400
Received: from fout-a1-smtp.messagingengine.com ([103.168.172.144]:53051)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <dmitry@HIDDEN>) id 1sxhtu-0000pQ-68
 for 73484 <at> debbugs.gnu.org; Mon, 07 Oct 2024 03:11:27 -0400
Received: from phl-compute-02.internal (phl-compute-02.phl.internal
 [10.202.2.42])
 by mailfout.phl.internal (Postfix) with ESMTP id C3D6D13800F3;
 Mon,  7 Oct 2024 03:11:13 -0400 (EDT)
Received: from phl-mailfrontend-02 ([10.202.2.163])
 by phl-compute-02.internal (MEProxy); Mon, 07 Oct 2024 03:11:13 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gutov.dev; h=cc
 :cc:content-transfer-encoding:content-type:content-type:date
 :date:from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to; s=fm1; t=1728285073;
 x=1728371473; bh=JagFvY4UvimI2+/+t1HWjYntz503tzQOfIm21DpzVzo=; b=
 TfjeNeiQ1qqBFl9JoRqv2ZUI2g5T1amwPvpm4rdN0WxXnTQWMytCd3hgbOvl5Q8m
 LaaeFydeUcMLdBs5NssLsIQECAA7jebez/TD83Eic3dqCp437FZ1bKBVfsd8THei
 AyHRaqnC3Xf33qF8fp88pWBsbMAqATy9ZTHzQvSZ2GEZYEf7ceTAkvenJRqzgEt7
 S4kBwPws5yn7CV7qZw5ScoR77CiU90cLWfQc7F2aew0let46cnnA+kBDEvP9Z6or
 QR6IO+ygt63j1f3pNfGNH4znFQLMqWId4qBmJDfc5lvSFiBXOe6edx9z56pLTlFA
 OiPVIw7hpZCXabCKpo9URA==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 messagingengine.com; h=cc:cc:content-transfer-encoding
 :content-type:content-type:date:date:feedback-id:feedback-id
 :from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy
 :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1728285073; x=
 1728371473; bh=JagFvY4UvimI2+/+t1HWjYntz503tzQOfIm21DpzVzo=; b=R
 yOYeH9Ltk/E5TGO7yHDnlQ6zbXCJdKqJgIj9xE7TzM2wIP5c90Pm5i9t79ar+odD
 6+6ZGFXk0azstR7H558nPD7n4Chw4uFkT3/0pFzswSZMN5zref3FZuRR40txaMDQ
 DPY28adkVcwqt5xjd52PwdLrIS2tMx+93Fj6oKeKRR3oOtVLzWjkDsAmtg1XwQfj
 Olm0zcxhQcazt+Bs4ys1H+cHnCElNnpWN7jiBvyh/2Wj7blWunXpLPWBpgBRoMH7
 lzoam0F4txlnE4VuR+RbqJQ30xITV9kq8NUB8ZqTTrODk2SS7sHgusvq9T35fcOS
 y2q5c3LU8tGgan5oRX3WA==
X-ME-Sender: <xms:kYkDZzyf8xp3oIr2IsCAlufR1m9W1dsK400ZAh_RPd-aRKxloNcwVg>
 <xme:kYkDZ7TQEHY-f__vToY0DY8OsnrC0m6yWGqAr4jOyUf6Wv7m3m7a-ic3o13o8z5sB
 Nk73c2sCozivSBsI90>
X-ME-Received: <xmr:kYkDZ9Xs-tUWQgEwuPEyDueBuogb4WxOGYr-eJohmurStEhyPk-7XISS4vUxXCxm6E8>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvddvkedguddukecutefuodetggdotefrod
 ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpggftfghnshhusghstghrihgsvgdp
 uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg
 hnthhsucdlqddutddtmdenucfjughrpefkffggfgfuvfevfhfhjggtgfesthejredttddv
 jeenucfhrhhomhepffhmihhtrhihucfiuhhtohhvuceoughmihhtrhihsehguhhtohhvrd
 guvghvqeenucggtffrrghtthgvrhhnpeetudeljeegheetgfehgeejkeeuhedvveeikeeu
 fedtvddtveefhfdvveegudejheenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmh
 epmhgrihhlfhhrohhmpegumhhithhrhiesghhuthhovhdruggvvhdpnhgspghrtghpthht
 ohepgedpmhhouggvpehsmhhtphhouhhtpdhrtghpthhtohepvghlihiisehgnhhurdhorh
 hgpdhrtghpthhtohepphhothesghhnuhdrohhrghdprhgtphhtthhopehsphifhhhithht
 ohhnsehsphifhhhithhtohhnrdhnrghmvgdprhgtphhtthhopeejfeegkeegseguvggssg
 hughhsrdhgnhhurdhorhhg
X-ME-Proxy: <xmx:kYkDZ9ig3dsyge6pPqS6C901hEGSzMHLjN7fZAM3-rp2WjFf9Gildw>
 <xmx:kYkDZ1BQQ481Vz_MH55dbXpcvAfeI-_JPD88Mm6N08zKDtTFHACEjw>
 <xmx:kYkDZ2Kud2Sb1uBVi0K2cj9QJIprSYTSdTFrHNXsVfvEZN7bKw2p2g>
 <xmx:kYkDZ0Ddln6nxNlcTrY3iFcmI_W4KxoywOP9c3JLJ-ePWt65rAWTtA>
 <xmx:kYkDZw_UDhSWZVBkHpvuFEL9zKA781F2Vlqb4hjkfaiiIPmqcvBW-h9g>
Feedback-ID: i07de48aa:Fastmail
Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon,
 7 Oct 2024 03:11:11 -0400 (EDT)
Message-ID: <b0192173-3d8a-49da-8792-521b9b486568@HIDDEN>
Date: Mon, 7 Oct 2024 10:11:08 +0300
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
To: Eli Zaretskii <eliz@HIDDEN>
References: <87tteaznog.fsf@HIDDEN>
 <edab570c-b2fa-4162-9383-df5c8aaff251@HIDDEN>
 <8734lrrj4e.fsf@HIDDEN>
 <ea10f340-9b46-4199-93fc-274c5e81ace4@HIDDEN>
 <87o74c1ce1.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN>
 <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN>
 <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN>
 <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN>
 <86ttdtzoof.fsf@HIDDEN>
 <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN>
 <86ttdsxt6x.fsf@HIDDEN>
 <52cb1caa-9e7e-45df-b328-d60948d397f6@HIDDEN>
 <864j5rxca1.fsf@HIDDEN>
 <b59bf102-a9d8-4723-91ac-acc3f8ff3aa8@HIDDEN>
 <87a5fiijy9.fsf@HIDDEN>
 <bd6751c8-5504-48d7-82d7-a3e8849a1910@HIDDEN>
 <86jzelvjh4.fsf@HIDDEN>
 <8b6560a9-e2d6-42ae-ac1d-014700f21804@HIDDEN>
 <86wmiktzez.fsf@HIDDEN>
Content-Language: en-US
From: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <86wmiktzez.fsf@HIDDEN>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Score: -0.7 (/)
X-Debbugs-Envelope-To: 73484
Cc: pot@HIDDEN, 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.7 (-)

On 07/10/2024 05:33, Eli Zaretskii wrote:
>> Sorry, I have to add a correction: it's about 15 min either way. Seems
>> like the first time I either messed up the start time, or the directory
>> was in "cold" cache, or the used etags some much older version.
>>
>> So to reiterate: the current etags-regen scans in around 30s, and the
>> simple switch scans the directory in 15 minutes. Retesting the change
>> from previous email, it doesn't really help.
> Can you please show the etags command line in each of these two cases
> that you are comparing?

Both commands end with a '-' (scanning the list of files passed from stdin).

>>> And if they don't have extensions, the code you
>>> removed would have caused etags to scan these files anyway, looking
>>> for Fortran or C tags.  So how come the change slowed down etags so
>>> much?  What am I missing?
>> I think it would also concern "unknown" extensions, right? Like .txt,
>> .png and so on.
> I have difficulty reasoning about this without knowing the command
> lines you used.  E.g., I don't understand why in one case it would
> scan files with unknown extensions that were not scanned in the other.

In one case the list is pre-filtered with etags-regen-file-extensions 
(see 'etags-regen--all-files'), in the other - it is not, and all files 
in project are passed.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 7 Oct 2024 02:33:43 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sun Oct 06 22:33:43 2024
Received: from localhost ([127.0.0.1]:43727 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1sxdZ9-0002Hq-6Q
	for submit <at> debbugs.gnu.org; Sun, 06 Oct 2024 22:33:43 -0400
Received: from eggs.gnu.org ([209.51.188.92]:58258)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eliz@HIDDEN>) id 1sxdZ7-0002HZ-2w
 for 73484 <at> debbugs.gnu.org; Sun, 06 Oct 2024 22:33:42 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1sxdYu-0000NP-3g; Sun, 06 Oct 2024 22:33:28 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date:
 mime-version; bh=cKXIWY8p/b7WLYmNjJdaf89eslx+xSWZ2ep6avNdz20=; b=l0m/uV26HAlE
 zZmvKnqML7BrAWparQe/axUq1cDNHofRWaDBlOMEM6opzXrI5y5fzEAsAMmdQP4Lw9dUzE+2KeQ7j
 QesyDMfTOHv9nRxUtm766KZWs8a6h6IwxVPmGAy6sabNtM4ZEaNnEywgVtNpQHFokEL53ZNayTe4Z
 Kt1bhGOFstej67vSrPiFGmwetfM49kaVuHgsWaRhJByPv7q5fHJXKUEMfCaH/ExJSRHeuGGPSbm2F
 8M0VJzRItzIyazbfHjMbnpwq/41mtpNuQXEKpLEDfoj4Pu5SnE42xx2Ik2Z3aBlAJcWyL2c2PvQCl
 af3S+KmXFUHbQmimUVm84g==;
Date: Mon, 07 Oct 2024 05:33:24 +0300
Message-Id: <86wmiktzez.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <8b6560a9-e2d6-42ae-ac1d-014700f21804@HIDDEN> (message from
 Dmitry Gutov on Sun, 6 Oct 2024 22:14:46 +0300)
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
References: <87tteaznog.fsf@HIDDEN>
 <edab570c-b2fa-4162-9383-df5c8aaff251@HIDDEN>
 <8734lrrj4e.fsf@HIDDEN>
 <ea10f340-9b46-4199-93fc-274c5e81ace4@HIDDEN>
 <87o74c1ce1.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN> <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN> <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN> <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN> <86ttdtzoof.fsf@HIDDEN>
 <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN> <86ttdsxt6x.fsf@HIDDEN>
 <52cb1caa-9e7e-45df-b328-d60948d397f6@HIDDEN> <864j5rxca1.fsf@HIDDEN>
 <b59bf102-a9d8-4723-91ac-acc3f8ff3aa8@HIDDEN>
 <87a5fiijy9.fsf@HIDDEN>
 <bd6751c8-5504-48d7-82d7-a3e8849a1910@HIDDEN> <86jzelvjh4.fsf@HIDDEN>
 <8b6560a9-e2d6-42ae-ac1d-014700f21804@HIDDEN>
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 73484
Cc: pot@HIDDEN, 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

> Date: Sun, 6 Oct 2024 22:14:46 +0300
> Cc: pot@HIDDEN, spwhitton@HIDDEN, 73484 <at> debbugs.gnu.org
> From: Dmitry Gutov <dmitry@HIDDEN>
> 
> On 06/10/2024 09:22, Eli Zaretskii wrote:
> 
> >> Then, the total time increased a lot: from 30 s to 30-40 min.
> > 
> > I don't understand why.  How many files with no extensions are in that
> > tree, and what was the etags command line in both cases?
> 
> Sorry, I have to add a correction: it's about 15 min either way. Seems 
> like the first time I either messed up the start time, or the directory 
> was in "cold" cache, or the used etags some much older version.
> 
> So to reiterate: the current etags-regen scans in around 30s, and the 
> simple switch scans the directory in 15 minutes. Retesting the change 
> from previous email, it doesn't really help.

Can you please show the etags command line in each of these two cases
that you are comparing?

> > And if they don't have extensions, the code you
> > removed would have caused etags to scan these files anyway, looking
> > for Fortran or C tags.  So how come the change slowed down etags so
> > much?  What am I missing?
> 
> I think it would also concern "unknown" extensions, right? Like .txt, 
> .png and so on.

I have difficulty reasoning about this without knowing the command
lines you used.  E.g., I don't understand why in one case it would
scan files with unknown extensions that were not scanned in the other.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 6 Oct 2024 19:15:05 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sun Oct 06 15:15:05 2024
Received: from localhost ([127.0.0.1]:42312 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1sxWie-00037S-Qn
	for submit <at> debbugs.gnu.org; Sun, 06 Oct 2024 15:15:05 -0400
Received: from fout-a1-smtp.messagingengine.com ([103.168.172.144]:57063)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <dmitry@HIDDEN>) id 1sxWib-00036i-Qr
 for 73484 <at> debbugs.gnu.org; Sun, 06 Oct 2024 15:15:03 -0400
Received: from phl-compute-01.internal (phl-compute-01.phl.internal
 [10.202.2.41])
 by mailfout.phl.internal (Postfix) with ESMTP id 491A513800ED;
 Sun,  6 Oct 2024 15:14:50 -0400 (EDT)
Received: from phl-mailfrontend-02 ([10.202.2.163])
 by phl-compute-01.internal (MEProxy); Sun, 06 Oct 2024 15:14:50 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gutov.dev; h=cc
 :cc:content-transfer-encoding:content-type:content-type:date
 :date:from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to; s=fm1; t=1728242090;
 x=1728328490; bh=xSDvjzv2R3JJ3V72JlWgfldCkz/v6Gp4cpwBYZseS0o=; b=
 AcX25BOc4B8xc/imj3VgFe84BrXrXsiw6q77rZMIrfO9CFSi4vZUApJ9k6r0lB08
 G9EVNkMcx0HCYOXxAYnUgvWpTnjRhnrBRvvdCeon7tHSVce4XBPPRhPcFqdjff3w
 YgZIoSLREsO2ytwVmQJCXHPvSHN010sJOPtGPBhjOSH/uxKZIxbofjkAWqcwM6SE
 XQG1QFPZPOmTAJfnUFkF1FDkz0tv/knKRCKqRSN+9WZhtyCq6aXKTRF5fPvJi6QJ
 6OR64Kfzjs6Hdlm8fZcL0bKovRaO8mA6I9GYo+99AcKYXRiovXE90yTw0k2U96X1
 jqYUaW3XMX+Lcjoe5BO2iQ==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 messagingengine.com; h=cc:cc:content-transfer-encoding
 :content-type:content-type:date:date:feedback-id:feedback-id
 :from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy
 :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1728242090; x=
 1728328490; bh=xSDvjzv2R3JJ3V72JlWgfldCkz/v6Gp4cpwBYZseS0o=; b=b
 MokaAhIM8SbkE9AWDIPlGRz4gRdTc507b++dW2sf1B3QZZzDIkyz9A9wLX62//xk
 EfE17GefW78fSjGgtl7VuqHGMFcU+iFX7X/jUPEGVbSEyqgZTYS7DEPO0zXX2vea
 afTd428Y6sOsY8pxB2j0IxokYn2iVZsFuzLlNN5YFkT1xBEdSgVhyIZWWk1Th3CL
 P6dE15EwRrW1XWtjABdvlLo8K2m8UG1IQHiWm4mN6ggzWr5bHjWwm2VKdvw2hHIV
 fiBCRSFoQUSR0XjNrR43C4wVestQ7UsGF1vTizXoeaNIUvHS3wAi+EMGrZ7b4KNI
 2cHcEdqonaoxuTc8HODVQ==
X-ME-Sender: <xms:qeECZ-m39iLAoigJP9aMQWKtrU2kvwE39_WzbWXS6qcqTMFGym0OjQ>
 <xme:qeECZ13ACW32OMVlQvZy0Z-3hm-7SUrTRNQMHE46sdJLJj98YC2QTSWrCmrV8tzFd
 ftvsdvsOcwLkbpLEBo>
X-ME-Received: <xmr:qeECZ8rNFiOMpXEiUUEx1CLwtxgIyWZ9MT2F9EODMLcpTPAd49fiAbaGoJUAZ1owZdyq>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvddvjedgudefiecutefuodetggdotefrod
 ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpggftfghnshhusghstghrihgsvgdp
 uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg
 hnthhsucdlqddutddtmdenucfjughrpefkffggfgfuvfevfhfhjggtgfesthejredttddv
 jeenucfhrhhomhepffhmihhtrhihucfiuhhtohhvuceoughmihhtrhihsehguhhtohhvrd
 guvghvqeenucggtffrrghtthgvrhhnpeefkeehueetieegveeltdejfeehfeehheejheek
 uefhfeefleevffelheefhfdvveenucffohhmrghinhepghhithhhuhgsrdgtohhmnecuve
 hluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepughmihhtrhih
 sehguhhtohhvrdguvghvpdhnsggprhgtphhtthhopeegpdhmohguvgepshhmthhpohhuth
 dprhgtphhtthhopegvlhhiiiesghhnuhdrohhrghdprhgtphhtthhopehpohhtsehgnhhu
 rdhorhhgpdhrtghpthhtohepshhpfihhihhtthhonhesshhpfihhihhtthhonhdrnhgrmh
 gvpdhrtghpthhtohepjeefgeekgeesuggvsggsuhhgshdrghhnuhdrohhrgh
X-ME-Proxy: <xmx:quECZymHtJT11IgjdX4VH8RW87KSRZWAmk_Qsb4EYbpFlT0JW8ZLrQ>
 <xmx:quECZ80mXOY510SsVLkk8K_RddOdPsYoeBWL66gcd7ctSxScZLHa6A>
 <xmx:quECZ5tP--sFCo0aAM8IbF69FENdHRubNkkg0YNx45OEk_Wo5whtTw>
 <xmx:quECZ4W4B4NPWaNax6JW6Dugz2qlheqv_Fr0eJUmHTzQjPCYWr9dBQ>
 <xmx:quECZ_Tyw0pXuQudJZJLs-Lsflpap3rrvnfYaWNjw5Jp0Lxk93rlfS3m>
Feedback-ID: i07de48aa:Fastmail
Received: by mail.messagingengine.com (Postfix) with ESMTPA; Sun,
 6 Oct 2024 15:14:48 -0400 (EDT)
Message-ID: <8b6560a9-e2d6-42ae-ac1d-014700f21804@HIDDEN>
Date: Sun, 6 Oct 2024 22:14:46 +0300
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
To: Eli Zaretskii <eliz@HIDDEN>
References: <87tteaznog.fsf@HIDDEN>
 <edab570c-b2fa-4162-9383-df5c8aaff251@HIDDEN>
 <8734lrrj4e.fsf@HIDDEN>
 <ea10f340-9b46-4199-93fc-274c5e81ace4@HIDDEN>
 <87o74c1ce1.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN> <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN> <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN> <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN> <86ttdtzoof.fsf@HIDDEN>
 <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN> <86ttdsxt6x.fsf@HIDDEN>
 <52cb1caa-9e7e-45df-b328-d60948d397f6@HIDDEN> <864j5rxca1.fsf@HIDDEN>
 <b59bf102-a9d8-4723-91ac-acc3f8ff3aa8@HIDDEN>
 <87a5fiijy9.fsf@HIDDEN>
 <bd6751c8-5504-48d7-82d7-a3e8849a1910@HIDDEN> <86jzelvjh4.fsf@HIDDEN>
Content-Language: en-US
From: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <86jzelvjh4.fsf@HIDDEN>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Score: -0.7 (/)
X-Debbugs-Envelope-To: 73484
Cc: pot@HIDDEN, 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.7 (-)

On 06/10/2024 09:22, Eli Zaretskii wrote:

>> Then, the total time increased a lot: from 30 s to 30-40 min.
> 
> I don't understand why.  How many files with no extensions are in that
> tree, and what was the etags command line in both cases?

Sorry, I have to add a correction: it's about 15 min either way. Seems 
like the first time I either messed up the start time, or the directory 
was in "cold" cache, or the used etags some much older version.

So to reiterate: the current etags-regen scans in around 30s, and the 
simple switch scans the directory in 15 minutes. Retesting the change 
from previous email, it doesn't really help.

And the 'find-tag' scan did become slower - i.e. from 400 ms to 1200 ms. 
Not clear about the mechanics (the size of TAGS only went up from 65 to 
88 MB).

>> But parsing HTML files seems to remain the slowest part. There are a lot
>> of them in that project (many test cases), but maybe 3x the number of
>> code files, not 60x their number. And they're pretty small, on average.
>> If somebody wants to test that locally, here's the repository:
>> https://github.com/mozilla/gecko-dev
> 
> If HTML files is what explains the slowdown, then why this change
> triggered it?  HTML files are supposed to have extensions that tell
> etags they are HTML.

Okay, I've commented out the most obvious suspects (html, asm, makefile) 
- all their entries in 'lang_names' - but the scan still takes too long.

Maybe it's some other file type, which I haven't found yet.

But what is see when monitoring the running scan with 'tail -f TAGS', is 
the output stops sometimes for like 20 seconds, in the middle of 
outputting tags of some common code file (like .cpp or .py, a common 
type), and then resumes, with files of the same type around this one.

> And if they don't have extensions, the code you
> removed would have caused etags to scan these files anyway, looking
> for Fortran or C tags.  So how come the change slowed down etags so
> much?  What am I missing?

I think it would also concern "unknown" extensions, right? Like .txt, 
.png and so on.

Anyway, the difference is either due to the different set of files (all 
project files, rather than files in the specified list of extensions), 
or due to all file names being printed. Not sure how to verify, yet.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 6 Oct 2024 06:22:46 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sun Oct 06 02:22:46 2024
Received: from localhost ([127.0.0.1]:39973 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1sxKfF-00034p-Ka
	for submit <at> debbugs.gnu.org; Sun, 06 Oct 2024 02:22:46 -0400
Received: from eggs.gnu.org ([209.51.188.92]:40050)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eliz@HIDDEN>) id 1sxKfD-00034a-WA
 for 73484 <at> debbugs.gnu.org; Sun, 06 Oct 2024 02:22:44 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1sxKf2-0002IB-Rf; Sun, 06 Oct 2024 02:22:32 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From:
 Date; bh=4YUDHJXumWXLLcRCtI7kwFpIUA4+wi0hNfvkC8hSa94=; b=rL9VniwQ2pqmMGirjIC1
 991bHzLiDUDh2yP8lFAydW/fP/Try3qHNe6TqQTFru5YIOTbAF0AlfOUb85pay+nL5gleDSWUiu9o
 PZBmxhwLXX5fhOlE9hw5Cv0CTh0wEYP1Nj3me+GgICgVjVG/N6cIDJ6E5wu3E2REiYOlDs+WuHdsl
 y1RP6SImWdw29FLEOFxockmj/GJbBUfNTGkFqgLSoMHvop9L6Wjfo+xtxDiqLNHlFcuPMgB4XbW/9
 qSVc3uGWCQQ2mRG5tRvmPQ1CoGLQRWj1M/4YPE+MNga9pMgNu7Qq27/S2mM3/otvewrHvMgAGjRKx
 zZh1FMC5s+nWIw==;
Date: Sun, 06 Oct 2024 09:22:31 +0300
Message-Id: <86jzelvjh4.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <bd6751c8-5504-48d7-82d7-a3e8849a1910@HIDDEN> (message from
 Dmitry Gutov on Sun, 6 Oct 2024 03:56:58 +0300)
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
References: <87tteaznog.fsf@HIDDEN>
 <edab570c-b2fa-4162-9383-df5c8aaff251@HIDDEN>
 <8734lrrj4e.fsf@HIDDEN>
 <ea10f340-9b46-4199-93fc-274c5e81ace4@HIDDEN>
 <87o74c1ce1.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN> <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN> <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN> <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN> <86ttdtzoof.fsf@HIDDEN>
 <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN> <86ttdsxt6x.fsf@HIDDEN>
 <52cb1caa-9e7e-45df-b328-d60948d397f6@HIDDEN> <864j5rxca1.fsf@HIDDEN>
 <b59bf102-a9d8-4723-91ac-acc3f8ff3aa8@HIDDEN>
 <87a5fiijy9.fsf@HIDDEN>
 <bd6751c8-5504-48d7-82d7-a3e8849a1910@HIDDEN>
MIME-version: 1.0
Content-type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 73484
Cc: pot@HIDDEN, 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

> Date: Sun, 6 Oct 2024 03:56:58 +0300
> Cc: spwhitton@HIDDEN, 73484 <at> debbugs.gnu.org,
>  Eli Zaretskii <eliz@HIDDEN>
> From: Dmitry Gutov <dmitry@HIDDEN>
> 
> On 05/10/2024 19:38, Francesco Potortì wrote:
> > Eli Zaretskii:
> >>> How hard is it to add to a live TAGS file fake lines which look like
> >>> this:
> >>>
> >>>     ^L
> >>>     foo,0
> >>>
> >>> (with random strings instead of "foo"), and then time some TAGS-using
> >>> commands with and without these additions?
> > 
> > Dmitry Gutov:
> >> Okay, done that.
> >>
> >> 'M-.' takes more or less the same.
> >>
> >> The file size of TAGS increased from 66 MB to 85 MiB.
> >>
> >> Won't measure time to generate now - because the current method and the
> >> "real" one will be different, but note that it's more relevant with
> >> etags-regen-mode because the scan is performed lazily: every time the
> >> user does the first search in a new project.
> > 
> > Removing the Fortran and C/C++ fallbacks just for testing requires recompiling etags.c after removing the code beginning with /* Else try Fortran or C. */.  This would avoid parsing the file (except for detecting the sharp-bang) and would leave the file name in the tags file, without tags.

That would also remove the ability to scan files of no language for
regexps.  So this is not what I intend to do for this feature request,
FWIW.

> Then, the total time increased a lot: from 30 s to 30-40 min.

I don't understand why.  How many files with no extensions are in that
tree, and what was the etags command line in both cases?

> But parsing HTML files seems to remain the slowest part. There are a lot 
> of them in that project (many test cases), but maybe 3x the number of 
> code files, not 60x their number. And they're pretty small, on average. 
> If somebody wants to test that locally, here's the repository: 
> https://github.com/mozilla/gecko-dev

If HTML files is what explains the slowdown, then why this change
triggered it?  HTML files are supposed to have extensions that tell
etags they are HTML.  And if they don't have extensions, the code you
removed would have caused etags to scan these files anyway, looking
for Fortran or C tags.  So how come the change slowed down etags so
much?  What am I missing?




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 6 Oct 2024 00:57:17 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sat Oct 05 20:57:16 2024
Received: from localhost ([127.0.0.1]:39741 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1sxFaG-0002KX-Gp
	for submit <at> debbugs.gnu.org; Sat, 05 Oct 2024 20:57:16 -0400
Received: from fhigh-a3-smtp.messagingengine.com ([103.168.172.154]:44973)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <dmitry@HIDDEN>) id 1sxFaC-0002KI-Nz
 for 73484 <at> debbugs.gnu.org; Sat, 05 Oct 2024 20:57:15 -0400
Received: from phl-compute-04.internal (phl-compute-04.phl.internal
 [10.202.2.44])
 by mailfhigh.phl.internal (Postfix) with ESMTP id 28A9411401A7;
 Sat,  5 Oct 2024 20:57:02 -0400 (EDT)
Received: from phl-mailfrontend-01 ([10.202.2.162])
 by phl-compute-04.internal (MEProxy); Sat, 05 Oct 2024 20:57:02 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gutov.dev; h=cc
 :cc:content-transfer-encoding:content-type:content-type:date
 :date:from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to; s=fm1; t=1728176222;
 x=1728262622; bh=z8wpDMUh3MbMqIu3TOJF0EtyGAPJkafgE6gcKiJNs7c=; b=
 oMj4a4jaBLFzDPA2go0jeLSCrUmuqfxLM93pJBzXvpNJ5pfpIh+/+8/e6NeUhUEv
 gdLokMR1vj/34ZhbSJEu3j9gW0s7Ts1/cxvdqmgTtaskkK0Q32BTifowCp4S68R3
 zxP1aV4u9xBm2zVlkyUm6ELibkfSMqq/mE9GPeXcONFucpYqYVCoNInxwPYPRsms
 J+JAnyLPSk2Xqq+85Ejb5XfduQ8lkD7WEaHk9svVkrKZoqVikwRKTtBvRYGunFu/
 ddu/+x7snvYBnue3liNOorjIWZ+4jghK+ObmQnQHWDa54crXt2Pl86cmZwwlLCJp
 7bftf3HtQ3186MORb9X4+g==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 messagingengine.com; h=cc:cc:content-transfer-encoding
 :content-type:content-type:date:date:feedback-id:feedback-id
 :from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy
 :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1728176222; x=
 1728262622; bh=z8wpDMUh3MbMqIu3TOJF0EtyGAPJkafgE6gcKiJNs7c=; b=l
 HypQYrFENnht5tfFmzW0Ian3ZTpJIP77a3cyHmUgifNT4hAoBWgFK4HIb3BanDu/
 H1asfhgUcTW2kMbtqOFIOsrwhCFOMb5+XIiWr/uAh9GgdhwX6B55fIlrKLvWh7JB
 c3RC4m83lZh7YOIcxGSLz/1IPItCMgP/cydgXwjyAkvhCWBiCdS1TTiieLiUiSB4
 b/L4fFN1VI0RQxhcpal3LM05MGYfZuXB48kaCtfm1P92UDJ1AMnZ9q4M3VxZtWrd
 wlaAadEORGuYrItowiFZiWwWu4bJOi+NkbEyq/VKXonFdjOYcynGDNAGNrBm3dn7
 JcE0C+jEUQXB8bo1esxWw==
X-ME-Sender: <xms:XeABZ1V-mSlphCgZw1Aav1RBxXrGEwJBL-FM6BEN71y0Ak3B1cD4Zw>
 <xme:XeABZ1lELRKkqoN49dKaCKt0xAbTQm1Py-SfDTgE1rk9wOr0-hiSf_g0jS6-W3IVb
 UP9KW-fLmKjcgH3v2w>
X-ME-Received: <xmr:XeABZxZoGpATyks4zMDWTq-2ZF8fAn-nb0mxDzGzc6VmfXtWRaYpXrTygddvytG-83Xz>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvddviedggedvucetufdoteggodetrfdotf
 fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu
 rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnh
 htshculddquddttddmnecujfgurhepkfffgggfuffvvehfhfgjtgfgsehtkeertddtvdej
 necuhfhrohhmpeffmhhithhrhicuifhuthhovhcuoegumhhithhrhiesghhuthhovhdrug
 gvvheqnecuggftrfgrthhtvghrnhepfedvjeeviefffeeukeelveeikeegtddtveeileev
 gfdvgffhtdfggeeffeegiefgnecuffhomhgrihhnpehgihhthhhusgdrtghomhenucevlh
 hushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpegumhhithhrhies
 ghhuthhovhdruggvvhdpnhgspghrtghpthhtohepgedpmhhouggvpehsmhhtphhouhhtpd
 hrtghpthhtohepphhothesghhnuhdrohhrghdprhgtphhtthhopehsphifhhhithhtohhn
 sehsphifhhhithhtohhnrdhnrghmvgdprhgtphhtthhopeejfeegkeegseguvggssghugh
 hsrdhgnhhurdhorhhgpdhrtghpthhtohepvghlihiisehgnhhurdhorhhg
X-ME-Proxy: <xmx:XeABZ4WB7WGDECDv0Cn5CtJVlflMdO9noakph_RhS45X0c2TGJLlyQ>
 <xmx:XeABZ_lnXQ2j3ocobMaiCOSV_gkoQN8WRR52O6Dn9nckNbVfoIcwBQ>
 <xmx:XeABZ1eXoeTTOnm8i5EXx6JK59tn8qiWsvoidmRQ33BhVxw8g7uAuQ>
 <xmx:XeABZ5Erl-9Kz7SzZcumPDK6FxJdAERYJXi0qqj2CeIbMf0KqXAZyg>
 <xmx:XuABZ4A7cnZttMe_-mqGjPwxdYe4kvYS9gX59-wNbUTuUfsEQ0y34Bxj>
Feedback-ID: i07de48aa:Fastmail
Received: by mail.messagingengine.com (Postfix) with ESMTPA; Sat,
 5 Oct 2024 20:57:00 -0400 (EDT)
Message-ID: <bd6751c8-5504-48d7-82d7-a3e8849a1910@HIDDEN>
Date: Sun, 6 Oct 2024 03:56:58 +0300
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
To: =?UTF-8?Q?Francesco_Potort=C3=AC?= <pot@HIDDEN>
References: <87tteaznog.fsf@HIDDEN>
 <edab570c-b2fa-4162-9383-df5c8aaff251@HIDDEN>
 <8734lrrj4e.fsf@HIDDEN>
 <ea10f340-9b46-4199-93fc-274c5e81ace4@HIDDEN>
 <87o74c1ce1.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN> <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN> <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN> <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN> <86ttdtzoof.fsf@HIDDEN>
 <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN> <86ttdsxt6x.fsf@HIDDEN>
 <52cb1caa-9e7e-45df-b328-d60948d397f6@HIDDEN> <864j5rxca1.fsf@HIDDEN>
 <b59bf102-a9d8-4723-91ac-acc3f8ff3aa8@HIDDEN>
 <87a5fiijy9.fsf@HIDDEN>
Content-Language: en-US
From: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <87a5fiijy9.fsf@HIDDEN>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Spam-Score: -0.7 (/)
X-Debbugs-Envelope-To: 73484
Cc: Eli Zaretskii <eliz@HIDDEN>, 73484 <at> debbugs.gnu.org,
 spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.7 (-)

On 05/10/2024 19:38, Francesco Potortì wrote:
> Eli Zaretskii:
>>> How hard is it to add to a live TAGS file fake lines which look like
>>> this:
>>>
>>>     ^L
>>>     foo,0
>>>
>>> (with random strings instead of "foo"), and then time some TAGS-using
>>> commands with and without these additions?
> 
> Dmitry Gutov:
>> Okay, done that.
>>
>> 'M-.' takes more or less the same.
>>
>> The file size of TAGS increased from 66 MB to 85 MiB.
>>
>> Won't measure time to generate now - because the current method and the
>> "real" one will be different, but note that it's more relevant with
>> etags-regen-mode because the scan is performed lazily: every time the
>> user does the first search in a new project.
> 
> Removing the Fortran and C/C++ fallbacks just for testing requires recompiling etags.c after removing the code beginning with /* Else try Fortran or C. */.  This would avoid parsing the file (except for detecting the sharp-bang) and would leave the file name in the tags file, without tags.

Thank you, this is useful for another kind of test (parsing the same 
project with the list of all enabled file types). The below was also 
needed to avoid a segfault:

diff --git a/lib-src/etags.c b/lib-src/etags.c
index 7f652790261..08c6037b9d7 100644
--- a/lib-src/etags.c
+++ b/lib-src/etags.c
@@ -1830,6 +1830,7 @@ process_file (FILE *fh, char *fn, language *lang)
       curfdp. */
    if (!CTAGS
        && curfdp->usecharno	/* no #line directives in this file */
+      && curfdp->lang
        && !curfdp->lang->metasource)
      {
        node *np, *prev;

Then, the total time increased a lot: from 30 s to 30-40 min. This cuts 
it down in half, if I measured correctly:

diff --git a/lib-src/etags.c b/lib-src/etags.c
index 7f652790261..5c2be2b9574 100644
--- a/lib-src/etags.c
+++ b/lib-src/etags.c
@@ -1902,21 +1903,21 @@ find_entries (FILE *inf)

    /* Else look for sharp-bang as the first two characters. */
    if (parser == NULL
+      && getc (inf) == '#'
+      && getc (inf) == '!'
        && readline_internal (&lb, inf, infilename, false) > 0
-      && lb.len >= 2
-      && lb.buffer[0] == '#'
-      && lb.buffer[1] == '!')
+      )
      {
        char *lp;

        /* Set lp to point at the first char after the last slash in the
           line or, if no slashes, at the first nonblank.  Then set cp to
  	 the first successive blank and terminate the string. */
-      lp = strrchr (lb.buffer+2, '/');
+      lp = strrchr (lb.buffer, '/');
        if (lp != NULL)
  	lp += 1;
        else
-	lp = skip_spaces (lb.buffer + 2);
+	lp = skip_spaces (lb.buffer);
        cp = skip_non_spaces (lp);
        /* If the "interpreter" turns out to be "env", the real 
interpreter is
  	 the next word.  */

But parsing HTML files seems to remain the slowest part. There are a lot 
of them in that project (many test cases), but maybe 3x the number of 
code files, not 60x their number. And they're pretty small, on average. 
If somebody wants to test that locally, here's the repository: 
https://github.com/mozilla/gecko-dev




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 5 Oct 2024 20:27:41 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sat Oct 05 16:27:41 2024
Received: from localhost ([127.0.0.1]:39663 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1sxBNM-0005J4-Ez
	for submit <at> debbugs.gnu.org; Sat, 05 Oct 2024 16:27:40 -0400
Received: from fhigh-a8-smtp.messagingengine.com ([103.168.172.159]:45325)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <dmitry@HIDDEN>) id 1sxBNJ-0005In-Cj
 for 73484 <at> debbugs.gnu.org; Sat, 05 Oct 2024 16:27:39 -0400
Received: from phl-compute-01.internal (phl-compute-01.phl.internal
 [10.202.2.41])
 by mailfhigh.phl.internal (Postfix) with ESMTP id F35FF11400BC;
 Sat,  5 Oct 2024 16:27:26 -0400 (EDT)
Received: from phl-mailfrontend-01 ([10.202.2.162])
 by phl-compute-01.internal (MEProxy); Sat, 05 Oct 2024 16:27:26 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gutov.dev; h=cc
 :cc:content-transfer-encoding:content-type:content-type:date
 :date:from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to; s=fm1; t=1728160046;
 x=1728246446; bh=s0CmroSJU2742/Yg6Ep23Y0Z7tQzi75093MoAEjrBxI=; b=
 VWKcmruc3H1NsvWxUPmeX28FDTGzXM8ixwsABVp0tH+nQ13QVKORnoRTCdpy7RoJ
 Kvoohq+RijWo6Kv+vThs7QS6c8NX+y2f3q0Ezv1luGXqC98+KKctj2WxLO55urKo
 S1TShoOuqvz/RSDSGNXSxibXWR/gCAX5BsEWS8Cx4+tqnOOEEHLos8fDMBH7fqau
 hlw9OAzr90Z0XiZr+nuE+GGv87WeJptQ4m63zOFpt8xXCp4XEDr4S+KbpmVz3v2I
 JMwuvdadEYOYr8RDYH9pL9MMfitVvVlrDlhW3+7kYr9tTGfwzWu1VLemItPztpvT
 mM+t/ojsL8PvdIsx635fxg==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 messagingengine.com; h=cc:cc:content-transfer-encoding
 :content-type:content-type:date:date:feedback-id:feedback-id
 :from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy
 :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1728160046; x=
 1728246446; bh=s0CmroSJU2742/Yg6Ep23Y0Z7tQzi75093MoAEjrBxI=; b=E
 DHGYidtjLYwfwmCJfQovR5ToqXUz5kcHRLLhy3fKjlbI+8+6ePFCguDFXaCcL7bg
 KRa1cKEEOTbeBhvLBboEvn+lVfqoVPNTZnJPN/9M+MhF7/KLUUhw+1a/ODU//6Cp
 8EmbQVO5OPCE3L43/oAnxfNE0G2Vm1hPsyZb8qUDDWGeX/gWU7AKVl06AR32qD1I
 TwCsk6GAd/cVtOpuX1/RatvTsNyG2qKyFIxeNypR86vuauSCS4uXcTb8+SJ/p3Sr
 zFfsnwP9smQsuSO9pOkACh1RMaGODImp64xihZxskCiqPoah1i+ad3RS2y+A7K7A
 1Nx08eXQmAB0XEi2zdetA==
X-ME-Sender: <xms:LqEBZwlK0ZJdK1ceD1tdz7UukgBrfYwvx_ncB1g1j_HBDoGz0NjBpQ>
 <xme:LqEBZ_234uFDmIsD0s7yA-WYBIqYbAo1hTvPJ_o8QQWSEhLHCRl0QmCQoUDp6fTNq
 oKFc3DBRP_09FTMhm0>
X-ME-Received: <xmr:LqEBZ-r8cdlF1MOchyxe8S4Iz_ZFDnVemGIFtTAxkvJ3v7IZOyEjzEBglwX6aj-zGNH1>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvddvhedgudehtdcutefuodetggdotefrod
 ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpggftfghnshhusghstghrihgsvgdp
 uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg
 hnthhsucdlqddutddtmdenucfjughrpefkffggfgfuvfevfhfhjggtgfesthejredttddv
 jeenucfhrhhomhepffhmihhtrhihucfiuhhtohhvuceoughmihhtrhihsehguhhtohhvrd
 guvghvqeenucggtffrrghtthgvrhhnpeetudeljeegheetgfehgeejkeeuhedvveeikeeu
 fedtvddtveefhfdvveegudejheenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmh
 epmhgrihhlfhhrohhmpegumhhithhrhiesghhuthhovhdruggvvhdpnhgspghrtghpthht
 ohepfedpmhhouggvpehsmhhtphhouhhtpdhrtghpthhtohepvghlihiisehgnhhurdhorh
 hgpdhrtghpthhtohepshhpfihhihhtthhonhesshhpfihhihhtthhonhdrnhgrmhgvpdhr
 tghpthhtohepjeefgeekgeesuggvsggsuhhgshdrghhnuhdrohhrgh
X-ME-Proxy: <xmx:LqEBZ8lXXldLrK8r-EvUSbvANjHLlrV49J2rLIqEApbZaaj9vYKR5Q>
 <xmx:LqEBZ-26mW1e8yv3VqpnpncWsGVFn1hjZxE1ObcRxxA5M1PseX5rBg>
 <xmx:LqEBZzshyCGscuIZcANY8xJdbYYPXOvQoCEp07SXSTVgAk8VapeTMQ>
 <xmx:LqEBZ6W5Em-JzTEkOyJcQMwN6whxrnXtrV-hrJfOWWAGfYLDPJJuPA>
 <xmx:LqEBZzxTn_8p689Dzmb2qWoEiBxYM6_iZWPf_5hI3y-Nm4icuaIe6BDf>
Feedback-ID: i07de48aa:Fastmail
Received: by mail.messagingengine.com (Postfix) with ESMTPA; Sat,
 5 Oct 2024 16:27:25 -0400 (EDT)
Message-ID: <dbfeb1ef-ea3a-4855-aeb4-1232aae00b48@HIDDEN>
Date: Sat, 5 Oct 2024 23:27:24 +0300
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
To: Eli Zaretskii <eliz@HIDDEN>
References: <87tteaznog.fsf@HIDDEN>
 <edab570c-b2fa-4162-9383-df5c8aaff251@HIDDEN>
 <8734lrrj4e.fsf@HIDDEN>
 <ea10f340-9b46-4199-93fc-274c5e81ace4@HIDDEN>
 <87o74c1ce1.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN> <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN> <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN> <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN> <86ttdtzoof.fsf@HIDDEN>
 <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN> <86ttdsxt6x.fsf@HIDDEN>
 <52cb1caa-9e7e-45df-b328-d60948d397f6@HIDDEN> <864j5rxca1.fsf@HIDDEN>
 <b59bf102-a9d8-4723-91ac-acc3f8ff3aa8@HIDDEN> <86zfnivacz.fsf@HIDDEN>
Content-Language: en-US
From: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <86zfnivacz.fsf@HIDDEN>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 73484
Cc: 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.7 (-)

On 05/10/2024 18:27, Eli Zaretskii wrote:
> Thanks.  What about the time it takes tags-search to show the prompt:
> is that affected in any way?

No, that's still instant, just like project-find-regexp. All the work 
happens after typing the input.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 5 Oct 2024 17:12:42 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sat Oct 05 13:12:42 2024
Received: from localhost ([127.0.0.1]:39524 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1sx8Kf-00037I-Q9
	for submit <at> debbugs.gnu.org; Sat, 05 Oct 2024 13:12:42 -0400
Received: from eggs.gnu.org ([209.51.188.92]:57686)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eliz@HIDDEN>) id 1sx8Kc-00036v-Tm
 for 73484 <at> debbugs.gnu.org; Sat, 05 Oct 2024 13:12:40 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1sx8KS-0006Fj-EC; Sat, 05 Oct 2024 13:12:28 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From:
 Date; bh=FI06RPe0dZNhY3jXzeu20Vmz5BmMlpN99R9OIEjtJP4=; b=pa96D5azZMNEoQhFI+Ca
 /h1n/kksiI05fJBQnN+003BHAwaywXhey3SypY9zREhgeSsgt01CNYm3z8GOruHtKm2MVVh2blXYF
 31l/MdEBfg1maOlI4nvGjV6xfb4/cEYB4kUoD2rAvel9Xw/dGFky6bVz8mOAkkOWGYPA/i50JgbQM
 fvQ7byg7G7g5087Xf1cAPKjvvpUrinN85o8GShx7quMRzJaMqvhZVm3UJLkGYV/usKrpvcL+tiSLr
 MydfQ+yj69DeICo/VAc+S7S+xZKbXYy5FurEX4sVhGo4H1/jpSVz6TNOPt3E8Fb/qQkRV5GTuyA3Y
 xhrqExabLG+QwA==;
Date: Sat, 05 Oct 2024 20:12:25 +0300
Message-Id: <86y132v5hi.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Francesco =?utf-8?Q?Potort=C3=AC?= <pot@HIDDEN>
In-Reply-To: <87a5fiijy9.fsf@HIDDEN> (message from Francesco
 =?utf-8?Q?Potort=C3=AC?= on Sat, 05 Oct 2024 18:38:22 +0200)
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
References: <87tteaznog.fsf@HIDDEN>
 <edab570c-b2fa-4162-9383-df5c8aaff251@HIDDEN>
 <8734lrrj4e.fsf@HIDDEN>
 <ea10f340-9b46-4199-93fc-274c5e81ace4@HIDDEN>
 <87o74c1ce1.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN> <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN> <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN> <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN> <86ttdtzoof.fsf@HIDDEN>
 <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN> <86ttdsxt6x.fsf@HIDDEN>
 <52cb1caa-9e7e-45df-b328-d60948d397f6@HIDDEN> <864j5rxca1.fsf@HIDDEN>
 <b59bf102-a9d8-4723-91ac-acc3f8ff3aa8@HIDDEN>
 <87a5fiijy9.fsf@HIDDEN>
MIME-version: 1.0
Content-type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 73484
Cc: dmitry@HIDDEN, 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

> From: Francesco Potortì <pot@HIDDEN>
> Date: Sat, 05 Oct 2024 18:38:22 +0200
> Cc: spwhitton@HIDDEN,
> 	73484 <at> debbugs.gnu.org,
> 	Eli Zaretskii <eliz@HIDDEN>
> 
> Eli Zaretskii:
> >> How hard is it to add to a live TAGS file fake lines which look like
> >> this:
> >> 
> >>    ^L
> >>    foo,0
> >> 
> >> (with random strings instead of "foo"), and then time some TAGS-using
> >> commands with and without these additions?
> 
> Dmitry Gutov:
> >Okay, done that.
> >
> >'M-.' takes more or less the same.
> >
> >The file size of TAGS increased from 66 MB to 85 MiB.
> >
> >Won't measure time to generate now - because the current method and the 
> >"real" one will be different, but note that it's more relevant with 
> >etags-regen-mode because the scan is performed lazily: every time the 
> >user does the first search in a new project.
> 
> Removing the Fortran and C/C++ fallbacks just for testing requires recompiling etags.c after removing the code beginning with /* Else try Fortran or C. */.  This would avoid parsing the file (except for detecting the sharp-bang) and would leave the file name in the tags file, without tags.

We are not talking about disabling the fallbacks, we are talking about
something else: the impact of having in TAGS names of files where no
tags were found (e.g., because their language was not recognized and
the fallbacks are disabled).




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 5 Oct 2024 16:38:42 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sat Oct 05 12:38:42 2024
Received: from localhost ([127.0.0.1]:39501 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1sx7nm-0001KX-Bt
	for submit <at> debbugs.gnu.org; Sat, 05 Oct 2024 12:38:42 -0400
Received: from eggs.gnu.org ([209.51.188.92]:34788)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <pot@HIDDEN>) id 1sx7nj-0001KH-MP
 for 73484 <at> debbugs.gnu.org; Sat, 05 Oct 2024 12:38:40 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <pot@HIDDEN>)
 id 1sx7nY-0008EX-9O; Sat, 05 Oct 2024 12:38:28 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=MIME-Version:References:Subject:In-Reply-To:To:Date:
 From; bh=f4e4jcLQEFGllG+h5LkSH7oZQznXQ9PVQqS9gP4U7eQ=; b=CbeBtqAhheL6KaGZ9sZU
 REpnHhfB/Ib2V++RHWxTb1mVkK1s6TUDBFKGzubQzhSJ+BuXxSzoTuPTpv/E0Mn7RB645/azbv4V6
 NlQSihYgnxMbeRO8dzN24t41zDNb063cuo/lxAFOBgHax3PEsCoBxUdTldfpfxbEvI5sQXFo9YijN
 t+0As14WgBUcUeovSLR0QXs8/JmRIibRqpCq5YDqCPF2oYnLYq1PyWpCLHOg5YLtiu0U07Vp/hn7h
 UcTAUHFAQ6TmZOLhN1PmtGmBm1julu3ErpxXRK9u73WuugVel8kUeaEVDIjqtgonbwaUlAO10Ra/C
 zfIQWoJYGNIy/g==;
Message-Id: <87a5fiijy9.fsf@HIDDEN>
From: =?utf-8?Q?Francesco_Potort=C3=AC?= <pot@HIDDEN>
Date: Sat, 05 Oct 2024 18:38:22 +0200
To: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <b59bf102-a9d8-4723-91ac-acc3f8ff3aa8@HIDDEN>
 (dmitry@HIDDEN)
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
References: <87tteaznog.fsf@HIDDEN>
 <edab570c-b2fa-4162-9383-df5c8aaff251@HIDDEN>
 <8734lrrj4e.fsf@HIDDEN>
 <ea10f340-9b46-4199-93fc-274c5e81ace4@HIDDEN>
 <87o74c1ce1.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN> <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN> <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN> <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN> <86ttdtzoof.fsf@HIDDEN>
 <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN> <86ttdsxt6x.fsf@HIDDEN>
 <52cb1caa-9e7e-45df-b328-d60948d397f6@HIDDEN> <864j5rxca1.fsf@HIDDEN>
 <b59bf102-a9d8-4723-91ac-acc3f8ff3aa8@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable
Organization: The GNU project
X-fingerprint: 4B02 6187 5C03 D6B1 2E31  7666 09DF 2DC9 BE21 6115
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 73484
Cc: Eli Zaretskii <eliz@HIDDEN>, 73484 <at> debbugs.gnu.org,
 spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

Eli Zaretskii:
>> How hard is it to add to a live TAGS file fake lines which look like
>> this:
>>=20
>>    ^L
>>    foo,0
>>=20
>> (with random strings instead of "foo"), and then time some TAGS-using
>> commands with and without these additions?

Dmitry Gutov:
>Okay, done that.
>
>'M-.' takes more or less the same.
>
>The file size of TAGS increased from 66 MB to 85 MiB.
>
>Won't measure time to generate now - because the current method and the=20
>"real" one will be different, but note that it's more relevant with=20
>etags-regen-mode because the scan is performed lazily: every time the=20
>user does the first search in a new project.

Removing the Fortran and C/C++ fallbacks just for testing requires recompil=
ing etags.c after removing the code beginning with /* Else try Fortran or C=
. */.  This would avoid parsing the file (except for detecting the sharp-ba=
ng) and would leave the file name in the tags file, without tags.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 5 Oct 2024 15:29:34 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sat Oct 05 11:29:34 2024
Received: from localhost ([127.0.0.1]:39399 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1sx6ir-0005yO-OJ
	for submit <at> debbugs.gnu.org; Sat, 05 Oct 2024 11:29:34 -0400
Received: from eggs.gnu.org ([209.51.188.92]:57624)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eliz@HIDDEN>) id 1sx6ip-0005yA-UX
 for 73484 <at> debbugs.gnu.org; Sat, 05 Oct 2024 11:29:32 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1sx6gZ-0002WJ-6F; Sat, 05 Oct 2024 11:27:11 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date:
 mime-version; bh=MYlgUniuHvvLKySZH6ha6Bd6sPxWUkHrhIv9DZR6PR4=; b=gbvmaUP9voe4
 MwzSDBPE2mzkjYYtIqXKrG2luGCi7gAxZxTEc7D/+IKRMOKhVhrntCym40Ss3Qbq75k/sNc0FZvym
 VPQ+yisKWmbGsTx91MmZhNQ2AM7RjnY4MYe074eFw7bOPYW2u9VTUVAh0UvPNY8bSlPo8pPHvgi+V
 CXZd49EEd0bCXQVoLDSG7UYCaWE0TWtdZUl35arva1JlFg2kSY2U4CSjwT2df2xrRkMFdiAKvfru3
 +qpikE8xm4Id5W5+9bHvTJ0gZTdM7xg6SOCSTwIk/2HpqhxQme7aLDnvH1kSCI5NQv5E9mUzlWbom
 kZG4FgSRpYIhJ7J++Tyo+A==;
Date: Sat, 05 Oct 2024 18:27:08 +0300
Message-Id: <86zfnivacz.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <b59bf102-a9d8-4723-91ac-acc3f8ff3aa8@HIDDEN> (message from
 Dmitry Gutov on Sat, 5 Oct 2024 17:29:44 +0300)
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
References: <87tteaznog.fsf@HIDDEN>
 <edab570c-b2fa-4162-9383-df5c8aaff251@HIDDEN>
 <8734lrrj4e.fsf@HIDDEN>
 <ea10f340-9b46-4199-93fc-274c5e81ace4@HIDDEN>
 <87o74c1ce1.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN> <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN> <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN> <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN> <86ttdtzoof.fsf@HIDDEN>
 <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN> <86ttdsxt6x.fsf@HIDDEN>
 <52cb1caa-9e7e-45df-b328-d60948d397f6@HIDDEN> <864j5rxca1.fsf@HIDDEN> 
 <b59bf102-a9d8-4723-91ac-acc3f8ff3aa8@HIDDEN>
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 73484
Cc: 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

> Date: Sat, 5 Oct 2024 17:29:44 +0300
> Cc: spwhitton@HIDDEN, 73484 <at> debbugs.gnu.org
> From: Dmitry Gutov <dmitry@HIDDEN>
> 
> On 05/10/2024 10:02, Eli Zaretskii wrote:
> 
> > How hard is it to add to a live TAGS file fake lines which look like
> > this:
> > 
> >    ^L
> >    foo,0
> > 
> > (with random strings instead of "foo"), and then time some TAGS-using
> > commands with and without these additions?
> 
> Okay, done that.
> 
> 'M-.' takes more or less the same.
> 
> The file size of TAGS increased from 66 MB to 85 MiB.
> 
> Won't measure time to generate now - because the current method and the 
> "real" one will be different, but note that it's more relevant with 
> etags-regen-mode because the scan is performed lazily: every time the 
> user does the first search in a new project.

Thanks.  What about the time it takes tags-search to show the prompt:
is that affected in any way?




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 5 Oct 2024 14:30:02 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sat Oct 05 10:30:02 2024
Received: from localhost ([127.0.0.1]:39364 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1sx5nF-0002gF-II
	for submit <at> debbugs.gnu.org; Sat, 05 Oct 2024 10:30:02 -0400
Received: from fout-a8-smtp.messagingengine.com ([103.168.172.151]:59209)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <dmitry@HIDDEN>) id 1sx5nC-0002fa-ED
 for 73484 <at> debbugs.gnu.org; Sat, 05 Oct 2024 10:29:59 -0400
Received: from phl-compute-09.internal (phl-compute-09.phl.internal
 [10.202.2.49])
 by mailfout.phl.internal (Postfix) with ESMTP id A4FC9138023B;
 Sat,  5 Oct 2024 10:29:48 -0400 (EDT)
Received: from phl-mailfrontend-02 ([10.202.2.163])
 by phl-compute-09.internal (MEProxy); Sat, 05 Oct 2024 10:29:48 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gutov.dev; h=cc
 :cc:content-transfer-encoding:content-type:content-type:date
 :date:from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to; s=fm1; t=1728138588;
 x=1728224988; bh=sZ25aJmW10Y+47xty662MU0uQipOvSDyjEnyknVoZmc=; b=
 Yfo5JTvScejacXvzKg0V9EruYjHoNBwb1DVWpsiHt5inwhM2DbNIKnqCiDBG6Icq
 plkGE1WVgjlOReVtjeNNkwYipOXbyF58LI8SnKNL80yRLHkbK3BM9nyH7Ya+XCi1
 TGx5bCLtDs6q+NOTcYfXCYpgNUZCNtTFSErKVHl61fPE11hL69C8oUSNe7BGFZcL
 GLbSPreTjGYI7fDs+A0AnyoWqKA1fSshny+1qXbdKlXhItUH62S9oTwdsC/QEJuh
 ctdnRQHx7qmPjdkbwJVmGZYioSwkiNKWYs6z+0pOrSpftcmdPM/UcCCoYXktD8BR
 oBBVyJOFgglxLtqj+Uhx0g==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 messagingengine.com; h=cc:cc:content-transfer-encoding
 :content-type:content-type:date:date:feedback-id:feedback-id
 :from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy
 :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1728138588; x=
 1728224988; bh=sZ25aJmW10Y+47xty662MU0uQipOvSDyjEnyknVoZmc=; b=P
 wZlGLt4vcGxig0TJeBUv7vPxWRmopAlLCn80EhIZsN+qajXLwaAROfKouZUr1kZG
 ZiroclVxYoSQdgxIe3XH9zzMqIiV+WnlJ0S4HvcR3Kz2ZAIVBgET6lQGgYJPNQcB
 59bgNtP0CKSwJS0tOLeZ6IO/AQNbUkdstCfAqAQbrh5Am/knXK+kBvpi25pEvv7P
 epIrqO2nmBj8QInBP+C0nvL6yxjkauh4rvHgRSdd3PO5cEjKrOhvf1NY5VbOvMp3
 7RrWXhZfDpSoslCkGANUlYc4GoYTvEudHvxyOdbssVkAlVeN7c0ep2CAFa47t7Ly
 3watxudyZHUIcadVbNwQA==
X-ME-Sender: <xms:XE0BZ5BFXmvzR8Rp3rKACT0C4xo2cKajXnMrMy9QYAaYKrNiOzytyg>
 <xme:XE0BZ3gaWbLMs3TWGCNuBp5XaWOXCSMs6mBxN-Co3SO_rNEMJK_1n_BPYefnOR3ac
 UeNynrA2VXHuDCxhKY>
X-ME-Received: <xmr:XE0BZ0ltuGv-vfd_lUk8y8B-XxY8eJmT003MTtlBc8kIlM2QRVy6K8DYjwqPrt9Hr_KD>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvddvhedgjeekucetufdoteggodetrfdotf
 fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu
 rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnh
 htshculddquddttddmnecujfgurhepkfffgggfuffvvehfhfgjtgfgsehtjeertddtvdej
 necuhfhrohhmpeffmhhithhrhicuifhuthhovhcuoegumhhithhrhiesghhuthhovhdrug
 gvvheqnecuggftrfgrthhtvghrnhepteduleejgeehtefgheegjeekueehvdevieekueef
 tddvtdevfefhvdevgedujeehnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpe
 hmrghilhhfrhhomhepughmihhtrhihsehguhhtohhvrdguvghvpdhnsggprhgtphhtthho
 peefpdhmohguvgepshhmthhpohhuthdprhgtphhtthhopegvlhhiiiesghhnuhdrohhrgh
 dprhgtphhtthhopehsphifhhhithhtohhnsehsphifhhhithhtohhnrdhnrghmvgdprhgt
 phhtthhopeejfeegkeegseguvggssghughhsrdhgnhhurdhorhhg
X-ME-Proxy: <xmx:XE0BZzyK5eE8cKuBBkg2hkK0i84cw9yyh_ifo4aA5fRoBJhLbaoMHA>
 <xmx:XE0BZ-SHgRyQnshh3Y6fqsjw40Ekqcih3luR4JmmcEjbLTUXewwzKA>
 <xmx:XE0BZ2ZQVAIAg34OyGvlOMnzmz7IP1GYmGRVge9060UJoR6ozYE6sg>
 <xmx:XE0BZ_Rwx8JyLzOmqtfIfHzfvn75-yv9iVBG23B7AY095gkVAcQpRg>
 <xmx:XE0BZ8ccNInMIUXHqgF9ereuKqo4AFATsLcpo04eGaK81BMtc8cN6Qew>
Feedback-ID: i07de48aa:Fastmail
Received: by mail.messagingengine.com (Postfix) with ESMTPA; Sat,
 5 Oct 2024 10:29:47 -0400 (EDT)
Message-ID: <b59bf102-a9d8-4723-91ac-acc3f8ff3aa8@HIDDEN>
Date: Sat, 5 Oct 2024 17:29:44 +0300
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
To: Eli Zaretskii <eliz@HIDDEN>
References: <87tteaznog.fsf@HIDDEN>
 <edab570c-b2fa-4162-9383-df5c8aaff251@HIDDEN>
 <8734lrrj4e.fsf@HIDDEN>
 <ea10f340-9b46-4199-93fc-274c5e81ace4@HIDDEN>
 <87o74c1ce1.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN> <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN> <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN> <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN> <86ttdtzoof.fsf@HIDDEN>
 <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN> <86ttdsxt6x.fsf@HIDDEN>
 <52cb1caa-9e7e-45df-b328-d60948d397f6@HIDDEN> <864j5rxca1.fsf@HIDDEN>
Content-Language: en-US
From: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <864j5rxca1.fsf@HIDDEN>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Score: -0.7 (/)
X-Debbugs-Envelope-To: 73484
Cc: 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.7 (-)

On 05/10/2024 10:02, Eli Zaretskii wrote:

> Like I said: in huge trees this might matter.

We do want to support them, right? Or anyway make the project size 
cutoff (where it remains practical to use Emacs) as high as feasible.

> But in any case, I don't understand the significance of the timings
> you show: we are discussing the increase in processing time which will
> be caused by adding files with no tags, which produce a single line in
> TAGS.

If there are a magnitude more "other" files, and an average source file 
contains only several definitions, this can make a difference.

> Therefore the interesting figures are time differences in
> processing some commands with and without those additional lines.  Are
> the times you show above related to any of that?

The time to generate is relevant. The time to visit the tags table gets 
non-trivial too, and it can increase.

>> If someone were to provide a patch for etags with new functionality
>> (disabling fallbacks, at least), I could benchmark and come back with
>> numbers. And if experimental flags are available, with numbers for those
>> as well.
> 
> How hard is it to add to a live TAGS file fake lines which look like
> this:
> 
>    ^L
>    foo,0
> 
> (with random strings instead of "foo"), and then time some TAGS-using
> commands with and without these additions?

Okay, done that.

'M-.' takes more or less the same.

The file size of TAGS increased from 66 MB to 85 MiB.

Won't measure time to generate now - because the current method and the 
"real" one will be different, but note that it's more relevant with 
etags-regen-mode because the scan is performed lazily: every time the 
user does the first search in a new project.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 5 Oct 2024 07:03:18 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sat Oct 05 03:03:18 2024
Received: from localhost ([127.0.0.1]:37154 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1swyor-0002v0-3c
	for submit <at> debbugs.gnu.org; Sat, 05 Oct 2024 03:03:18 -0400
Received: from eggs.gnu.org ([209.51.188.92]:45766)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eliz@HIDDEN>) id 1swyoo-0002ul-IY
 for 73484 <at> debbugs.gnu.org; Sat, 05 Oct 2024 03:03:12 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1swyoe-0005wb-Mi; Sat, 05 Oct 2024 03:03:00 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date:
 mime-version; bh=F5ceDjeYLYCNQDBh7JeRzG3j1yZkT/hMz/WQjtVO6k0=; b=SS0dAs56do18
 eLctr5JMoJ9YNI8vB8wwxUfXdYgBLi3fGFs0zy3GXAfcW8UKHSOLksnTnQbGka/QHeS/MT5+X12fB
 t8vW0pl/WzlUqXdmnmam28Tp9wOaT6tjre8qMQjjSBbFi20UWHVYYPiVc+0qSqrcES1bnpM+O9vZw
 IWBJ5nNRs3fHLB2yityurKS2BWSAnne70VeYBrZlPm1G+LczK11V2fDz/eI29G+pvYUqdrwOVQxZj
 QqmiuwQl4TLnzdKgVnpO/m8ObFUax3SAN3eptiQ9M0O4nQFXVWzdL1dOc2NFSzcz5Nq+1HQaRqR01
 im5ji4VqR1+m/EIZ9FIb0A==;
Date: Sat, 05 Oct 2024 10:02:46 +0300
Message-Id: <864j5rxca1.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <52cb1caa-9e7e-45df-b328-d60948d397f6@HIDDEN> (message from
 Dmitry Gutov on Sat, 5 Oct 2024 02:01:14 +0300)
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
References: <87tteaznog.fsf@HIDDEN>
 <edab570c-b2fa-4162-9383-df5c8aaff251@HIDDEN>
 <8734lrrj4e.fsf@HIDDEN>
 <ea10f340-9b46-4199-93fc-274c5e81ace4@HIDDEN>
 <87o74c1ce1.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN> <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN> <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN> <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN> <86ttdtzoof.fsf@HIDDEN>
 <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN> <86ttdsxt6x.fsf@HIDDEN>
 <52cb1caa-9e7e-45df-b328-d60948d397f6@HIDDEN>
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 73484
Cc: 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

> Date: Sat, 5 Oct 2024 02:01:14 +0300
> Cc: spwhitton@HIDDEN, 73484 <at> debbugs.gnu.org
> From: Dmitry Gutov <dmitry@HIDDEN>
> 
> On 04/10/2024 09:45, Eli Zaretskii wrote:
> 
> > So once again, I think this is a premature optimization.  The downside
> > of a larger TAGS will only have tangible effects in huge trees.
> 
> FWIW, TAGS for gecko-dev (Mozilla's repository which I have here for 
> testing) takes ~30 seconds to generate and ~400ms to find a definition 
> for the set of files to scan that I currently have set up. Both timings 
> seem quite impactful for user experience. I imagine some Emacs users 
> work at Mozilla, though that's only a guess.

Like I said: in huge trees this might matter.

But in any case, I don't understand the significance of the timings
you show: we are discussing the increase in processing time which will
be caused by adding files with no tags, which produce a single line in
TAGS.  Therefore the interesting figures are time differences in
processing some commands with and without those additional lines.  Are
the times you show above related to any of that?

> If someone were to provide a patch for etags with new functionality 
> (disabling fallbacks, at least), I could benchmark and come back with 
> numbers. And if experimental flags are available, with numbers for those 
> as well.

How hard is it to add to a live TAGS file fake lines which look like
this:

  ^L
  foo,0

(with random strings instead of "foo"), and then time some TAGS-using
commands with and without these additions?

> >> I would hope that project-find-regexp works well enough for that. Or
> >> 'M-x project-search' for the fans of the classic interface.
> > 
> > Maybe, but we do still want to keep tags-search, so the existence of
> > other commands don't invalidate my argument above.
> 
> In my mind, tags-search is for files that are code-related. Actual users 
> might differ, though.

The fact that we pass *.texi files to etags should already tell you
that this mental model is incomplete.  The fact that etags supports
HTML, TeX, and PostScript files (in addition to Texinfo) is another
evidence to that effect.  And that's even before we consider the
regexp feature, which could be used to tag anything in any kind of
file.

I agree that these use cases are relatively rare, but that doesn't
make them invalid or even unimportant.

> > If we want a separate optional behavior that prevents files with no
> > tags from being mentioned in TAGS, I'd argue that such an option
> > should affect all the scanned files, not just those whose language
> > could not be determined from their names.
> 
> I don't have a strong opinion here, just that it would depart from my 
> mental model mentioned above, of having all code-related files listed. 
> For example by missing some newly added .c file where no function 
> definitions have been added yet; 'M-x tags-search' would skip it.

This matches my impression that this option (which skips files with no
tags) should rarely if ever be used.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 4 Oct 2024 23:01:32 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri Oct 04 19:01:31 2024
Received: from localhost ([127.0.0.1]:36815 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1swrIh-0002x8-BE
	for submit <at> debbugs.gnu.org; Fri, 04 Oct 2024 19:01:31 -0400
Received: from fhigh-a3-smtp.messagingengine.com ([103.168.172.154]:49327)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <dmitry@HIDDEN>) id 1swrIe-0002wu-Tv
 for 73484 <at> debbugs.gnu.org; Fri, 04 Oct 2024 19:01:29 -0400
Received: from phl-compute-06.internal (phl-compute-06.phl.internal
 [10.202.2.46])
 by mailfhigh.phl.internal (Postfix) with ESMTP id F3BBA11400BA;
 Fri,  4 Oct 2024 19:01:19 -0400 (EDT)
Received: from phl-mailfrontend-02 ([10.202.2.163])
 by phl-compute-06.internal (MEProxy); Fri, 04 Oct 2024 19:01:19 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gutov.dev; h=cc
 :cc:content-transfer-encoding:content-type:content-type:date
 :date:from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to; s=fm1; t=1728082879;
 x=1728169279; bh=bxKN8+NBv8UeLBqNbqP3hNiWptC9a2Mjk/7M4qfue8w=; b=
 XYPNARnOtsF2JgSOZPDpYOMV1HLm07oU507/A8EJl7LUVLkuq/2bseFgp1mKC6Tk
 ywzVtrwcuTsKB2onplMefNjsrrwASBQudls4hC9DHQUpq1bVtoOkWTEqrvsfRAd0
 8gyjLAvkZuqKTnheQBbLAq8DTWRADHMF50XxnIOupeHwF7zn4bfclA81qymYlsqC
 trXi5TQklR1nr8yNn/BqkbaDhSo9FC6hgLZpcEce3dbVjtjgxwWrQv9sKlNuuKTW
 AexqsrLcZN89i5sTgihgCiqKBclD12VxtP27c3Z/jzrQMX7s1005RjFg5E3+8+gp
 rpJnyf3lpAZEcwfKzFroJA==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 messagingengine.com; h=cc:cc:content-transfer-encoding
 :content-type:content-type:date:date:feedback-id:feedback-id
 :from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy
 :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1728082879; x=
 1728169279; bh=bxKN8+NBv8UeLBqNbqP3hNiWptC9a2Mjk/7M4qfue8w=; b=U
 m8DT+BYjHFm588e2+LQDcVFnR7xlTL9/cu7mk1SNl8vXhX3Kd2NCEHgUulWjIleY
 rOsDIHSBWt55NzLLH4/mkA30f2xqhYpf7cr9R09de7to8eEAcaT0ESfqzpyHwI17
 +V3UykN0rOILAfVHaoy5KuAVzayJQbmUy5eujlegyfavKER56hv9Yx7fwibOMp8K
 y1OhubxHw4mUlpV7kGtWLILzYC8Z9KH2WCPte8BtnCM9ehS9fRxgWEt89M69P48c
 m/OydzaFZskVrP18vHGtPdeRbfhmJN28BYtZJdogT6FN/1BbqSiFF5ANlOfuczNl
 6OmQWWMPu1JFUFYVoKxkQ==
X-ME-Sender: <xms:v3MAZ-w25RZJC2aoLD22V84I-5_b0sBzObEjd4CmnkFeWOx9m6IHUw>
 <xme:v3MAZ6SGdkAMjobB7PSNMFkyx0HiKeEsuflX7GNujvV4sg2E9lR97j1qms7w045vQ
 8ZBl9BuvHn7tSsrsSU>
X-ME-Received: <xmr:v3MAZwUE_z2Ag9CqWirGP9e4Rwm-TzOCy4aDB_HQg04F1zkJNArPG_-lsZ_kXcwEXio>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvddvgedgudekucetufdoteggodetrfdotf
 fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu
 rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnh
 htshculddquddttddmnecujfgurhepkfffgggfuffvvehfhfgjtgfgsehtjeertddtvdej
 necuhfhrohhmpeffmhhithhrhicuifhuthhovhcuoegumhhithhrhiesghhuthhovhdrug
 gvvheqnecuggftrfgrthhtvghrnhepteduleejgeehtefgheegjeekueehvdevieekueef
 tddvtdevfefhvdevgedujeehnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpe
 hmrghilhhfrhhomhepughmihhtrhihsehguhhtohhvrdguvghvpdhnsggprhgtphhtthho
 peefpdhmohguvgepshhmthhpohhuthdprhgtphhtthhopegvlhhiiiesghhnuhdrohhrgh
 dprhgtphhtthhopehsphifhhhithhtohhnsehsphifhhhithhtohhnrdhnrghmvgdprhgt
 phhtthhopeejfeegkeegseguvggssghughhsrdhgnhhurdhorhhg
X-ME-Proxy: <xmx:v3MAZ0igsNLusL_j0jiERnCcGEwCryneQxfPC-eErmtvXgNEyMhpsw>
 <xmx:v3MAZwCz2NP5FlXnK7qyt-kUW4_6aV3dj3D9MoSIhz5vuUlI67eeCw>
 <xmx:v3MAZ1LT_GjG7aAs7PHHkQPoU9ySIDEA_V-ZQqeEBsDzyVITyJS9lg>
 <xmx:v3MAZ3A3tzeZuaLRpm3VhnQH7PwFj6BReMbGDqcAqesir8OPjeccSQ>
 <xmx:v3MAZzORWe6TtoumaptdcgrQpCWJocUSYdb1luFlYvrBlo3QHYBYQiK5>
Feedback-ID: i07de48aa:Fastmail
Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri,
 4 Oct 2024 19:01:17 -0400 (EDT)
Message-ID: <52cb1caa-9e7e-45df-b328-d60948d397f6@HIDDEN>
Date: Sat, 5 Oct 2024 02:01:14 +0300
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
To: Eli Zaretskii <eliz@HIDDEN>
References: <87tteaznog.fsf@HIDDEN>
 <edab570c-b2fa-4162-9383-df5c8aaff251@HIDDEN>
 <8734lrrj4e.fsf@HIDDEN>
 <ea10f340-9b46-4199-93fc-274c5e81ace4@HIDDEN>
 <87o74c1ce1.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN> <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN> <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN> <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN> <86ttdtzoof.fsf@HIDDEN>
 <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN> <86ttdsxt6x.fsf@HIDDEN>
Content-Language: en-US
From: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <86ttdsxt6x.fsf@HIDDEN>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Score: -0.7 (/)
X-Debbugs-Envelope-To: 73484
Cc: 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.7 (-)

On 04/10/2024 09:45, Eli Zaretskii wrote:

> They will need to choose only if they want improvements.  To have the
> same behavior, with the same downsides as before, they need not change
> anything.  IOW, the change I propose does no harm to those projects.

We did talk about changing the default of etags-regen-file-extensions to 
t. I suppose it's debatable.

> And if shebang detection is desired, the choice is quite obvious, if
> you ask me: submit all the files.  The downside is making TAGS larger
> and having more file names in it, which I think is a very small
> downside, if at all, compared to advantages.
> 
> So once again, I think this is a premature optimization.  The downside
> of a larger TAGS will only have tangible effects in huge trees.

FWIW, TAGS for gecko-dev (Mozilla's repository which I have here for 
testing) takes ~30 seconds to generate and ~400ms to find a definition 
for the set of files to scan that I currently have set up. Both timings 
seem quite impactful for user experience. I imagine some Emacs users 
work at Mozilla, though that's only a guess.

If someone were to provide a patch for etags with new functionality 
(disabling fallbacks, at least), I could benchmark and come back with 
numbers. And if experimental flags are available, with numbers for those 
as well.

>>> The fact that in the scenario you describe above 2K more files will
>>> appear in tags-search is, from my POV, an argument _for_ including
>>> them, not against: we have no reason to assume that users don't want
>>> to search those files for some regexp, because regexps specified in
>>> tags-search don't necessarily have anything to do with the identifiers
>>> we tag.  A valid case in point is to look up all references to some
>>> file when the file is deleted, or references to some version when the
>>> version is updated: we definitely want files like README and INSTALL
>>> to be included in the search.
>>
>> I would hope that project-find-regexp works well enough for that. Or
>> 'M-x project-search' for the fans of the classic interface.
> 
> Maybe, but we do still want to keep tags-search, so the existence of
> other commands don't invalidate my argument above.

In my mind, tags-search is for files that are code-related. Actual users 
might differ, though.

>> README and INSTALL are not currently included in TAGS. You seem to be
>> making a case that all files in our dev repository should be included,
>> but for some reason the current build rules are very different?
> 
> I'm not talking specifically about Emacs, because README and INSTALL
> are typically present in many packages.  In our case, we don't pass
> them to etags for historical reasons (we have admin/*.el stuff to help
> us modify the version string in all the files that reference it, for
> example), but it is quite plausible that if we had this option back
> then, we could have used etags to help.  For example, one downside of
> what we have in admin.el is that the list of files to edit when we
> bump the version is maintained by hand, which is error-prone: we just
> had an instance of this when exec/configure.ac was added and we forgot
> to update admin.el according.  Using etags would have allowed us to
> avoid such problems.

Some other aspects of having more false positives would come up as a 
result, probably. But it might be worth testing.

> If we want a separate optional behavior that prevents files with no
> tags from being mentioned in TAGS, I'd argue that such an option
> should affect all the scanned files, not just those whose language
> could not be determined from their names.

I don't have a strong opinion here, just that it would depart from my 
mental model mentioned above, of having all code-related files listed. 
For example by missing some newly added .c file where no function 
definitions have been added yet; 'M-x tags-search' would skip it.

If that makes sense to you, okay.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 4 Oct 2024 06:45:33 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri Oct 04 02:45:33 2024
Received: from localhost ([127.0.0.1]:34274 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1swc4C-0001AS-Vh
	for submit <at> debbugs.gnu.org; Fri, 04 Oct 2024 02:45:33 -0400
Received: from eggs.gnu.org ([209.51.188.92]:39174)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eliz@HIDDEN>) id 1swc47-0001A7-NR
 for 73484 <at> debbugs.gnu.org; Fri, 04 Oct 2024 02:45:31 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1swc3y-0003a1-GS; Fri, 04 Oct 2024 02:45:18 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date:
 mime-version; bh=KyGa3PB4TZ86RhOZgVz/CB2F7o96JKYsnzwBYHxiMOA=; b=nouDps7aWI8x
 B5qW3rgQZO7rKHeTn7LDL1BVMxJOoloEyADZulRgre3mxTnM0ZO0+O/Lqi/AkQTER1UVgFrTybJzl
 4/JDJ1hN1wgm8rfQTNrbLEln2ffbTL6i4gbadj0dhuT/sIK0g43VmKqWUg8ByrXHurfVagxjutolJ
 dNYlhd1e4QUXvtfYolknhaaTUskofg4P9TOo4JmbUW7LbX+qBKWK02CafOw8qI04eId60qcAXzmzp
 Pwd0f9Lreat71nxyOFraBOVXt7zPKN53oQg7xfUVTzuUg+L5sCXqndiMlK7ebIHYnKPR0SU1SOX6a
 xbI1jwb8HWSqMbm/OJa13g==;
Date: Fri, 04 Oct 2024 09:45:10 +0300
Message-Id: <86ttdsxt6x.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN> (message from
 Dmitry Gutov on Fri, 4 Oct 2024 04:25:15 +0300)
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
References: <87tteaznog.fsf@HIDDEN>
 <edab570c-b2fa-4162-9383-df5c8aaff251@HIDDEN>
 <8734lrrj4e.fsf@HIDDEN>
 <ea10f340-9b46-4199-93fc-274c5e81ace4@HIDDEN>
 <87o74c1ce1.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN> <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN> <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN> <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN> <86ttdtzoof.fsf@HIDDEN>
 <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN>
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 73484
Cc: 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

> Date: Fri, 4 Oct 2024 04:25:15 +0300
> Cc: spwhitton@HIDDEN, 73484 <at> debbugs.gnu.org
> From: Dmitry Gutov <dmitry@HIDDEN>
> 
> >> Previously, when building a TAGS file manually, a developer in such a
> >> project specified a list of file globs by hand. One that would be
> >> limited to .[ch] files, and maybe .y as well, but not all the files in
> >> the directory.
> > 
> > If they definitely do NOT want the other files to be present in TAGS,
> > they can keep using those globs.  Nothing will change in that case.
> 
> a) They would have to produce the same list of file extensions that we 
> are using now, and they will need to find out which variable to 
> customize, to set to that list.
> 
> b) They won't get the shebang detection capability, unless we add a new 
> option where they will have to enumerate all their shebang-enabled file 
> names as well.
> 
> So it seems like they would have to choose between the one and the 
> other, with the end behavior that I'm describing not being supported 
> even any combination of user options.

They will need to choose only if they want improvements.  To have the
same behavior, with the same downsides as before, they need not change
anything.  IOW, the change I propose does no harm to those projects.

And if shebang detection is desired, the choice is quite obvious, if
you ask me: submit all the files.  The downside is making TAGS larger
and having more file names in it, which I think is a very small
downside, if at all, compared to advantages.

So once again, I think this is a premature optimization.  The downside
of a larger TAGS will only have tangible effects in huge trees.

> > The fact that in the scenario you describe above 2K more files will
> > appear in tags-search is, from my POV, an argument _for_ including
> > them, not against: we have no reason to assume that users don't want
> > to search those files for some regexp, because regexps specified in
> > tags-search don't necessarily have anything to do with the identifiers
> > we tag.  A valid case in point is to look up all references to some
> > file when the file is deleted, or references to some version when the
> > version is updated: we definitely want files like README and INSTALL
> > to be included in the search.
> 
> I would hope that project-find-regexp works well enough for that. Or 
> 'M-x project-search' for the fans of the classic interface.

Maybe, but we do still want to keep tags-search, so the existence of
other commands don't invalidate my argument above.

> README and INSTALL are not currently included in TAGS. You seem to be 
> making a case that all files in our dev repository should be included, 
> but for some reason the current build rules are very different?

I'm not talking specifically about Emacs, because README and INSTALL
are typically present in many packages.  In our case, we don't pass
them to etags for historical reasons (we have admin/*.el stuff to help
us modify the version string in all the files that reference it, for
example), but it is quite plausible that if we had this option back
then, we could have used etags to help.  For example, one downside of
what we have in admin.el is that the list of files to edit when we
bump the version is maintained by hand, which is error-prone: we just
had an instance of this when exec/configure.ac was added and we forgot
to update admin.el according.  Using etags would have allowed us to
avoid such problems.

If we want a separate optional behavior that prevents files with no
tags from being mentioned in TAGS, I'd argue that such an option
should affect all the scanned files, not just those whose language
could not be determined from their names.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 4 Oct 2024 01:25:30 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Oct 03 21:25:30 2024
Received: from localhost ([127.0.0.1]:34149 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1swX4T-0000wl-UU
	for submit <at> debbugs.gnu.org; Thu, 03 Oct 2024 21:25:30 -0400
Received: from fout-a1-smtp.messagingengine.com ([103.168.172.144]:46269)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <dmitry@HIDDEN>) id 1swX4S-0000wV-3p
 for 73484 <at> debbugs.gnu.org; Thu, 03 Oct 2024 21:25:28 -0400
Received: from phl-compute-10.internal (phl-compute-10.phl.internal
 [10.202.2.50])
 by mailfout.phl.internal (Postfix) with ESMTP id 252F11380228;
 Thu,  3 Oct 2024 21:25:20 -0400 (EDT)
Received: from phl-mailfrontend-01 ([10.202.2.162])
 by phl-compute-10.internal (MEProxy); Thu, 03 Oct 2024 21:25:20 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gutov.dev; h=cc
 :cc:content-transfer-encoding:content-type:content-type:date
 :date:from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to; s=fm1; t=1728005120;
 x=1728091520; bh=f53wCZtCsi8TOsNqFhXGXZqYI1OXCBtWaTrl88344M0=; b=
 ga/itDE/d9BVynxLygO5IS7j0hsNl6DwuRLW/7Ra2EcGkTGyqj/PNGYEQpE7S+C1
 EcjBQAXGe4AVtjftxxmaMCzUBSQXYe+OZhBrHWLcudIMyZolu0lE2qUWWxFFRgGh
 ExPEIS/BwuOCpR5umDd7VVTSSi7CGbt3yTsX1AV6yOmg5f+HraOyf3yu97PHNzTK
 Z0Fi9Iysq9uLPMuNsYzmnh7EBcElRmmAa1SU8AKdQGwhsHk8+RI/7RbLOPs5448j
 q/3Py9IB0p01YEUT5kOwqacWDaGwegPCPUxg+n5RdSCVZwvw1Wf9UO3SWgPILEIc
 1axUtkeOpa+61Rmj/l3jHg==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 messagingengine.com; h=cc:cc:content-transfer-encoding
 :content-type:content-type:date:date:feedback-id:feedback-id
 :from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy
 :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1728005120; x=
 1728091520; bh=f53wCZtCsi8TOsNqFhXGXZqYI1OXCBtWaTrl88344M0=; b=j
 8g+Qs0Iy3nJcg6LLEB7uOLXKZA21yafvv/Ui6k2HeznFcjLAFgJZE8GlpzkQucqy
 K95QAGrjnTQxN48/3GjEXF8H0D4DZ+05mSgewngL3mjAN4KfjvC+vV1XNbNSCaIK
 NWBWAiV9+o1rTvJDKyyqA9sOU4Qqdi9eZIghoMwl/MlyEtJgPIyYopUOIOZc3fQg
 9vcGLMrpesihXApjIQ5eFG/sKIpm/GXMFak9MuX6KVs0UVw1BOfQ3JH5DmsGRnqD
 WGaARjf61TJi0vcJabrdQrmPUIFmdglwt3uEbgIXg8HahTd0YCYjvThms3lFuzCQ
 oxnYgWJtlXduD83aQ/14A==
X-ME-Sender: <xms:_0P_Zi9bMzCBUyOr5Z4v5MWY3tAfkh-HLR5-ZSD3cGLU70zq2RjK5g>
 <xme:_0P_ZispFMP93-sYUgcM6tWhuygtAy9Ks-8B8gUoSap7pyeBHAunjAIeRyHRM6YrY
 NbpYKa-ZWfk6EbEtrQ>
X-ME-Received: <xmr:_0P_ZoCBlmFqyLFVZAkSMPONyTQ9CVU8AWakvzg0lwL0nyzKa4-albDQ4Ls4miptUws>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvddvvddggeekucetufdoteggodetrfdotf
 fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu
 rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnh
 htshculddquddttddmnecujfgurhepkfffgggfuffvvehfhfgjtgfgsehtjeertddtvdej
 necuhfhrohhmpeffmhhithhrhicuifhuthhovhcuoegumhhithhrhiesghhuthhovhdrug
 gvvheqnecuggftrfgrthhtvghrnhepteduleejgeehtefgheegjeekueehvdevieekueef
 tddvtdevfefhvdevgedujeehnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpe
 hmrghilhhfrhhomhepughmihhtrhihsehguhhtohhvrdguvghvpdhnsggprhgtphhtthho
 peefpdhmohguvgepshhmthhpohhuthdprhgtphhtthhopegvlhhiiiesghhnuhdrohhrgh
 dprhgtphhtthhopehsphifhhhithhtohhnsehsphifhhhithhtohhnrdhnrghmvgdprhgt
 phhtthhopeejfeegkeegseguvggssghughhsrdhgnhhurdhorhhg
X-ME-Proxy: <xmx:_0P_ZqePUWyocmVDmZvAadbfq7nbOx78j-mFwl2AEn4qLCe537M8eQ>
 <xmx:_0P_ZnP4AeIm07AdlVq1qR2bqY3tfzsQo9vmZAdw4basoH-tRH4MDw>
 <xmx:_0P_Zkm3-ER-fvGluUklPl0vIOfbo-Qcfsu2K8WqyM3M184_zbIbmA>
 <xmx:_0P_Zpsxe4q0tRdLm1TEKkeA9nK7tRTCmb95hQ29xcaQr4BebzEr5Q>
 <xmx:AET_ZgpdGxhJW1J5XjtgD1MqAnivcdmS1GPDULwE9rvprcvD2BRjEr7k>
Feedback-ID: i07de48aa:Fastmail
Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu,
 3 Oct 2024 21:25:18 -0400 (EDT)
Message-ID: <8d7dc133-9828-4023-821f-e4403f899f81@HIDDEN>
Date: Fri, 4 Oct 2024 04:25:15 +0300
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
To: Eli Zaretskii <eliz@HIDDEN>
References: <87tteaznog.fsf@HIDDEN>
 <edab570c-b2fa-4162-9383-df5c8aaff251@HIDDEN>
 <8734lrrj4e.fsf@HIDDEN>
 <ea10f340-9b46-4199-93fc-274c5e81ace4@HIDDEN>
 <87o74c1ce1.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN> <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN> <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN> <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN> <86ttdtzoof.fsf@HIDDEN>
Content-Language: en-US
From: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <86ttdtzoof.fsf@HIDDEN>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Score: -0.7 (/)
X-Debbugs-Envelope-To: 73484
Cc: 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.7 (-)

On 03/10/2024 09:27, Eli Zaretskii wrote:

>> But here's how I'm looking at it:
>>
>> Imagine a straightforward C project, one that has .c files, .h, maybe
>> .y, and also a bunch of docs, build artefacts (some of them checked in),
>> and maybe other data files as well. Also README, ChangeLog, Makefile,
>> config.bat, some .txt files, many other files without extensions, etc.
>>
>> Previously, when building a TAGS file manually, a developer in such a
>> project specified a list of file globs by hand. One that would be
>> limited to .[ch] files, and maybe .y as well, but not all the files in
>> the directory.
> 
> If they definitely do NOT want the other files to be present in TAGS,
> they can keep using those globs.  Nothing will change in that case.

a) They would have to produce the same list of file extensions that we 
are using now, and they will need to find out which variable to 
customize, to set to that list.

b) They won't get the shebang detection capability, unless we add a new 
option where they will have to enumerate all their shebang-enabled file 
names as well.

So it seems like they would have to choose between the one and the 
other, with the end behavior that I'm describing not being supported 
even any combination of user options.

>> To use Emacs itself as an example, the 'tags' target in our own Makefile
>> only includes .[hc], .m, .cc, .el and (surprising to me) .texi files.
>> But not any of the others. The number of such files is ~3K, if I'm
>> counting correctly.
>>
>> The total number of all non-ignored files in our repo is ~5K. That's 2K
>> more files that would be present in the 'M-x tags-search' or 'M-x
>> list-tags' outputs, if an Emacs developer simply switches to using
>> etags-regen-mode, and etags-regen-mode drops the file extensions
>> whitelist, and etags keeps all passed files' names in its output.
> 
> OTOH, if a file with a known extension has no taggable symbols, you
> still get its file name in TAGS.  So omitting files whose language we
> could not recognize would be an incompatible change in behavior.

Incompatible change in etags' behavior, but likely a more compatible 
change in the behavior of the default Emacs.

For etags, though, we could an opt-in flag.

> The fact that in the scenario you describe above 2K more files will
> appear in tags-search is, from my POV, an argument _for_ including
> them, not against: we have no reason to assume that users don't want
> to search those files for some regexp, because regexps specified in
> tags-search don't necessarily have anything to do with the identifiers
> we tag.  A valid case in point is to look up all references to some
> file when the file is deleted, or references to some version when the
> version is updated: we definitely want files like README and INSTALL
> to be included in the search.

I would hope that project-find-regexp works well enough for that. Or 
'M-x project-search' for the fans of the classic interface.

README and INSTALL are not currently included in TAGS. You seem to be 
making a case that all files in our dev repository should be included, 
but for some reason the current build rules are very different?




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 3 Oct 2024 06:27:51 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Oct 03 02:27:51 2024
Received: from localhost ([127.0.0.1]:59765 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1swFJW-0006vw-U3
	for submit <at> debbugs.gnu.org; Thu, 03 Oct 2024 02:27:51 -0400
Received: from eggs.gnu.org ([209.51.188.92]:41260)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eliz@HIDDEN>) id 1swFJU-0006vQ-JX
 for 73484 <at> debbugs.gnu.org; Thu, 03 Oct 2024 02:27:49 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1swFJN-0001vk-89; Thu, 03 Oct 2024 02:27:41 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date:
 mime-version; bh=rXFiQsCQt+P51bIiP/JHWTxHXbSUOGNiAvL47zTXnMg=; b=RBHG2Fu9dzGS
 ZY96WWl2vgwDAWQP93hBQvGGaNUQyuHMozk4gZ5pZPIhQUuDLdYgSlbVX21dyYXB8L3+iD3XNFv4l
 dvOastgezcuNabtVpkmgv9uckPQMfKQCNNX0CVG//7BCghniuZrWeFK/RXD9+HrdqSpkClxosIVP0
 Ghq+YCW/+apE4twvJF8ovYvn1cmDy2+M/PZYDsZCGe1hCMtVDqrngjjWzvWb+t4/FIezpEVfyXqe8
 5S/ajVCoevyI0EhdB5VWcm//b7bMWSl5bGq+Sfp/s1H6tKvzAVAQxa+xAW850FqzRXHpKytoKhIub
 upa8wUVh/Z6l03yPrqoAXQ==;
Date: Thu, 03 Oct 2024 09:27:28 +0300
Message-Id: <86ttdtzoof.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN> (message from
 Dmitry Gutov on Thu, 3 Oct 2024 01:03:14 +0300)
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
References: <87tteaznog.fsf@HIDDEN>
 <edab570c-b2fa-4162-9383-df5c8aaff251@HIDDEN>
 <8734lrrj4e.fsf@HIDDEN>
 <ea10f340-9b46-4199-93fc-274c5e81ace4@HIDDEN>
 <87o74c1ce1.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN> <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN> <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN> <865xqa1ggi.fsf@HIDDEN>
 <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN>
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 73484
Cc: 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

> Date: Thu, 3 Oct 2024 01:03:14 +0300
> Cc: spwhitton@HIDDEN, 73484 <at> debbugs.gnu.org
> From: Dmitry Gutov <dmitry@HIDDEN>
> 
> >> ...but if there are no matches I'd prefer the files to be skipped. The
> >> files detected as type 'none' anyway.
> > 
> > I don't like this, and I think this is misguided.  I also don't see
> > any special problem with having lines that name files in TAGS, it
> > isn't like the size of TAGS will grow significantly or its processing
> > will be significantly slower.  IOW, this sounds like a clear case of
> > premature optimization.
> 
> I could do some experiments, if you post preliminary support of that 
> flag, with "empty" files in TAGS and without.

OK.

> But here's how I'm looking at it:
> 
> Imagine a straightforward C project, one that has .c files, .h, maybe 
> .y, and also a bunch of docs, build artefacts (some of them checked in), 
> and maybe other data files as well. Also README, ChangeLog, Makefile, 
> config.bat, some .txt files, many other files without extensions, etc.
> 
> Previously, when building a TAGS file manually, a developer in such a 
> project specified a list of file globs by hand. One that would be 
> limited to .[ch] files, and maybe .y as well, but not all the files in 
> the directory.

If they definitely do NOT want the other files to be present in TAGS,
they can keep using those globs.  Nothing will change in that case.

> To use Emacs itself as an example, the 'tags' target in our own Makefile 
> only includes .[hc], .m, .cc, .el and (surprising to me) .texi files. 
> But not any of the others. The number of such files is ~3K, if I'm 
> counting correctly.
> 
> The total number of all non-ignored files in our repo is ~5K. That's 2K 
> more files that would be present in the 'M-x tags-search' or 'M-x 
> list-tags' outputs, if an Emacs developer simply switches to using 
> etags-regen-mode, and etags-regen-mode drops the file extensions 
> whitelist, and etags keeps all passed files' names in its output.

OTOH, if a file with a known extension has no taggable symbols, you
still get its file name in TAGS.  So omitting files whose language we
could not recognize would be an incompatible change in behavior.

The fact that in the scenario you describe above 2K more files will
appear in tags-search is, from my POV, an argument _for_ including
them, not against: we have no reason to assume that users don't want
to search those files for some regexp, because regexps specified in
tags-search don't necessarily have anything to do with the identifiers
we tag.  A valid case in point is to look up all references to some
file when the file is deleted, or references to some version when the
version is updated: we definitely want files like README and INSTALL
to be included in the search.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 2 Oct 2024 22:03:26 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Oct 02 18:03:26 2024
Received: from localhost ([127.0.0.1]:59429 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1sw7RO-0004FB-6x
	for submit <at> debbugs.gnu.org; Wed, 02 Oct 2024 18:03:26 -0400
Received: from fhigh-a6-smtp.messagingengine.com ([103.168.172.157]:33449)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <dmitry@HIDDEN>) id 1sw7RN-0004Ex-3E
 for 73484 <at> debbugs.gnu.org; Wed, 02 Oct 2024 18:03:25 -0400
Received: from phl-compute-10.internal (phl-compute-10.phl.internal
 [10.202.2.50])
 by mailfhigh.phl.internal (Postfix) with ESMTP id EDAC81140162;
 Wed,  2 Oct 2024 18:03:18 -0400 (EDT)
Received: from phl-mailfrontend-02 ([10.202.2.163])
 by phl-compute-10.internal (MEProxy); Wed, 02 Oct 2024 18:03:18 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gutov.dev; h=cc
 :cc:content-transfer-encoding:content-type:content-type:date
 :date:from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to; s=fm1; t=1727906598;
 x=1727992998; bh=34vEabMkoacxJn2PGn3WNwU+KpHLxHtCUCsV3dld5xc=; b=
 llrLFWXBlaPmzTXUeaYqc5PXX3q9UT8iq+b4rjqX3Z7rumXl9+4cGOaOqpCQfU/X
 qVtZKKKrcgi2KV3iGPNX6zS+3tBs+0oKLr0cbJSqwR6/o+GyGlDco3hehHIPlvlq
 5wmLLp5pPWdnzUc68ulrJlAYpHEGBMG9bpYL0Vt2GH6l/u29PmBBCPci6FuYjGz5
 Hd1KbLUURRiRB0AlnuvHY3KMMz9E3IGJ7755XKUEnxfkq+j4aV3rwCxSgdAfnHQd
 qUeKMBfddwhxRl6Sg75BVxgD5LPjgjyH85ZOQwMvRvx+nkLJl7nz0ot1HqvR4RmH
 huIxpmyiV8X6IA/UcW/mEA==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 messagingengine.com; h=cc:cc:content-transfer-encoding
 :content-type:content-type:date:date:feedback-id:feedback-id
 :from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy
 :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1727906598; x=
 1727992998; bh=34vEabMkoacxJn2PGn3WNwU+KpHLxHtCUCsV3dld5xc=; b=f
 cUs+Ve5I6ypCt2FXqN5YdhpssZflkuUreRmUTej019PXhNRQCru45Vf4cwGx1Fek
 TVHXbwuHLt42kerlM+a+w1f5OwxIei6TwtArsyRn9GI5T8xDKeOQFIjHSn3V53hI
 4jirdu8Qkso5Zz9hL/XtLJ+eSJKNFducb1zNDcoKi7Ug/2SbfDLXC39ZGobYVeJz
 82XJtGv0DqAAFTsilnV23QAXBE/6pSzvMGBxkNW31wmV4USxyj/ow/kGDu3ADEle
 07BS2ElbdysfpR1tI8pt3WcnzoYoFsuGqKlykunK32RcXDefe/zdyU0DY/4TYS6/
 oaIgHM4GOjEpdePNYGgDQ==
X-ME-Sender: <xms:JsP9ZjfPAi4ep9Ef3Hf84ogQozf_HiIs2GUNhIFKZoRju3TBD2YhYA>
 <xme:JsP9ZpPABx6kxh55d8eMsWictVapLeHrmGi-JczFFxT6K_JlpK1IHPGYJVF_ZBDNo
 bUN2Bm5IDM8tr7EdFk>
X-ME-Received: <xmr:JsP9ZshoNbl-B5RMRoW0YBse4JIQXOSdvMdtHOUIzAXFQilkqUX6GLc7evMcPTOzd3I>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvddvtddgtdeiucetufdoteggodetrfdotf
 fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu
 rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnh
 htshculddquddttddmnecujfgurhepkfffgggfuffvvehfhfgjtgfgsehtjeertddtvdej
 necuhfhrohhmpeffmhhithhrhicuifhuthhovhcuoegumhhithhrhiesghhuthhovhdrug
 gvvheqnecuggftrfgrthhtvghrnhepteduleejgeehtefgheegjeekueehvdevieekueef
 tddvtdevfefhvdevgedujeehnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpe
 hmrghilhhfrhhomhepughmihhtrhihsehguhhtohhvrdguvghvpdhnsggprhgtphhtthho
 peefpdhmohguvgepshhmthhpohhuthdprhgtphhtthhopegvlhhiiiesghhnuhdrohhrgh
 dprhgtphhtthhopehsphifhhhithhtohhnsehsphifhhhithhtohhnrdhnrghmvgdprhgt
 phhtthhopeejfeegkeegseguvggssghughhsrdhgnhhurdhorhhg
X-ME-Proxy: <xmx:JsP9Zk-5JCVYRiPOcgh8wWVlYc9Q1Ce09b9Gs4nCKkRwYvtGHd920g>
 <xmx:JsP9ZvukN2ImN-nGIZfCDVq9TTy4gnI2C7z7QxBkbLSIjlbqARlEsw>
 <xmx:JsP9ZjFdNRKWm-eWbiobBv_lvdYxMUYameiXHP4RL8Hk1TliyxiHGA>
 <xmx:JsP9ZmNFPIyL5r1B5GEjSaRqLVCSz45qVFZNNy_Ys3Vd3snSjA0MiA>
 <xmx:JsP9ZvLXf0oG_OpBFP08T3RU5QBVAD_3TjWwXC-5kyxNEvy4EcywE6ya>
Feedback-ID: i07de48aa:Fastmail
Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed,
 2 Oct 2024 18:03:17 -0400 (EDT)
Message-ID: <ca89563f-b0d2-412a-9248-e4beb3ad7b84@HIDDEN>
Date: Thu, 3 Oct 2024 01:03:14 +0300
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
To: Eli Zaretskii <eliz@HIDDEN>
References: <87tteaznog.fsf@HIDDEN>
 <edab570c-b2fa-4162-9383-df5c8aaff251@HIDDEN>
 <8734lrrj4e.fsf@HIDDEN>
 <ea10f340-9b46-4199-93fc-274c5e81ace4@HIDDEN>
 <87o74c1ce1.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN> <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN> <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN> <865xqa1ggi.fsf@HIDDEN>
Content-Language: en-US
From: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <865xqa1ggi.fsf@HIDDEN>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Score: -0.7 (/)
X-Debbugs-Envelope-To: 73484
Cc: 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.7 (-)

On 02/10/2024 21:56, Eli Zaretskii wrote:
>> Date: Wed, 2 Oct 2024 21:00:58 +0300
>> Cc: spwhitton@HIDDEN, 73484 <at> debbugs.gnu.org
>> From: Dmitry Gutov <dmitry@HIDDEN>
>>
>> On 02/10/2024 14:28, Eli Zaretskii wrote:
>>>> Date: Tue, 1 Oct 2024 02:19:17 +0300
>>>> Cc: spwhitton@HIDDEN, 73484 <at> debbugs.gnu.org
>>>> From: Dmitry Gutov <dmitry@HIDDEN>
>>>>
>>>> Just do nothing.
>>>
>>> Doing nothing means the file's name will not appear at all in TAGS.  I
>>> don't think that's TRT, since every file submitted to etags should be
>>> mentioned in TAGS for the benefit of tags-search and similar features.
>>
>> Hmm, maybe another flag, then?
>>
>> Including many unrelated files would just bloat the tags file for little
>> reason. And unlike manual generation, it's not like the user asked for
>> all of them to be included.
> 
> What do we tell to users of tags-search and its ilk?

We can consider how most of such users' indexes look. See below.

>>> So I currently tend to modify etags such that if no language was
>>> detected by the file's name/extension, and this new no-fallbacks
>>> option was specified, etags will behave as if given --language=none
>>> (which also means that if any regexps were specified, they will be
>>> processed correctly for such files).
>>
>> Any regexps for "all" files, right?
> 
> The rules for regexps don't change: each regexp applies to the files
> that follow it on the command line.

This seems okay.

>> ...but if there are no matches I'd prefer the files to be skipped. The
>> files detected as type 'none' anyway.
> 
> I don't like this, and I think this is misguided.  I also don't see
> any special problem with having lines that name files in TAGS, it
> isn't like the size of TAGS will grow significantly or its processing
> will be significantly slower.  IOW, this sounds like a clear case of
> premature optimization.

I could do some experiments, if you post preliminary support of that 
flag, with "empty" files in TAGS and without.

But here's how I'm looking at it:

Imagine a straightforward C project, one that has .c files, .h, maybe 
.y, and also a bunch of docs, build artefacts (some of them checked in), 
and maybe other data files as well. Also README, ChangeLog, Makefile, 
config.bat, some .txt files, many other files without extensions, etc.

Previously, when building a TAGS file manually, a developer in such a 
project specified a list of file globs by hand. One that would be 
limited to .[ch] files, and maybe .y as well, but not all the files in 
the directory.

To use Emacs itself as an example, the 'tags' target in our own Makefile 
only includes .[hc], .m, .cc, .el and (surprising to me) .texi files. 
But not any of the others. The number of such files is ~3K, if I'm 
counting correctly.

The total number of all non-ignored files in our repo is ~5K. That's 2K 
more files that would be present in the 'M-x tags-search' or 'M-x 
list-tags' outputs, if an Emacs developer simply switches to using 
etags-regen-mode, and etags-regen-mode drops the file extensions 
whitelist, and etags keeps all passed files' names in its output.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 2 Oct 2024 18:56:57 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Oct 02 14:56:57 2024
Received: from localhost ([127.0.0.1]:59229 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1sw4Wu-0001tI-SO
	for submit <at> debbugs.gnu.org; Wed, 02 Oct 2024 14:56:57 -0400
Received: from eggs.gnu.org ([209.51.188.92]:35948)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eliz@HIDDEN>) id 1sw4Wt-0001sz-G2
 for 73484 <at> debbugs.gnu.org; Wed, 02 Oct 2024 14:56:56 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1sw4Wn-0001VR-2e; Wed, 02 Oct 2024 14:56:49 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date:
 mime-version; bh=8EqvFEYhajW31dYFbUgu5HrfkxZrAGpQGRGxGITHYrg=; b=qfLYDDW2OF/P
 sWHfG93P5MTzUGm/2SxjEMuhyCtMO4X66jM70PqZWFTEBZQknYabqTUo1z9/eh+rFrOrl0ubd8X9y
 uRusjmvdPqI6ZnTqq8gIT+V9R8uONnpH+vOk7bUtknbBETNyc1mbcxVaKTCt3hFeenAIYAysbWCAb
 lChHexYx3NrbedVkP+dUCUueP9R2DYvzdu1VCB+b/TawUkKVla0ASl5TMQI2PvrP6aT5bOt2MByUn
 NAG88F+rDSMWq49PcLMQ6JgHXy4pmwX8lLz9dfa43/axg3rx9UG2RlEvr8xC1/aqPyr4bZJ7DzFpU
 ZFrf8pWPpauTfJ42Hrlr1g==;
Date: Wed, 02 Oct 2024 21:56:45 +0300
Message-Id: <865xqa1ggi.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN> (message from
 Dmitry Gutov on Wed, 2 Oct 2024 21:00:58 +0300)
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
References: <87tteaznog.fsf@HIDDEN>
 <edab570c-b2fa-4162-9383-df5c8aaff251@HIDDEN>
 <8734lrrj4e.fsf@HIDDEN>
 <ea10f340-9b46-4199-93fc-274c5e81ace4@HIDDEN>
 <87o74c1ce1.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN> <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN> <86frpe2186.fsf@HIDDEN>
 <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN>
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 73484
Cc: 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

> Date: Wed, 2 Oct 2024 21:00:58 +0300
> Cc: spwhitton@HIDDEN, 73484 <at> debbugs.gnu.org
> From: Dmitry Gutov <dmitry@HIDDEN>
> 
> On 02/10/2024 14:28, Eli Zaretskii wrote:
> >> Date: Tue, 1 Oct 2024 02:19:17 +0300
> >> Cc: spwhitton@HIDDEN, 73484 <at> debbugs.gnu.org
> >> From: Dmitry Gutov <dmitry@HIDDEN>
> >>
> >> Just do nothing.
> > 
> > Doing nothing means the file's name will not appear at all in TAGS.  I
> > don't think that's TRT, since every file submitted to etags should be
> > mentioned in TAGS for the benefit of tags-search and similar features.
> 
> Hmm, maybe another flag, then?
> 
> Including many unrelated files would just bloat the tags file for little 
> reason. And unlike manual generation, it's not like the user asked for 
> all of them to be included.

What do we tell to users of tags-search and its ilk?

> > So I currently tend to modify etags such that if no language was
> > detected by the file's name/extension, and this new no-fallbacks
> > option was specified, etags will behave as if given --language=none
> > (which also means that if any regexps were specified, they will be
> > processed correctly for such files).
> 
> Any regexps for "all" files, right?

The rules for regexps don't change: each regexp applies to the files
that follow it on the command line.

> ...but if there are no matches I'd prefer the files to be skipped. The 
> files detected as type 'none' anyway.

I don't like this, and I think this is misguided.  I also don't see
any special problem with having lines that name files in TAGS, it
isn't like the size of TAGS will grow significantly or its processing
will be significantly slower.  IOW, this sounds like a clear case of
premature optimization.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 2 Oct 2024 18:01:11 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Oct 02 14:01:11 2024
Received: from localhost ([127.0.0.1]:59169 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1sw3ex-0007If-F1
	for submit <at> debbugs.gnu.org; Wed, 02 Oct 2024 14:01:11 -0400
Received: from fhigh-a3-smtp.messagingengine.com ([103.168.172.154]:46683)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <dmitry@HIDDEN>) id 1sw3ew-0007IT-2v
 for 73484 <at> debbugs.gnu.org; Wed, 02 Oct 2024 14:01:10 -0400
Received: from phl-compute-10.internal (phl-compute-10.phl.internal
 [10.202.2.50])
 by mailfhigh.phl.internal (Postfix) with ESMTP id EB4ED11401CD;
 Wed,  2 Oct 2024 14:01:03 -0400 (EDT)
Received: from phl-mailfrontend-02 ([10.202.2.163])
 by phl-compute-10.internal (MEProxy); Wed, 02 Oct 2024 14:01:03 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gutov.dev; h=cc
 :cc:content-transfer-encoding:content-type:content-type:date
 :date:from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to; s=fm1; t=1727892063;
 x=1727978463; bh=0+i62Fhc8sBwgsoGREqhvkgUAsllWyyCF1CTaaTKQR8=; b=
 eJqDQuXiN93IKf9ivsUi1TtoJ+Dl777xcHgh5JYYkkp8LczJrQcVDPBGpcB7S0Rp
 uSG5K6Rpt5/1S7WDkrWmGnW/++HjOuGXo9CjwtUbLb+TiczlN26SjRDW1VnVoVsp
 ZYkop4xfd/Ho7U13Lmd3mUtQiQMGWdREHCa7VD+oLZHFjlxEl/EcVuMzZZBh4EbT
 9oK4K1d6/7T86o3fpPpoPkNpk/0oxht2ymAP+Fy2yArFSgLr8vhl6v6LqDW71KII
 TXeneUFmPLRb1DmqS4FxUMSMIA5JRYb2my/a6xGanjsdijvnZ7LbA6NBMWfEWHBT
 LIJf2Wy04FpqHRMu2bscFA==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 messagingengine.com; h=cc:cc:content-transfer-encoding
 :content-type:content-type:date:date:feedback-id:feedback-id
 :from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy
 :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1727892063; x=
 1727978463; bh=0+i62Fhc8sBwgsoGREqhvkgUAsllWyyCF1CTaaTKQR8=; b=Q
 r9KhnwIfES2rEswd9dMdKaBTQSYmSF8kfoCXEN7pQ371x1wsDBrkM7GWRtKQrDXD
 zsufdSuifC+Js+behtK46ip84S23AE/PibL098zJ040rPGNlp6vNAcSDHCmn3GUf
 cMtKTEA7h9ZBSaq7nA8hLju6o0hKvvWddaqhAebnY3ppotLYkeLNfwMrKXOmATFv
 Lo00Ja6nKf+qYSJKRisW+t+CgMbssIDJKn4Lv3e2IHPbCs9E0qdw0kfyZa0i71dT
 zTkHWj0rXvhqx/A4jM8fTy6X88yvg+fWrrFcpLZSmN0LpvYq1wSHQQYXGxnEx43U
 zFVZ6H2pIzAaQklpOkx2A==
X-ME-Sender: <xms:X4r9ZmF2BhH_hC1XRhVcHDxOb_4fQdBcGoNCjFWzmzhCMXJlh09CEw>
 <xme:X4r9ZnWpfZUGcDq0xUOFBCp9u9masIt0PzDOY2a81ycFq-BStlJHUHKLSA_q6msAf
 jDZbPSePqTA353a30I>
X-ME-Received: <xmr:X4r9ZgKd_DG0AMxkeu5HhObTuWKuEwAaVmPDZQBNiT6VeH_nW17GKN9OHnNOiRntv8Y>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvdduledgudduiecutefuodetggdotefrod
 ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpggftfghnshhusghstghrihgsvgdp
 uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg
 hnthhsucdlqddutddtmdenucfjughrpefkffggfgfuvfevfhfhjggtgfesthejredttddv
 jeenucfhrhhomhepffhmihhtrhihucfiuhhtohhvuceoughmihhtrhihsehguhhtohhvrd
 guvghvqeenucggtffrrghtthgvrhhnpeetudeljeegheetgfehgeejkeeuhedvveeikeeu
 fedtvddtveefhfdvveegudejheenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmh
 epmhgrihhlfhhrohhmpegumhhithhrhiesghhuthhovhdruggvvhdpnhgspghrtghpthht
 ohepfedpmhhouggvpehsmhhtphhouhhtpdhrtghpthhtohepvghlihiisehgnhhurdhorh
 hgpdhrtghpthhtohepshhpfihhihhtthhonhesshhpfihhihhtthhonhdrnhgrmhgvpdhr
 tghpthhtohepjeefgeekgeesuggvsggsuhhgshdrghhnuhdrohhrgh
X-ME-Proxy: <xmx:X4r9ZgESWvwQt56APhYqGVJwE4Rw-gmiR4WZzqTjdrzIhpLCSZcExw>
 <xmx:X4r9ZsVUthFZJC2z9Th44Wy5x_bIIbGGOF3BmwFGQzgKg_hQjmndkQ>
 <xmx:X4r9ZjOyTddYHwMKyxaGcWs9OqUgq2brHYSrZLxs8SUCE8Xa7Puyow>
 <xmx:X4r9Zj1D0G2_6hYTCgp1NF0_k3YFrUAICeRTVTmp1iL515FC9OFIFg>
 <xmx:X4r9ZjQLuVapSmJzFQgtPDVfsN60SbtJumTZafIeJF6vlTBKv41Xo4Ui>
Feedback-ID: i07de48aa:Fastmail
Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed,
 2 Oct 2024 14:01:01 -0400 (EDT)
Message-ID: <8e305b6d-8ca8-4437-990f-183ebc007d18@HIDDEN>
Date: Wed, 2 Oct 2024 21:00:58 +0300
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
To: Eli Zaretskii <eliz@HIDDEN>
References: <87tteaznog.fsf@HIDDEN>
 <edab570c-b2fa-4162-9383-df5c8aaff251@HIDDEN>
 <8734lrrj4e.fsf@HIDDEN>
 <ea10f340-9b46-4199-93fc-274c5e81ace4@HIDDEN>
 <87o74c1ce1.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN> <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN> <86frpe2186.fsf@HIDDEN>
Content-Language: en-US
From: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <86frpe2186.fsf@HIDDEN>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Score: -0.7 (/)
X-Debbugs-Envelope-To: 73484
Cc: 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.7 (-)

On 02/10/2024 14:28, Eli Zaretskii wrote:
>> Date: Tue, 1 Oct 2024 02:19:17 +0300
>> Cc: spwhitton@HIDDEN, 73484 <at> debbugs.gnu.org
>> From: Dmitry Gutov <dmitry@HIDDEN>
>>
>> On 29/09/2024 11:25, Eli Zaretskii wrote:
>>> I understand that we need to disable the Fortran and C fallbacks to
>>> avoid false positives, but what do we want to do if the fallbacks are
>>> disabled and no suitable language parser is found using the file name?
>>> Just skip the file and do nothing? emit a warning? something else?
>>
>> Just do nothing.
> 
> Doing nothing means the file's name will not appear at all in TAGS.  I
> don't think that's TRT, since every file submitted to etags should be
> mentioned in TAGS for the benefit of tags-search and similar features.

Hmm, maybe another flag, then?

Including many unrelated files would just bloat the tags file for little 
reason. And unlike manual generation, it's not like the user asked for 
all of them to be included.

> So I currently tend to modify etags such that if no language was
> detected by the file's name/extension, and this new no-fallbacks
> option was specified, etags will behave as if given --language=none
> (which also means that if any regexps were specified, they will be
> processed correctly for such files).

Any regexps for "all" files, right? For our etags-regen configuration in 
the Emacs repo, for example, we add 2 regexps, but for specific file 
types only.

If regexps are configured for 'none', and they match something, 
certainly the file should be in the index.

> If no regexps were specified or
> none matched, this means only the file's name will appear in TAGS, and
> that's all.

...but if there are no matches I'd prefer the files to be skipped. The 
files detected as type 'none' anyway.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 2 Oct 2024 11:28:24 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Oct 02 07:28:24 2024
Received: from localhost ([127.0.0.1]:56750 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1svxWp-0001xP-W6
	for submit <at> debbugs.gnu.org; Wed, 02 Oct 2024 07:28:24 -0400
Received: from eggs.gnu.org ([209.51.188.92]:46986)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eliz@HIDDEN>) id 1svxWn-0001xA-5y
 for 73484 <at> debbugs.gnu.org; Wed, 02 Oct 2024 07:28:22 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1svxWg-0008Fc-LO; Wed, 02 Oct 2024 07:28:14 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date:
 mime-version; bh=CYQUSp+kw46nngEaLqeT5oB2wEWsVbDxF9kR5OyPwXU=; b=m5MtSYU1oAM8
 3DlUTPHnsNCteagb6Y5+zhxwJDDrR8/xENcj7rDr+Wj404vue96DBw1bYfbIO9Ty91MXXRJzi4v+v
 qD+jhb2T75Qge/ALtjw87h0TR44zfk4xk3UMjnhesELjm6WZpjT5eVkT1NIbtk3XAR+PxLoyomNIt
 Ah7ZfQ26GdHbV7TyPy/TBhC9CHloPoU5aQ1IagCsT3ECfM5GS9UOgwmcYrR1gweBbYeyLHN6dEuUw
 zjFysSAHhQlJXOjGrOs6bZ80Im537TDFFerJ12FG/cikZsaMlnfokG42/7Y15/fkndLrg9cg1Ya0d
 NFGgvwjmFbpqLbiLuqUF5Q==;
Date: Wed, 02 Oct 2024 14:28:09 +0300
Message-Id: <86frpe2186.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN> (message from
 Dmitry Gutov on Tue, 1 Oct 2024 02:19:17 +0300)
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
References: <87tteaznog.fsf@HIDDEN>
 <edab570c-b2fa-4162-9383-df5c8aaff251@HIDDEN>
 <8734lrrj4e.fsf@HIDDEN>
 <ea10f340-9b46-4199-93fc-274c5e81ace4@HIDDEN>
 <87o74c1ce1.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN> <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN>
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 73484
Cc: 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

> Date: Tue, 1 Oct 2024 02:19:17 +0300
> Cc: spwhitton@HIDDEN, 73484 <at> debbugs.gnu.org
> From: Dmitry Gutov <dmitry@HIDDEN>
> 
> On 29/09/2024 11:25, Eli Zaretskii wrote:
> > I understand that we need to disable the Fortran and C fallbacks to
> > avoid false positives, but what do we want to do if the fallbacks are
> > disabled and no suitable language parser is found using the file name?
> > Just skip the file and do nothing? emit a warning? something else?
> 
> Just do nothing.

Doing nothing means the file's name will not appear at all in TAGS.  I
don't think that's TRT, since every file submitted to etags should be
mentioned in TAGS for the benefit of tags-search and similar features.

So I currently tend to modify etags such that if no language was
detected by the file's name/extension, and this new no-fallbacks
option was specified, etags will behave as if given --language=none
(which also means that if any regexps were specified, they will be
processed correctly for such files).  If no regexps were specified or
none matched, this means only the file's name will appear in TAGS, and
that's all.

If the above is not a good plan for some reason, feel free to holler.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 1 Oct 2024 22:01:34 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Tue Oct 01 18:01:34 2024
Received: from localhost ([127.0.0.1]:53938 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1svkw2-00055J-2K
	for submit <at> debbugs.gnu.org; Tue, 01 Oct 2024 18:01:34 -0400
Received: from fhigh-a7-smtp.messagingengine.com ([103.168.172.158]:42023)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <dmitry@HIDDEN>) id 1svkvz-00055A-Qv
 for 73484 <at> debbugs.gnu.org; Tue, 01 Oct 2024 18:01:32 -0400
Received: from phl-compute-01.internal (phl-compute-01.phl.internal
 [10.202.2.41])
 by mailfhigh.phl.internal (Postfix) with ESMTP id 017AA1140521;
 Tue,  1 Oct 2024 18:01:27 -0400 (EDT)
Received: from phl-mailfrontend-02 ([10.202.2.163])
 by phl-compute-01.internal (MEProxy); Tue, 01 Oct 2024 18:01:27 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gutov.dev; h=cc
 :cc:content-transfer-encoding:content-type:content-type:date
 :date:from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to; s=fm1; t=1727820086;
 x=1727906486; bh=QhGe4Rc7rkIpT/PtT6JUOyKK/zxDU2uhNWfKxEQZYnM=; b=
 SsvKF2rvHO9keQl0cnBQGbKPsAuYrk1LWAi5mk7jls7akRMo8y4+n7TEmxCO0gwq
 bmETVByz0nRnmAUFeoP0m6vZ/Q/e75Ad4n0iGVbhDNKZUnKoXqJOabajymqNptjZ
 Awnqno96hCnvDqU6fqs4BuJquArStyyhBD/OL2+8Ruy6MBeCq5HQ80s+lh2awG9V
 i/wBjr4Myhc5VKD576LnQiHsockfKe6hBQoiEUgKEdn3w6N5He+nlhH0NKKMNNAD
 uiTe999wGYyO/xKH+K/vA/1+uzIMc/NmP+U5+b4aLmTi/y4B63ycqaaVAcwnEWKd
 Tkwag9KFc+cmmgOmTR54uQ==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 messagingengine.com; h=cc:cc:content-transfer-encoding
 :content-type:content-type:date:date:feedback-id:feedback-id
 :from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy
 :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1727820086; x=
 1727906486; bh=QhGe4Rc7rkIpT/PtT6JUOyKK/zxDU2uhNWfKxEQZYnM=; b=L
 nZVQxmTlep68JbJFAFzLvlVcodq5URLNzWjT140HTa4YzGZsTqrjsXOycJIHSG1Q
 fsf+evA1yW/y5d7DyOhhXZXwFnjnmXJOMJjW34GYz9xk74OU/P7BCov2vt13a2iu
 KO67Qcoe6aMylka2G6DO7JlWCkBCN7n6Xf2SlHDejqIobipfDtSYbl/QntwceFti
 BrPQrDZD5YTCFj3nJdo6PNUEPv3l8UGwkM19EaVtriB2NS3xmOs4PltYPq9xgejC
 VkhCb3n2XCSh6za9HjMTqh2xJk4QMjoWMT08T0Pj1pJUoGu1Tr+qAAEvBAS+2HqB
 BLeAbmOOd4GeYPZNz677Q==
X-ME-Sender: <xms:NnH8ZjTyuql3O5BaA3PmEPeowynvwSD_Rj7OvXqGpyQmtdh-3vti7g>
 <xme:NnH8Zkz0GeJGs9ZWi2Q4HFRnK3NZkI0NJtLQ67LpLvONpf8_WsABuuXaEXZeBKM6D
 c6K3b53ohuTzzO9mj4>
X-ME-Received: <xmr:NnH8Zo2k-au_9BE_rwcKoZ5Y2lzTOr3vbe7HdY1tbH3U_aZt9Sobp9nQvtLC7PrYCkg>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvddukedgtdeiucetufdoteggodetrfdotf
 fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu
 rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnh
 htshculddquddttddmnecujfgurhepkfffgggfuffvvehfhfgjtgfgsehtjeertddtvdej
 necuhfhrohhmpeffmhhithhrhicuifhuthhovhcuoegumhhithhrhiesghhuthhovhdrug
 gvvheqnecuggftrfgrthhtvghrnhepffeifedvleeukedtgfelieegudfgveekfeejveej
 ffetffeuueeugefhveeiuddvnecuffhomhgrihhnpehgnhhurdhorhhgnecuvehluhhsth
 gvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepughmihhtrhihsehguhht
 ohhvrdguvghvpdhnsggprhgtphhtthhopeefpdhmohguvgepshhmthhpohhuthdprhgtph
 htthhopegvlhhiiiesghhnuhdrohhrghdprhgtphhtthhopehsphifhhhithhtohhnsehs
 phifhhhithhtohhnrdhnrghmvgdprhgtphhtthhopeejfeegkeegseguvggssghughhsrd
 hgnhhurdhorhhg
X-ME-Proxy: <xmx:NnH8ZjAqAcBcbXaB2lv34F0EcyJHKdS7HXnxXvD4cB8Gg5o3oZvQJg>
 <xmx:NnH8Zsja6YwIMHHwV4LsyHwz7rZvSGl5fWcHvTgA9lfcAUWLj9Gq_Q>
 <xmx:NnH8ZnpzaLvDzGMxg9clHMccCn1lH5CuSvdV0oHbpuPZPxNDBRV4jw>
 <xmx:NnH8ZnhYTlZdjXdmBKdN6BQDdP3EStNCJpuelRS97KxlA6rD3HOvMw>
 <xmx:NnH8ZlsOemRYwQ3aTaqvrxVzpkBpEMMa53jeoLxeSYLIB_2JLSKRi0Mv>
Feedback-ID: i07de48aa:Fastmail
Received: by mail.messagingengine.com (Postfix) with ESMTPA; Tue,
 1 Oct 2024 18:01:25 -0400 (EDT)
Message-ID: <e1da4d38-d9aa-4b41-b5c3-4869e051c46a@HIDDEN>
Date: Wed, 2 Oct 2024 01:01:23 +0300
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
To: Eli Zaretskii <eliz@HIDDEN>
References: <87tteaznog.fsf@HIDDEN>
 <edab570c-b2fa-4162-9383-df5c8aaff251@HIDDEN>
 <8734lrrj4e.fsf@HIDDEN>
 <ea10f340-9b46-4199-93fc-274c5e81ace4@HIDDEN>
 <87o74c1ce1.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN> <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN> <86a5fn3m23.fsf@HIDDEN>
Content-Language: en-US
From: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <86a5fn3m23.fsf@HIDDEN>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Score: -0.7 (/)
X-Debbugs-Envelope-To: 73484
Cc: 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.7 (-)

On 01/10/2024 18:00, Eli Zaretskii wrote:

>> Just do nothing. We'd really want to delegate language detection to
>> etags rather than doing it inside Elisp - the latter is slower and
>> ultimately more limited. But for that etags needs to have a reliable
>> detection logic, one without too many false positives (and IME false
>> positives here are worse than false negatives, because scanning too much
>> can often mean both wrong tags and long scans, and a completion table
>> that gets too large because of bogus tags).
> 
> I'm not sure I understand: if you worry about performance, then
> disabling fallbacks will not eliminate all of the cases where etags
> scans the entire file or at least some of its portions.

etags's scanning should still be faster than doing it in Lisp, or the 
subsequent calls to tags-completion-table or etags--xref-find-definitions.

Further, the last function would repeatedly search through the tags 
file, so it's important to keep tags' scanner accuracy high: without 
incorrectly recognized files, and without wrong index entries.

> Can you explain to me again what exactly is the problem with the
> fallbacks in the context of etags-regen?

We've talked about this before, here's my previous reply: 
https://lists.gnu.org/archive/html/emacs-devel/2018-01/msg00387.html

I don't have the same experiment at hand, but the past me seems to be 
saying that scanning files incorrectly can also make the whole scan take 
longer, considerably. And make the resulting file bigger, which makes 
its parsing from Emacs slower as well, and so on.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 1 Oct 2024 15:00:50 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Tue Oct 01 11:00:50 2024
Received: from localhost ([127.0.0.1]:51870 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1sveMr-0003qV-QA
	for submit <at> debbugs.gnu.org; Tue, 01 Oct 2024 11:00:50 -0400
Received: from eggs.gnu.org ([209.51.188.92]:57436)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eliz@HIDDEN>) id 1sveMp-0003qN-Br
 for 73484 <at> debbugs.gnu.org; Tue, 01 Oct 2024 11:00:48 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1sveMk-0008EW-C4; Tue, 01 Oct 2024 11:00:42 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date:
 mime-version; bh=83Jatd5Rh1ITBQPALx7Be7YPUQDyTXen3iQGibrHfSM=; b=olSgDSEvCIiF
 Vx8P2W5BXGnjwr3nG9XKp0HSunTVuas2UyyiepnImmSLcwB8Y9YUCLruwYA7LcrqhFp8TrovWvYRJ
 spf+H6wC5lyf32jAZFQ+tACSlrztFzMzB94hSwhDm9XsiIK8PiIneVZpvIAs24tgjzZUMZvagqz4T
 52nCHTshCnbsK27r8WiP7Ox0hqEaB4JbUziuwWGNJmvhXirZTO8OR7ODf95gZwGQv+VWww89giMkl
 K25elBLi00BWr7SUTmIEeHpfqgZjxVCcNNwDcbJgrWBsUjsJ43E5WbpXWezcpIruw3CD1DMhSA20V
 kozrtAkIIacL09CRCheI7w==;
Date: Tue, 01 Oct 2024 18:00:36 +0300
Message-Id: <86a5fn3m23.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN> (message from
 Dmitry Gutov on Tue, 1 Oct 2024 02:19:17 +0300)
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
References: <87tteaznog.fsf@HIDDEN>
 <edab570c-b2fa-4162-9383-df5c8aaff251@HIDDEN>
 <8734lrrj4e.fsf@HIDDEN>
 <ea10f340-9b46-4199-93fc-274c5e81ace4@HIDDEN>
 <87o74c1ce1.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN> <86ttdy50ja.fsf@HIDDEN>
 <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN>
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 73484
Cc: 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

> Date: Tue, 1 Oct 2024 02:19:17 +0300
> Cc: spwhitton@HIDDEN, 73484 <at> debbugs.gnu.org
> From: Dmitry Gutov <dmitry@HIDDEN>
> 
> On 29/09/2024 11:25, Eli Zaretskii wrote:
> > I understand that we need to disable the Fortran and C fallbacks to
> > avoid false positives, but what do we want to do if the fallbacks are
> > disabled and no suitable language parser is found using the file name?
> > Just skip the file and do nothing? emit a warning? something else?
> 
> Just do nothing. We'd really want to delegate language detection to 
> etags rather than doing it inside Elisp - the latter is slower and 
> ultimately more limited. But for that etags needs to have a reliable 
> detection logic, one without too many false positives (and IME false 
> positives here are worse than false negatives, because scanning too much 
> can often mean both wrong tags and long scans, and a completion table 
> that gets too large because of bogus tags).

I'm not sure I understand: if you worry about performance, then
disabling fallbacks will not eliminate all of the cases where etags
scans the entire file or at least some of its portions.

Can you explain to me again what exactly is the problem with the
fallbacks in the context of etags-regen?




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 30 Sep 2024 23:20:00 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Mon Sep 30 19:20:00 2024
Received: from localhost ([127.0.0.1]:47707 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1svPgO-00047m-Iu
	for submit <at> debbugs.gnu.org; Mon, 30 Sep 2024 19:20:00 -0400
Received: from fout-a3-smtp.messagingengine.com ([103.168.172.146]:35451)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <dmitry@HIDDEN>) id 1svPgM-00047e-Qs
 for 73484 <at> debbugs.gnu.org; Mon, 30 Sep 2024 19:19:59 -0400
Received: from phl-compute-02.internal (phl-compute-02.phl.internal
 [10.202.2.42])
 by mailfout.phl.internal (Postfix) with ESMTP id 0AFD01380A84;
 Mon, 30 Sep 2024 19:19:20 -0400 (EDT)
Received: from phl-mailfrontend-01 ([10.202.2.162])
 by phl-compute-02.internal (MEProxy); Mon, 30 Sep 2024 19:19:20 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gutov.dev; h=cc
 :cc:content-transfer-encoding:content-type:content-type:date
 :date:from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to; s=fm1; t=1727738360;
 x=1727824760; bh=o5I3DCXNilKMYIc2ls4kUUhjqJ9mEBepaNbXpZ2sPII=; b=
 F34xK9CF9mPkQGrRVxzWmum/JYGgM690c21H8iiPtUFGRfD7puZGkZWf2q3ZUekk
 7RQ9694/yqmOlOI+pxTsWfmX7XgLUDHYkzkg9kzYQUQb0CW4Un59tRj8ToaOlEzD
 gz52eOmuKOmEWUpX+eMtuiPtr7E7rNYNtg7aMl75eP9cVIfWs1hAeSpdxzdxdxvx
 JRQe885PCMyrCVg5G5j1++DlJOzfErOYUeKIgy2q+CmNbabQAMFDc1OxavojIemw
 3VdKoAVokegYtEW33OQZ2hdTdInmLTk+dWaYoAorPjedPuUAXSgVmzqopJuTlSaq
 rQD5JTZpdzXYrY7T9BHq0Q==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 messagingengine.com; h=cc:cc:content-transfer-encoding
 :content-type:content-type:date:date:feedback-id:feedback-id
 :from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy
 :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1727738360; x=
 1727824760; bh=o5I3DCXNilKMYIc2ls4kUUhjqJ9mEBepaNbXpZ2sPII=; b=N
 QmAmRUfyMejczQ+gZniXjhIwEGJeWXsbff33OOVcvl4zxmJizW1Z7E7IIBZ4ehp+
 ItdFMvrDIbd4cBY2MpkUIeGb0w9b0qyU0aRS+0InyCjV3cvRpH9vJ23UY0lBM+5o
 5QyGIf6Mq9JfDjaNP/9vMMsJN3Th4S5AEs3ACGDRTwMIjGAw/EBLeUOJ7e0CoKPY
 UftDze5rEFhY97FAnmOyzdB77A0yfrxt/ropcfSjtc7cEoX7uyWnhhFqeCcvGpAS
 866VMUJusl80XoU4xINMPabSqhbly45GJhPVrAOak/dK7HX3eg7sk0K0A9LMgHA5
 U+bT/Af7KrXBX6AKq1GIA==
X-ME-Sender: <xms:9zH7Zhn7etm7Q2dJ7GEC4VsbIOPl9hYa7HdEZwxuj0Qk2uvcGwb_hQ>
 <xme:9zH7Zs1794eR9r6431rRUZxGSxGHBuEU4wcBHBBtdyi0VuJoz_sbz290qUprTb18d
 udcUcDR9p3wyWEs2-8>
X-ME-Received: <xmr:9zH7Zno89huO8HGuDGGgkWLTe2AoFAznVGUSpW6cM7XV9VEvjAZXu_HcPC7DnJaGdWzG>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvdduiedgvddtucetufdoteggodetrfdotf
 fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu
 rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnh
 htshculddquddttddmnecujfgurhepkfffgggfuffvvehfhfgjtgfgsehtjeertddtvdej
 necuhfhrohhmpeffmhhithhrhicuifhuthhovhcuoegumhhithhrhiesghhuthhovhdrug
 gvvheqnecuggftrfgrthhtvghrnhepteduleejgeehtefgheegjeekueehvdevieekueef
 tddvtdevfefhvdevgedujeehnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpe
 hmrghilhhfrhhomhepughmihhtrhihsehguhhtohhvrdguvghvpdhnsggprhgtphhtthho
 peefpdhmohguvgepshhmthhpohhuthdprhgtphhtthhopegvlhhiiiesghhnuhdrohhrgh
 dprhgtphhtthhopehsphifhhhithhtohhnsehsphifhhhithhtohhnrdhnrghmvgdprhgt
 phhtthhopeejfeegkeegseguvggssghughhsrdhgnhhurdhorhhg
X-ME-Proxy: <xmx:9zH7ZhnIiPNUu1C2Of1gqQP1emVfpdl4f04IYjSPJa9o0JkwUKsMig>
 <xmx:9zH7Zv1sU_ZL3r7FCKBfTOTFVLrWubyplGfLNiKdq36TGnEMVI8rdQ>
 <xmx:9zH7ZgtSkzUtiWybSVrS3oHpRgereZOXw0YzeSgDk6JxMC5-pLmQRA>
 <xmx:9zH7ZjViMcp2cK0wKtOouCBttPXrv_jI4tnnM5uTspzwoeCkdKLCng>
 <xmx:-DH7ZgzOMf4lK1oPc1SyYZdKwoDE2TawgwIRSXvMJb_xWcWYn6p68B2n>
Feedback-ID: i07de48aa:Fastmail
Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon,
 30 Sep 2024 19:19:18 -0400 (EDT)
Message-ID: <75fe4289-da41-454d-ba92-22a92ea7002f@HIDDEN>
Date: Tue, 1 Oct 2024 02:19:17 +0300
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
To: Eli Zaretskii <eliz@HIDDEN>
References: <87tteaznog.fsf@HIDDEN>
 <edab570c-b2fa-4162-9383-df5c8aaff251@HIDDEN>
 <8734lrrj4e.fsf@HIDDEN>
 <ea10f340-9b46-4199-93fc-274c5e81ace4@HIDDEN>
 <87o74c1ce1.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN> <86ttdy50ja.fsf@HIDDEN>
Content-Language: en-US
From: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <86ttdy50ja.fsf@HIDDEN>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Score: -0.7 (/)
X-Debbugs-Envelope-To: 73484
Cc: 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.7 (-)

On 29/09/2024 11:25, Eli Zaretskii wrote:
> I understand that we need to disable the Fortran and C fallbacks to
> avoid false positives, but what do we want to do if the fallbacks are
> disabled and no suitable language parser is found using the file name?
> Just skip the file and do nothing? emit a warning? something else?

Just do nothing. We'd really want to delegate language detection to 
etags rather than doing it inside Elisp - the latter is slower and 
ultimately more limited. But for that etags needs to have a reliable 
detection logic, one without too many false positives (and IME false 
positives here are worse than false negatives, because scanning too much 
can often mean both wrong tags and long scans, and a completion table 
that gets too large because of bogus tags).

For shebangs in particular, however, see Francesco's very good 
explanation. And detecting shebangs in Lisp would not be practical -- 
too slow.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 29 Sep 2024 17:16:40 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sun Sep 29 13:16:40 2024
Received: from localhost ([127.0.0.1]:41321 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1suxXE-0007GS-8x
	for submit <at> debbugs.gnu.org; Sun, 29 Sep 2024 13:16:40 -0400
Received: from eggs.gnu.org ([209.51.188.92]:48278)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <pot@HIDDEN>) id 1suxXC-0007GD-M7
 for 73484 <at> debbugs.gnu.org; Sun, 29 Sep 2024 13:16:39 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <pot@HIDDEN>)
 id 1suxWa-0003nm-Vp; Sun, 29 Sep 2024 13:16:00 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=MIME-Version:References:Subject:In-Reply-To:To:Date:
 From; bh=ppPWi3uiZI/ZmyBfN4GmtHeDRgwjTZIwuAJhmFV1yyg=; b=TiLXiK8hLtwQcneaiUX5
 Cm5ufFKQJ4BRnjftiSGeZVh8WpzdENIhvhCBL8JPehCaQewdpCRnB06QnKZcKA+YuOook/YsPLEAB
 /HDcpc3CMlk4edjqgDJ9WbFE6rz4/XhSQJra24wasoDPlpZPuaTtlIH39D7zSUWTmt/rAhemG3iCg
 0VBDzSiuPR2W3r7PhS3iDLDnsuhncekposbaGLgZBpldhkCAcuNW9swZM8ap3lsJROl1XZgvIJF6v
 dco0phcIGUOwI/5O4Hkqwkw/pVDuTQiFaW7rgBmCCskCBwvhGwGwlxkkVtiwzlIdpydLicraQBS4n
 +6lpZFUT+ndHIw==;
Message-Id: <87ttdyfkj6.fsf@HIDDEN>
From: =?utf-8?Q?Francesco_Potort=C3=AC?= <pot@HIDDEN>
Date: Sun, 29 Sep 2024 19:15:57 +0200
To: Eli Zaretskii <eliz@HIDDEN>
In-Reply-To: <86o7464tkc.fsf@HIDDEN> (eliz@HIDDEN)
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
References: <87tteaznog.fsf@HIDDEN>
 <edab570c-b2fa-4162-9383-df5c8aaff251@HIDDEN>
 <8734lrrj4e.fsf@HIDDEN>
 <ea10f340-9b46-4199-93fc-274c5e81ace4@HIDDEN>
 <87o74c1ce1.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN> <86ttdy50ja.fsf@HIDDEN>
 <86o7464tkc.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain
Organization: The GNU project
X-fingerprint: 4B02 6187 5C03 D6B1 2E31  7666 09DF 2DC9 BE21 6115
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 73484
Cc: dmitry@HIDDEN, 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

Eli Zaretskii:
>> I understand that we need to disable the Fortran and C fallbacks to
>> avoid false positives, but what do we want to do if the fallbacks are
>> disabled and no suitable language parser is found using the file name?
>> Just skip the file and do nothing? emit a warning? something else?

Eli Zaretskii:
>Wait a minute... we already have "--language=none", which means only
>do regexp processing, if any.  If no regexps were specified, 'none'
>produces a single entry for a file, stating just its name, like this:
>
>  ^L
>  foo,0
>
>where ^L is a literal \f character.  Is the intent here to prevent
>even that from being written to TAGS?  If not, then we don't need any
>new command-line option; instead, etags-regen could simply pass the
>"--language=none" option before each file with no extension, and be
>done, no?
>
>Or maybe this is "the missing link" between this and the shebang
>processing?

If you set language=none for files whose extension is unknown to Etags, then you give up on shebang processing.  If you do not set language=none and Etags does not recognise any shebang, it defaults to Fortran.  If it does not find any Fortran tags, it defaults to C/C++.  When default processing happens on a file which is neither Fortran nor C/C++, it usually generates no tags, but may occasionally generate fake tags.

AFAIU, the problem is that there are use cases when you have to feed Etags with files that should generate no tags, yet the occasional fake tags are not tolerable.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 29 Sep 2024 10:57:05 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sun Sep 29 06:57:05 2024
Received: from localhost ([127.0.0.1]:40150 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1surbt-0003Xx-2w
	for submit <at> debbugs.gnu.org; Sun, 29 Sep 2024 06:57:05 -0400
Received: from eggs.gnu.org ([209.51.188.92]:50046)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eliz@HIDDEN>) id 1surbp-0003XJ-VK
 for 73484 <at> debbugs.gnu.org; Sun, 29 Sep 2024 06:57:03 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1surbD-0005D3-EN; Sun, 29 Sep 2024 06:56:23 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date:
 mime-version; bh=lyYapNloY6hfwl4j5y32NO8rD8f9apjuViYPPJD1TnQ=; b=V1URICKQT0uT
 QZO0CHzWPmcTdH8x0AxA+p4WJf0iVY8CNWT8dKaGSK8504PW+NoHcIA5fVPL9SL5R/XJvY4S+ebpz
 VFU3vILO2OM9PD8oA6FTUr4cpEDMM1jnbrgjpyT519sAMH1GHg3beRp+glGbTX9YRlFLERUftzHdP
 k7/TB0JiWmNo4pszLtHMBkVeGCEZDLkSfzeg3CHPTXrzoq/2XwEbKCl8o9OuR1mJVqZjEYbu/NczT
 aN9c8TeutTw83HO6t9QtiipZhEAp93cjIKpthtkjRf7fUYrZkroB3O5w3IWOX2yFlW8UPTCQAS/yt
 zNcsceS6QM/tjMt4kpO1Eg==;
Date: Sun, 29 Sep 2024 13:56:19 +0300
Message-Id: <86o7464tkc.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: dmitry@HIDDEN
In-Reply-To: <86ttdy50ja.fsf@HIDDEN> (message from Eli Zaretskii on Sun, 29
 Sep 2024 11:25:45 +0300)
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
References: <87tteaznog.fsf@HIDDEN>
 <edab570c-b2fa-4162-9383-df5c8aaff251@HIDDEN>
 <8734lrrj4e.fsf@HIDDEN>
 <ea10f340-9b46-4199-93fc-274c5e81ace4@HIDDEN>
 <87o74c1ce1.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN> <86ttdy50ja.fsf@HIDDEN>
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 73484
Cc: 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

> Cc: 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
> Date: Sun, 29 Sep 2024 11:25:45 +0300
> From: Eli Zaretskii <eliz@HIDDEN>
> 
> I understand that we need to disable the Fortran and C fallbacks to
> avoid false positives, but what do we want to do if the fallbacks are
> disabled and no suitable language parser is found using the file name?
> Just skip the file and do nothing? emit a warning? something else?

Wait a minute... we already have "--language=none", which means only
do regexp processing, if any.  If no regexps were specified, 'none'
produces a single entry for a file, stating just its name, like this:

  ^L
  foo,0

where ^L is a literal \f character.  Is the intent here to prevent
even that from being written to TAGS?  If not, then we don't need any
new command-line option; instead, etags-regen could simply pass the
"--language=none" option before each file with no extension, and be
done, no?

Or maybe this is "the missing link" between this and the shebang
processing?




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 29 Sep 2024 08:28:39 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sun Sep 29 04:28:39 2024
Received: from localhost ([127.0.0.1]:39383 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1supIE-0001gV-Tx
	for submit <at> debbugs.gnu.org; Sun, 29 Sep 2024 04:28:39 -0400
Received: from eggs.gnu.org ([209.51.188.92]:51914)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eliz@HIDDEN>) id 1supID-0001gB-7Z
 for 73484 <at> debbugs.gnu.org; Sun, 29 Sep 2024 04:28:37 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1supFV-0005Oy-JZ; Sun, 29 Sep 2024 04:25:49 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date:
 mime-version; bh=lRZCQVvKKJHQ/R4Rb/7JPdYWhpoPPyp8AC3wy6ng7Ec=; b=JUjkPM1ABhZH
 3RW/+M23nEG3VE2bYP36tZt4HSKHbLtHyklq1piwhY8RctuZbYTCpxI4Lejg6oGXHeygwRhVn9GRV
 M1guExFrYQ9+UUhP9k1PnjcgubV/mXlCi0TQP/NEidQZ3E/fZJ13wRHiRn2YbF3c2go4opv4GN+TG
 I6DQOXqqAZx2O7yDGziac6aIAuUZjL9OCT0FS7Y/NwSdUEq/KExp9bbm+CWgXDH0y17vJcK192fAl
 EpstCPbXOorirEzs31I66CaMinl0//zyupFWiPMjnEcJDdppX1dkZ7U5l68H1RwZDzDiOVgpT8ynF
 AP1Et93aOn/vdGPitdDuLw==;
Date: Sun, 29 Sep 2024 11:25:45 +0300
Message-Id: <86ttdy50ja.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN> (message from
 Dmitry Gutov on Thu, 26 Sep 2024 01:30:55 +0300)
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
References: <87tteaznog.fsf@HIDDEN>
 <edab570c-b2fa-4162-9383-df5c8aaff251@HIDDEN>
 <8734lrrj4e.fsf@HIDDEN>
 <ea10f340-9b46-4199-93fc-274c5e81ace4@HIDDEN>
 <87o74c1ce1.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN>
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 73484
Cc: 73484 <at> debbugs.gnu.org, spwhitton@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

> Date: Thu, 26 Sep 2024 01:30:55 +0300
> From: Dmitry Gutov <dmitry@HIDDEN>
> 
> > We want to replace etags-regen-file-extensions with enabling etags's
> > hashbang detection support.  That requires disabling its Fortran
> > fallback.
> 
> Thanks, a fuller plan would look something like this:
> 
> - Implement the --no-fortran-fallback flag in etags. Or an environment 
> variable, or etc. Use it conditionally in etags-regen-mode.
> - Revisit the default lists of extensions that etags recognizes, keeping 
> in mind the recent thread we talking this about in - e.g. *.a seems out 
> of place for ASM (someone more familiar with assembly dialects please 
> feel free to correctme).
> - Add new possible value t to etags-regen-file-extensions, and switch 
> the default to it.

I understand that we need to disable the Fortran and C fallbacks to
avoid false positives, but what do we want to do if the fallbacks are
disabled and no suitable language parser is found using the file name?
Just skip the file and do nothing? emit a warning? something else?

I also don't understand why enabling the etags' shebang detection
requires to disable the Fortran and C fallbacks: etags looks for
shebang _before_ it falls back to Fortran and C, so what am I missing?




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 26 Sep 2024 12:18:57 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Sep 26 08:18:57 2024
Received: from localhost ([127.0.0.1]:58157 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1stnST-0002nj-Bb
	for submit <at> debbugs.gnu.org; Thu, 26 Sep 2024 08:18:57 -0400
Received: from fout-a1-smtp.messagingengine.com ([103.168.172.144]:45309)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <dmitry@HIDDEN>) id 1stnSR-0002nA-NK
 for 73484 <at> debbugs.gnu.org; Thu, 26 Sep 2024 08:18:56 -0400
Received: from phl-compute-09.internal (phl-compute-09.phl.internal
 [10.202.2.49])
 by mailfout.phl.internal (Postfix) with ESMTP id 1330913806B2;
 Thu, 26 Sep 2024 08:18:22 -0400 (EDT)
Received: from phl-mailfrontend-02 ([10.202.2.163])
 by phl-compute-09.internal (MEProxy); Thu, 26 Sep 2024 08:18:22 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gutov.dev; h=cc
 :cc:content-transfer-encoding:content-type:content-type:date
 :date:from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to; s=fm3; t=1727353102;
 x=1727439502; bh=1mC494Y0kzckY+IvNHe1HgdBOiIItlwn1WUzvVVxt0w=; b=
 brncbAPQ2LDNyIoxfmmpn1/sdVP8WU+6OijJKB2VIIFqA734VrGzo18kSn1LY6xJ
 kvA8lbCa7lTINQWuJJyhBqRRfFP/tE7wIgQwv17PXcrFg8UAwrUjnTA7cmEvJO/C
 oQwlQJ8NrARSfdYqb+TLZ1jrVdi9qj+54xOl3j4WgHUXK6rFzJdJ5yT6cPZHTB+S
 LcnXDvWJXtdRDxoj8USz9raiMVtXX3p0hTbdBeCK7a6dqmpbEzuB1Ko3maijnZu4
 5WFpexx+/KDQ1bjpDxN5ysr9MJRiAxLsuA0AqxtQUfjZ1qUU5j6xNcmCS5U9fc0B
 74LHmuBy+cuBUD+VGSFgbw==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 messagingengine.com; h=cc:cc:content-transfer-encoding
 :content-type:content-type:date:date:feedback-id:feedback-id
 :from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy
 :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1727353102; x=
 1727439502; bh=1mC494Y0kzckY+IvNHe1HgdBOiIItlwn1WUzvVVxt0w=; b=O
 0EkmFDcXBltbAvdupORM9tBlSjgxF3/z2erNtC1G/BhJYDMhT/i7jQYgofj/AcrU
 OFe7ON8ffmBsBtjVWS6WseiPKz0hugfrMiyG5lE8J+cerzIoJctcngWSqqIREQ47
 2EmKJ8av7QYDpbG5/0lBNgPXdouuEKIs1282Y7C+HMF3IoFtkXkI32Hne1MnbRVq
 9Uo5UcO+0EDNK2ri6LMQgbIM6QVplRT/YY4WDtCIVgs6yR2xGHWn7eFbGIvod1PV
 u0jcsludp9lZfTnHnwkkJZzeg/Dtxf2L6XwM6rCMr0OyCM3Xh1DPGaMRjgIUt63x
 bi1D/t5GWm5FAsckvempA==
X-ME-Sender: <xms:DVH1ZvcLJ2CZm4uWsZzyYsVrSa0222mzw_mxv46RWhMzkG1-9CChwQ>
 <xme:DVH1ZlMeCoeIOU5PA6IoifB3OvoFHKkH_8Cp0PaInLs10WBalQrWQBtHzFyNjKdtP
 vH8sChT8vY8hzlo9Fs>
X-ME-Received: <xmr:DVH1ZojTcXypyzk8EIP4VHwoQRaXkX_Bzn5EhiOdVGp_wIas4T9BKhiWSJcsR1v-CKQ>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvddtjedghedvucetufdoteggodetrfdotf
 fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu
 rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnh
 htshculddquddttddmnecujfgurhepkfffgggfuffvvehfhfgjtgfgsehtkeertddtvdej
 necuhfhrohhmpeffmhhithhrhicuifhuthhovhcuoegumhhithhrhiesghhuthhovhdrug
 gvvheqnecuggftrfgrthhtvghrnhepgeelfeetkefghfdvhfdtgeevveevteetgeetveeg
 tedthefhudekteehffeukeeknecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpe
 hmrghilhhfrhhomhepughmihhtrhihsehguhhtohhvrdguvghvpdhnsggprhgtphhtthho
 peefpdhmohguvgepshhmthhpohhuthdprhgtphhtthhopehpohhtsehgnhhurdhorhhgpd
 hrtghpthhtohepjeefgeekgeesuggvsggsuhhgshdrghhnuhdrohhrghdprhgtphhtthho
 pehsphifhhhithhtohhnsehsphifhhhithhtohhnrdhnrghmvg
X-ME-Proxy: <xmx:DVH1Zg-txnbuZBSDofssn76TEC2ERwOYK_PeNB6RQfzZUSdiynFkag>
 <xmx:DVH1ZrvtKtrm6wUT71RibwuLt7V8GtcaIRi-Avh-10_aRN_ptckQMQ>
 <xmx:DVH1ZvHWJ0nQDVCophZC-JHThsAxaeq-et6GHRHaK4XsomxfDaZ4dw>
 <xmx:DVH1ZiOc3Ie2LaB1eMzcFV0t9fRt7BQ5WQb7NekbUO2ljmei2wNgFA>
 <xmx:DlH1ZrLJH9J_EGDEdCsniT4beZunxTZwD5fzKOBEUwbzmFaWw5RBEr_M>
Feedback-ID: i07de48aa:Fastmail
Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu,
 26 Sep 2024 08:18:20 -0400 (EDT)
Message-ID: <a0b9b824-2ad1-4378-a627-7f952049a470@HIDDEN>
Date: Thu, 26 Sep 2024 15:18:17 +0300
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
To: =?UTF-8?Q?Francesco_Potort=C3=AC?= <pot@HIDDEN>
References: <87tteaznog.fsf@HIDDEN>
 <edab570c-b2fa-4162-9383-df5c8aaff251@HIDDEN>
 <8734lrrj4e.fsf@HIDDEN>
 <ea10f340-9b46-4199-93fc-274c5e81ace4@HIDDEN>
 <87o74c1ce1.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN>
 <87setmkgh6.fsf@HIDDEN>
Content-Language: en-US
From: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <87setmkgh6.fsf@HIDDEN>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Spam-Score: -0.7 (/)
X-Debbugs-Envelope-To: 73484
Cc: 73484 <at> debbugs.gnu.org, Sean Whitton <spwhitton@HIDDEN>
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.7 (-)

On 26/09/2024 10:43, Francesco Potortì wrote:
>> - Implement the --no-fortran-fallback flag in etags. Or an environment
>> variable, or etc. Use it conditionally in etags-regen-mode.
> If your purpose is to avoid Etags creating false tags on files whose language it cannot detect, you need to disable all fallbacks, rather than just Fortran.

Yeah, sorry, I guess the next fallback is C?

We'll want to disable both, so the flag would be --no-fallbacks, I guess.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 26 Sep 2024 08:07:09 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Sep 26 04:07:09 2024
Received: from localhost ([127.0.0.1]:52255 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1stjWn-0003wi-4a
	for submit <at> debbugs.gnu.org; Thu, 26 Sep 2024 04:07:09 -0400
Received: from eggs.gnu.org ([209.51.188.92]:36916)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <pot@HIDDEN>) id 1stjWl-0003wU-Qu
 for 73484 <at> debbugs.gnu.org; Thu, 26 Sep 2024 04:07:08 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <pot@HIDDEN>)
 id 1stj9p-0005wg-Ke; Thu, 26 Sep 2024 03:43:29 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=MIME-Version:References:Subject:In-Reply-To:To:Date:
 From; bh=c0mW5K++FeVEjRa7JysNThtJDg0H8x4L4LtXV0Iw5MQ=; b=MD99JC2Smbio9a6ZduXW
 r2JOSbIYra94kDbGnE9L2ZZCqjfXxgv9ytqnZ+WpIPkqnPAbFdjhXZXXcSpKtGtZTJSJapk5sq4Xu
 nqSP2hP+8RaVyqDfguHo4trFHNkt5Djlp+Glny4LcjeXpq6VuUyZsbNW6aIkAge7Dtxj8LctTGmcf
 0ReylEyJlnQk6WGsHqmmTcVOyeVbdO6+GT/THhmbhemKp/j0vpv04Rc6m2F4sQECRvC3KLocn3ZMP
 wovOAZm3BiYlbfxhqU6D296EELfPpcn5mty+tBr3KxQNSF5bzOxAjzR4QC9dwdeSagOsgL13EsEfp
 6yMSjdXmnPBSZg==;
Message-Id: <87setmkgh6.fsf@HIDDEN>
From: =?utf-8?Q?Francesco_Potort=C3=AC?= <pot@HIDDEN>
Date: Thu, 26 Sep 2024 09:43:17 +0200
To: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN>
 (dmitry@HIDDEN)
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
References: <87tteaznog.fsf@HIDDEN>
 <edab570c-b2fa-4162-9383-df5c8aaff251@HIDDEN>
 <8734lrrj4e.fsf@HIDDEN>
 <ea10f340-9b46-4199-93fc-274c5e81ace4@HIDDEN>
 <87o74c1ce1.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
 <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable
Organization: The GNU project
X-fingerprint: 4B02 6187 5C03 D6B1 2E31  7666 09DF 2DC9 BE21 6115
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 73484
Cc: 73484 <at> debbugs.gnu.org, Sean Whitton <spwhitton@HIDDEN>
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

>- Implement the --no-fortran-fallback flag in etags. Or an environment=20
>variable, or etc. Use it conditionally in etags-regen-mode.

If your purpose is to avoid Etags creating false tags on files whose langua=
ge it cannot detect, you need to disable all fallbacks, rather than just Fo=
rtran.

Sorry if I got lost and missed something.

=2D-=20
fp




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at 73484 <at> debbugs.gnu.org:


Received: (at 73484) by debbugs.gnu.org; 25 Sep 2024 22:31:37 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Sep 25 18:31:37 2024
Received: from localhost ([127.0.0.1]:60826 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1staXp-00082J-79
	for submit <at> debbugs.gnu.org; Wed, 25 Sep 2024 18:31:37 -0400
Received: from fhigh-a7-smtp.messagingengine.com ([103.168.172.158]:49559)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <dmitry@HIDDEN>) id 1staXk-00081W-Qn
 for 73484 <at> debbugs.gnu.org; Wed, 25 Sep 2024 18:31:36 -0400
Received: from phl-compute-05.internal (phl-compute-05.phl.internal
 [10.202.2.45])
 by mailfhigh.phl.internal (Postfix) with ESMTP id 39DB31140167;
 Wed, 25 Sep 2024 18:31:00 -0400 (EDT)
Received: from phl-mailfrontend-02 ([10.202.2.163])
 by phl-compute-05.internal (MEProxy); Wed, 25 Sep 2024 18:31:00 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gutov.dev; h=cc
 :content-transfer-encoding:content-type:content-type:date:date
 :from:from:in-reply-to:in-reply-to:message-id:mime-version
 :references:reply-to:subject:subject:to:to; s=fm3; t=1727303460;
 x=1727389860; bh=oXwylzFfpKUZeMRTjg9V0br/7LnNKpT6fAzXEJooNkw=; b=
 mRUfdHselPtiINcoPkvsmoHAg+GfJgpjjHwxgOiuUizVvKDItF4z1LLIPtAdmYfI
 Hnyylygm/1vEokq+T9L7900Cl+3zK9Qr0zUdsBJdStmGFQQ3SxV74zJ8f6k9Im5a
 +ccFqN/sx2MalICdeHtjbNQLOzzrS1vdPb1hdFDUZ6wkXegFh1xOLsiVOC6l+PGj
 rakkZA4UojsoZ8s6CwCAkJYgl7Z/4LBwKhMddpivfBS21RXxYpvNRtj2+FI9eEw/
 nUqZw1UCfaJZG/d88B6HK+J5FjlZ/ko86lhha7D/3x7NEk/dKpjdEAlUxXceQDFE
 yi/MJpyEwTAoqK3oZocmDw==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 messagingengine.com; h=cc:content-transfer-encoding:content-type
 :content-type:date:date:feedback-id:feedback-id:from:from
 :in-reply-to:in-reply-to:message-id:mime-version:references
 :reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy
 :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1727303460; x=
 1727389860; bh=oXwylzFfpKUZeMRTjg9V0br/7LnNKpT6fAzXEJooNkw=; b=J
 KBG0kV3+q2I9+XmYo8LsXGR7hQrPgYUTVhnymfKqMy9vks/rannbY16uT5iZbQMI
 SR9yItsHTqhZIsVAoYC/T2lqxeiFPb+1/0xPE8P5FzSLnkksk3Lyv8YO104eqs1/
 Exq/xrC19TRBhXeLZNoRb6Vx7gNFyqPxRz/GNfiRsrYU8pdnKu5Fh029DpuXJQde
 H5wzbYHCPdtr4YHCPTUjbD8pagNdaum3RbhY4zXF8oV4TD/EuQtLJGUn5T+V95Dk
 akNqvW6rTevEAJ/BmfwsAhm3H3r6NbuULtxf0mL8qLzvx0FVZrE/5TECDcBKS5xM
 /i4ovU/oAy8mBsV8O8n9g==
X-ME-Sender: <xms:I4_0Zhk3ok-lHDMx1D_nVYY6rkrI-9CQRb5hMf-sXtaavq_3dAZhbA>
 <xme:I4_0Zs0HlGZRVJi2L7VujnL7tJAInOadYabn2qwUajBRK-3kX780RU6KtVbA8SYpq
 yrkfRJtlnIDfslgvwE>
X-ME-Received: <xmr:I4_0ZnrYZZpMrPdDyxGzSEJT0LapjtqY1hpRqVFyihOU2k7VBYocs8hmlVvWQDO_oOw>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvddtiedgudduucetufdoteggodetrfdotf
 fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu
 rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnh
 htshculddquddttddmnecujfgurhepkfffgggfuffvfhfhjggtgfesthejredttddvjeen
 ucfhrhhomhepffhmihhtrhihucfiuhhtohhvuceoughmihhtrhihsehguhhtohhvrdguvg
 hvqeenucggtffrrghtthgvrhhnpedthfeuvddtveelgeeuleevvdejveehffevveehvdeu
 ffdtfefhvdeugefgtefgtdenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmh
 grihhlfhhrohhmpegumhhithhrhiesghhuthhovhdruggvvhdpnhgspghrtghpthhtohep
 vddpmhhouggvpehsmhhtphhouhhtpdhrtghpthhtohepshhpfihhihhtthhonhesshhpfi
 hhihhtthhonhdrnhgrmhgvpdhrtghpthhtohepjeefgeekgeesuggvsggsuhhgshdrghhn
 uhdrohhrgh
X-ME-Proxy: <xmx:I4_0Zhk0Nt-Nj4AlVvTH8Xavb9-wh52bvfr6GJRKC_u4tK4vFH7sPA>
 <xmx:I4_0Zv1oMq5Wd9AGundUKr7Q5K6M3buFtHOi4uYZgiFN5hOAPMZ_kA>
 <xmx:I4_0Zgue7CWc_r5eSORypW8meShZOXq_1gANixGx679LflFUSie6yA>
 <xmx:I4_0ZjW-fmodcL1nmshFIAyjbeHgP2ErwGvuN5HdSnIta9tPpZF35Q>
 <xmx:JI_0ZtC4ENBoHEVmIfArdLgCY6X1cJavV4NcmywwnczqWE81NN_3Rm-v>
Feedback-ID: i07de48aa:Fastmail
Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed,
 25 Sep 2024 18:30:58 -0400 (EDT)
Message-ID: <37e4b3cd-6363-4f55-9921-92a1182679dc@HIDDEN>
Date: Thu, 26 Sep 2024 01:30:55 +0300
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: bug#73484: 31.0.50; Abolishing etags-regen-file-extensions
To: Sean Whitton <spwhitton@HIDDEN>, 73484 <at> debbugs.gnu.org
References: <87tteaznog.fsf@HIDDEN>
 <edab570c-b2fa-4162-9383-df5c8aaff251@HIDDEN>
 <8734lrrj4e.fsf@HIDDEN>
 <ea10f340-9b46-4199-93fc-274c5e81ace4@HIDDEN>
 <87o74c1ce1.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
 <87jzezzg87.fsf_-_@HIDDEN>
Content-Language: en-US
From: Dmitry Gutov <dmitry@HIDDEN>
In-Reply-To: <87jzezzg87.fsf_-_@HIDDEN>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Score: -0.7 (/)
X-Debbugs-Envelope-To: 73484
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.7 (-)

Hi!

On 25/09/2024 22:27, Sean Whitton wrote:

> On Wed 25 Sep 2024 at 02:41pm +03, Dmitry Gutov wrote:
> 
>> On 25/09/2024 09:21, Sean Whitton wrote:
>>>> We would probably also discuss etags' auto-detection and its list of default
>>>> extensions, during the next release's development.
>>> Okay, cool!  Should we have a bug to track this?
> 
> We want to replace etags-regen-file-extensions with enabling etags's
> hashbang detection support.  That requires disabling its Fortran
> fallback.

Thanks, a fuller plan would look something like this:

- Implement the --no-fortran-fallback flag in etags. Or an environment 
variable, or etc. Use it conditionally in etags-regen-mode.
- Revisit the default lists of extensions that etags recognizes, keeping 
in mind the recent thread we talking this about in - e.g. *.a seems out 
of place for ASM (someone more familiar with assembly dialects please 
feel free to correctme).
- Add new possible value t to etags-regen-file-extensions, and switch 
the default to it.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 25 Sep 2024 19:40:15 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Sep 25 15:40:15 2024
Received: from localhost ([127.0.0.1]:50239 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1stXry-0003mS-F6
	for submit <at> debbugs.gnu.org; Wed, 25 Sep 2024 15:40:15 -0400
Received: from lists.gnu.org ([209.51.188.17]:46398)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <spwhitton@HIDDEN>) id 1stXg3-0002tJ-7D
 for submit <at> debbugs.gnu.org; Wed, 25 Sep 2024 15:27:56 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10])
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <spwhitton@HIDDEN>)
 id 1stXfb-0005jU-OR
 for bug-gnu-emacs@HIDDEN; Wed, 25 Sep 2024 15:27:28 -0400
Received: from sendmail.purelymail.com ([34.202.193.197])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <spwhitton@HIDDEN>)
 id 1stXfY-0001KV-SZ
 for bug-gnu-emacs@HIDDEN; Wed, 25 Sep 2024 15:27:27 -0400
DKIM-Signature: a=rsa-sha256;
 b=kY5zVwX8oqrbA26GDFuU2uJoet54IpqikGHPe2PTFpR7ZgDX0kAtl83VpthGiJRNW6M6rDEfmCZ1yg2i9rdxvqYvoqCLIs9tJOg/mCC3aw0YwFc6+kOPxUKUu2eM0SGBvjTg6V1wiIvhCE7Hrf2F7JFzUxeB2uaGBE8vqbHrcAz/ZQtIQmK6rvWOXG4L1iQi6bH2SIE3JSrNgZRgg+HZ5ezs76lglSIv4mMwZqCui5YVgPbcS4y9uY12E+hHYRkbn7OZq3U6j5jBq/8UjMIhNeYvcOB+oCRRyLC5aRyAr0E/rPwxt4ubVYelTVpC53Fasst0aMqy9n9KNe3l8JyMww==;
 s=purelymail1; d=spwhitton.name; v=1;
 bh=OyfeK+Q8vABqVLTTTjFzA9X8wM+jeybXfZUrF+5qWFk=;
 h=Received:Received:From:To:Subject:Date; 
DKIM-Signature: a=rsa-sha256;
 b=Z45MalJ010zt2uPz8n0gCzS3DSnXGJriURNcLHhcBYKy2Z3/k8gCph+Xom2Ui02KghYD3yrC83/MEvOBJkCWKRENYoEufMf/3TqDm5UQ8HqmXS9Uv9DJmXYzK48sz+oFKHGgm3bj4c1J3YS4OtqTGgtzfjMK/oxCX442Z2JazuKvXkwbKeEtOyqJQbiuflerE0klcTCpBqsOZS0flLY75uvr8gnvZFUPaKCQxzKjn21iwDPdi7EfjGS15UHwWFhfw1KiIRf+6loVWESluYQj6cCD1QxWBKN9ccATJI0b2EEg20Kqw5G/ZHamJnKQuaFQbJp57IuqnAFDkibVMTXweg==;
 s=purelymail1; d=purelymail.com; v=1;
 bh=OyfeK+Q8vABqVLTTTjFzA9X8wM+jeybXfZUrF+5qWFk=;
 h=Feedback-ID:Received:Received:From:To:Subject:Date; 
Feedback-ID: 20115:3760:null:purelymail
X-Pm-Original-To: bug-gnu-emacs@HIDDEN
Received: by smtp.purelymail.com (Purelymail SMTP) with ESMTPSA id 160390981
 for <bug-gnu-emacs@HIDDEN>
 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384);
 Wed, 25 Sep 2024 19:27:21 +0000 (UTC)
Received: by zephyr.silentflame.com (Postfix, from userid 1000)
 id 4CDFF94086D; Wed, 25 Sep 2024 20:27:20 +0100 (BST)
From: Sean Whitton <spwhitton@HIDDEN>
To: bug-gnu-emacs@HIDDEN
Subject: 31.0.50; Abolishing etags-regen-file-extensions
In-Reply-To: <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN> (Dmitry Gutov's
 message of "Wed, 25 Sep 2024 14:41:37 +0300")
References: <87tteaznog.fsf@HIDDEN>
 <edab570c-b2fa-4162-9383-df5c8aaff251@HIDDEN>
 <8734lrrj4e.fsf@HIDDEN>
 <ea10f340-9b46-4199-93fc-274c5e81ace4@HIDDEN>
 <87o74c1ce1.fsf@HIDDEN>
 <b8001a72-8fc9-4e4e-a2d7-5da94a92f250@HIDDEN>
Date: Wed, 25 Sep 2024 20:27:20 +0100
Message-ID: <87jzezzg87.fsf_-_@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13)
MIME-Version: 1.0
Content-Type: text/plain
Received-SPF: pass client-ip=34.202.193.197;
 envelope-from=spwhitton@HIDDEN; helo=sendmail.purelymail.com
X-Spam_score_int: -20
X-Spam_score: -2.1
X-Spam_bar: --
X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1,
 RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001,
 SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-Spam-Score: -1.3 (-)
X-Debbugs-Envelope-To: submit
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -2.3 (--)

Hello,

On Wed 25 Sep 2024 at 02:41pm +03, Dmitry Gutov wrote:

> On 25/09/2024 09:21, Sean Whitton wrote:
>>> We would probably also discuss etags' auto-detection and its list of default
>>> extensions, during the next release's development.
>> Okay, cool!  Should we have a bug to track this?

We want to replace etags-regen-file-extensions with enabling etags's
hashbang detection support.  That requires disabling its Fortran
fallback.

-- 
Sean Whitton




Acknowledgement sent to Sean Whitton <spwhitton@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs@HIDDEN. Full text available.
Report forwarded to bug-gnu-emacs@HIDDEN:
bug#73484; Package emacs. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Sun, 12 Jan 2025 05:45:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.