GNU bug report logs - #20365
24.5; all-completions returns duplicates for Info-read-node-name-1

Previous Next

Package: emacs;

Reported by: Oleh Krehel <ohwoeowho <at> gmail.com>

Date: Sat, 18 Apr 2015 16:17:02 UTC

Severity: minor

Found in version 24.5

Fixed in version 29.1

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 20365 in the body.
You can then email your comments to 20365 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox

Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#20365; Package emacs. (Sat, 18 Apr 2015 16:17:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Oleh Krehel <ohwoeowho <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sat, 18 Apr 2015 16:17:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Oleh Krehel <ohwoeowho <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: 24.5; all-completions returns duplicates for Info-read-node-name-1
Date: Sat, 18 Apr 2015 18:11:14 +0200

To reproduce, eval this in the top *info* directory:

(setq x (all-completions "(" 'Info-read-node-name-1))

x will contain many duplicates for each node, like "org" "org" "org"
"org" "org.info.gz" "org" "org.info.gz" "org" "org.info.gz".

Finally, if one of the elements of `all-completions' is passed, it still
doesn't work. I'm guessing that it expects "(org)" instead of "org", but
then why not offer these on the completion list?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20365; Package emacs. (Sat, 18 Apr 2015 17:42:02 GMT) Full text and rfc822 format available.

Message #8 received at 20365 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> IRO.UMontreal.CA>
To: Oleh Krehel <ohwoeowho <at> gmail.com>
Cc: 20365 <at> debbugs.gnu.org
Subject: Re: bug#20365: 24.5;
 all-completions returns duplicates for Info-read-node-name-1
Date: Sat, 18 Apr 2015 13:41:17 -0400

Just a first comment: it's not considered incorrect for
`all-completions' to return a list with duplicate entries in it.
More specifically, it's considered the completion UI's job to remove
those duplicates.

>   (setq x (all-completions "(" 'Info-read-node-name-1))
> x will contain many duplicates for each node, like "org" "org" "org"
> "org" "org.info.gz" "org" "org.info.gz" "org" "org.info.gz".

Maybe the way these entries are generated could be reviewed to try and
reduce the number of duplicates.  And we could call `delete-dups' on the
result: while a completion-table shouldn't need to go out of its way to
reduce the number of duplicates (since the UI is supposed to handle it
anyway), it's probably good to avoid having such expected large number of
duplicates, indeed.

> Finally, if one of the elements of `all-completions' is passed, it still
> doesn't work.

What does "is passed" mean here?  What does "doesn't work" mean here?

> I'm guessing that it expects "(org)" instead of "org", but
> then why not offer these on the completion list?

What is "it"?


        Stefan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20365; Package emacs. (Sat, 18 Apr 2015 17:50:01 GMT) Full text and rfc822 format available.

Message #11 received at 20365 <at> debbugs.gnu.org (full text, mbox):

From: Oleh Krehel <ohwoeowho <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 20365 <at> debbugs.gnu.org
Subject: Re: bug#20365: 24.5;
 all-completions returns duplicates for Info-read-node-name-1
Date: Sat, 18 Apr 2015 19:49:44 +0200

On Sat, Apr 18, 2015 at 7:41 PM, Stefan Monnier
<monnier <at> iro.umontreal.ca> wrote:
> Just a first comment: it's not considered incorrect for
> `all-completions' to return a list with duplicate entries in it.
> More specifically, it's considered the completion UI's job to remove
> those duplicates.
>
>>   (setq x (all-completions "(" 'Info-read-node-name-1))
>> x will contain many duplicates for each node, like "org" "org" "org"
>> "org" "org.info.gz" "org" "org.info.gz" "org" "org.info.gz".
>
> Maybe the way these entries are generated could be reviewed to try and
> reduce the number of duplicates.  And we could call `delete-dups' on the
> result: while a completion-table shouldn't need to go out of its way to
> reduce the number of duplicates (since the UI is supposed to handle it
> anyway), it's probably good to avoid having such expected large number of
> duplicates, indeed.
>
>> Finally, if one of the elements of `all-completions' is passed, it still
>> doesn't work.
>
> What does "is passed" mean here?  What does "doesn't work" mean here?
>
>> I'm guessing that it expects "(org)" instead of "org", but
>> then why not offer these on the completion list?
>
> What is "it"?

What I think is happening there is that Info calls
`completing-read-function' and ultimately gives it (through
all-completions) a list like '("org" "emacs"...).  And then it expects
the answer to be in the form "(org)" or "(emacs)". Which I think is
strange. I've pushed recently this hack to `ivy-read', which makes
it work for Info:

((eq collection 'Info-read-node-name-1)
 (if (equal Info-current-file "dir")
     (setq collection
           (mapcar (lambda (x) (format "(%s)" x))
                   (cl-delete-duplicates
                    (all-completions "(" collection predicate)
                    :test 'equal)))
   (setq collection (all-completions "" collection predicate))))

This is quite ugly to go to such lengths to deal with just one
completion case.

Oleh

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20365; Package emacs. (Sat, 18 Apr 2015 23:41:02 GMT) Full text and rfc822 format available.

Message #14 received at 20365 <at> debbugs.gnu.org (full text, mbox):

From: Drew Adams <drew.adams <at> oracle.com>
To: Stefan Monnier <monnier <at> IRO.UMontreal.CA>, Oleh Krehel
 <ohwoeowho <at> gmail.com>
Cc: 20365 <at> debbugs.gnu.org
Subject: RE: bug#20365: 24.5; all-completions returns duplicates for
 Info-read-node-name-1
Date: Sat, 18 Apr 2015 16:40:03 -0700 (PDT)

> Just a first comment: it's not considered incorrect for
> `all-completions' to return a list with duplicate entries in it.
> More specifically, it's considered the completion UI's job to remove
> those duplicates.

Yes, absolutely.

> > (setq x (all-completions "(" 'Info-read-node-name-1))
> > x will contain many duplicates for each node, like "org" "org"
> > "org" "org" "org.info.gz" "org" "org.info.gz" "org" "org.info.gz".
> 
> Maybe the way these entries are generated could be reviewed to try
> and reduce the number of duplicates.  And we could call `delete-dups' on
> the result: while a completion-table shouldn't need to go out of its way
> to reduce the number of duplicates (since the UI is supposed to handle
> it anyway), it's probably good to avoid having such expected large
> number of duplicates, indeed.

Just to be sure -

I hope you mean to do this only in `Info-read-node-name-1' or
in the code that calls `completing-read' (or in a "completion UI").
I hope you don't mean to do any of that in `all-completions'.
I'm guessing that this is what you mean.

If my guess is correct, then you need not read any further,
and thanks for confirming the guess.

---

In case not, let me disagree that such things should be done in
`all-completions' and similar functions or in `completing-read'.
They should not delete duplicates.

In my context, for instance, the completion candidates are often
alist elements where the cdrs matter.  Or they are strings that
are `string=' but have different properties.  The cdrs or the
string properties matter to the code that calls `all-completions'
or `completing-read'.  The calling code reconstructs the alist
element from the chosen candidate(s) (it provides access to the
associated cdr information).

It is important in such contexts to allow "duplicates", as
defined from the point of view of comparing only the cars with
`string=' (which is what `all-completions' does).  Just
because two completion candidate strings are equal does not
mean that the alist elements of which they are the cars are
equal.

It should be entirely up to the caller to delete duplicates.
It is trivial for it to do so, and logical that this be the
place where that is done.  Only the caller (or the "completion
UI") knows whether duplicates make sense in the current context.

`all-completions' & compagnie are building blocks (and coded
in C, no less, so not tweakable in Lisp).  It is not their
job to either (a) filter out duplicates or (b) try not to
produce duplicates.  (a) is the job of their callers, which
are the consumers of the candidates.  (b) is the job of the
COLLECTION function or other producer of the candidates.

So please try to tackle any problem of duplicates in this
particular use case by fiddling with `Info-read-node-name-1',
and not by introducing a change in `all-completions' etc.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20365; Package emacs. (Sun, 19 Apr 2015 01:51:02 GMT) Full text and rfc822 format available.

Message #17 received at 20365 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> IRO.UMontreal.CA>
To: Drew Adams <drew.adams <at> oracle.com>
Cc: Oleh Krehel <ohwoeowho <at> gmail.com>, 20365 <at> debbugs.gnu.org
Subject: Re: bug#20365: 24.5;
 all-completions returns duplicates for Info-read-node-name-1
Date: Sat, 18 Apr 2015 21:50:44 -0400

> I hope you mean to do this only in `Info-read-node-name-1' or
> in the code that calls `completing-read' (or in a "completion UI").
> I hope you don't mean to do any of that in `all-completions'.
> I'm guessing that this is what you mean.

Yes.


        Stefan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20365; Package emacs. (Sun, 19 Apr 2015 02:05:01 GMT) Full text and rfc822 format available.

Message #20 received at 20365 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> IRO.UMontreal.CA>
To: Oleh Krehel <ohwoeowho <at> gmail.com>
Cc: 20365 <at> debbugs.gnu.org
Subject: Re: bug#20365: 24.5;
 all-completions returns duplicates for Info-read-node-name-1
Date: Sat, 18 Apr 2015 22:04:54 -0400

> What I think is happening there is that Info calls
> `completing-read-function' and ultimately gives it (through
> all-completions) a list like '("org" "emacs"...).  And then it expects
> the answer to be in the form "(org)" or "(emacs)".  Which I think is
> strange.

IIUC the problem is as follows: the completion-table used by "g" in
Info-mode lets you complete node names in the current file or lets you
complete file names (which are placed within parens), but the completion
table itself does not add those parens [ A long standing missing feature
here is that it doesn't know how to complete a node name after the
"(<file>)", even though it's a very much valid input to provide.  ]

Your ivy-mode generally wants to have a list of strings as candidates
and wants those strings to be valid return values, whereas the way
completion tables are defined there is no guarantee that you can get
that kind of info from the completion table.

We could change the Info-mode completion table so as to include the
closing paren in the output of `all-completions' (and probably include
the opening paren as well, in that case).  Note also that on my system
"g (ema TAB" includes things like "emacs23/" where "(emacs23/)" is not
a valid element (since emacs-23 is a directorym which is complete in
steps, just as in C-x C-f).


        Stefan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20365; Package emacs. (Sun, 19 Apr 2015 11:54:02 GMT) Full text and rfc822 format available.

Message #23 received at 20365 <at> debbugs.gnu.org (full text, mbox):

From: Oleh Krehel <ohwoeowho <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 20365 <at> debbugs.gnu.org
Subject: Re: bug#20365: 24.5;
 all-completions returns duplicates for Info-read-node-name-1
Date: Sun, 19 Apr 2015 13:53:14 +0200

> We could change the Info-mode completion table so as to include the
> closing paren in the output of `all-completions' (and probably include
> the opening paren as well, in that case).  Note also that on my system
> "g (ema TAB" includes things like "emacs23/" where "(emacs23/)" is not
> a valid element (since emacs-23 is a directorym which is complete in
> steps, just as in C-x C-f).

That would be fine. It just makes sense for any element of what
`all-completions' returns to be a valid answer.

About the duplicate entries, I think it should be the responsibility
of the caller to remove the duplicates.  Here's my line of thought: a
completion function is expected to have an O(N) complexity, where N is
the amount of candidates. Removing duplicates is O(N^2) at worst, and
O(NlogN) at best.  So the completion function should not attempt to
remove the duplicates. It's doesn't affect the performance when I do
it for 1000 candidates, but when it's 20k (`describe-function') it can
have an impact.

The point is that the collection passed to the completion function can
be very large, and all but O(N) algorithms should be avoided.  On the
other hand, the caller knows exactly the type of data that it's
passing and may be able to remove the duplicates in an efficient way.

Oleh

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20365; Package emacs. (Sun, 19 Apr 2015 14:45:03 GMT) Full text and rfc822 format available.

Message #26 received at 20365 <at> debbugs.gnu.org (full text, mbox):

From: Drew Adams <drew.adams <at> oracle.com>
To: Oleh Krehel <ohwoeowho <at> gmail.com>, Stefan Monnier
 <monnier <at> iro.umontreal.ca>
Cc: 20365 <at> debbugs.gnu.org
Subject: RE: bug#20365: 24.5; all-completions returns duplicates for
 Info-read-node-name-1
Date: Sun, 19 Apr 2015 07:43:43 -0700 (PDT)

> It just makes sense for any element of what
> `all-completions' returns to be a valid answer.

Interesting point of view.  Sounds familiar... ;-)

http://debbugs.gnu.org/cgi/bugreport.cgi?bug=1085

> About the duplicate entries, I think it should be the responsibility
> of the caller to remove the duplicates.  Here's my line of thought:
> a completion function is expected to have an O(N) complexity, where N
> is the amount of candidates. Removing duplicates is O(N^2) at worst,
> and O(NlogN) at best.  So the completion function should not attempt to
> remove the duplicates. It's doesn't affect the performance when I do
> it for 1000 candidates, but when it's 20k (`describe-function') it
> can have an impact.
> 
> The point is that the collection passed to the completion function
> can be very large, and all but O(N) algorithms should be avoided.  On
> the other hand, the caller knows exactly the type of data that it's
> passing and may be able to remove the duplicates in an efficient
> way.

FWIW -

I agree that the calling code should control duplicate removal.
(Definitely, `all-completions' should not do that.)

But it can make sense to allow the calling program to optionally
"reach inside `completing-read'" to do its duplicate removal.

In Icicles, the caller can cause `completing-read' itself to remove
duplicates by binding global variable `icicle-transform-function'.

Because `completing-read' does some processing (e.g. sorting)
after computation of the candidates, this filter promotion into
`completing-read' can save some time.  It can also give users
dynamic control over the candidates to be matched (the completion
domain).

The value of variable `icicle-transform-function' is a function
used to transform the list of completion candidates.  Users can
toggle such transforming on/off using `C-$' during completion.

The most common use, by far, of `icicle-transform-function' is
to bind it to a remove-duplicates function.  Other uses include
switching to a particular subset of candidates, against which
user input is then matched.

For example, using `C-$' with `icicle-apropos-value' toggles
filtering the domain of initial candidates according to the
prefix argument, as follows:

 * no prefix arg: only user options (+ values)
 *           < 0: only commands (+ definitions)
 *           > 0: only faces (+ plists)
 *           = 0: only options (+ values), commands (+ defs),
                  faces (+ plists)

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20365; Package emacs. (Sun, 19 Apr 2015 16:45:01 GMT) Full text and rfc822 format available.

Message #29 received at 20365 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Oleh Krehel <ohwoeowho <at> gmail.com>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 20365 <at> debbugs.gnu.org
Subject: Re: bug#20365: 24.5; all-completions returns duplicates for
 Info-read-node-name-1
Date: Sun, 19 Apr 2015 19:44:04 +0300

On 04/19/2015 02:53 PM, Oleh Krehel wrote:

> About the duplicate entries, I think it should be the responsibility
> of the caller to remove the duplicates.

That could have been a decent argument if we're discussing a new API, 
and not an already widely-used one.

> Here's my line of thought: a
> completion function is expected to have an O(N) complexity, where N is
> the amount of candidates. Removing duplicates is O(N^2) at worst, and
> O(NlogN) at best.

O(NlogN) is closer to the truth:

First, you copy - O(N), then sort - O(NlogN), then call 
`delete-consecutive-dups' (linear time).

> So the completion function should not attempt to
> remove the duplicates. It's doesn't affect the performance when I do
> it for 1000 candidates, but when it's 20k (`describe-function') it can
> have an impact.

Even on a decently-sized collection (38K), this takes only 80ms:

(delete-consecutive-dups (sort (all-completions "" obarray) #'string<))

That's not terrible.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20365; Package emacs. (Sun, 19 Apr 2015 17:01:02 GMT) Full text and rfc822 format available.

Message #32 received at 20365 <at> debbugs.gnu.org (full text, mbox):

From: Oleh Krehel <ohwoeowho <at> gmail.com>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>, 20365 <at> debbugs.gnu.org
Subject: Re: bug#20365: 24.5;
 all-completions returns duplicates for Info-read-node-name-1
Date: Sun, 19 Apr 2015 19:00:48 +0200

On Sun, Apr 19, 2015 at 6:44 PM, Dmitry Gutov <dgutov <at> yandex.ru> wrote:

> That could have been a decent argument if we're discussing a new API, and
> not an already widely-used one.

No problem: no completion package will complain if you don't pass it duplicates.
Then it's just a matter of fixing the offening callers of completion one by one.

>> Here's my line of thought: a
>> completion function is expected to have an O(N) complexity, where N is
>> the amount of candidates. Removing duplicates is O(N^2) at worst, and
>> O(NlogN) at best.
>
>
> O(NlogN) is closer to the truth:
>
> First, you copy - O(N), then sort - O(NlogN), then call
> `delete-consecutive-dups' (linear time).
>
>> So the completion function should not attempt to
>> remove the duplicates. It's doesn't affect the performance when I do
>> it for 1000 candidates, but when it's 20k (`describe-function') it can
>> have an impact.
>
>
> Even on a decently-sized collection (38K), this takes only 80ms:
>
> (delete-consecutive-dups (sort (all-completions "" obarray) #'string<))
>
> That's not terrible.

Actually, 0.08 is a lot when you consider that I would call this after
each key press (in case when the collection is a function, not for the
static list).  The sluggishness would be perceptible even for a
relatively slow typist. And this is only the overhead, there's also
actual computing to be done. There's no reason not to avoid this
overhead cost.

Oleh

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20365; Package emacs. (Sun, 19 Apr 2015 17:13:02 GMT) Full text and rfc822 format available.

Message #35 received at 20365 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Oleh Krehel <ohwoeowho <at> gmail.com>
Cc: 20365 <at> debbugs.gnu.org
Subject: Re: bug#20365: 24.5; all-completions returns duplicates for
 Info-read-node-name-1
Date: Sun, 19 Apr 2015 20:12:40 +0300

On 04/19/2015 08:00 PM, Oleh Krehel wrote:

> No problem: no completion package will complain if you don't pass it duplicates.
> Then it's just a matter of fixing the offening callers of completion one by one.

Good luck with that.

> Actually, 0.08 is a lot when you consider that I would call this after
> each key press (in case when the collection is a function, not for the
> static list).  The sluggishness would be perceptible even for a
> relatively slow typist. And this is only the overhead, there's also
> actual computing to be done. There's no reason not to avoid this
> overhead cost.

It's only 80ms as long as the list is unfiltered. Type one or two chars 
- and the already small delay disappears into nothing.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20365; Package emacs. (Mon, 20 Apr 2015 02:30:04 GMT) Full text and rfc822 format available.

Message #38 received at 20365 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> IRO.UMontreal.CA>
To: Oleh Krehel <ohwoeowho <at> gmail.com>
Cc: 20365 <at> debbugs.gnu.org
Subject: Re: bug#20365: 24.5;
 all-completions returns duplicates for Info-read-node-name-1
Date: Sun, 19 Apr 2015 22:29:00 -0400

> That would be fine. It just makes sense for any element of what
> `all-completions' returns to be a valid answer.

What's a valid answer?

all-completions will happily return "share/" when completing on
"/usr/s", even if you're looking for a .tex file rather than a directory
and even if there's no "share/" in the current directory either.

Yet, we don't want all-completions to return elements of the form
"/home/monnier/private/package/emacs/trunk/lisp/" since that would
result in a lot of redundancy that then needs to be found&removed.

So, no, all-completions just doesn't always return "valid answers".
Instead, it returns chunks of text that can added to some prefix (which
you can find via `completion-boundaries'), the result of which should be
a prefix of a valid answer.

> About the duplicate entries, I think it should be the responsibility
> of the caller to remove the duplicates.

That's the case currently.  The completion-table is called and the
caller is the UI, and currently it's the UI's responsibility to remove
the duplicates.

> Here's my line of thought: a completion function is expected to have
> an O(N) complexity, where N is the amount of candidates.  Removing
> duplicates is O(N^2) at worst, and O(NlogN) at best.

Actually, with a hash-table it's pretty much down to O(N).


        Stefan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20365; Package emacs. (Mon, 20 Apr 2015 08:39:02 GMT) Full text and rfc822 format available.

Message #41 received at 20365 <at> debbugs.gnu.org (full text, mbox):

From: Oleh Krehel <ohwoeowho <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 20365 <at> debbugs.gnu.org
Subject: Re: bug#20365: 24.5;
 all-completions returns duplicates for Info-read-node-name-1
Date: Mon, 20 Apr 2015 10:38:14 +0200

On Mon, Apr 20, 2015 at 4:29 AM, Stefan Monnier
<monnier <at> iro.umontreal.ca> wrote:
>> That would be fine. It just makes sense for any element of what
>> `all-completions' returns to be a valid answer.
>
> What's a valid answer?

A valid answer is the one that doesn't lead to an error if
`completing-read-function' returns it.
Also, in my experience, having less callbacks leads to easier debugging.
I've looked at the callbacks in `incomplete-mode': the completion
function gets called like 3 times for a single input.

> all-completions will happily return "share/" when completing on
> "/usr/s", even if you're looking for a .tex file rather than a directory
> and even if there's no "share/" in the current directory either.

This would be good if there was a "share/" in the current directory.
I'm fine with the concept of `all-completions' being able to return
"infinite" candidates, but it would be nice if it obeyed some rules.
For instance, ivy-mode can handle the file names being "infinte",
because of the concept of directories being non-terminal completion
nodes. This concept can be adapted to all compltetion types with
"infinite" candidates. And there would be zero confusion:
`all-completions' would immediately return a list of strings, some of
them terminal nodes, some of them "directories".  Then it only remains
to provide a generic `file-directory-p' and we're done.

> Yet, we don't want all-completions to return elements of the form
> "/home/monnier/private/package/emacs/trunk/lisp/" since that would
> result in a lot of redundancy that then needs to be found&removed.

With a concept of "active directory" and "non-terminal" node this can be done.
"active directory" would be a generic `default-directory', and
"non-terminal" would
be a generic predicate `file-directory-p'.

> So, no, all-completions just doesn't always return "valid answers".
> Instead, it returns chunks of text that can added to some prefix (which
> you can find via `completion-boundaries'), the result of which should be
> a prefix of a valid answer.

We could have the cake and eat it too with my approach described above.

>> About the duplicate entries, I think it should be the responsibility
>> of the caller to remove the duplicates.
>
> That's the case currently.  The completion-table is called and the
> caller is the UI, and currently it's the UI's responsibility to remove
> the duplicates.

So Info returning duplicates is a bug that should be fixed?

>> Here's my line of thought: a completion function is expected to have
>> an O(N) complexity, where N is the amount of candidates.  Removing
>> duplicates is O(N^2) at worst, and O(NlogN) at best.
>
> Actually, with a hash-table it's pretty much down to O(N).

Yeah, but we're not using that. And having no assumptions on the data,
the hashing function would be the most basic one.

Oleh

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20365; Package emacs. (Mon, 20 Apr 2015 12:31:02 GMT) Full text and rfc822 format available.

Message #44 received at 20365 <at> debbugs.gnu.org (full text, mbox):

From: Oleh Krehel <ohwoeowho <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 20365 <at> debbugs.gnu.org
Subject: Re: bug#20365: 24.5;
 all-completions returns duplicates for Info-read-node-name-1
Date: Mon, 20 Apr 2015 14:30:27 +0200

Please check the "fix-info-dups" brach with a fix. This patch removes
the duplicates and generally simplifies the completion of Info files.

Oleh

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20365; Package emacs. (Mon, 20 Apr 2015 14:40:03 GMT) Full text and rfc822 format available.

Message #47 received at 20365 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Oleh Krehel <ohwoeowho <at> gmail.com>
Cc: 20365 <at> debbugs.gnu.org
Subject: Re: bug#20365: 24.5;
 all-completions returns duplicates for Info-read-node-name-1
Date: Mon, 20 Apr 2015 10:38:26 -0400

> nodes. This concept can be adapted to all compltetion types with
> "infinite" candidates. And there would be zero confusion:
> `all-completions' would immediately return a list of strings, some of
> them terminal nodes, some of them "directories".  Then it only remains
> to provide a generic `file-directory-p' and we're done.

Yes, that's also what I was thinking.  Basically, if all-completions
returns something which requires further input, then this something
should include the "terminating char" which makes completion-boundaries
change (which is how to detect that your candidate list is out-of-date
because the input has "moved to another directory").

>> That's the case currently.  The completion-table is called and the
>> caller is the UI, and currently it's the UI's responsibility to remove
>> the duplicates.
> So Info returning duplicates is a bug that should be fixed?

Our UI already does remove duplicates (not in info.el, of course, since
our UI is in minibuffer.el).

>>> Here's my line of thought: a completion function is expected to have
>>> an O(N) complexity, where N is the amount of candidates.  Removing
>>> duplicates is O(N^2) at worst, and O(NlogN) at best.
>> Actually, with a hash-table it's pretty much down to O(N).
> Yeah, but we're not using that.

Not sure who's "we", here.  But the point is that if the performance of
delete-dups becomes a problem, it can be improved.

> And having no assumptions on the data, the hashing function would be
> the most basic one.

I don't think that should make much difference: (make-hash-table :test #'equal)
should work just fine.


        Stefan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20365; Package emacs. (Mon, 20 Apr 2015 14:54:02 GMT) Full text and rfc822 format available.

Message #50 received at 20365 <at> debbugs.gnu.org (full text, mbox):

From: Oleh Krehel <ohwoeowho <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 20365 <at> debbugs.gnu.org
Subject: Re: bug#20365: 24.5;
 all-completions returns duplicates for Info-read-node-name-1
Date: Mon, 20 Apr 2015 16:52:57 +0200

On Mon, Apr 20, 2015 at 4:38 PM, Stefan Monnier
<monnier <at> iro.umontreal.ca> wrote:
>> nodes. This concept can be adapted to all compltetion types with
>> "infinite" candidates. And there would be zero confusion:
>> `all-completions' would immediately return a list of strings, some of
>> them terminal nodes, some of them "directories".  Then it only remains
>> to provide a generic `file-directory-p' and we're done.
>
> Yes, that's also what I was thinking.  Basically, if all-completions
> returns something which requires further input, then this something
> should include the "terminating char" which makes completion-boundaries
> change (which is how to detect that your candidate list is out-of-date
> because the input has "moved to another directory").

I'm not fully familiar with the concept of completion-boundaries, but I
have a feeling that I won't like it, specifically the multiple callback
passes for a response to a single completion input change.

Is the concept of completion-boundaries irreplaceable, could it maybe be
replaced by a notion of navigating a tree (just like navigating a file
system, with terminal and non-terminal nodes).  Which Emacs features use
the completion-boundaries concept?

>>> That's the case currently.  The completion-table is called and the
>>> caller is the UI, and currently it's the UI's responsibility to remove
>>> the duplicates.
>> So Info returning duplicates is a bug that should be fixed?
>
> Our UI already does remove duplicates (not in info.el, of course, since
> our UI is in minibuffer.el).

I've added a fix to Info in any case. Can you please look at it in the
fix-info-dups branch?

>>>> Here's my line of thought: a completion function is expected to have
>>>> an O(N) complexity, where N is the amount of candidates.  Removing
>>>> duplicates is O(N^2) at worst, and O(NlogN) at best.
>>> Actually, with a hash-table it's pretty much down to O(N).
>> Yeah, but we're not using that.
>
> Not sure who's "we", here.  But the point is that if the performance of
> delete-dups becomes a problem, it can be improved.

We are the happy Emacs users.

>> And having no assumptions on the data, the hashing function would be
>> the most basic one.
>
> I don't think that should make much difference: (make-hash-table :test #'equal)
> should work just fine.

OK, that's good to keep in mind. But even better is to avoid placing the
overhead on the UI, be it minibuffer.el or ivy.el or whatever.

Oleh

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20365; Package emacs. (Mon, 20 Apr 2015 19:16:03 GMT) Full text and rfc822 format available.

Message #53 received at 20365 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Oleh Krehel <ohwoeowho <at> gmail.com>
Cc: 20365 <at> debbugs.gnu.org
Subject: Re: bug#20365: 24.5;
 all-completions returns duplicates for Info-read-node-name-1
Date: Mon, 20 Apr 2015 15:14:55 -0400

> I'm not fully familiar with the concept of completion-boundaries, but I

I know, but if you intend to support a completion UI that works with
existing completion-tables, you're going to have to become at least
somewhat acquainted to it.

> Is the concept of completion-boundaries irreplaceable, could it maybe be
> replaced by a notion of navigating a tree (just like navigating a file
> system, with terminal and non-terminal nodes).  Which Emacs features use
> the completion-boundaries concept?

All the "special" examples of completion I showed (plus file name completion).

> OK, that's good to keep in mind. But even better is to avoid placing the
> overhead on the UI, be it minibuffer.el or ivy.el or whatever.

Many completion tables are made by combining other completion tables.
Doing the delete-dups at every step is going to be more code and more
runtime work than doing it once at the end in the UI.  So the UI will
have to do it.  Completion-tables can also do it if they want, but it
should not be necessary for correctness.


        Stefan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20365; Package emacs. (Sun, 17 Apr 2022 10:52:01 GMT) Full text and rfc822 format available.

Message #56 received at 20365 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Oleh Krehel <ohwoeowho <at> gmail.com>
Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>, 20365 <at> debbugs.gnu.org
Subject: Re: bug#20365: 24.5; all-completions returns duplicates for
 Info-read-node-name-1
Date: Sun, 17 Apr 2022 12:51:17 +0200

Oleh Krehel <ohwoeowho <at> gmail.com> writes:

> Please check the "fix-info-dups" brach with a fix. This patch removes
> the duplicates and generally simplifies the completion of Info files.

I've now applied that patch to Emacs 29 (with some changes), and I think
that fixes the reported problem here, so I'm closing the bug report.
(The discussion then went on to more general issues of duplicates in
completions.)

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

bug marked as fixed in version 29.1, send any further explanations to 20365 <at> debbugs.gnu.org and Oleh Krehel <ohwoeowho <at> gmail.com> Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Sun, 17 Apr 2022 10:52:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 15 May 2022 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 3 years and 59 days ago.

Previous Next

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #20365 24.5; all-completions returns duplicates for Info-read-node-name-1

GNU bug report logs - #20365
24.5; all-completions returns duplicates for Info-read-node-name-1