Groups | Search | Server Info | Login | Register


Groups > comp.lang.lisp > #60712

Re: Faster remove-duplicates with sorted list.

From tpeplt <tpeplt@gmail.com>
Newsgroups comp.lang.lisp
Subject Re: Faster remove-duplicates with sorted list.
Date 2025-09-07 15:55 -0400
Organization A noiseless patient Spider
Message-ID <87wm6akpfq.fsf@gmail.com> (permalink)
References <109kdki$3q7rv$1@dont-email.me>

Show all headers | View raw


"B. Pym" <Nobody447095@here-nor-there.org> writes:

> "Pierre R. Mai" wrote:
>
>> > I believe that remove-duplicates has to assume that the list is not
>> > sorted. Therefore it has to compare each element, element by element
>> > ( ~N^2 ). Whereas a side by side goes like 2*N.
>> >
>> > The only problem I have is excessive consing which significantly slows
>> > down the algorithm.
>> 
>> (defun uniquify-sorted-list (list &key (key #'identity) (test #'eql))
>>   (loop for element in list
>>         for element-key = (funcall key element)
>>         for last-element-key = (load-time-value (gensym))
>>         then element-key
>>         unless (funcall test element-key last-element-key)
>>         collect element))
>
> Testing under SBCL:
>
> (uniquify-sorted-list '(a b b c d d e f f f g))
>   ===>
> (A)
>
> (uniquify-sorted-list '(2 2 3 4 4 4 5 5 6 7))
>   ===>
> (2)
>
> (uniquify-sorted-list '(2 2 3 4 4 4 5 5 6 7) :test #'=)
>   ===>
> debugger invoked on a SIMPLE-TYPE-ERROR in thread
> #<THREAD "main thread" RUNNING {23EAC0D1}>:
>   Argument Y is not a NUMBER: #:G5
>
>
> It seems that except in the first pass through the loop
> "last-element-key" is always equal to "element-key".
>
> Again we have proof of the unusability of Common Lisp and, in
> particular, Loop.
>
> Again we have proof of the mindless arrogance of fans of
> Common Lisp.  That complex chunk of code was not tested even
> once by its author.
>

The mistake in the LOOP solution is akin to using DO* when
it should use DO.  Each time ‘last-element-key’ is updated,
it uses the current value of ‘element-key’ when it should
use the previous value of ‘element-key’.  This is fixed by
changing:

  "FOR last-element-key ..."

to:

  "AND last-element-key..."

See:
  http://www.ai.mit.edu/projects/iiip/doc/CommonLISP/HyperSpec/Body/sec_6-1-2-1.html

```
(defun uniquify-sorted-list (list &key (key #'identity) (test #'eql))
  "Create a list of items from sorted LIST but removing all duplicate
items.  (If LIST is not sorted, then duplicates in the list may remain
in the result list.)"
  (loop
    for element in list
    for element-key = (funcall key element)
    and last-element-key = nil then element-key
    for result = nil
      then (funcall test element-key last-element-key)
    unless result
      collect element))
```

(uniquify-sorted-list '(2 2 3 4 4 4 5 5 6 7) :test #'=)
;;=> (2 3 4 5 6 7)

Or, use LOOP’s destructuring and stepping by tails of a
list:

```
(defun uniquify-sorted-list (list &key (key #'identity) (test #'eql))
  "Create a list of items from sorted LIST but removing all duplicate
items.  (If LIST is not sorted, then duplicates in the list may remain
in the result list.)"
  (loop
    for (one two) on list
    if (null two)
      collect one
    else
      unless (funcall test (funcall key one) (funcall key two))
        collect one))
```

(uniquify-sorted-list '(5 5 6 7 11.2 11.2 33.4 44.5 44.5 44.5) :test #'=)
;;=> (5 6 7 11.2 33.4 44.5)

See:
  http://www.ai.mit.edu/projects/iiip/doc/CommonLISP/HyperSpec/Body/sec_6-1-1-7.html
and
  http://www.ai.mit.edu/projects/iiip/doc/CommonLISP/HyperSpec/Body/sec_6-1-2-1-3.html

-- 
The lyf so short, the craft so long to lerne.
- Geoffrey Chaucer, The Parliament of Birds.

Back to comp.lang.lisp | Previous | NextPrevious in thread | Find similar


Thread

Re: Faster remove-duplicates with sorted list. "B. Pym" <Nobody447095@here-nor-there.org> - 2025-09-07 16:59 +0000
  Re: Faster remove-duplicates with sorted list. tpeplt <tpeplt@gmail.com> - 2025-09-07 15:55 -0400

csiph-web