Path: csiph.com!eternal-september.org!feeder.eternal-september.org!.POSTED!not-for-mail From: tpeplt Newsgroups: comp.lang.lisp Subject: Re: Faster remove-duplicates with sorted list. Date: Sun, 07 Sep 2025 15:55:37 -0400 Organization: A noiseless patient Spider Lines: 108 Message-ID: <87wm6akpfq.fsf@gmail.com> References: <109kdki$3q7rv$1@dont-email.me> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Date: Sun, 07 Sep 2025 19:55:41 +0000 (UTC) Injection-Info: dont-email.me; posting-host="8e8075693a007523daa6d5d6846eb97d"; logging-data="4095405"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/wyX9To7tD45jPAv/BGPgnVq+YVPrt/Hs=" User-Agent: Gnus/5.13 (Gnus v5.13) Cancel-Lock: sha1:hDJnLr38+a8VYrSI9aXRu36+F3o= sha1:WopRrdwvgVwF2d0NTMy9m7Wl4go= Xref: csiph.com comp.lang.lisp:60712 "B. Pym" writes: > "Pierre R. Mai" wrote: > >> > I believe that remove-duplicates has to assume that the list is not >> > sorted. Therefore it has to compare each element, element by element >> > ( ~N^2 ). Whereas a side by side goes like 2*N. >> > >> > The only problem I have is excessive consing which significantly slows >> > down the algorithm. >> >> (defun uniquify-sorted-list (list &key (key #'identity) (test #'eql)) >> (loop for element in list >> for element-key = (funcall key element) >> for last-element-key = (load-time-value (gensym)) >> then element-key >> unless (funcall test element-key last-element-key) >> collect element)) > > Testing under SBCL: > > (uniquify-sorted-list '(a b b c d d e f f f g)) > ===> > (A) > > (uniquify-sorted-list '(2 2 3 4 4 4 5 5 6 7)) > ===> > (2) > > (uniquify-sorted-list '(2 2 3 4 4 4 5 5 6 7) :test #'=) > ===> > debugger invoked on a SIMPLE-TYPE-ERROR in thread > #: > Argument Y is not a NUMBER: #:G5 > > > It seems that except in the first pass through the loop > "last-element-key" is always equal to "element-key". > > Again we have proof of the unusability of Common Lisp and, in > particular, Loop. > > Again we have proof of the mindless arrogance of fans of > Common Lisp. That complex chunk of code was not tested even > once by its author. > The mistake in the LOOP solution is akin to using DO* when it should use DO. Each time ‘last-element-key’ is updated, it uses the current value of ‘element-key’ when it should use the previous value of ‘element-key’. This is fixed by changing: "FOR last-element-key ..." to: "AND last-element-key..." See: http://www.ai.mit.edu/projects/iiip/doc/CommonLISP/HyperSpec/Body/sec_6-1-2-1.html ``` (defun uniquify-sorted-list (list &key (key #'identity) (test #'eql)) "Create a list of items from sorted LIST but removing all duplicate items. (If LIST is not sorted, then duplicates in the list may remain in the result list.)" (loop for element in list for element-key = (funcall key element) and last-element-key = nil then element-key for result = nil then (funcall test element-key last-element-key) unless result collect element)) ``` (uniquify-sorted-list '(2 2 3 4 4 4 5 5 6 7) :test #'=) ;;=> (2 3 4 5 6 7) Or, use LOOP’s destructuring and stepping by tails of a list: ``` (defun uniquify-sorted-list (list &key (key #'identity) (test #'eql)) "Create a list of items from sorted LIST but removing all duplicate items. (If LIST is not sorted, then duplicates in the list may remain in the result list.)" (loop for (one two) on list if (null two) collect one else unless (funcall test (funcall key one) (funcall key two)) collect one)) ``` (uniquify-sorted-list '(5 5 6 7 11.2 11.2 33.4 44.5 44.5 44.5) :test #'=) ;;=> (5 6 7 11.2 33.4 44.5) See: http://www.ai.mit.edu/projects/iiip/doc/CommonLISP/HyperSpec/Body/sec_6-1-1-7.html and http://www.ai.mit.edu/projects/iiip/doc/CommonLISP/HyperSpec/Body/sec_6-1-2-1-3.html -- The lyf so short, the craft so long to lerne. - Geoffrey Chaucer, The Parliament of Birds.