Path: csiph.com!eternal-september.org!feeder.eternal-september.org!nntp.eternal-september.org!.POSTED!not-for-mail
From: Tim Rentsch
Newsgroups: comp.lang.c
Subject: Re: sorting Was: Isn't that beauty ? (no it's not)
Date: Sun, 26 Apr 2026 07:29:02 -0700
Organization: A noiseless patient Spider
Lines: 125
Message-ID: <86340h3idt.fsf@linuxsc.com>
References: <10otm7r$1ntrg$1@raubtier-asyl.eternal-september.org> <863416xid5.fsf@linuxsc.com> <10r94t2$or8$1@reader1.panix.com> <861pgkaje2.fsf@linuxsc.com> <10rf8rr$j50$1@reader1.panix.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Date: Sun, 26 Apr 2026 14:29:04 +0000 (UTC)
Injection-Info: dont-email.me; posting-host="ba9581de696c0982039c62980e84b25b"; logging-data="1723352"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18C6YJHJ3kKCS2i5xYTiNiOZ3dA4JjRBFM="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:UlIqJgAdS7s89qgHaUPabR/0jQY= sha1:NTcrAKi/L7SUMS96Jqc5AqCiRdE=
Xref: csiph.com comp.lang.c:397992
cross@spitfire.i.gajendra.net (Dan Cross) writes:
> In article <861pgkaje2.fsf@linuxsc.com>,
> Tim Rentsch wrote:
>
>> cross@spitfire.i.gajendra.net (Dan Cross) writes:
>>
>>> In article <863416xid5.fsf@linuxsc.com>,
>>> Tim Rentsch wrote:
>>>
>>>> antispam@fricas.org (Waldek Hebisch) writes:
>>>>
>>>>> Tim Rentsch wrote:
>>>>>
>>>>>> Michael S writes:
>>>>>>
>>>>>>> On Mon, 06 Apr 2026 15:13:32 -0700
>>>>>>> Tim Rentsch wrote:
>>>>>>>
>>>>>>>> Obviously what words (or lines) appear can affect the character
>>>>>>>> counts, but that still doesn't change BigO. By the way you don't
>>>>>>>> say whether you are sorting words or lines.
>>>>>>>
>>>>>>> This sub-thread is about sorting lines with average length of few
>>>>>>> dozens characters, i.e. many times longer than log2(N). That was
>>>>>>> stated at one of earlier posts.
>>>>>>
>>>>>> That has nothing to do with BigO, which is about asymptotic
>>>>>> behavior as N goes to infinity.
>>>>>
>>>>> Honest Big(O) varies length of the key with N. In practical range
>>>>> key length may be constant, but fixing length gives unrealistic
>>>>> problem for Big(O) analysis: without varying key length there are
>>>>> finitely many keys and sorting is equivalent to counting how many
>>>>> times each key appears in the input.
>>>>
>>>> There's an important clarification to make here. There are two
>>>> independent parameters: N, the number of records to be sorted (a
>>>> record being a character string that is either a word or a line),
>>>> and the (maximum) length of any record, which in the discussion is
>>>> bounded above by a constant.
>>>>
>>>> What is being asked about is the behavior as a function of N as N
>>>> increases without bound. Of course, theoretically, as the number of
>>>> records increases without bound, eventually the character strings
>>>> being sorted will have to have duplicates. But long before that
>>>> happens the index variable N will run out of bits. This property is
>>>> well understood in theoretical computer science, not just in terms
>>>> of how much time is used but how much storage is needed. In theory
>>>> log N bits are needed just to hold the index pointers. It is
>>>> customary though, outside of purely theoretical discussions, to
>>>> ignore that and treat the size of an index or pointer variable as
>>>> constant. In purely theoretical terms no sorting algorithm is
>>>> O(N*log(N)), because just incrementing a pointer takes more than
>>>> O(1) operations. Surely the discussions in Knuth's books take such
>>>> things into consideration.
>>>
>>> If by "Knuth's books" you're referring to TAOCP, then he does
>>> not seem to give it too much attention. [...]
>>
>> In most of the chapter on Sorting, TAOCP uses the number of
>> comparisons as the basis of comparison. But not everywhere
>> in the chapter.
>
> It seems like I pointed out a few places where he acknowledges
> a more complex picture. Are there other places to which you are
> referring?
>
>> My statement was not meant to be limited to the discussion of
>> Sorting.
>
> What do you think I was referring to, exactly? I was responding
> to your comments about Knuth's books, specifically, and the
> quoted text above, which seems concerned solely with sorting.
>
> As I mentioned, Dasgupta et al _do_ mention that analysis of
> algorithms is more complex than most treatments, because of
> precisely the idea that as things grow, seemingly constant
> operations are no longer constant. As I mentioned, they did
> this within the context of Fibonacci numbers, not sorting, but
> the point stands. Since, as you say, your statement was not
> meant to be limited to discussions of sorting, then it seems to
> be supporting what you are saying.
>
>>>> On the practical side, which almost
>>>> always covers discussions that take place in usenet newsgroups,
>>>> these minor theoretical issues are ignored. Any actul computer in
>>>> the physical universe will never have occasion to process more than
>>>> 2**512 records, due to the limitation of the number of elementary
>>>> particles in the universe, so a 512-bit address (or index value)
>>>> always suffices.
>>>>
>>>> So yes, in theory, the considerations around processing an enormous
>>>> number of values are relevant. In the practical context of the
>>>> discussion underway here, they aren't.
>>>
>>> Indeed. As Rob Pike once put it, "Fancy algorithms are slow
>>> when $n$ is small, and $n$ is usually small. Fancy algorithms
>>> have big constants. Until you know that $n$ is frequently going
>>> to get big, don't get fancy."
>>
>> Whether the Rob Pike advisory is applicable or not is beside the
>> point.
>
> On the contrary; I mentioned it because it supports your thesis.
>
>> My comment was about fancy mathematics, not fancy
>> algorithms. My statement is just as applicable to Tim Sort (one
>> of the fancier sorting algorithms) as it is to Bubble Sort.
>
> There's nothing particularly fancy about it, but that aside, I'm
> honestly not sure what exactly I said that you are (apparently?)
> disagreeing with.
>
> I was responding with a specific statement about Knuth's books,
> a reference to another book in support of your statement, and
> yet another reference to something that Pike had written that,
> again, supports your point.
I read through your comments. My sense is there are still some
misunderstandings (probably on both sides) but I think it's
probably better just to leave it at that. Also the discussion
has wandered kind of far afield for comp.lang.c, so in the
interest of courtesy I am stopping here, except to add thank you
for your comments.