Path: csiph.com!eternal-september.org!feeder.eternal-september.org!nntp.eternal-september.org!.POSTED!not-for-mail
From: Tim Rentsch
Newsgroups: comp.lang.c
Subject: Re: sorting Was: Isn't that beauty ? (no it's not)
Date: Tue, 07 Apr 2026 20:04:38 -0700
Organization: A noiseless patient Spider
Lines: 56
Message-ID: <863416xid5.fsf@linuxsc.com>
References: <10otm7r$1ntrg$1@raubtier-asyl.eternal-september.org> <10phfh8$u3oe$1@dont-email.me> <20260319214046.000004b5@yahoo.com> <10pi29u$15114$1@dont-email.me> <10pih35$17mc2$5@dont-email.me> <20260320125824.00002759@yahoo.com> <10pjdll$1h0eh$6@dont-email.me> <10pjedq$1if62$1@dont-email.me> <20260320150840.000045db@yahoo.com> <10pjj56$1k50i$1@dont-email.me> <20260322020323.00006a16@yahoo.com> <864ilnzqib.fsf@linuxsc.com> <20260407022213.00001ae7@yahoo.com> <86mrzfxvv5.fsf@linuxsc.com> <10r35bp$29m9s$1@paganini.bofh.team>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Date: Wed, 08 Apr 2026 03:04:43 +0000 (UTC)
Injection-Info: dont-email.me; posting-host="078da6b6fdfa8b5cb04ea822fecda5b3"; logging-data="3458103"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/VTfU/n0n5JzXc562JOL5kIqR210elvB4="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:BSK+PIpSgfFQvg4/s0n2vnJeTxc= sha1:if4i9TSquncpOsDSGzJR/y5/UH8=
Xref: csiph.com comp.lang.c:397417
antispam@fricas.org (Waldek Hebisch) writes:
> Tim Rentsch wrote:
>
>> Michael S writes:
>>
>>> On Mon, 06 Apr 2026 15:13:32 -0700
>>> Tim Rentsch wrote:
>>>
>>>> Obviously what words (or lines) appear can affect the character
>>>> counts, but that still doesn't change BigO. By the way you don't
>>>> say whether you are sorting words or lines.
>>>
>>> This sub-thread is about sorting lines with average length of few
>>> dozens characters, i.e. many times longer than log2(N). That was
>>> stated at one of earlier posts.
>>
>> That has nothing to do with BigO, which is about asymptotic
>> behavior as N goes to infinity.
>
> Honest Big(O) varies length of the key with N. In practical range
> key length may be constant, but fixing length gives unrealistic
> problem for Big(O) analysis: without varying key length there are
> finitely many keys and sorting is equivalent to counting how many
> times each key appears in the input.
There's an important clarification to make here. There are two
independent parameters: N, the number of records to be sorted (a
record being a character string that is either a word or a line),
and the (maximum) length of any record, which in the discussion is
bounded above by a constant.
What is being asked about is the behavior as a function of N as N
increases without bound. Of course, theoretically, as the number of
records increases without bound, eventually the character strings
being sorted will have to have duplicates. But long before that
happens the index variable N will run out of bits. This property is
well understood in theoretical computer science, not just in terms
of how much time is used but how much storage is needed. In theory
log N bits are needed just to hold the index pointers. It is
customary though, outside of purely theoretical discussions, to
ignore that and treat the size of an index or pointer variable as
constant. In purely theoretical terms no sorting algorithm is
O(N*log(N)), because just incrementing a pointer takes more than
O(1) operations. Surely the discussions in Knuth's books take such
things into consideration. On the practical side, which almost
always covers discussions that take place in usenet newsgroups,
these minor theoretical issues are ignored. Any actul computer in
the physical universe will never have occasion to process more than
2**512 records, due to the limitation of the number of elementary
particles in the universe, so a 512-bit address (or index value)
always suffices.
So yes, in theory, the considerations around processing an enormous
number of values are relevant. In the practical context of the
discussion underway here, they aren't.