Groups | Search | Server Info | Login | Register


Groups > comp.lang.c > #397953

Re: sorting Was: Isn't that beauty ? (no it's not)

From Tim Rentsch <tr.17687@z991.linuxsc.com>
Newsgroups comp.lang.c
Subject Re: sorting Was: Isn't that beauty ? (no it's not)
Date 2026-04-25 15:47 -0700
Organization A noiseless patient Spider
Message-ID <86fr4i3ben.fsf@linuxsc.com> (permalink)
References <10otm7r$1ntrg$1@raubtier-asyl.eternal-september.org> <863416xid5.fsf@linuxsc.com> <10r9d06$32585$1@paganini.bofh.team> <86fr508dwq.fsf@linuxsc.com> <10rjkin$otp$1@reader1.panix.com>

Show all headers | View raw


cross@spitfire.i.gajendra.net (Dan Cross) writes:

> In article <86fr508dwq.fsf@linuxsc.com>,
> Tim Rentsch  <tr.17687@z991.linuxsc.com> wrote:
>
>> antispam@fricas.org (Waldek Hebisch) writes:
>>
>>> Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
>>>
>>>> antispam@fricas.org (Waldek Hebisch) writes:
>>>>
>>>>> Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
>>>>>
>>>>>> Michael S <already5chosen@yahoo.com> writes:
>>>>>>
>>>>>>> On Mon, 06 Apr 2026 15:13:32 -0700
>>>>>>> Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
>>>>>>>
>>>>>>>> Obviously what words (or lines) appear can affect the character
>>>>>>>> counts, but that still doesn't change BigO.  By the way you don't
>>>>>>>> say whether you are sorting words or lines.
>>>>>>>
>>>>>>> This sub-thread is about sorting lines with average length of few
>>>>>>> dozens characters, i.e. many times longer than log2(N).  That was
>>>>>>> stated at one of earlier posts.
>>>>>>
>>>>>> That has nothing to do with BigO, which is about asymptotic
>>>>>> behavior as N goes to infinity.
>>>>>
>>>>> Honest Big(O) varies length of the key with N.  In practical range
>>>>> key length may be constant, but fixing length gives unrealistic
>>>>> problem for Big(O) analysis:  without varying key length there are
>>>>> finitely many keys and sorting is equivalent to counting how many
>>>>> times each key appears in the input.
>>>>
>>>> There's an important clarification to make here.  There are two
>>>> independent parameters:  N, the number of records to be sorted (a
>>>> record being a character string that is either a word or a line),
>>>> and the (maximum) length of any record, which in the discussion is
>>>> bounded above by a constant.
>>>
>>> There is one choice, made often to simplify problem.  But there
>>> is quite universal choice:  N is total size of input data.
>>
>> No, N is usually the number of records to be sorted, not the size of
>> the input.  Also, in the discussion I was responding to, there were
>> clearly two distinct axes relevant to the discussion.
>>
>>>> What is being asked about is the behavior as a function of N as N
>>>> increases without bound.  Of course, theoretically, as the number of
>>>> records increases without bound, eventually the character strings
>>>> being sorted will have to have duplicates.  But long before that
>>>> happens the index variable N will run out of bits.  This property is
>>>> well understood in theoretical computer science, not just in terms
>>>> of how much time is used but how much storage is needed.  In theory
>>>> log N bits are needed just to hold the index pointers.  It is
>>>> customary though, outside of purely theoretical discussions, to
>>>> ignore that and treat the size of an index or pointer variable as
>>>> constant.  In purely theoretical terms no sorting algorithm is
>>>> O(N*log(N)), because just incrementing a pointer takes more than
>>>> O(1) operations.  Surely the discussions in Knuth's books take such
>>>> things into consideration.  On the practical side, which almost
>>>> always covers discussions that take place in usenet newsgroups,
>>>> these minor theoretical issues are ignored.  Any actul computer in
>>>> the physical universe will never have occasion to process more than
>>>> 2**512 records, due to the limitation of the number of elementary
>>>> particles in the universe, so a 512-bit address (or index value)
>>>> always suffices.
>>>>
>>>> So yes, in theory, the considerations around processing an enormous
>>>> number of values are relevant.  In the practical context of the
>>>> discussion underway here, they aren't.
>>>
>>> It was you who insisted on asymptitic complexity, rejecting
>>> practical amendments...
>>
>> This statement isn't right.  BigO was already part of the discussion
>> when I joined in.  Also, it is customary in discussion of sorting
>> algorithms to use the metric of number of comparisons done, without
>> regard to the size of the variables needed to hold the indices of
>> the records being sorted.
>
> Hmm, this response echoes almost exactly what I wrote in
> <10ranaa$ihf$1@reader1.panix.com> on April 10th, also in
> response to Waldek.  Did you see it?

Probably, but I don't remember it specifically.

>> See Knuth chapter 5, on Sorting, in volume 3 of TAOCP.
>
> As for TAOCP, as much as I respect and admire Knuth, I don't
> think "The Art of Computer Programming" is a good reference for
> working programmers.  At 391 pages long, Chapter Five occupies
> more than half of an ~750 page book, the examples are all in
> assembly language for his notional MIX computer, and the
> analysis he presents presumes a mathematics background that,
> frankly, most programmers neither have nor need.

I am not recommending TAOCP as a textbook.  It's a well-known work
and respected for its theoretical treatments.  Also it happens to be
what I consulted before posting, mainly because it was handy.

> "See chapter 5" is thus not useful as a reference, and rather
> comes across as an admonishment.

No admonishment was intented.  I mentioned it only as a supporting
work for my previous statement.

Back to comp.lang.c | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

Re: sorting Was: Isn't that beauty ? (no it's not) antispam@fricas.org (Waldek Hebisch) - 2026-04-09 23:33 +0000
  Re: sorting Was: Isn't that beauty ? (no it's not) cross@spitfire.i.gajendra.net (Dan Cross) - 2026-04-10 11:35 +0000
  Re: sorting Was: Isn't that beauty ? (no it's not) Tim Rentsch <tr.17687@z991.linuxsc.com> - 2026-04-12 07:13 -0700
    Re: sorting Was: Isn't that beauty ? (no it's not) cross@spitfire.i.gajendra.net (Dan Cross) - 2026-04-13 20:44 +0000
      Re: sorting Was: Isn't that beauty ? (no it's not) Tim Rentsch <tr.17687@z991.linuxsc.com> - 2026-04-25 15:47 -0700
        Re: sorting Was: Isn't that beauty ? (no it's not) cross@spitfire.i.gajendra.net (Dan Cross) - 2026-04-27 02:04 +0000
          Re: sorting Was: Isn't that beauty ? (no it's not) Tim Rentsch <tr.17687@z991.linuxsc.com> - 2026-04-26 22:27 -0700
            Re: sorting Was: Isn't that beauty ? (no it's not) cross@spitfire.i.gajendra.net (Dan Cross) - 2026-04-27 14:41 +0000

csiph-web