Groups | Search | Server Info | Login | Register
Groups > comp.lang.c > #397992
| From | Tim Rentsch <tr.17687@z991.linuxsc.com> |
|---|---|
| Newsgroups | comp.lang.c |
| Subject | Re: sorting Was: Isn't that beauty ? (no it's not) |
| Date | 2026-04-26 07:29 -0700 |
| Organization | A noiseless patient Spider |
| Message-ID | <86340h3idt.fsf@linuxsc.com> (permalink) |
| References | <10otm7r$1ntrg$1@raubtier-asyl.eternal-september.org> <863416xid5.fsf@linuxsc.com> <10r94t2$or8$1@reader1.panix.com> <861pgkaje2.fsf@linuxsc.com> <10rf8rr$j50$1@reader1.panix.com> |
cross@spitfire.i.gajendra.net (Dan Cross) writes: > In article <861pgkaje2.fsf@linuxsc.com>, > Tim Rentsch <tr.17687@z991.linuxsc.com> wrote: > >> cross@spitfire.i.gajendra.net (Dan Cross) writes: >> >>> In article <863416xid5.fsf@linuxsc.com>, >>> Tim Rentsch <tr.17687@z991.linuxsc.com> wrote: >>> >>>> antispam@fricas.org (Waldek Hebisch) writes: >>>> >>>>> Tim Rentsch <tr.17687@z991.linuxsc.com> wrote: >>>>> >>>>>> Michael S <already5chosen@yahoo.com> writes: >>>>>> >>>>>>> On Mon, 06 Apr 2026 15:13:32 -0700 >>>>>>> Tim Rentsch <tr.17687@z991.linuxsc.com> wrote: >>>>>>> >>>>>>>> Obviously what words (or lines) appear can affect the character >>>>>>>> counts, but that still doesn't change BigO. By the way you don't >>>>>>>> say whether you are sorting words or lines. >>>>>>> >>>>>>> This sub-thread is about sorting lines with average length of few >>>>>>> dozens characters, i.e. many times longer than log2(N). That was >>>>>>> stated at one of earlier posts. >>>>>> >>>>>> That has nothing to do with BigO, which is about asymptotic >>>>>> behavior as N goes to infinity. >>>>> >>>>> Honest Big(O) varies length of the key with N. In practical range >>>>> key length may be constant, but fixing length gives unrealistic >>>>> problem for Big(O) analysis: without varying key length there are >>>>> finitely many keys and sorting is equivalent to counting how many >>>>> times each key appears in the input. >>>> >>>> There's an important clarification to make here. There are two >>>> independent parameters: N, the number of records to be sorted (a >>>> record being a character string that is either a word or a line), >>>> and the (maximum) length of any record, which in the discussion is >>>> bounded above by a constant. >>>> >>>> What is being asked about is the behavior as a function of N as N >>>> increases without bound. Of course, theoretically, as the number of >>>> records increases without bound, eventually the character strings >>>> being sorted will have to have duplicates. But long before that >>>> happens the index variable N will run out of bits. This property is >>>> well understood in theoretical computer science, not just in terms >>>> of how much time is used but how much storage is needed. In theory >>>> log N bits are needed just to hold the index pointers. It is >>>> customary though, outside of purely theoretical discussions, to >>>> ignore that and treat the size of an index or pointer variable as >>>> constant. In purely theoretical terms no sorting algorithm is >>>> O(N*log(N)), because just incrementing a pointer takes more than >>>> O(1) operations. Surely the discussions in Knuth's books take such >>>> things into consideration. >>> >>> If by "Knuth's books" you're referring to TAOCP, then he does >>> not seem to give it too much attention. [...] >> >> In most of the chapter on Sorting, TAOCP uses the number of >> comparisons as the basis of comparison. But not everywhere >> in the chapter. > > It seems like I pointed out a few places where he acknowledges > a more complex picture. Are there other places to which you are > referring? > >> My statement was not meant to be limited to the discussion of >> Sorting. > > What do you think I was referring to, exactly? I was responding > to your comments about Knuth's books, specifically, and the > quoted text above, which seems concerned solely with sorting. > > As I mentioned, Dasgupta et al _do_ mention that analysis of > algorithms is more complex than most treatments, because of > precisely the idea that as things grow, seemingly constant > operations are no longer constant. As I mentioned, they did > this within the context of Fibonacci numbers, not sorting, but > the point stands. Since, as you say, your statement was not > meant to be limited to discussions of sorting, then it seems to > be supporting what you are saying. > >>>> On the practical side, which almost >>>> always covers discussions that take place in usenet newsgroups, >>>> these minor theoretical issues are ignored. Any actul computer in >>>> the physical universe will never have occasion to process more than >>>> 2**512 records, due to the limitation of the number of elementary >>>> particles in the universe, so a 512-bit address (or index value) >>>> always suffices. >>>> >>>> So yes, in theory, the considerations around processing an enormous >>>> number of values are relevant. In the practical context of the >>>> discussion underway here, they aren't. >>> >>> Indeed. As Rob Pike once put it, "Fancy algorithms are slow >>> when $n$ is small, and $n$ is usually small. Fancy algorithms >>> have big constants. Until you know that $n$ is frequently going >>> to get big, don't get fancy." >> >> Whether the Rob Pike advisory is applicable or not is beside the >> point. > > On the contrary; I mentioned it because it supports your thesis. > >> My comment was about fancy mathematics, not fancy >> algorithms. My statement is just as applicable to Tim Sort (one >> of the fancier sorting algorithms) as it is to Bubble Sort. > > There's nothing particularly fancy about it, but that aside, I'm > honestly not sure what exactly I said that you are (apparently?) > disagreeing with. > > I was responding with a specific statement about Knuth's books, > a reference to another book in support of your statement, and > yet another reference to something that Pike had written that, > again, supports your point. I read through your comments. My sense is there are still some misunderstandings (probably on both sides) but I think it's probably better just to leave it at that. Also the discussion has wandered kind of far afield for comp.lang.c, so in the interest of courtesy I am stopping here, except to add thank you for your comments.
Back to comp.lang.c | Previous | Next — Previous in thread | Find similar
Re: sorting Was: Isn't that beauty ? (no it's not) Tim Rentsch <tr.17687@z991.linuxsc.com> - 2026-04-11 21:32 -0700
Re: sorting Was: Isn't that beauty ? (no it's not) cross@spitfire.i.gajendra.net (Dan Cross) - 2026-04-12 04:59 +0000
Re: sorting Was: Isn't that beauty ? (no it's not) Tim Rentsch <tr.17687@z991.linuxsc.com> - 2026-04-26 07:29 -0700
csiph-web