Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.c > #386054 > unrolled thread

realloc() - frequency, conditions, or experiences about relocation?

Started byJanis Papanagnou <janis_papanagnou+ng@hotmail.com>
First post2024-06-17 08:08 +0200
Last post2024-06-25 16:16 +0200
Articles 20 on this page of 100 — 27 participants

Back to article view | Back to comp.lang.c


Contents

  realloc() - frequency, conditions, or experiences about relocation? Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-06-17 08:08 +0200
    Re: realloc() - frequency, conditions, or experiences about relocation? "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-06-16 23:34 -0700
    Re: realloc() - frequency, conditions, or experiences about relocation? Ben Bacarisse <ben@bsb.me.uk> - 2024-06-17 10:18 +0100
      Re: realloc() - frequency, conditions, or experiences about relocation? Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2024-06-17 10:31 +0100
        Re: realloc() - frequency, conditions, or experiences about relocation? Ben Bacarisse <ben@bsb.me.uk> - 2024-06-17 10:55 +0100
          Re: realloc() - frequency, conditions, or experiences about relocation? Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2024-06-17 13:45 +0100
            Re: realloc() - frequency, conditions, or experiences about relocation? Ben Bacarisse <ben@bsb.me.uk> - 2024-06-17 15:33 +0100
              Re: realloc() - frequency, conditions, or experiences about relocation? Anton Shepelev <anton.txt@g{oogle}mail.com> - 2024-06-17 18:10 +0300
                Re: realloc() - frequency, conditions, or experiences about relocation? Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-06-18 00:09 -0700
                  Re: realloc() - frequency, conditions, or experiences about relocation? Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2024-06-18 11:19 +0100
                    Re: realloc() - frequency, conditions, or experiences about relocation? Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2024-06-18 11:46 +0100
                      Re: realloc() - frequency, conditions, or experiences about relocation? Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-29 00:14 +0000
                        Re: realloc() - frequency, conditions, or experiences about relocation? Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2024-07-02 10:18 +0100
                          Re: realloc() - frequency, conditions, or experiences about relocation? Ben Bacarisse <ben@bsb.me.uk> - 2024-07-02 16:39 +0100
                          Re: realloc() - frequency, conditions, or experiences about relocation? Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-07-03 23:48 +0000
                    Re: realloc() - frequency, conditions, or experiences about relocation? Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-06-24 09:32 -0700
                      Re: realloc() - frequency, conditions, or experiences about relocation? David Brown <david.brown@hesbynett.no> - 2024-06-24 19:19 +0200
                  Indefinite pronouns [was:Re: realloc() - frequency, conditions, or experiences about relocation?] Anton Shepelev <anton.txt@gmail.moc> - 2024-06-20 02:51 +0300
                    Re: Indefinite pronouns vallor <vallor@cultnix.org> - 2024-06-20 00:34 +0000
                      Re: Indefinite pronouns David Brown <david.brown@hesbynett.no> - 2024-06-21 12:59 +0200
                        Re: Indefinite pronouns Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-21 10:31 -0700
                    Re: Indefinite pronouns [was:Re: realloc() - frequency, conditions, or experiences about relocation?] gazelle@shell.xmission.com (Kenny McCormack) - 2024-06-20 12:10 +0000
                      Re: Indefinite pronouns [was: Re: <something technical>] Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-06-20 15:04 +0200
                    Re: Indefinite pronouns [was:Re: realloc() - frequency, conditions, or experiences about relocation?] Cri-Cri <cri@cri.cri.invalid> - 2024-06-21 00:55 +0000
                    Re: Indefinite pronouns [was:Re: realloc() - frequency, conditions, or experiences about relocation?] Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-06-20 23:23 -0700
              Re: realloc() - frequency, conditions, or experiences about relocation? Richard Harnden <richard.nospam@gmail.invalid> - 2024-06-17 16:15 +0100
                Re: realloc() - frequency, conditions, or experiences about relocation? "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-06-17 13:05 -0700
              Re: realloc() - frequency, conditions, or experiences about relocation? Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2024-06-17 16:36 +0100
            Re: realloc() - frequency, conditions, or experiences about relocation? Anton Shepelev <anton.txt@g{oogle}mail.com> - 2024-06-17 18:02 +0300
              Re: realloc() - frequency, conditions, or experiences about relocation? "David Jones" <dajhawk18xx@@nowhere.com> - 2024-06-18 17:59 +0000
              Re: realloc() - frequency, conditions, or experiences about relocation? davidd02@tpg.com.au (David Duffy) - 2024-06-19 06:48 +0000
                Re: realloc() - frequency, conditions, or experiences about relocation? Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2024-06-19 10:12 +0100
                  Re: realloc() - frequency, conditions, or experiences about relocation? Ben Bacarisse <ben@bsb.me.uk> - 2024-06-19 16:36 +0100
                    Re: realloc() - frequency, conditions, or experiences about relocation? David Brown <david.brown@hesbynett.no> - 2024-06-19 19:41 +0200
                      Re: realloc() - frequency, conditions, or experiences about relocation? Ben Bacarisse <ben@bsb.me.uk> - 2024-06-19 22:24 +0100
                        Re: realloc() - frequency, conditions, or experiences about relocation? David Brown <david.brown@hesbynett.no> - 2024-06-20 13:22 +0200
                  Re: realloc() - frequency, conditions, or experiences about relocation? Anton Shepelev <anton.txt@gmail.moc> - 2024-06-20 01:53 +0300
                    Re: realloc() - frequency, conditions, or experiences about relocation? Anton Shepelev <anton.txt@g{oogle}mail.com> - 2024-07-08 19:34 +0300
                Re: realloc() - frequency, conditions, or experiences about relocation? Anton Shepelev <anton.txt@g{oogle}mail.com> - 2024-06-19 15:20 +0300
              Re: realloc() - frequency, conditions, or experiences about relocation? Rich Ulrich <rich.ulrich@comcast.net> - 2024-07-02 00:51 -0400
                Re: realloc() - frequency, conditions, or experiences about relocation? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-07-01 22:10 -0700
                  Re: realloc() - frequency, conditions, or experiences about relocation? Rich Ulrich <rich.ulrich@comcast.net> - 2024-07-02 11:45 -0400
                    Re: realloc() - frequency, conditions, or experiences about relocation? Anton Shepelev <anton.txt@g{oogle}mail.com> - 2024-07-08 20:01 +0300
                      Re: realloc() - frequency, conditions, or experiences about relocation? Rich Ulrich <rich.ulrich@comcast.net> - 2024-07-21 19:40 -0400
                        Re: realloc() - frequency, conditions, or experiences about relocation? Anton Shepelev <anton.txt@g{oogle}mail.com> - 2024-07-23 16:47 +0300
                Re: realloc() - frequency, conditions, or experiences about relocation? Paul <nospam@needed.invalid> - 2024-07-02 03:02 -0400
                  Re: realloc() - frequency, conditions, or experiences about relocation? Rich Ulrich <rich.ulrich@comcast.net> - 2024-07-02 11:52 -0400
                    Re: realloc() - frequency, conditions, or experiences about relocation? Rich Ulrich <rich.ulrich@comcast.net> - 2024-07-02 11:58 -0400
                      Re: realloc() - frequency, conditions, or experiences about relocation? Paul <nospam@needed.invalid> - 2024-07-02 15:09 -0400
                        Re: realloc() - frequency, conditions, or experiences about relocation? James Kuyper <jameskuyper@alumni.caltech.edu> - 2024-07-02 16:54 -0400
                        Re: realloc() - frequency, conditions, or experiences about relocation? James Kuyper <jameskuyper@alumni.caltech.edu> - 2024-07-02 16:58 -0400
            Re: realloc() - frequency, conditions, or experiences about relocation? scott@slp53.sl.home (Scott Lurndal) - 2024-06-17 16:58 +0000
              Re: realloc() - frequency, conditions, or experiences about relocation? "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-06-17 13:12 -0700
            Re: realloc() - frequency, conditions, or experiences about relocation? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-17 16:39 -0700
              Re: realloc() - frequency, conditions, or experiences about relocation? Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2024-06-18 06:12 +0100
              Re: realloc() - frequency, conditions, or experiences about relocation? Bonita Montero <Bonita.Montero@gmail.com> - 2024-06-18 09:56 +0200
        Re: realloc() - frequency, conditions, or experiences about relocation? David Brown <david.brown@hesbynett.no> - 2024-06-17 20:11 +0200
    Re: realloc() - frequency, conditions, or experiences about relocation? Bonita Montero <Bonita.Montero@gmail.com> - 2024-06-17 11:22 +0200
      Re: realloc() - frequency, conditions, or experiences about relocation? Vir Campestris <vir.campestris@invalid.invalid> - 2024-06-20 21:08 +0100
        Re: realloc() - frequency, conditions, or experiences about relocation? Bonita Montero <Bonita.Montero@gmail.com> - 2024-06-21 21:12 +0200
          Re: realloc() - frequency, conditions, or experiences about relocation? Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-24 08:40 +0000
            Re: realloc() - frequency, conditions, or experiences about relocation? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-24 02:55 -0700
              Re: realloc() - frequency, conditions, or experiences about relocation? David Brown <david.brown@hesbynett.no> - 2024-06-24 13:40 +0200
                Re: realloc() - frequency, conditions, or experiences about relocation? Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2024-06-24 18:50 +0100
                  Re: realloc() - frequency, conditions, or experiences about relocation? scott@slp53.sl.home (Scott Lurndal) - 2024-06-24 18:20 +0000
                    Re: realloc() - frequency, conditions, or experiences about relocation? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-24 14:28 -0700
                      Re: realloc() - frequency, conditions, or experiences about relocation? Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2024-06-25 00:22 +0100
                      Re: realloc() - frequency, conditions, or experiences about relocation? "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-06-24 18:31 -0700
                  Re: realloc() - frequency, conditions, or experiences about relocation? Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-25 07:06 +0000
                    Re: realloc() - frequency, conditions, or experiences about relocation? Bonita Montero <Bonita.Montero@gmail.com> - 2024-06-25 10:38 +0200
                      Re: realloc() - frequency, conditions, or experiences about relocation? Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-26 00:51 +0000
                Re: realloc() - frequency, conditions, or experiences about relocation? "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-06-24 12:33 -0700
                  Re: realloc() - frequency, conditions, or experiences about relocation? Bonita Montero <Bonita.Montero@gmail.com> - 2024-06-24 21:48 +0200
                Re: realloc() - frequency, conditions, or experiences about relocation? Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-24 22:59 +0000
              Re: realloc() - frequency, conditions, or experiences about relocation? Bonita Montero <Bonita.Montero@gmail.com> - 2024-06-24 16:19 +0200
              Re: realloc() - frequency, conditions, or experiences about relocation? Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-25 07:02 +0000
                Re: realloc() - frequency, conditions, or experiences about relocation? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-25 03:05 -0700
                  Re: realloc() - frequency, conditions, or experiences about relocation? Richard Damon <richard@damon-family.org> - 2024-06-25 07:21 -0400
                  Re: realloc() - frequency, conditions, or experiences about relocation? Phil Carmody <pc+usenet@asdf.org> - 2024-06-28 11:01 +0300
                    Re: realloc() - frequency, conditions, or experiences about relocation? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-28 04:04 -0700
                  Re: realloc() - frequency, conditions, or experiences about relocation? James Kuyper <jameskuyper@alumni.caltech.edu> - 2024-06-28 06:37 -0400
                    Re: realloc() - frequency, conditions, or experiences about relocation? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-28 04:05 -0700
                Re: realloc() - frequency, conditions, or experiences about relocation? James Kuyper <jameskuyper@alumni.caltech.edu> - 2024-06-28 06:36 -0400
            Re: realloc() - frequency, conditions, or experiences about relocation? Bonita Montero <Bonita.Montero@gmail.com> - 2024-06-24 16:16 +0200
              Down the hall, past the water cooler, third door on the left... (Was: realloc() - frequency, conditions, or experiences about) relocation? gazelle@shell.xmission.com (Kenny McCormack) - 2024-06-24 15:44 +0000
                Re: Down the hall, past the water cooler, third door on the left... (Was: realloc() - frequency, conditions, or experiences about) relocation? Bonita Montero <Bonita.Montero@gmail.com> - 2024-06-24 20:16 +0200
    Re: realloc() - frequency, conditions, or experiences about relocation? David Brown <david.brown@hesbynett.no> - 2024-06-17 14:15 +0200
      Re: realloc() - frequency, conditions, or experiences about relocation? Bonita Montero <Bonita.Montero@gmail.com> - 2024-06-17 14:18 +0200
    Re: realloc() - frequency, conditions, or experiences about relocation? Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-06-17 15:21 +0200
    Re: realloc() - frequency, conditions, or experiences about relocation? scott@slp53.sl.home (Scott Lurndal) - 2024-06-17 16:50 +0000
      Re: realloc() - frequency, conditions, or experiences about relocation? Michael S <already5chosen@yahoo.com> - 2024-06-17 20:20 +0300
        Re: realloc() - frequency, conditions, or experiences about relocation? scott@slp53.sl.home (Scott Lurndal) - 2024-06-17 19:02 +0000
    Re: realloc() - frequency, conditions, or experiences about relocation? Rosario19 <Ros@invalid.invalid> - 2024-06-18 11:50 +0200
    Re: realloc() - frequency, conditions, or experiences about relocation? Bonita Montero <Bonita.Montero@gmail.com> - 2024-06-25 10:48 +0200
      Re: realloc() - frequency, conditions, or experiences about relocation? Vir Campestris <vir.campestris@invalid.invalid> - 2024-06-25 11:55 +0100
        Re: realloc() - frequency, conditions, or experiences about relocation? Bonita Montero <Bonita.Montero@gmail.com> - 2024-06-25 13:28 +0200
          Re: realloc() - frequency, conditions, or experiences about relocation? Vir Campestris <vir.campestris@invalid.invalid> - 2024-06-26 12:15 +0100
            Re: realloc() - frequency, conditions, or experiences about relocation? Bonita Montero <Bonita.Montero@gmail.com> - 2024-06-26 15:01 +0200
      Re: realloc() - frequency, conditions, or experiences about relocation? DFS <nospam@dfs.com> - 2024-06-25 09:56 -0400
        Re: realloc() - frequency, conditions, or experiences about relocation? Bonita Montero <Bonita.Montero@gmail.com> - 2024-06-25 16:16 +0200

Page 2 of 5 — ← Prev page 1 [2] 3 4 5  Next page →


#386325 — Re: Indefinite pronouns

FromKeith Thompson <Keith.S.Thompson+u@gmail.com>
Date2024-06-21 10:31 -0700
SubjectRe: Indefinite pronouns
Message-ID<878qyyw6c5.fsf@nosuchdomain.example.com>
In reply to#386310
David Brown <david.brown@hesbynett.no> writes:
> On 20/06/2024 02:34, vallor wrote:
>> On Thu, 20 Jun 2024 02:51:56 +0300, Anton Shepelev <anton.txt@gmail.moc>
>> wrote in <20240620025156.2ae9300726603b4cb3631547@gmail.moc>:
>> 
>>> Cross-posting to alt.english.usage .
>> I've set the followup-to: same
>
> I've put it back to comp.lang.c.  It's off-topic for that group,

Please don't do that.

[...]

-- 
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

[toc] | [prev] | [next] | [standalone]


#386264 — Re: Indefinite pronouns [was:Re: realloc() - frequency, conditions, or experiences about relocation?]

Fromgazelle@shell.xmission.com (Kenny McCormack)
Date2024-06-20 12:10 +0000
SubjectRe: Indefinite pronouns [was:Re: realloc() - frequency, conditions, or experiences about relocation?]
Message-ID<v5168d$2fiq3$1@news.xmission.com>
In reply to#386254
>> > I think it is a modern English idiom, which I dislike as
>> > well.  StackOverflow is full of questions starting like:
>> > "How do you do this?" and "How do I do that?"  They are
>> > informal ways of the more literary "How does one do
>> > this?"  or "What is the way to do that?"
>>
>> I have a different take here.  First the "your" of "your
>> strategy" reads as a definite pronoun, meaning it refers
>> specifically to Ben and not to some unknown other party.
>
>And I am /sure/ it is intended in the general (indefinite)
>sense, as is the `you' in Malcolm's two following sentences:

This sub-thread is certainly interesting, but it ultimately smacks of
people looking for ways to feel insulted.  Victimhood complex, and all that.

But, it makes me think that the problem is the basic paradigm of newsgroup
(i.e., online forum) communication being thought of as personalized.  I.e.,
as in direct person-to-person communication - as if it was being spoken in
a real room with real people (face-to-face).  The fact is, it is not.  It
would be better if we didn't think of it that way.  Rather, it should be
thought of as communication between the speaker and an anonymous void.
I.e., I'm not talking to you - I am talking to the anonymous void.
Everybody is.

Sort of like in the (US) House of Representatives - members are not ever
supposed to be talking to each other.  Rather, they are always speaking to
the void.

Like I am doing now.

This is also why it is good (And, yes, I know this goes against the CW) to
drop attributions, as I have done here.  Keep it anonymous.

-- 

First of all, I do not appreciate your playing stupid here at all.

	- Thomas 'PointedEars' Lahn -

[toc] | [prev] | [next] | [standalone]


#386266 — Re: Indefinite pronouns [was: Re: <something technical>]

FromJanis Papanagnou <janis_papanagnou+ng@hotmail.com>
Date2024-06-20 15:04 +0200
SubjectRe: Indefinite pronouns [was: Re: <something technical>]
Message-ID<v519c2$2j4vg$1@dont-email.me>
In reply to#386264
On 20.06.2024 14:10, Kenny McCormack wrote:
> 
> This sub-thread is certainly interesting, but it ultimately smacks of
> people looking for ways to feel insulted.  Victimhood complex, and all that.
> 
> But, it makes me think that the problem is the basic paradigm of newsgroup
> (i.e., online forum) communication being thought of as personalized.  I.e.,
> as in direct person-to-person communication - as if it was being spoken in
> a real room with real people (face-to-face).  The fact is, it is not.  It
> would be better if we didn't think of it that way.  Rather, it should be
> thought of as communication between the speaker and an anonymous void.
> I.e., I'm not talking to you - I am talking to the anonymous void.
> Everybody is.
> 
> Sort of like in the (US) House of Representatives - members are not ever
> supposed to be talking to each other.  Rather, they are always speaking to
> the void.
> 
> Like I am doing now.
> 
> This is also why it is good (And, yes, I know this goes against the CW) to
> drop attributions, as I have done here.  Keep it anonymous.

Part 1

This is hard to achieve given that the technical NG infrastructure and
functions "logically" connect the articles; it's only a little burden
to identify (if not already obvious) the addressee.

I think it would be better to try to stay on the issue as opposed to
reply (as so often done) ad hominem (where arguments don't seem to
exist or don't help anymore). This is of course yet more difficult to
achieve and will in practice [also] not work in Usenet (I'm sure).

Language can be used or interpreted in personal or impersonal forms.
Some communication forms - and more so their semantical contents! -
are (beyond the "you" vs. "one" dichotomy) inherently [set up to be]
personal.

-- Anonymous
:-)

Part 2

That all said. I think it's important to know who said/posted what.
It allows to associate personal context/background information when
replying. You can also be more assured about the quality of contents
(to the good or bad) or even ignore certain posts. It saves time and
protects ones health and mental sanity.[*]

There's of course also post (or threads) that just exchange opinions,
and "we" know everyone [typically] has an opinion (and often even in
cases where they are put up against facts). Some folks are known to
post a lot, respond to every thread that appears, contributing facts
(sometimes) but also opinions (or personal offenses); this may be a
nuisance (or just ignored, unless "anonymously" posted).

So far my opinion on this non-technical meta-topic subthread.

Janis
(Darn, I disclosed my identity!)

[*] Wasn't that the inherent problem of all those "social media"
platforms? - Where anonymous posts - and some say that anonymity
does negatively contribute to the language and contents of such
disturbing posts - lead to barbarian communication conditions.
(I know that only from hearsay but it seems common perception.)

[toc] | [prev] | [next] | [standalone]


#386299 — Re: Indefinite pronouns [was:Re: realloc() - frequency, conditions, or experiences about relocation?]

FromCri-Cri <cri@cri.cri.invalid>
Date2024-06-21 00:55 +0000
SubjectRe: Indefinite pronouns [was:Re: realloc() - frequency, conditions, or experiences about relocation?]
Message-ID<Nb4dO.139759$2RJ6.5368@fx05.ams4>
In reply to#386254
On Thu, 20 Jun 2024 02:51:56 +0300, Anton Shepelev wrote:

>> But "one" can also be used as a first person definite pronoun
>> (referring to the speaker), which an online reference tells me is
>> chiefly British English.

We have the same construct in Swedish: 'en', as in "en kanske skulle kunna 
ta en banan", meaning "one could perhaps have a banana." Referring to 
oneself from an outside perspective, in, for instance, a situation where 
there used to be several items on offer to guests, but now there are only 
bananas and some dry sponge cake left. IOW, of two lesser desirable items, 
one could accept a banana.

-- 
Cri-Cri

[toc] | [prev] | [next] | [standalone]


#386303 — Re: Indefinite pronouns [was:Re: realloc() - frequency, conditions, or experiences about relocation?]

FromTim Rentsch <tr.17687@z991.linuxsc.com>
Date2024-06-20 23:23 -0700
SubjectRe: Indefinite pronouns [was:Re: realloc() - frequency, conditions, or experiences about relocation?]
Message-ID<86plsahl0e.fsf@linuxsc.com>
In reply to#386254
Anton Shepelev <anton.txt@gmail.moc> writes:

> Cross-posting to alt.english.usage .
>
> Tim Rentsch to Anton Shepelev:
>
>>> I think it is a modern English idiom, which I dislike as
>>> well.  StackOverflow is full of questions starting like:
>>> "How do you do this?" and "How do I do that?"  They are
>>> informal ways of the more literary "How does one do
>>> this?"  or "What is the way to do that?"
>>
>> I have a different take here.  First the "your" of "your
>> strategy" reads as a definite pronoun, meaning it refers
>> specifically to Ben and not to some unknown other party.
>
> And I am /sure/ it is intended in the general (indefinite)
> sense,

I don't know why you think that.  I don't know any native
English speakers who would read it as other than referring
to the person being responded to (who was Ben in this case).
Responding to Malcolm's statement, Ben said "It's odd to
call it mine", so it seems Ben also read it as a definite
pronoun, referring to himself. 

As for the rest of your comments, you're reaching bad
conclusions because you're not looking in the right places.
To investigate the meaning and usage of words and phrases,
the best first place to look is always a dictionary.  In
the process of writing my earlier response, I consulted
roughly half a dozen different online dictionaries, reading
definitions for 'one', 'you', 'they', 'indefinite pronoun',
'definite pronoun', and probably some other terms.  I also
looked at non-dictionary sources if they looked relevant,
but my starting point was dictionaries.  Oh, I also looked
up both 'idiom' and 'idiomatic' (which even though they are
related don't mean the same thing).

Incidentally, on the three pages you gave links for,
all of them had at least one example that used "you"
as an indefinite pronoun.

One point I want to clear up.  A couple of times you
characterize the indefinite pronoun usage of "you"
as "modern".  It is not in any sense modern.  I am
a native speaker of English, speaking and reading
the language for well over 60 years.  Furthermore I
was raised by a grammarian.  In all of that time and
experience there was never any hint that "you" as an
indefinite pronoun was new or unusual or considered
substandard or slang or unacceptable.  It is simply
standard usage, and has been for longer than my
lifetime.

>> But "one" can also be used as a first person definite
>> pronoun (referring to the speaker), which an online
>> reference tells me is chiefly British English.
>
> I had no idea it could, nor does Wikipedia.  Can you share
> an example of a definite first-person `one'?

(A) Pick a good search engine (I use duckduckgo.com).
(B) Search for the two words  one definition.
(C) Read the entries for every online dictionary that
is found, or at least the top five or six.

You should find a couple of examples.  It was by going
through this process myself that I discovered "one"
is sometimes used as a first person definite pronoun.

[toc] | [prev] | [next] | [standalone]


#386103

FromRichard Harnden <richard.nospam@gmail.invalid>
Date2024-06-17 16:15 +0100
Message-ID<v4pjuf$n98h$1@dont-email.me>
In reply to#386099
On 17/06/2024 15:33, Ben Bacarisse wrote:
> Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
> 
>> On 17/06/2024 10:55, Ben Bacarisse wrote:
>>> Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
>>>
>>>> On 17/06/2024 10:18, Ben Bacarisse wrote:
>>>>> Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
>>>>>
>>>>>> In a recent thread realloc() was a substantial part of the discussion.
>>>>>> "Occasionally" the increased data storage will be relocated along
>>>>>> with the previously stored data. On huge data sets that might be a
>>>>>> performance factor. Is there any experience or are there any concrete
>>>>>> factors about the conditions when this relocation happens? - I could
>>>>>> imagine that it's no issue as long as you're in some kB buffer range,
>>>>>> but if, say, we're using realloc() to substantially increase buffers
>>>>>> often it might be an issue to consider. It would be good to get some
>>>>>> feeling about that internal.
>>>>> There is obviously a cost, but there is (usually) no alternative if
>>>>> contiguous storage is required.  In practice, the cost is usually
>>>>> moderate and can be very effectively managed by using an exponential
>>>>> allocation scheme: at every reallocation multiply the storage space by
>>>>> some factor greater than 1 (I often use 3/2, but doubling is often used
>>>>> as well).  This results in O(log(N)) rather than O(N) allocations as in
>>>>> your code that added a constant to the size.  Of course, some storage is
>>>>> wasted (that /might/ be retrieved by a final realloc down to the final
>>>>> size) but that's rarely significant.
>>>>>
>>>> So can we work it out?
>>> What is "it"?
>>>
>>>> Let's assume for the moment that the allocations have a semi-normal
>>>> distribution,
>>> What allocations?  The allocations I talked about don't have that
>>> distribution.
>>>
>>>> with negative values disallowed. Now ignoring the first few
>>>> values, if we have allocated, say, 1K, we ought to be able to predict the
>>>> value by integrating the distribution from 1k to infinity and taking the
>>>> mean.
>>> I have no idea what you are talking about.  What "value" are you looking
>>> to calculate?
>>>
>> We have a continuously growing buffer, and we want the best strategy for
>> reallocations as the stream of characters comes at us. So, given we now how
>> many characters have arrived, can we predict how many will arrive, and
>> therefore ask for the best amount when we reallocate, so that we neither
>> make too many reallocation (reallocate on every byte received) or ask for
>> too much (demand SIZE_MAX memory when the first byte is received).?
> 
> Obviously not, or we'd use the prediction.  You question was probably
> rhetorical, but it didn't read that way.
> 
>> Your strategy for avoiding these extremes is exponential growth.
> 
> It's odd to call it mine.  It's very widely know and used.  "The one I
> mentioned" might be less confusing description.
> 
>> You
>> allocate a small amount for the first few bytes. Then you use exponential
>> growth, with a factor of ether 2 or 1.5. My question is whether or not we
>> can be cuter. And of course we need to know the statistical distribution of
>> the input files. And I'm assuming a semi-normal distribution, ignoring the
>> files with small values, which we will allocate enough for anyway.
>>
>> And so we integrate the distribution between the point we are at and
>> infinity. Then we tkae the mean. And that gives us a best estimate of how
>> many bytes are to come, and therefore how much to grow the buffer by.
> 
> I would be surprised if that were worth the effort at run time.  A
> static analysis of "typical" input sizes might be interesting as that
> could be used to get an estimate of good factors to use, but anything
> more complicated than maybe a few factors (e.g. doubling up to 1MB then
> 3/2 thereafter) is likely to be too messy to useful.
> 
> Also, the cost of reallocations is not constant.  Larger ones are
> usually more costly than small ones, so if one were going to a lot of
> effort to make run-time guesses, that cost should be factored in as
> well.
> 

I usually keep track:

struct
{
     size_t used;
     size_t allocated;
     void *data;
};

Then, if used + new_size is more than what's already been allocated then 
a realloc will be required.

Start with an initial allocated size that's 'resonable' - the happy path 
will never need any reallocs.

Otherwise multiply by some factor.  Typicall I just double it.


[toc] | [prev] | [next] | [standalone]


#386118

From"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>
Date2024-06-17 13:05 -0700
Message-ID<v4q4u4$t6bd$1@dont-email.me>
In reply to#386103
On 6/17/2024 8:15 AM, Richard Harnden wrote:
> On 17/06/2024 15:33, Ben Bacarisse wrote:
>> Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
>>
>>> On 17/06/2024 10:55, Ben Bacarisse wrote:
>>>> Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
>>>>
>>>>> On 17/06/2024 10:18, Ben Bacarisse wrote:
>>>>>> Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
>>>>>>
>>>>>>> In a recent thread realloc() was a substantial part of the 
>>>>>>> discussion.
>>>>>>> "Occasionally" the increased data storage will be relocated along
>>>>>>> with the previously stored data. On huge data sets that might be a
>>>>>>> performance factor. Is there any experience or are there any 
>>>>>>> concrete
>>>>>>> factors about the conditions when this relocation happens? - I could
>>>>>>> imagine that it's no issue as long as you're in some kB buffer 
>>>>>>> range,
>>>>>>> but if, say, we're using realloc() to substantially increase buffers
>>>>>>> often it might be an issue to consider. It would be good to get some
>>>>>>> feeling about that internal.
>>>>>> There is obviously a cost, but there is (usually) no alternative if
>>>>>> contiguous storage is required.  In practice, the cost is usually
>>>>>> moderate and can be very effectively managed by using an exponential
>>>>>> allocation scheme: at every reallocation multiply the storage 
>>>>>> space by
>>>>>> some factor greater than 1 (I often use 3/2, but doubling is often 
>>>>>> used
>>>>>> as well).  This results in O(log(N)) rather than O(N) allocations 
>>>>>> as in
>>>>>> your code that added a constant to the size.  Of course, some 
>>>>>> storage is
>>>>>> wasted (that /might/ be retrieved by a final realloc down to the 
>>>>>> final
>>>>>> size) but that's rarely significant.
>>>>>>
>>>>> So can we work it out?
>>>> What is "it"?
>>>>
>>>>> Let's assume for the moment that the allocations have a semi-normal
>>>>> distribution,
>>>> What allocations?  The allocations I talked about don't have that
>>>> distribution.
>>>>
>>>>> with negative values disallowed. Now ignoring the first few
>>>>> values, if we have allocated, say, 1K, we ought to be able to 
>>>>> predict the
>>>>> value by integrating the distribution from 1k to infinity and 
>>>>> taking the
>>>>> mean.
>>>> I have no idea what you are talking about.  What "value" are you 
>>>> looking
>>>> to calculate?
>>>>
>>> We have a continuously growing buffer, and we want the best strategy for
>>> reallocations as the stream of characters comes at us. So, given we 
>>> now how
>>> many characters have arrived, can we predict how many will arrive, and
>>> therefore ask for the best amount when we reallocate, so that we neither
>>> make too many reallocation (reallocate on every byte received) or ask 
>>> for
>>> too much (demand SIZE_MAX memory when the first byte is received).?
>>
>> Obviously not, or we'd use the prediction.  You question was probably
>> rhetorical, but it didn't read that way.
>>
>>> Your strategy for avoiding these extremes is exponential growth.
>>
>> It's odd to call it mine.  It's very widely know and used.  "The one I
>> mentioned" might be less confusing description.
>>
>>> You
>>> allocate a small amount for the first few bytes. Then you use 
>>> exponential
>>> growth, with a factor of ether 2 or 1.5. My question is whether or 
>>> not we
>>> can be cuter. And of course we need to know the statistical 
>>> distribution of
>>> the input files. And I'm assuming a semi-normal distribution, 
>>> ignoring the
>>> files with small values, which we will allocate enough for anyway.
>>>
>>> And so we integrate the distribution between the point we are at and
>>> infinity. Then we tkae the mean. And that gives us a best estimate of 
>>> how
>>> many bytes are to come, and therefore how much to grow the buffer by.
>>
>> I would be surprised if that were worth the effort at run time.  A
>> static analysis of "typical" input sizes might be interesting as that
>> could be used to get an estimate of good factors to use, but anything
>> more complicated than maybe a few factors (e.g. doubling up to 1MB then
>> 3/2 thereafter) is likely to be too messy to useful.
>>
>> Also, the cost of reallocations is not constant.  Larger ones are
>> usually more costly than small ones, so if one were going to a lot of
>> effort to make run-time guesses, that cost should be factored in as
>> well.
>>
> 
> I usually keep track:
> 
> struct
> {
>      size_t used;
>      size_t allocated;
>      void *data;
> };
> 
> Then, if used + new_size is more than what's already been allocated then 
> a realloc will be required.

Fwiw, I remember way back using an n-ary tree of regions to accomplish 
it. The trigger for creating a new region was very similar to your 
logic. It performed pretty good and did not use realloc. Indeed it was 
for a special use case. I remember having region partitions that would 
link to other regions. Actually, it kind of reminded me of a strange 
version of ropes:

https://en.wikipedia.org/wiki/Rope_(data_structure)

Fwiw, here is my old C code for a region:

https://pastebin.com/raw/f37a23918
(raw text, no ads)

Iirc, I first mentioned it in:

https://groups.google.com/g/comp.lang.c/c/7oaJFWKVCTw/m/sSWYU9BUS_QJ




> 
> Start with an initial allocated size that's 'resonable' - the happy path 
> will never need any reallocs.
> 
> Otherwise multiply by some factor.  Typicall I just double it.
> 
> 
> 

[toc] | [prev] | [next] | [standalone]


#386105

FromMalcolm McLean <malcolm.arthur.mclean@gmail.com>
Date2024-06-17 16:36 +0100
Message-ID<v4pl6q$nn9o$1@dont-email.me>
In reply to#386099
On 17/06/2024 15:33, Ben Bacarisse wrote:
> Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
> 
>> On 17/06/2024 10:55, Ben Bacarisse wrote:
>>> Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
>>>
>>>> On 17/06/2024 10:18, Ben Bacarisse wrote:
>>>>> Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
>>>>>
>>>>>> In a recent thread realloc() was a substantial part of the discussion.
>>>>>> "Occasionally" the increased data storage will be relocated along
>>>>>> with the previously stored data. On huge data sets that might be a
>>>>>> performance factor. Is there any experience or are there any concrete
>>>>>> factors about the conditions when this relocation happens? - I could
>>>>>> imagine that it's no issue as long as you're in some kB buffer range,
>>>>>> but if, say, we're using realloc() to substantially increase buffers
>>>>>> often it might be an issue to consider. It would be good to get some
>>>>>> feeling about that internal.
>>>>> There is obviously a cost, but there is (usually) no alternative if
>>>>> contiguous storage is required.  In practice, the cost is usually
>>>>> moderate and can be very effectively managed by using an exponential
>>>>> allocation scheme: at every reallocation multiply the storage space by
>>>>> some factor greater than 1 (I often use 3/2, but doubling is often used
>>>>> as well).  This results in O(log(N)) rather than O(N) allocations as in
>>>>> your code that added a constant to the size.  Of course, some storage is
>>>>> wasted (that /might/ be retrieved by a final realloc down to the final
>>>>> size) but that's rarely significant.
>>>>>
>>>> So can we work it out?
>>> What is "it"?
>>>
>>>> Let's assume for the moment that the allocations have a semi-normal
>>>> distribution,
>>> What allocations?  The allocations I talked about don't have that
>>> distribution.
>>>
>>>> with negative values disallowed. Now ignoring the first few
>>>> values, if we have allocated, say, 1K, we ought to be able to predict the
>>>> value by integrating the distribution from 1k to infinity and taking the
>>>> mean.
>>> I have no idea what you are talking about.  What "value" are you looking
>>> to calculate?
>>>
>> We have a continuously growing buffer, and we want the best strategy for
>> reallocations as the stream of characters comes at us. So, given we now how
>> many characters have arrived, can we predict how many will arrive, and
>> therefore ask for the best amount when we reallocate, so that we neither
>> make too many reallocation (reallocate on every byte received) or ask for
>> too much (demand SIZE_MAX memory when the first byte is received).?
> 
> Obviously not, or we'd use the prediction.  You question was probably
> rhetorical, but it didn't read that way.
> 
>> Your strategy for avoiding these extremes is exponential growth.
> 
> It's odd to call it mine.  It's very widely know and used.  "The one I
> mentioned" might be less confusing description.
> 
>> You
>> allocate a small amount for the first few bytes. Then you use exponential
>> growth, with a factor of ether 2 or 1.5. My question is whether or not we
>> can be cuter. And of course we need to know the statistical distribution of
>> the input files. And I'm assuming a semi-normal distribution, ignoring the
>> files with small values, which we will allocate enough for anyway.
>>
>> And so we integrate the distribution between the point we are at and
>> infinity. Then we tkae the mean. And that gives us a best estimate of how
>> many bytes are to come, and therefore how much to grow the buffer by.
> 
> I would be surprised if that were worth the effort at run time.  A
> static analysis of "typical" input sizes might be interesting as that
> could be used to get an estimate of good factors to use, but anything
> more complicated than maybe a few factors (e.g. doubling up to 1MB then
> 3/2 thereafter) is likely to be too messy to useful.
>
There's virualy no run-time effort, unless you ask caller to pass in a 
customised distribution, which you analyse on the fly, which would be 
quite a bit of work.
All the work is done beforehand. We need a statistical distribution of 
the files sizes of the files we are interesed in. So, probably, text 
files on personal computers. Then we'll excude the small files, say 
under 1kb which will have an odd distribution for various reasons, and 
which we are not interested in as we can easily afford 1k as an initial 
buffer.

And we're probably looking at a semi- normal, maybe log- normal 
distribution. There's no reason to suspect it would be anything odd. And 
with the normal distribution there is no closed form integral, but 
tables of integrals are published.

So we convert 1K to a Z-score, integrate from that to infinity, halve 
the result, and that gives us an estimate of the most likely file size - 
having established that the file is over 1k, half will be below and half 
above this size. So that's the next amount to realloc. Say, for the sake 
of argument, 4K. Then we do the same thing, starting from 4k, and 
working out the most likely file size, given that the file is over 4K. 
Now the disribution tends to flatten out towards the tail, so if best 
guess, given at least 1K, was 4K, best guess diven 4k, won't be 8K. It 
will be 10k, maybe 12k. Do the same again for 12k. And we'll get a 
series of numbers like this.

1k, 4k, 10k, 20k, 50k, 120k, 500k, 2MB, 8MB ...

and so on, rapidly increasing to SIZE_MAX. And then at runtime we just 
hardcode those in, it's a lookup table with not too many entries.

Becuase we've chosen the mean, half the time you will reallocate. You 
can easily fine tune the strategy by choosing a proportion other than 
0.5, depending on whether saving memory or reducing allocations is more 
important to you.

and the hard part is getting some real statistics to work on.
 >
> Also, the cost of reallocations is not constant.  Larger ones are
> usually more costly than small ones, so if one were going to a lot of
> effort to make run-time guesses, that cost should be factored in as
> well.
> 
Unfortunately yes. Real optimisation problems can be almost impossible 
for reasons like this. iF the cost of reallocations isn't constant, 
tou've got to put in correctiong factors, and then what was a fairly 
simple procedure becomes difficult.

-- 
Check out my hobby project.
http://malcolmmclean.github.io/babyxrc

[toc] | [prev] | [next] | [standalone]


#386101

FromAnton Shepelev <anton.txt@g{oogle}mail.com>
Date2024-06-17 18:02 +0300
Message-ID<20240617180249.96dfaafa89392827aa162434@g{oogle}mail.com>
In reply to#386084
[cross-posted to: ci.stat.math]

Malcolm McLean:

> We have a continuously growing buffer, and we want the
> best strategy for reallocations as the stream of
> characters comes at us. So, given we now how many
> characters have arrived, can we predict how many will
> arrive,

Do you mean in the next bunch, or in total (till the end of
the buffer's lifetime)?

> and therefore ask for the best amount when we reallocate,
> so that we neither make too many reallocation (reallocate
> on every byte received) or ask for too much (demand
> SIZE_MAX memory when the first byte is received).?
>
> Your strategy for avoiding these extremes is exponential
> growth. You allocate a small amount for the first few
> bytes. Then you use exponential growth, with a factor of
> ether 2 or 1.5.

This strategy ensures a constant ratio between the amount of
reallocated data to the length of the buffer by making
reallocations less frequent as the buffer grows.

> And so we integrate the distribution between the point we
> are at and infinity. Then we tkae the mean. And that gives
> us a best estimate of how many bytes are to come, and
> therefore how much to grow the buffer by.

You have an apriori distribution of the buffer size (can be
tracked on-the-fly, if unknown beforehand) and a partially
filled buffer.  The task is to calculate the a-posteriori
distribution of /that/ buffer's final size, and then to
allocate the predicted value based on a good percentile.

How about using a percentile instead of the mean, e.g. if
the current size corresponds to percentile p, you allocate a
capacity corresponding to percentile 1-(1-p)/k , where k>1
denotes the balance between space and time efficency.  For
example, if the 60th percentile of the buffer is required
and k=2, you allocate a capacity sufficient to hold
100-(100-60)/2=80% of buffers.

-- 
()  ascii ribbon campaign -- against html e-mail
/\  www.asciiribbon.org   -- against proprietary attachments

[toc] | [prev] | [next] | [standalone]


#386190

From"David Jones" <dajhawk18xx@@nowhere.com>
Date2024-06-18 17:59 +0000
Message-ID<v4shu2$1ff4q$1@dont-email.me>
In reply to#386101
Anton Shepelev wrote:

> [cross-posted to: ci.stat.math]
> 
> Malcolm McLean:
> 
> > We have a continuously growing buffer, and we want the
> > best strategy for reallocations as the stream of
> > characters comes at us. So, given we now how many
> > characters have arrived, can we predict how many will
> > arrive,
> 
> Do you mean in the next bunch, or in total (till the end of
> the buffer's lifetime)?
> 
> > and therefore ask for the best amount when we reallocate,
> > so that we neither make too many reallocation (reallocate
> > on every byte received) or ask for too much (demand
> > SIZE_MAX memory when the first byte is received).?
> > 
> > Your strategy for avoiding these extremes is exponential
> > growth. You allocate a small amount for the first few
> > bytes. Then you use exponential growth, with a factor of
> > ether 2 or 1.5.
> 
> This strategy ensures a constant ratio between the amount of
> reallocated data to the length of the buffer by making
> reallocations less frequent as the buffer grows.
> 
> > And so we integrate the distribution between the point we
> > are at and infinity. Then we tkae the mean. And that gives
> > us a best estimate of how many bytes are to come, and
> > therefore how much to grow the buffer by.
> 
> You have an apriori distribution of the buffer size (can be
> tracked on-the-fly, if unknown beforehand) and a partially
> filled buffer.  The task is to calculate the a-posteriori
> distribution of that buffer's final size, and then to
> allocate the predicted value based on a good percentile.
> 
> How about using a percentile instead of the mean, e.g. if
> the current size corresponds to percentile p, you allocate a
> capacity corresponding to percentile 1-(1-p)/k , where k>1
> denotes the balance between space and time efficency.  For
> example, if the 60th percentile of the buffer is required
> and k=2, you allocate a capacity sufficient to hold
> 100-(100-60)/2=80% of buffers.

Based on essentially no background to this question, not much can be
said. However, if one starts from the suggestion above to use the mean
of some distribution (or later some percentile), one notes that the
"mean" is just the minimum of a quadratic cast function ,,, so an
improvement would be to base the choice on some more realistic cost
function, chosen for the actual application. Given that the scenario
apparently involves a sequence of such decisions, the obvious extension
of the cost-based approach would be to employ some form of dynamic
programming. Of course, this might not be appealing, in which case one
might choose the theoretically-simple approach of tuning a policy based
on good stchastic simulations of the situation.

[toc] | [prev] | [next] | [standalone]


#386219

Fromdavidd02@tpg.com.au (David Duffy)
Date2024-06-19 06:48 +0000
Message-ID<v4tuvf$1qto5$1@dont-email.me>
In reply to#386101
In sci.stat.math Anton Shepelev <anton.txt@g{oogle}mail.com> wrote:
> [cross-posted to: ci.stat.math]
> 
> Malcolm McLean:
> 
>> We have a continuously growing buffer, and we want the
>> best strategy for reallocations as the stream of
>> characters comes at us. So, given we now how many
>> characters have arrived, can we predict how many will
>> arrive,
> 
> Do you mean in the next bunch, or in total (till the end of
> the buffer's lifetime)?
> 
Isn't this a halting problem?  Aren't the more important data: 
how much memory the user is allowed to allocate, the properties of
the current system's memory allocation algorithm, when your stream
will have to go to disc or other slow large volume storage, how
the stream can be compressed on the fly (the latter might well give
strong predictions for future storage requirements based on what 
has been read to date).

[toc] | [prev] | [next] | [standalone]


#386226

FromMalcolm McLean <malcolm.arthur.mclean@gmail.com>
Date2024-06-19 10:12 +0100
Message-ID<v4u7dm$1t2pu$1@dont-email.me>
In reply to#386219
On 19/06/2024 07:48, David Duffy wrote:
> In sci.stat.math Anton Shepelev <anton.txt@g{oogle}mail.com> wrote:
>> [cross-posted to: ci.stat.math]
>>
>> Malcolm McLean:
>>
>>> We have a continuously growing buffer, and we want the
>>> best strategy for reallocations as the stream of
>>> characters comes at us. So, given we now how many
>>> characters have arrived, can we predict how many will
>>> arrive,
>>
>> Do you mean in the next bunch, or in total (till the end of
>> the buffer's lifetime)?
>>
> Isn't this a halting problem?  Aren't the more important data:
> how much memory the user is allowed to allocate, the properties of
> the current system's memory allocation algorithm, when your stream
> will have to go to disc or other slow large volume storage, how
> the stream can be compressed on the fly (the latter might well give
> strong predictions for future storage requirements based on what
> has been read to date).
> 
> 
No. We have to have some knowledge. And what we probaby know is that the 
input is a file stored on someone's personal computer. And someone has 
published on the statistical distribution of such files And they have a 
log-normal distribution with a mean and a median which he gives. So with 
that informaton, we can work out, given that a file is at least N 
characters, what is the prbablity that an allocation of any size will 
contain the whole file, and how many bytes, on average will be wasted.

Statistical analysis can't tell us what a allocation wil cost versus the 
cost of hoggng memory we don't need, however. It can;t tell us what to 
do. Just put us in the picture.
-- 
Check out my hobby project.
http://malcolmmclean.github.io/babyxrc

[toc] | [prev] | [next] | [standalone]


#386241

FromBen Bacarisse <ben@bsb.me.uk>
Date2024-06-19 16:36 +0100
Message-ID<875xu5t066.fsf@bsb.me.uk>
In reply to#386226
Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

> No. We have to have some knowledge. And what we probaby know is that the
> input is a file stored on someone's personal computer.  And someone has
> published on the statistical distribution of such files

That's not the case that matters (to me at least).  If the input is a
file, we have a much better way of "guessing" the size than guessing and
growing -- just ask for the size.  Sure, we might need to make
adjustments if the file is changing, but there is always a better
measure than any statistical analysis.

To some extent this seems like a solution in search of a problem.
Growing the buffer exponentially is simple and effective.

-- 
Ben.

[toc] | [prev] | [next] | [standalone]


#386244

FromDavid Brown <david.brown@hesbynett.no>
Date2024-06-19 19:41 +0200
Message-ID<v4v58t$230rh$1@dont-email.me>
In reply to#386241
On 19/06/2024 17:36, Ben Bacarisse wrote:
> Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
> 
>> No. We have to have some knowledge. And what we probaby know is that the
>> input is a file stored on someone's personal computer.  And someone has
>> published on the statistical distribution of such files
> 
> That's not the case that matters (to me at least).  If the input is a
> file, we have a much better way of "guessing" the size than guessing and
> growing -- just ask for the size.  Sure, we might need to make
> adjustments if the file is changing, but there is always a better
> measure than any statistical analysis.
> 
> To some extent this seems like a solution in search of a problem.

It seems more like a solution that doesn't exist in search of a problem 
with absurdly unrealistic requirements.  And even if Malcolm's solution 
existed, and the problem existed, it /still/ wouldn't work - knowing the 
distribution of file sizes tells us nothing about the size of any given 
file.

> Growing the buffer exponentially is simple and effective.
> 

Yes, that's the general way to handle buffers when you don't know what 
size they should be.

A better solutions for this sort of program is usually, as you say, 
asking the OS for the file size (there is no standard library function 
for getting the file size, but it's not hard to do for any realistic 
target OS).  And then for big files, prefer mmap to reading the file 
into a buffer.

It's only really for unsized "files" such as piped input that you have 
no way of getting the size, and then exponential growth is the way to 
go.  Personally, I'd start with a big size (perhaps 10 MB) that is 
bigger than you are likely to need in practice, but small enough that it 
is negligible on even vaguely modern computers. Then the realloc code is 
unlikely to be used (but it can still be there for completeness).


[toc] | [prev] | [next] | [standalone]


#386251

FromBen Bacarisse <ben@bsb.me.uk>
Date2024-06-19 22:24 +0100
Message-ID<87o77wsk18.fsf@bsb.me.uk>
In reply to#386244
David Brown <david.brown@hesbynett.no> writes:

> On 19/06/2024 17:36, Ben Bacarisse wrote:
>> Growing the buffer exponentially is simple and effective.
>
> Yes, that's the general way to handle buffers when you don't know what size
> they should be.
>
> A better solutions for this sort of program is usually, as you say, asking
> the OS for the file size (there is no standard library function for getting
> the file size, but it's not hard to do for any realistic target OS).  And
> then for big files, prefer mmap to reading the file into a buffer.
>
> It's only really for unsized "files" such as piped input that you have no
> way of getting the size, and then exponential growth is the way to go.
> Personally, I'd start with a big size (perhaps 10 MB) that is bigger than
> you are likely to need in practice, but small enough that it is negligible
> on even vaguely modern computers. Then the realloc code is unlikely to be
> used (but it can still be there for completeness).

There are other uses that have nothing to do with files.  I have a small
dynamic array library (just a couple of function) that I use for all
sorts of things.  I can read a file or parse tokens or input a line just
by adding characters.  Because of its rather general use, I don't start
with a large buffer (though the initial size can be set).

-- 
Ben.

[toc] | [prev] | [next] | [standalone]


#386262

FromDavid Brown <david.brown@hesbynett.no>
Date2024-06-20 13:22 +0200
Message-ID<v513dn$2i1c3$2@dont-email.me>
In reply to#386251
On 19/06/2024 23:24, Ben Bacarisse wrote:
> David Brown <david.brown@hesbynett.no> writes:
> 
>> On 19/06/2024 17:36, Ben Bacarisse wrote:
>>> Growing the buffer exponentially is simple and effective.
>>
>> Yes, that's the general way to handle buffers when you don't know what size
>> they should be.
>>
>> A better solutions for this sort of program is usually, as you say, asking
>> the OS for the file size (there is no standard library function for getting
>> the file size, but it's not hard to do for any realistic target OS).  And
>> then for big files, prefer mmap to reading the file into a buffer.
>>
>> It's only really for unsized "files" such as piped input that you have no
>> way of getting the size, and then exponential growth is the way to go.
>> Personally, I'd start with a big size (perhaps 10 MB) that is bigger than
>> you are likely to need in practice, but small enough that it is negligible
>> on even vaguely modern computers. Then the realloc code is unlikely to be
>> used (but it can still be there for completeness).
> 
> There are other uses that have nothing to do with files.  

Of course.  This comment was for the specific purposes being discussed 
here.  For other uses, there can be many other structures and algorithms 
that fit better.  Exponentially increasing the size when needed is a 
good general-purpose method.

> I have a small
> dynamic array library (just a couple of function) that I use for all
> sorts of things.  I can read a file or parse tokens or input a line just
> by adding characters.  Because of its rather general use, I don't start
> with a large buffer (though the initial size can be set).
> 

[toc] | [prev] | [next] | [standalone]


#386253

FromAnton Shepelev <anton.txt@gmail.moc>
Date2024-06-20 01:53 +0300
Message-ID<20240620015347.9bdcc4df03ab63d096375450@gmail.moc>
In reply to#386226
Malcolm McLean:

> We have to have some knowledge. And what we probaby know
> is that the input is a file stored on someone's personal
> computer. And someone has published on the statistical
> distribution of such files And they have a log-normal
> distribution with a mean and a median which he gives. So
> with that informaton, we can work out, given that a file
> is at least N characters, what is the prbablity that an
> allocation of any size will contain the whole file, and
> how many bytes, on average will be wasted.

Observe that the standard algorithm of exponential growth is
memoryless and self-similar in that in does not depend on
context, or the history of previous reallocations.  These
properties belong to (or even identify?) the exponential
distribution.  We can therefore assume that exponential-
growth strategy is ideal for exponentially distributed
buffer sizes, and under that assumption determine the
relation between the CDF values (p) corresponding to
consequent re-allcoations:

   p  = e^x/L                 ,
   p0 = 1-e^(L*x0)            ,
   p1 = 1-e^(L*x1)            ,
   x1 = k*x0 (by our strategy), =>
   p1 = 1-(1-p0)^k            .

which does not depend on the distribution and lets us
generalise this approach for any distribution:

              x1 = Q( 1 - ( 1 - CDF(x0) )^k )

where:

   x0    : the required size
   x1    : the new recommended capacity
   Q(p)  : the p-Quantile of the given distribution
   CDF(x): the CDF of the given distribution
   k>1   : balance between speed and space efficiency

-- 
()  ascii ribbon campaign -- against html e-mail
/\  www.asciiribbon.org   -- against proprietary attachments

[toc] | [prev] | [next] | [standalone]


#386891

FromAnton Shepelev <anton.txt@g{oogle}mail.com>
Date2024-07-08 19:34 +0300
Message-ID<20240708193456.a1ebe2d0872239c525120d84@g{oogle}mail.com>
In reply to#386253
I had plumb forgot about this solution of mine:

>    p0 = 1-e^(L*x0)            ,
>    p1 = 1-e^(L*x1)            ,
>    x1 = k*x0 (by our strategy), =>
>    p1 = 1-(1-p0)^k            .
>
> which does not depend on the distribution and lets us
> generalise this approach for any distribution:
>
>               x1 = Q( 1 - ( 1 - CDF(x0) )^k )
> where:
>
>    x0    : the required size
>    x1    : the new recommended capacity
>    Q(p)  : the p-Quantile of the given distribution
>    CDF(x): the CDF of the given distribution
>    k>1   : balance between speed and space efficiency

Let us test it with the exponential distribution, for which:

  Q  (p) = -Ln( 1 - p )/L
  CDF(x) =  1 - e^(-Lx)

Substituting these into the equation for x1:

  x1 = Q ( 1 - ( 1 - ( 1 - e^(-Lx0)  ) )^k ) =
       Q ( 1 - ( e^(-Lx0)              )^k ) =
       Q ( 1 -   e^(-kLx0)                 ) =
      -Ln( e^(-kLx0) )/L                     = k*x0 (QED)

That is, my solution is a/the generalisation of the
exponential growth strategy.

-- 
()  ascii ribbon campaign -- against html e-mail
/\  www.asciiribbon.org   -- against proprietary attachments

[toc] | [prev] | [next] | [standalone]


#386237

FromAnton Shepelev <anton.txt@g{oogle}mail.com>
Date2024-06-19 15:20 +0300
Message-ID<20240619152000.2738defeceb1df7203151c64@g{oogle}mail.com>
In reply to#386219
Malcolm McLean writes that, given the log-normal distribution
of file sizes with known parameters,

> we can work out, given that a file is at least N
> characters, what is the prbablity that an allocation of
> any size will contain the whole file, and how many bytes,
> on average will be wasted.

This is why I thought statisticians might help him: Malcolm
wants to find the aposteriori distribution of the size of a
file, after it has been found to exceed N bytes.  Am I right
that if we take the remaining (N>20) part of the density
function and re-normalise it, we shall obtain the desired
distribution?

My proposition was as follows:

  1.  Find quantile q0 corresponding to the buffer size
      currently requested.

  2.  Calculate new quantile q1 = 1-(1-q0)/k, where k>1 is
      an adjustable parameter, and use its corresponding
      value as the new allocation size.

For example, assuming for simplicity a uniform [0,20]
distribution of file sizez and k=2, a sequence of allocation
may look like this:

                requested allocated
                 2        20-(20- 2)/2 = 11
                12        20-(20-12)/2 = 16
                18        20-(20-18)/2 = 19
-- 
()  ascii ribbon campaign -- against html e-mail
/\  www.asciiribbon.org   -- against proprietary attachments

[toc] | [prev] | [next] | [standalone]


#386694

FromRich Ulrich <rich.ulrich@comcast.net>
Date2024-07-02 00:51 -0400
Message-ID<s1178jp7fjh4rgtsafa78dp6j05ejidau2@4ax.com>
In reply to#386101
On Mon, 17 Jun 2024 18:02:49 +0300, Anton Shepelev
<anton.txt@g{oogle}mail.com> wrote:

>[cross-posted to: ci.stat.math]
>
Anton, 

The post being responded to was originally to comp.lang.c
which I don't subscribe to. 

I have a question that I suppose reflects on my news source, 
GigaNews, or else on my reader, Forte Agent. 

Was this thread something posted 15 or 20 years ago? 

I tried to call up the original post by clicking on the Message
ID when looking at headers; nothing comes up when Agent goes
online to look.  The header shows multiple earlier messages; 
none of them come up for me. 

My clicking on Message ID works elsewhere. The logical and 
simple explanation is that this is a thread old enough that
GigaNews does not have it. 

I suppose that someone else might be able to tell me, if their 
supplier goes back further or if GigaNews is somehow failing
to show me something that is recent. 

-- 
Rich Ulrich 

[toc] | [prev] | [next] | [standalone]


Page 2 of 5 — ← Prev page 1 [2] 3 4 5  Next page →

Back to top | Article view | comp.lang.c


csiph-web