Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.c > #386054 > unrolled thread
| Started by | Janis Papanagnou <janis_papanagnou+ng@hotmail.com> |
|---|---|
| First post | 2024-06-17 08:08 +0200 |
| Last post | 2024-06-25 16:16 +0200 |
| Articles | 20 on this page of 100 — 27 participants |
Back to article view | Back to comp.lang.c
realloc() - frequency, conditions, or experiences about relocation? Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-06-17 08:08 +0200
Re: realloc() - frequency, conditions, or experiences about relocation? "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-06-16 23:34 -0700
Re: realloc() - frequency, conditions, or experiences about relocation? Ben Bacarisse <ben@bsb.me.uk> - 2024-06-17 10:18 +0100
Re: realloc() - frequency, conditions, or experiences about relocation? Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2024-06-17 10:31 +0100
Re: realloc() - frequency, conditions, or experiences about relocation? Ben Bacarisse <ben@bsb.me.uk> - 2024-06-17 10:55 +0100
Re: realloc() - frequency, conditions, or experiences about relocation? Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2024-06-17 13:45 +0100
Re: realloc() - frequency, conditions, or experiences about relocation? Ben Bacarisse <ben@bsb.me.uk> - 2024-06-17 15:33 +0100
Re: realloc() - frequency, conditions, or experiences about relocation? Anton Shepelev <anton.txt@g{oogle}mail.com> - 2024-06-17 18:10 +0300
Re: realloc() - frequency, conditions, or experiences about relocation? Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-06-18 00:09 -0700
Re: realloc() - frequency, conditions, or experiences about relocation? Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2024-06-18 11:19 +0100
Re: realloc() - frequency, conditions, or experiences about relocation? Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2024-06-18 11:46 +0100
Re: realloc() - frequency, conditions, or experiences about relocation? Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-29 00:14 +0000
Re: realloc() - frequency, conditions, or experiences about relocation? Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2024-07-02 10:18 +0100
Re: realloc() - frequency, conditions, or experiences about relocation? Ben Bacarisse <ben@bsb.me.uk> - 2024-07-02 16:39 +0100
Re: realloc() - frequency, conditions, or experiences about relocation? Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-07-03 23:48 +0000
Re: realloc() - frequency, conditions, or experiences about relocation? Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-06-24 09:32 -0700
Re: realloc() - frequency, conditions, or experiences about relocation? David Brown <david.brown@hesbynett.no> - 2024-06-24 19:19 +0200
Indefinite pronouns [was:Re: realloc() - frequency, conditions, or experiences about relocation?] Anton Shepelev <anton.txt@gmail.moc> - 2024-06-20 02:51 +0300
Re: Indefinite pronouns vallor <vallor@cultnix.org> - 2024-06-20 00:34 +0000
Re: Indefinite pronouns David Brown <david.brown@hesbynett.no> - 2024-06-21 12:59 +0200
Re: Indefinite pronouns Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-21 10:31 -0700
Re: Indefinite pronouns [was:Re: realloc() - frequency, conditions, or experiences about relocation?] gazelle@shell.xmission.com (Kenny McCormack) - 2024-06-20 12:10 +0000
Re: Indefinite pronouns [was: Re: <something technical>] Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-06-20 15:04 +0200
Re: Indefinite pronouns [was:Re: realloc() - frequency, conditions, or experiences about relocation?] Cri-Cri <cri@cri.cri.invalid> - 2024-06-21 00:55 +0000
Re: Indefinite pronouns [was:Re: realloc() - frequency, conditions, or experiences about relocation?] Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-06-20 23:23 -0700
Re: realloc() - frequency, conditions, or experiences about relocation? Richard Harnden <richard.nospam@gmail.invalid> - 2024-06-17 16:15 +0100
Re: realloc() - frequency, conditions, or experiences about relocation? "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-06-17 13:05 -0700
Re: realloc() - frequency, conditions, or experiences about relocation? Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2024-06-17 16:36 +0100
Re: realloc() - frequency, conditions, or experiences about relocation? Anton Shepelev <anton.txt@g{oogle}mail.com> - 2024-06-17 18:02 +0300
Re: realloc() - frequency, conditions, or experiences about relocation? "David Jones" <dajhawk18xx@@nowhere.com> - 2024-06-18 17:59 +0000
Re: realloc() - frequency, conditions, or experiences about relocation? davidd02@tpg.com.au (David Duffy) - 2024-06-19 06:48 +0000
Re: realloc() - frequency, conditions, or experiences about relocation? Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2024-06-19 10:12 +0100
Re: realloc() - frequency, conditions, or experiences about relocation? Ben Bacarisse <ben@bsb.me.uk> - 2024-06-19 16:36 +0100
Re: realloc() - frequency, conditions, or experiences about relocation? David Brown <david.brown@hesbynett.no> - 2024-06-19 19:41 +0200
Re: realloc() - frequency, conditions, or experiences about relocation? Ben Bacarisse <ben@bsb.me.uk> - 2024-06-19 22:24 +0100
Re: realloc() - frequency, conditions, or experiences about relocation? David Brown <david.brown@hesbynett.no> - 2024-06-20 13:22 +0200
Re: realloc() - frequency, conditions, or experiences about relocation? Anton Shepelev <anton.txt@gmail.moc> - 2024-06-20 01:53 +0300
Re: realloc() - frequency, conditions, or experiences about relocation? Anton Shepelev <anton.txt@g{oogle}mail.com> - 2024-07-08 19:34 +0300
Re: realloc() - frequency, conditions, or experiences about relocation? Anton Shepelev <anton.txt@g{oogle}mail.com> - 2024-06-19 15:20 +0300
Re: realloc() - frequency, conditions, or experiences about relocation? Rich Ulrich <rich.ulrich@comcast.net> - 2024-07-02 00:51 -0400
Re: realloc() - frequency, conditions, or experiences about relocation? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-07-01 22:10 -0700
Re: realloc() - frequency, conditions, or experiences about relocation? Rich Ulrich <rich.ulrich@comcast.net> - 2024-07-02 11:45 -0400
Re: realloc() - frequency, conditions, or experiences about relocation? Anton Shepelev <anton.txt@g{oogle}mail.com> - 2024-07-08 20:01 +0300
Re: realloc() - frequency, conditions, or experiences about relocation? Rich Ulrich <rich.ulrich@comcast.net> - 2024-07-21 19:40 -0400
Re: realloc() - frequency, conditions, or experiences about relocation? Anton Shepelev <anton.txt@g{oogle}mail.com> - 2024-07-23 16:47 +0300
Re: realloc() - frequency, conditions, or experiences about relocation? Paul <nospam@needed.invalid> - 2024-07-02 03:02 -0400
Re: realloc() - frequency, conditions, or experiences about relocation? Rich Ulrich <rich.ulrich@comcast.net> - 2024-07-02 11:52 -0400
Re: realloc() - frequency, conditions, or experiences about relocation? Rich Ulrich <rich.ulrich@comcast.net> - 2024-07-02 11:58 -0400
Re: realloc() - frequency, conditions, or experiences about relocation? Paul <nospam@needed.invalid> - 2024-07-02 15:09 -0400
Re: realloc() - frequency, conditions, or experiences about relocation? James Kuyper <jameskuyper@alumni.caltech.edu> - 2024-07-02 16:54 -0400
Re: realloc() - frequency, conditions, or experiences about relocation? James Kuyper <jameskuyper@alumni.caltech.edu> - 2024-07-02 16:58 -0400
Re: realloc() - frequency, conditions, or experiences about relocation? scott@slp53.sl.home (Scott Lurndal) - 2024-06-17 16:58 +0000
Re: realloc() - frequency, conditions, or experiences about relocation? "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-06-17 13:12 -0700
Re: realloc() - frequency, conditions, or experiences about relocation? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-17 16:39 -0700
Re: realloc() - frequency, conditions, or experiences about relocation? Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2024-06-18 06:12 +0100
Re: realloc() - frequency, conditions, or experiences about relocation? Bonita Montero <Bonita.Montero@gmail.com> - 2024-06-18 09:56 +0200
Re: realloc() - frequency, conditions, or experiences about relocation? David Brown <david.brown@hesbynett.no> - 2024-06-17 20:11 +0200
Re: realloc() - frequency, conditions, or experiences about relocation? Bonita Montero <Bonita.Montero@gmail.com> - 2024-06-17 11:22 +0200
Re: realloc() - frequency, conditions, or experiences about relocation? Vir Campestris <vir.campestris@invalid.invalid> - 2024-06-20 21:08 +0100
Re: realloc() - frequency, conditions, or experiences about relocation? Bonita Montero <Bonita.Montero@gmail.com> - 2024-06-21 21:12 +0200
Re: realloc() - frequency, conditions, or experiences about relocation? Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-24 08:40 +0000
Re: realloc() - frequency, conditions, or experiences about relocation? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-24 02:55 -0700
Re: realloc() - frequency, conditions, or experiences about relocation? David Brown <david.brown@hesbynett.no> - 2024-06-24 13:40 +0200
Re: realloc() - frequency, conditions, or experiences about relocation? Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2024-06-24 18:50 +0100
Re: realloc() - frequency, conditions, or experiences about relocation? scott@slp53.sl.home (Scott Lurndal) - 2024-06-24 18:20 +0000
Re: realloc() - frequency, conditions, or experiences about relocation? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-24 14:28 -0700
Re: realloc() - frequency, conditions, or experiences about relocation? Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2024-06-25 00:22 +0100
Re: realloc() - frequency, conditions, or experiences about relocation? "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-06-24 18:31 -0700
Re: realloc() - frequency, conditions, or experiences about relocation? Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-25 07:06 +0000
Re: realloc() - frequency, conditions, or experiences about relocation? Bonita Montero <Bonita.Montero@gmail.com> - 2024-06-25 10:38 +0200
Re: realloc() - frequency, conditions, or experiences about relocation? Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-26 00:51 +0000
Re: realloc() - frequency, conditions, or experiences about relocation? "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-06-24 12:33 -0700
Re: realloc() - frequency, conditions, or experiences about relocation? Bonita Montero <Bonita.Montero@gmail.com> - 2024-06-24 21:48 +0200
Re: realloc() - frequency, conditions, or experiences about relocation? Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-24 22:59 +0000
Re: realloc() - frequency, conditions, or experiences about relocation? Bonita Montero <Bonita.Montero@gmail.com> - 2024-06-24 16:19 +0200
Re: realloc() - frequency, conditions, or experiences about relocation? Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-25 07:02 +0000
Re: realloc() - frequency, conditions, or experiences about relocation? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-25 03:05 -0700
Re: realloc() - frequency, conditions, or experiences about relocation? Richard Damon <richard@damon-family.org> - 2024-06-25 07:21 -0400
Re: realloc() - frequency, conditions, or experiences about relocation? Phil Carmody <pc+usenet@asdf.org> - 2024-06-28 11:01 +0300
Re: realloc() - frequency, conditions, or experiences about relocation? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-28 04:04 -0700
Re: realloc() - frequency, conditions, or experiences about relocation? James Kuyper <jameskuyper@alumni.caltech.edu> - 2024-06-28 06:37 -0400
Re: realloc() - frequency, conditions, or experiences about relocation? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-28 04:05 -0700
Re: realloc() - frequency, conditions, or experiences about relocation? James Kuyper <jameskuyper@alumni.caltech.edu> - 2024-06-28 06:36 -0400
Re: realloc() - frequency, conditions, or experiences about relocation? Bonita Montero <Bonita.Montero@gmail.com> - 2024-06-24 16:16 +0200
Down the hall, past the water cooler, third door on the left... (Was: realloc() - frequency, conditions, or experiences about) relocation? gazelle@shell.xmission.com (Kenny McCormack) - 2024-06-24 15:44 +0000
Re: Down the hall, past the water cooler, third door on the left... (Was: realloc() - frequency, conditions, or experiences about) relocation? Bonita Montero <Bonita.Montero@gmail.com> - 2024-06-24 20:16 +0200
Re: realloc() - frequency, conditions, or experiences about relocation? David Brown <david.brown@hesbynett.no> - 2024-06-17 14:15 +0200
Re: realloc() - frequency, conditions, or experiences about relocation? Bonita Montero <Bonita.Montero@gmail.com> - 2024-06-17 14:18 +0200
Re: realloc() - frequency, conditions, or experiences about relocation? Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-06-17 15:21 +0200
Re: realloc() - frequency, conditions, or experiences about relocation? scott@slp53.sl.home (Scott Lurndal) - 2024-06-17 16:50 +0000
Re: realloc() - frequency, conditions, or experiences about relocation? Michael S <already5chosen@yahoo.com> - 2024-06-17 20:20 +0300
Re: realloc() - frequency, conditions, or experiences about relocation? scott@slp53.sl.home (Scott Lurndal) - 2024-06-17 19:02 +0000
Re: realloc() - frequency, conditions, or experiences about relocation? Rosario19 <Ros@invalid.invalid> - 2024-06-18 11:50 +0200
Re: realloc() - frequency, conditions, or experiences about relocation? Bonita Montero <Bonita.Montero@gmail.com> - 2024-06-25 10:48 +0200
Re: realloc() - frequency, conditions, or experiences about relocation? Vir Campestris <vir.campestris@invalid.invalid> - 2024-06-25 11:55 +0100
Re: realloc() - frequency, conditions, or experiences about relocation? Bonita Montero <Bonita.Montero@gmail.com> - 2024-06-25 13:28 +0200
Re: realloc() - frequency, conditions, or experiences about relocation? Vir Campestris <vir.campestris@invalid.invalid> - 2024-06-26 12:15 +0100
Re: realloc() - frequency, conditions, or experiences about relocation? Bonita Montero <Bonita.Montero@gmail.com> - 2024-06-26 15:01 +0200
Re: realloc() - frequency, conditions, or experiences about relocation? DFS <nospam@dfs.com> - 2024-06-25 09:56 -0400
Re: realloc() - frequency, conditions, or experiences about relocation? Bonita Montero <Bonita.Montero@gmail.com> - 2024-06-25 16:16 +0200
Page 2 of 5 — ← Prev page 1 [2] 3 4 5 Next page →
| From | Keith Thompson <Keith.S.Thompson+u@gmail.com> |
|---|---|
| Date | 2024-06-21 10:31 -0700 |
| Subject | Re: Indefinite pronouns |
| Message-ID | <878qyyw6c5.fsf@nosuchdomain.example.com> |
| In reply to | #386310 |
David Brown <david.brown@hesbynett.no> writes:
> On 20/06/2024 02:34, vallor wrote:
>> On Thu, 20 Jun 2024 02:51:56 +0300, Anton Shepelev <anton.txt@gmail.moc>
>> wrote in <20240620025156.2ae9300726603b4cb3631547@gmail.moc>:
>>
>>> Cross-posting to alt.english.usage .
>> I've set the followup-to: same
>
> I've put it back to comp.lang.c. It's off-topic for that group,
Please don't do that.
[...]
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
[toc] | [prev] | [next] | [standalone]
| From | gazelle@shell.xmission.com (Kenny McCormack) |
|---|---|
| Date | 2024-06-20 12:10 +0000 |
| Subject | Re: Indefinite pronouns [was:Re: realloc() - frequency, conditions, or experiences about relocation?] |
| Message-ID | <v5168d$2fiq3$1@news.xmission.com> |
| In reply to | #386254 |
>> > I think it is a modern English idiom, which I dislike as >> > well. StackOverflow is full of questions starting like: >> > "How do you do this?" and "How do I do that?" They are >> > informal ways of the more literary "How does one do >> > this?" or "What is the way to do that?" >> >> I have a different take here. First the "your" of "your >> strategy" reads as a definite pronoun, meaning it refers >> specifically to Ben and not to some unknown other party. > >And I am /sure/ it is intended in the general (indefinite) >sense, as is the `you' in Malcolm's two following sentences: This sub-thread is certainly interesting, but it ultimately smacks of people looking for ways to feel insulted. Victimhood complex, and all that. But, it makes me think that the problem is the basic paradigm of newsgroup (i.e., online forum) communication being thought of as personalized. I.e., as in direct person-to-person communication - as if it was being spoken in a real room with real people (face-to-face). The fact is, it is not. It would be better if we didn't think of it that way. Rather, it should be thought of as communication between the speaker and an anonymous void. I.e., I'm not talking to you - I am talking to the anonymous void. Everybody is. Sort of like in the (US) House of Representatives - members are not ever supposed to be talking to each other. Rather, they are always speaking to the void. Like I am doing now. This is also why it is good (And, yes, I know this goes against the CW) to drop attributions, as I have done here. Keep it anonymous. -- First of all, I do not appreciate your playing stupid here at all. - Thomas 'PointedEars' Lahn -
[toc] | [prev] | [next] | [standalone]
| From | Janis Papanagnou <janis_papanagnou+ng@hotmail.com> |
|---|---|
| Date | 2024-06-20 15:04 +0200 |
| Subject | Re: Indefinite pronouns [was: Re: <something technical>] |
| Message-ID | <v519c2$2j4vg$1@dont-email.me> |
| In reply to | #386264 |
On 20.06.2024 14:10, Kenny McCormack wrote: > > This sub-thread is certainly interesting, but it ultimately smacks of > people looking for ways to feel insulted. Victimhood complex, and all that. > > But, it makes me think that the problem is the basic paradigm of newsgroup > (i.e., online forum) communication being thought of as personalized. I.e., > as in direct person-to-person communication - as if it was being spoken in > a real room with real people (face-to-face). The fact is, it is not. It > would be better if we didn't think of it that way. Rather, it should be > thought of as communication between the speaker and an anonymous void. > I.e., I'm not talking to you - I am talking to the anonymous void. > Everybody is. > > Sort of like in the (US) House of Representatives - members are not ever > supposed to be talking to each other. Rather, they are always speaking to > the void. > > Like I am doing now. > > This is also why it is good (And, yes, I know this goes against the CW) to > drop attributions, as I have done here. Keep it anonymous. Part 1 This is hard to achieve given that the technical NG infrastructure and functions "logically" connect the articles; it's only a little burden to identify (if not already obvious) the addressee. I think it would be better to try to stay on the issue as opposed to reply (as so often done) ad hominem (where arguments don't seem to exist or don't help anymore). This is of course yet more difficult to achieve and will in practice [also] not work in Usenet (I'm sure). Language can be used or interpreted in personal or impersonal forms. Some communication forms - and more so their semantical contents! - are (beyond the "you" vs. "one" dichotomy) inherently [set up to be] personal. -- Anonymous :-) Part 2 That all said. I think it's important to know who said/posted what. It allows to associate personal context/background information when replying. You can also be more assured about the quality of contents (to the good or bad) or even ignore certain posts. It saves time and protects ones health and mental sanity.[*] There's of course also post (or threads) that just exchange opinions, and "we" know everyone [typically] has an opinion (and often even in cases where they are put up against facts). Some folks are known to post a lot, respond to every thread that appears, contributing facts (sometimes) but also opinions (or personal offenses); this may be a nuisance (or just ignored, unless "anonymously" posted). So far my opinion on this non-technical meta-topic subthread. Janis (Darn, I disclosed my identity!) [*] Wasn't that the inherent problem of all those "social media" platforms? - Where anonymous posts - and some say that anonymity does negatively contribute to the language and contents of such disturbing posts - lead to barbarian communication conditions. (I know that only from hearsay but it seems common perception.)
[toc] | [prev] | [next] | [standalone]
| From | Cri-Cri <cri@cri.cri.invalid> |
|---|---|
| Date | 2024-06-21 00:55 +0000 |
| Subject | Re: Indefinite pronouns [was:Re: realloc() - frequency, conditions, or experiences about relocation?] |
| Message-ID | <Nb4dO.139759$2RJ6.5368@fx05.ams4> |
| In reply to | #386254 |
On Thu, 20 Jun 2024 02:51:56 +0300, Anton Shepelev wrote: >> But "one" can also be used as a first person definite pronoun >> (referring to the speaker), which an online reference tells me is >> chiefly British English. We have the same construct in Swedish: 'en', as in "en kanske skulle kunna ta en banan", meaning "one could perhaps have a banana." Referring to oneself from an outside perspective, in, for instance, a situation where there used to be several items on offer to guests, but now there are only bananas and some dry sponge cake left. IOW, of two lesser desirable items, one could accept a banana. -- Cri-Cri
[toc] | [prev] | [next] | [standalone]
| From | Tim Rentsch <tr.17687@z991.linuxsc.com> |
|---|---|
| Date | 2024-06-20 23:23 -0700 |
| Subject | Re: Indefinite pronouns [was:Re: realloc() - frequency, conditions, or experiences about relocation?] |
| Message-ID | <86plsahl0e.fsf@linuxsc.com> |
| In reply to | #386254 |
Anton Shepelev <anton.txt@gmail.moc> writes: > Cross-posting to alt.english.usage . > > Tim Rentsch to Anton Shepelev: > >>> I think it is a modern English idiom, which I dislike as >>> well. StackOverflow is full of questions starting like: >>> "How do you do this?" and "How do I do that?" They are >>> informal ways of the more literary "How does one do >>> this?" or "What is the way to do that?" >> >> I have a different take here. First the "your" of "your >> strategy" reads as a definite pronoun, meaning it refers >> specifically to Ben and not to some unknown other party. > > And I am /sure/ it is intended in the general (indefinite) > sense, I don't know why you think that. I don't know any native English speakers who would read it as other than referring to the person being responded to (who was Ben in this case). Responding to Malcolm's statement, Ben said "It's odd to call it mine", so it seems Ben also read it as a definite pronoun, referring to himself. As for the rest of your comments, you're reaching bad conclusions because you're not looking in the right places. To investigate the meaning and usage of words and phrases, the best first place to look is always a dictionary. In the process of writing my earlier response, I consulted roughly half a dozen different online dictionaries, reading definitions for 'one', 'you', 'they', 'indefinite pronoun', 'definite pronoun', and probably some other terms. I also looked at non-dictionary sources if they looked relevant, but my starting point was dictionaries. Oh, I also looked up both 'idiom' and 'idiomatic' (which even though they are related don't mean the same thing). Incidentally, on the three pages you gave links for, all of them had at least one example that used "you" as an indefinite pronoun. One point I want to clear up. A couple of times you characterize the indefinite pronoun usage of "you" as "modern". It is not in any sense modern. I am a native speaker of English, speaking and reading the language for well over 60 years. Furthermore I was raised by a grammarian. In all of that time and experience there was never any hint that "you" as an indefinite pronoun was new or unusual or considered substandard or slang or unacceptable. It is simply standard usage, and has been for longer than my lifetime. >> But "one" can also be used as a first person definite >> pronoun (referring to the speaker), which an online >> reference tells me is chiefly British English. > > I had no idea it could, nor does Wikipedia. Can you share > an example of a definite first-person `one'? (A) Pick a good search engine (I use duckduckgo.com). (B) Search for the two words one definition. (C) Read the entries for every online dictionary that is found, or at least the top five or six. You should find a couple of examples. It was by going through this process myself that I discovered "one" is sometimes used as a first person definite pronoun.
[toc] | [prev] | [next] | [standalone]
| From | Richard Harnden <richard.nospam@gmail.invalid> |
|---|---|
| Date | 2024-06-17 16:15 +0100 |
| Message-ID | <v4pjuf$n98h$1@dont-email.me> |
| In reply to | #386099 |
On 17/06/2024 15:33, Ben Bacarisse wrote:
> Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
>
>> On 17/06/2024 10:55, Ben Bacarisse wrote:
>>> Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
>>>
>>>> On 17/06/2024 10:18, Ben Bacarisse wrote:
>>>>> Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
>>>>>
>>>>>> In a recent thread realloc() was a substantial part of the discussion.
>>>>>> "Occasionally" the increased data storage will be relocated along
>>>>>> with the previously stored data. On huge data sets that might be a
>>>>>> performance factor. Is there any experience or are there any concrete
>>>>>> factors about the conditions when this relocation happens? - I could
>>>>>> imagine that it's no issue as long as you're in some kB buffer range,
>>>>>> but if, say, we're using realloc() to substantially increase buffers
>>>>>> often it might be an issue to consider. It would be good to get some
>>>>>> feeling about that internal.
>>>>> There is obviously a cost, but there is (usually) no alternative if
>>>>> contiguous storage is required. In practice, the cost is usually
>>>>> moderate and can be very effectively managed by using an exponential
>>>>> allocation scheme: at every reallocation multiply the storage space by
>>>>> some factor greater than 1 (I often use 3/2, but doubling is often used
>>>>> as well). This results in O(log(N)) rather than O(N) allocations as in
>>>>> your code that added a constant to the size. Of course, some storage is
>>>>> wasted (that /might/ be retrieved by a final realloc down to the final
>>>>> size) but that's rarely significant.
>>>>>
>>>> So can we work it out?
>>> What is "it"?
>>>
>>>> Let's assume for the moment that the allocations have a semi-normal
>>>> distribution,
>>> What allocations? The allocations I talked about don't have that
>>> distribution.
>>>
>>>> with negative values disallowed. Now ignoring the first few
>>>> values, if we have allocated, say, 1K, we ought to be able to predict the
>>>> value by integrating the distribution from 1k to infinity and taking the
>>>> mean.
>>> I have no idea what you are talking about. What "value" are you looking
>>> to calculate?
>>>
>> We have a continuously growing buffer, and we want the best strategy for
>> reallocations as the stream of characters comes at us. So, given we now how
>> many characters have arrived, can we predict how many will arrive, and
>> therefore ask for the best amount when we reallocate, so that we neither
>> make too many reallocation (reallocate on every byte received) or ask for
>> too much (demand SIZE_MAX memory when the first byte is received).?
>
> Obviously not, or we'd use the prediction. You question was probably
> rhetorical, but it didn't read that way.
>
>> Your strategy for avoiding these extremes is exponential growth.
>
> It's odd to call it mine. It's very widely know and used. "The one I
> mentioned" might be less confusing description.
>
>> You
>> allocate a small amount for the first few bytes. Then you use exponential
>> growth, with a factor of ether 2 or 1.5. My question is whether or not we
>> can be cuter. And of course we need to know the statistical distribution of
>> the input files. And I'm assuming a semi-normal distribution, ignoring the
>> files with small values, which we will allocate enough for anyway.
>>
>> And so we integrate the distribution between the point we are at and
>> infinity. Then we tkae the mean. And that gives us a best estimate of how
>> many bytes are to come, and therefore how much to grow the buffer by.
>
> I would be surprised if that were worth the effort at run time. A
> static analysis of "typical" input sizes might be interesting as that
> could be used to get an estimate of good factors to use, but anything
> more complicated than maybe a few factors (e.g. doubling up to 1MB then
> 3/2 thereafter) is likely to be too messy to useful.
>
> Also, the cost of reallocations is not constant. Larger ones are
> usually more costly than small ones, so if one were going to a lot of
> effort to make run-time guesses, that cost should be factored in as
> well.
>
I usually keep track:
struct
{
size_t used;
size_t allocated;
void *data;
};
Then, if used + new_size is more than what's already been allocated then
a realloc will be required.
Start with an initial allocated size that's 'resonable' - the happy path
will never need any reallocs.
Otherwise multiply by some factor. Typicall I just double it.
[toc] | [prev] | [next] | [standalone]
| From | "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> |
|---|---|
| Date | 2024-06-17 13:05 -0700 |
| Message-ID | <v4q4u4$t6bd$1@dont-email.me> |
| In reply to | #386103 |
On 6/17/2024 8:15 AM, Richard Harnden wrote:
> On 17/06/2024 15:33, Ben Bacarisse wrote:
>> Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
>>
>>> On 17/06/2024 10:55, Ben Bacarisse wrote:
>>>> Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
>>>>
>>>>> On 17/06/2024 10:18, Ben Bacarisse wrote:
>>>>>> Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
>>>>>>
>>>>>>> In a recent thread realloc() was a substantial part of the
>>>>>>> discussion.
>>>>>>> "Occasionally" the increased data storage will be relocated along
>>>>>>> with the previously stored data. On huge data sets that might be a
>>>>>>> performance factor. Is there any experience or are there any
>>>>>>> concrete
>>>>>>> factors about the conditions when this relocation happens? - I could
>>>>>>> imagine that it's no issue as long as you're in some kB buffer
>>>>>>> range,
>>>>>>> but if, say, we're using realloc() to substantially increase buffers
>>>>>>> often it might be an issue to consider. It would be good to get some
>>>>>>> feeling about that internal.
>>>>>> There is obviously a cost, but there is (usually) no alternative if
>>>>>> contiguous storage is required. In practice, the cost is usually
>>>>>> moderate and can be very effectively managed by using an exponential
>>>>>> allocation scheme: at every reallocation multiply the storage
>>>>>> space by
>>>>>> some factor greater than 1 (I often use 3/2, but doubling is often
>>>>>> used
>>>>>> as well). This results in O(log(N)) rather than O(N) allocations
>>>>>> as in
>>>>>> your code that added a constant to the size. Of course, some
>>>>>> storage is
>>>>>> wasted (that /might/ be retrieved by a final realloc down to the
>>>>>> final
>>>>>> size) but that's rarely significant.
>>>>>>
>>>>> So can we work it out?
>>>> What is "it"?
>>>>
>>>>> Let's assume for the moment that the allocations have a semi-normal
>>>>> distribution,
>>>> What allocations? The allocations I talked about don't have that
>>>> distribution.
>>>>
>>>>> with negative values disallowed. Now ignoring the first few
>>>>> values, if we have allocated, say, 1K, we ought to be able to
>>>>> predict the
>>>>> value by integrating the distribution from 1k to infinity and
>>>>> taking the
>>>>> mean.
>>>> I have no idea what you are talking about. What "value" are you
>>>> looking
>>>> to calculate?
>>>>
>>> We have a continuously growing buffer, and we want the best strategy for
>>> reallocations as the stream of characters comes at us. So, given we
>>> now how
>>> many characters have arrived, can we predict how many will arrive, and
>>> therefore ask for the best amount when we reallocate, so that we neither
>>> make too many reallocation (reallocate on every byte received) or ask
>>> for
>>> too much (demand SIZE_MAX memory when the first byte is received).?
>>
>> Obviously not, or we'd use the prediction. You question was probably
>> rhetorical, but it didn't read that way.
>>
>>> Your strategy for avoiding these extremes is exponential growth.
>>
>> It's odd to call it mine. It's very widely know and used. "The one I
>> mentioned" might be less confusing description.
>>
>>> You
>>> allocate a small amount for the first few bytes. Then you use
>>> exponential
>>> growth, with a factor of ether 2 or 1.5. My question is whether or
>>> not we
>>> can be cuter. And of course we need to know the statistical
>>> distribution of
>>> the input files. And I'm assuming a semi-normal distribution,
>>> ignoring the
>>> files with small values, which we will allocate enough for anyway.
>>>
>>> And so we integrate the distribution between the point we are at and
>>> infinity. Then we tkae the mean. And that gives us a best estimate of
>>> how
>>> many bytes are to come, and therefore how much to grow the buffer by.
>>
>> I would be surprised if that were worth the effort at run time. A
>> static analysis of "typical" input sizes might be interesting as that
>> could be used to get an estimate of good factors to use, but anything
>> more complicated than maybe a few factors (e.g. doubling up to 1MB then
>> 3/2 thereafter) is likely to be too messy to useful.
>>
>> Also, the cost of reallocations is not constant. Larger ones are
>> usually more costly than small ones, so if one were going to a lot of
>> effort to make run-time guesses, that cost should be factored in as
>> well.
>>
>
> I usually keep track:
>
> struct
> {
> size_t used;
> size_t allocated;
> void *data;
> };
>
> Then, if used + new_size is more than what's already been allocated then
> a realloc will be required.
Fwiw, I remember way back using an n-ary tree of regions to accomplish
it. The trigger for creating a new region was very similar to your
logic. It performed pretty good and did not use realloc. Indeed it was
for a special use case. I remember having region partitions that would
link to other regions. Actually, it kind of reminded me of a strange
version of ropes:
https://en.wikipedia.org/wiki/Rope_(data_structure)
Fwiw, here is my old C code for a region:
https://pastebin.com/raw/f37a23918
(raw text, no ads)
Iirc, I first mentioned it in:
https://groups.google.com/g/comp.lang.c/c/7oaJFWKVCTw/m/sSWYU9BUS_QJ
>
> Start with an initial allocated size that's 'resonable' - the happy path
> will never need any reallocs.
>
> Otherwise multiply by some factor. Typicall I just double it.
>
>
>
[toc] | [prev] | [next] | [standalone]
| From | Malcolm McLean <malcolm.arthur.mclean@gmail.com> |
|---|---|
| Date | 2024-06-17 16:36 +0100 |
| Message-ID | <v4pl6q$nn9o$1@dont-email.me> |
| In reply to | #386099 |
On 17/06/2024 15:33, Ben Bacarisse wrote: > Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes: > >> On 17/06/2024 10:55, Ben Bacarisse wrote: >>> Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes: >>> >>>> On 17/06/2024 10:18, Ben Bacarisse wrote: >>>>> Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes: >>>>> >>>>>> In a recent thread realloc() was a substantial part of the discussion. >>>>>> "Occasionally" the increased data storage will be relocated along >>>>>> with the previously stored data. On huge data sets that might be a >>>>>> performance factor. Is there any experience or are there any concrete >>>>>> factors about the conditions when this relocation happens? - I could >>>>>> imagine that it's no issue as long as you're in some kB buffer range, >>>>>> but if, say, we're using realloc() to substantially increase buffers >>>>>> often it might be an issue to consider. It would be good to get some >>>>>> feeling about that internal. >>>>> There is obviously a cost, but there is (usually) no alternative if >>>>> contiguous storage is required. In practice, the cost is usually >>>>> moderate and can be very effectively managed by using an exponential >>>>> allocation scheme: at every reallocation multiply the storage space by >>>>> some factor greater than 1 (I often use 3/2, but doubling is often used >>>>> as well). This results in O(log(N)) rather than O(N) allocations as in >>>>> your code that added a constant to the size. Of course, some storage is >>>>> wasted (that /might/ be retrieved by a final realloc down to the final >>>>> size) but that's rarely significant. >>>>> >>>> So can we work it out? >>> What is "it"? >>> >>>> Let's assume for the moment that the allocations have a semi-normal >>>> distribution, >>> What allocations? The allocations I talked about don't have that >>> distribution. >>> >>>> with negative values disallowed. Now ignoring the first few >>>> values, if we have allocated, say, 1K, we ought to be able to predict the >>>> value by integrating the distribution from 1k to infinity and taking the >>>> mean. >>> I have no idea what you are talking about. What "value" are you looking >>> to calculate? >>> >> We have a continuously growing buffer, and we want the best strategy for >> reallocations as the stream of characters comes at us. So, given we now how >> many characters have arrived, can we predict how many will arrive, and >> therefore ask for the best amount when we reallocate, so that we neither >> make too many reallocation (reallocate on every byte received) or ask for >> too much (demand SIZE_MAX memory when the first byte is received).? > > Obviously not, or we'd use the prediction. You question was probably > rhetorical, but it didn't read that way. > >> Your strategy for avoiding these extremes is exponential growth. > > It's odd to call it mine. It's very widely know and used. "The one I > mentioned" might be less confusing description. > >> You >> allocate a small amount for the first few bytes. Then you use exponential >> growth, with a factor of ether 2 or 1.5. My question is whether or not we >> can be cuter. And of course we need to know the statistical distribution of >> the input files. And I'm assuming a semi-normal distribution, ignoring the >> files with small values, which we will allocate enough for anyway. >> >> And so we integrate the distribution between the point we are at and >> infinity. Then we tkae the mean. And that gives us a best estimate of how >> many bytes are to come, and therefore how much to grow the buffer by. > > I would be surprised if that were worth the effort at run time. A > static analysis of "typical" input sizes might be interesting as that > could be used to get an estimate of good factors to use, but anything > more complicated than maybe a few factors (e.g. doubling up to 1MB then > 3/2 thereafter) is likely to be too messy to useful. > There's virualy no run-time effort, unless you ask caller to pass in a customised distribution, which you analyse on the fly, which would be quite a bit of work. All the work is done beforehand. We need a statistical distribution of the files sizes of the files we are interesed in. So, probably, text files on personal computers. Then we'll excude the small files, say under 1kb which will have an odd distribution for various reasons, and which we are not interested in as we can easily afford 1k as an initial buffer. And we're probably looking at a semi- normal, maybe log- normal distribution. There's no reason to suspect it would be anything odd. And with the normal distribution there is no closed form integral, but tables of integrals are published. So we convert 1K to a Z-score, integrate from that to infinity, halve the result, and that gives us an estimate of the most likely file size - having established that the file is over 1k, half will be below and half above this size. So that's the next amount to realloc. Say, for the sake of argument, 4K. Then we do the same thing, starting from 4k, and working out the most likely file size, given that the file is over 4K. Now the disribution tends to flatten out towards the tail, so if best guess, given at least 1K, was 4K, best guess diven 4k, won't be 8K. It will be 10k, maybe 12k. Do the same again for 12k. And we'll get a series of numbers like this. 1k, 4k, 10k, 20k, 50k, 120k, 500k, 2MB, 8MB ... and so on, rapidly increasing to SIZE_MAX. And then at runtime we just hardcode those in, it's a lookup table with not too many entries. Becuase we've chosen the mean, half the time you will reallocate. You can easily fine tune the strategy by choosing a proportion other than 0.5, depending on whether saving memory or reducing allocations is more important to you. and the hard part is getting some real statistics to work on. > > Also, the cost of reallocations is not constant. Larger ones are > usually more costly than small ones, so if one were going to a lot of > effort to make run-time guesses, that cost should be factored in as > well. > Unfortunately yes. Real optimisation problems can be almost impossible for reasons like this. iF the cost of reallocations isn't constant, tou've got to put in correctiong factors, and then what was a fairly simple procedure becomes difficult. -- Check out my hobby project. http://malcolmmclean.github.io/babyxrc
[toc] | [prev] | [next] | [standalone]
| From | Anton Shepelev <anton.txt@g{oogle}mail.com> |
|---|---|
| Date | 2024-06-17 18:02 +0300 |
| Message-ID | <20240617180249.96dfaafa89392827aa162434@g{oogle}mail.com> |
| In reply to | #386084 |
[cross-posted to: ci.stat.math] Malcolm McLean: > We have a continuously growing buffer, and we want the > best strategy for reallocations as the stream of > characters comes at us. So, given we now how many > characters have arrived, can we predict how many will > arrive, Do you mean in the next bunch, or in total (till the end of the buffer's lifetime)? > and therefore ask for the best amount when we reallocate, > so that we neither make too many reallocation (reallocate > on every byte received) or ask for too much (demand > SIZE_MAX memory when the first byte is received).? > > Your strategy for avoiding these extremes is exponential > growth. You allocate a small amount for the first few > bytes. Then you use exponential growth, with a factor of > ether 2 or 1.5. This strategy ensures a constant ratio between the amount of reallocated data to the length of the buffer by making reallocations less frequent as the buffer grows. > And so we integrate the distribution between the point we > are at and infinity. Then we tkae the mean. And that gives > us a best estimate of how many bytes are to come, and > therefore how much to grow the buffer by. You have an apriori distribution of the buffer size (can be tracked on-the-fly, if unknown beforehand) and a partially filled buffer. The task is to calculate the a-posteriori distribution of /that/ buffer's final size, and then to allocate the predicted value based on a good percentile. How about using a percentile instead of the mean, e.g. if the current size corresponds to percentile p, you allocate a capacity corresponding to percentile 1-(1-p)/k , where k>1 denotes the balance between space and time efficency. For example, if the 60th percentile of the buffer is required and k=2, you allocate a capacity sufficient to hold 100-(100-60)/2=80% of buffers. -- () ascii ribbon campaign -- against html e-mail /\ www.asciiribbon.org -- against proprietary attachments
[toc] | [prev] | [next] | [standalone]
| From | "David Jones" <dajhawk18xx@@nowhere.com> |
|---|---|
| Date | 2024-06-18 17:59 +0000 |
| Message-ID | <v4shu2$1ff4q$1@dont-email.me> |
| In reply to | #386101 |
Anton Shepelev wrote: > [cross-posted to: ci.stat.math] > > Malcolm McLean: > > > We have a continuously growing buffer, and we want the > > best strategy for reallocations as the stream of > > characters comes at us. So, given we now how many > > characters have arrived, can we predict how many will > > arrive, > > Do you mean in the next bunch, or in total (till the end of > the buffer's lifetime)? > > > and therefore ask for the best amount when we reallocate, > > so that we neither make too many reallocation (reallocate > > on every byte received) or ask for too much (demand > > SIZE_MAX memory when the first byte is received).? > > > > Your strategy for avoiding these extremes is exponential > > growth. You allocate a small amount for the first few > > bytes. Then you use exponential growth, with a factor of > > ether 2 or 1.5. > > This strategy ensures a constant ratio between the amount of > reallocated data to the length of the buffer by making > reallocations less frequent as the buffer grows. > > > And so we integrate the distribution between the point we > > are at and infinity. Then we tkae the mean. And that gives > > us a best estimate of how many bytes are to come, and > > therefore how much to grow the buffer by. > > You have an apriori distribution of the buffer size (can be > tracked on-the-fly, if unknown beforehand) and a partially > filled buffer. The task is to calculate the a-posteriori > distribution of that buffer's final size, and then to > allocate the predicted value based on a good percentile. > > How about using a percentile instead of the mean, e.g. if > the current size corresponds to percentile p, you allocate a > capacity corresponding to percentile 1-(1-p)/k , where k>1 > denotes the balance between space and time efficency. For > example, if the 60th percentile of the buffer is required > and k=2, you allocate a capacity sufficient to hold > 100-(100-60)/2=80% of buffers. Based on essentially no background to this question, not much can be said. However, if one starts from the suggestion above to use the mean of some distribution (or later some percentile), one notes that the "mean" is just the minimum of a quadratic cast function ,,, so an improvement would be to base the choice on some more realistic cost function, chosen for the actual application. Given that the scenario apparently involves a sequence of such decisions, the obvious extension of the cost-based approach would be to employ some form of dynamic programming. Of course, this might not be appealing, in which case one might choose the theoretically-simple approach of tuning a policy based on good stchastic simulations of the situation.
[toc] | [prev] | [next] | [standalone]
| From | davidd02@tpg.com.au (David Duffy) |
|---|---|
| Date | 2024-06-19 06:48 +0000 |
| Message-ID | <v4tuvf$1qto5$1@dont-email.me> |
| In reply to | #386101 |
In sci.stat.math Anton Shepelev <anton.txt@g{oogle}mail.com> wrote:
> [cross-posted to: ci.stat.math]
>
> Malcolm McLean:
>
>> We have a continuously growing buffer, and we want the
>> best strategy for reallocations as the stream of
>> characters comes at us. So, given we now how many
>> characters have arrived, can we predict how many will
>> arrive,
>
> Do you mean in the next bunch, or in total (till the end of
> the buffer's lifetime)?
>
Isn't this a halting problem? Aren't the more important data:
how much memory the user is allowed to allocate, the properties of
the current system's memory allocation algorithm, when your stream
will have to go to disc or other slow large volume storage, how
the stream can be compressed on the fly (the latter might well give
strong predictions for future storage requirements based on what
has been read to date).
[toc] | [prev] | [next] | [standalone]
| From | Malcolm McLean <malcolm.arthur.mclean@gmail.com> |
|---|---|
| Date | 2024-06-19 10:12 +0100 |
| Message-ID | <v4u7dm$1t2pu$1@dont-email.me> |
| In reply to | #386219 |
On 19/06/2024 07:48, David Duffy wrote:
> In sci.stat.math Anton Shepelev <anton.txt@g{oogle}mail.com> wrote:
>> [cross-posted to: ci.stat.math]
>>
>> Malcolm McLean:
>>
>>> We have a continuously growing buffer, and we want the
>>> best strategy for reallocations as the stream of
>>> characters comes at us. So, given we now how many
>>> characters have arrived, can we predict how many will
>>> arrive,
>>
>> Do you mean in the next bunch, or in total (till the end of
>> the buffer's lifetime)?
>>
> Isn't this a halting problem? Aren't the more important data:
> how much memory the user is allowed to allocate, the properties of
> the current system's memory allocation algorithm, when your stream
> will have to go to disc or other slow large volume storage, how
> the stream can be compressed on the fly (the latter might well give
> strong predictions for future storage requirements based on what
> has been read to date).
>
>
No. We have to have some knowledge. And what we probaby know is that the
input is a file stored on someone's personal computer. And someone has
published on the statistical distribution of such files And they have a
log-normal distribution with a mean and a median which he gives. So with
that informaton, we can work out, given that a file is at least N
characters, what is the prbablity that an allocation of any size will
contain the whole file, and how many bytes, on average will be wasted.
Statistical analysis can't tell us what a allocation wil cost versus the
cost of hoggng memory we don't need, however. It can;t tell us what to
do. Just put us in the picture.
--
Check out my hobby project.
http://malcolmmclean.github.io/babyxrc
[toc] | [prev] | [next] | [standalone]
| From | Ben Bacarisse <ben@bsb.me.uk> |
|---|---|
| Date | 2024-06-19 16:36 +0100 |
| Message-ID | <875xu5t066.fsf@bsb.me.uk> |
| In reply to | #386226 |
Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes: > No. We have to have some knowledge. And what we probaby know is that the > input is a file stored on someone's personal computer. And someone has > published on the statistical distribution of such files That's not the case that matters (to me at least). If the input is a file, we have a much better way of "guessing" the size than guessing and growing -- just ask for the size. Sure, we might need to make adjustments if the file is changing, but there is always a better measure than any statistical analysis. To some extent this seems like a solution in search of a problem. Growing the buffer exponentially is simple and effective. -- Ben.
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2024-06-19 19:41 +0200 |
| Message-ID | <v4v58t$230rh$1@dont-email.me> |
| In reply to | #386241 |
On 19/06/2024 17:36, Ben Bacarisse wrote: > Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes: > >> No. We have to have some knowledge. And what we probaby know is that the >> input is a file stored on someone's personal computer. And someone has >> published on the statistical distribution of such files > > That's not the case that matters (to me at least). If the input is a > file, we have a much better way of "guessing" the size than guessing and > growing -- just ask for the size. Sure, we might need to make > adjustments if the file is changing, but there is always a better > measure than any statistical analysis. > > To some extent this seems like a solution in search of a problem. It seems more like a solution that doesn't exist in search of a problem with absurdly unrealistic requirements. And even if Malcolm's solution existed, and the problem existed, it /still/ wouldn't work - knowing the distribution of file sizes tells us nothing about the size of any given file. > Growing the buffer exponentially is simple and effective. > Yes, that's the general way to handle buffers when you don't know what size they should be. A better solutions for this sort of program is usually, as you say, asking the OS for the file size (there is no standard library function for getting the file size, but it's not hard to do for any realistic target OS). And then for big files, prefer mmap to reading the file into a buffer. It's only really for unsized "files" such as piped input that you have no way of getting the size, and then exponential growth is the way to go. Personally, I'd start with a big size (perhaps 10 MB) that is bigger than you are likely to need in practice, but small enough that it is negligible on even vaguely modern computers. Then the realloc code is unlikely to be used (but it can still be there for completeness).
[toc] | [prev] | [next] | [standalone]
| From | Ben Bacarisse <ben@bsb.me.uk> |
|---|---|
| Date | 2024-06-19 22:24 +0100 |
| Message-ID | <87o77wsk18.fsf@bsb.me.uk> |
| In reply to | #386244 |
David Brown <david.brown@hesbynett.no> writes: > On 19/06/2024 17:36, Ben Bacarisse wrote: >> Growing the buffer exponentially is simple and effective. > > Yes, that's the general way to handle buffers when you don't know what size > they should be. > > A better solutions for this sort of program is usually, as you say, asking > the OS for the file size (there is no standard library function for getting > the file size, but it's not hard to do for any realistic target OS). And > then for big files, prefer mmap to reading the file into a buffer. > > It's only really for unsized "files" such as piped input that you have no > way of getting the size, and then exponential growth is the way to go. > Personally, I'd start with a big size (perhaps 10 MB) that is bigger than > you are likely to need in practice, but small enough that it is negligible > on even vaguely modern computers. Then the realloc code is unlikely to be > used (but it can still be there for completeness). There are other uses that have nothing to do with files. I have a small dynamic array library (just a couple of function) that I use for all sorts of things. I can read a file or parse tokens or input a line just by adding characters. Because of its rather general use, I don't start with a large buffer (though the initial size can be set). -- Ben.
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2024-06-20 13:22 +0200 |
| Message-ID | <v513dn$2i1c3$2@dont-email.me> |
| In reply to | #386251 |
On 19/06/2024 23:24, Ben Bacarisse wrote: > David Brown <david.brown@hesbynett.no> writes: > >> On 19/06/2024 17:36, Ben Bacarisse wrote: >>> Growing the buffer exponentially is simple and effective. >> >> Yes, that's the general way to handle buffers when you don't know what size >> they should be. >> >> A better solutions for this sort of program is usually, as you say, asking >> the OS for the file size (there is no standard library function for getting >> the file size, but it's not hard to do for any realistic target OS). And >> then for big files, prefer mmap to reading the file into a buffer. >> >> It's only really for unsized "files" such as piped input that you have no >> way of getting the size, and then exponential growth is the way to go. >> Personally, I'd start with a big size (perhaps 10 MB) that is bigger than >> you are likely to need in practice, but small enough that it is negligible >> on even vaguely modern computers. Then the realloc code is unlikely to be >> used (but it can still be there for completeness). > > There are other uses that have nothing to do with files. Of course. This comment was for the specific purposes being discussed here. For other uses, there can be many other structures and algorithms that fit better. Exponentially increasing the size when needed is a good general-purpose method. > I have a small > dynamic array library (just a couple of function) that I use for all > sorts of things. I can read a file or parse tokens or input a line just > by adding characters. Because of its rather general use, I don't start > with a large buffer (though the initial size can be set). >
[toc] | [prev] | [next] | [standalone]
| From | Anton Shepelev <anton.txt@gmail.moc> |
|---|---|
| Date | 2024-06-20 01:53 +0300 |
| Message-ID | <20240620015347.9bdcc4df03ab63d096375450@gmail.moc> |
| In reply to | #386226 |
Malcolm McLean:
> We have to have some knowledge. And what we probaby know
> is that the input is a file stored on someone's personal
> computer. And someone has published on the statistical
> distribution of such files And they have a log-normal
> distribution with a mean and a median which he gives. So
> with that informaton, we can work out, given that a file
> is at least N characters, what is the prbablity that an
> allocation of any size will contain the whole file, and
> how many bytes, on average will be wasted.
Observe that the standard algorithm of exponential growth is
memoryless and self-similar in that in does not depend on
context, or the history of previous reallocations. These
properties belong to (or even identify?) the exponential
distribution. We can therefore assume that exponential-
growth strategy is ideal for exponentially distributed
buffer sizes, and under that assumption determine the
relation between the CDF values (p) corresponding to
consequent re-allcoations:
p = e^x/L ,
p0 = 1-e^(L*x0) ,
p1 = 1-e^(L*x1) ,
x1 = k*x0 (by our strategy), =>
p1 = 1-(1-p0)^k .
which does not depend on the distribution and lets us
generalise this approach for any distribution:
x1 = Q( 1 - ( 1 - CDF(x0) )^k )
where:
x0 : the required size
x1 : the new recommended capacity
Q(p) : the p-Quantile of the given distribution
CDF(x): the CDF of the given distribution
k>1 : balance between speed and space efficiency
--
() ascii ribbon campaign -- against html e-mail
/\ www.asciiribbon.org -- against proprietary attachments
[toc] | [prev] | [next] | [standalone]
| From | Anton Shepelev <anton.txt@g{oogle}mail.com> |
|---|---|
| Date | 2024-07-08 19:34 +0300 |
| Message-ID | <20240708193456.a1ebe2d0872239c525120d84@g{oogle}mail.com> |
| In reply to | #386253 |
I had plumb forgot about this solution of mine:
> p0 = 1-e^(L*x0) ,
> p1 = 1-e^(L*x1) ,
> x1 = k*x0 (by our strategy), =>
> p1 = 1-(1-p0)^k .
>
> which does not depend on the distribution and lets us
> generalise this approach for any distribution:
>
> x1 = Q( 1 - ( 1 - CDF(x0) )^k )
> where:
>
> x0 : the required size
> x1 : the new recommended capacity
> Q(p) : the p-Quantile of the given distribution
> CDF(x): the CDF of the given distribution
> k>1 : balance between speed and space efficiency
Let us test it with the exponential distribution, for which:
Q (p) = -Ln( 1 - p )/L
CDF(x) = 1 - e^(-Lx)
Substituting these into the equation for x1:
x1 = Q ( 1 - ( 1 - ( 1 - e^(-Lx0) ) )^k ) =
Q ( 1 - ( e^(-Lx0) )^k ) =
Q ( 1 - e^(-kLx0) ) =
-Ln( e^(-kLx0) )/L = k*x0 (QED)
That is, my solution is a/the generalisation of the
exponential growth strategy.
--
() ascii ribbon campaign -- against html e-mail
/\ www.asciiribbon.org -- against proprietary attachments
[toc] | [prev] | [next] | [standalone]
| From | Anton Shepelev <anton.txt@g{oogle}mail.com> |
|---|---|
| Date | 2024-06-19 15:20 +0300 |
| Message-ID | <20240619152000.2738defeceb1df7203151c64@g{oogle}mail.com> |
| In reply to | #386219 |
Malcolm McLean writes that, given the log-normal distribution
of file sizes with known parameters,
> we can work out, given that a file is at least N
> characters, what is the prbablity that an allocation of
> any size will contain the whole file, and how many bytes,
> on average will be wasted.
This is why I thought statisticians might help him: Malcolm
wants to find the aposteriori distribution of the size of a
file, after it has been found to exceed N bytes. Am I right
that if we take the remaining (N>20) part of the density
function and re-normalise it, we shall obtain the desired
distribution?
My proposition was as follows:
1. Find quantile q0 corresponding to the buffer size
currently requested.
2. Calculate new quantile q1 = 1-(1-q0)/k, where k>1 is
an adjustable parameter, and use its corresponding
value as the new allocation size.
For example, assuming for simplicity a uniform [0,20]
distribution of file sizez and k=2, a sequence of allocation
may look like this:
requested allocated
2 20-(20- 2)/2 = 11
12 20-(20-12)/2 = 16
18 20-(20-18)/2 = 19
--
() ascii ribbon campaign -- against html e-mail
/\ www.asciiribbon.org -- against proprietary attachments
[toc] | [prev] | [next] | [standalone]
| From | Rich Ulrich <rich.ulrich@comcast.net> |
|---|---|
| Date | 2024-07-02 00:51 -0400 |
| Message-ID | <s1178jp7fjh4rgtsafa78dp6j05ejidau2@4ax.com> |
| In reply to | #386101 |
On Mon, 17 Jun 2024 18:02:49 +0300, Anton Shepelev
<anton.txt@g{oogle}mail.com> wrote:
>[cross-posted to: ci.stat.math]
>
Anton,
The post being responded to was originally to comp.lang.c
which I don't subscribe to.
I have a question that I suppose reflects on my news source,
GigaNews, or else on my reader, Forte Agent.
Was this thread something posted 15 or 20 years ago?
I tried to call up the original post by clicking on the Message
ID when looking at headers; nothing comes up when Agent goes
online to look. The header shows multiple earlier messages;
none of them come up for me.
My clicking on Message ID works elsewhere. The logical and
simple explanation is that this is a thread old enough that
GigaNews does not have it.
I suppose that someone else might be able to tell me, if their
supplier goes back further or if GigaNews is somehow failing
to show me something that is recent.
--
Rich Ulrich
[toc] | [prev] | [next] | [standalone]
Page 2 of 5 — ← Prev page 1 [2] 3 4 5 Next page →
Back to top | Article view | comp.lang.c
csiph-web