Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.c > #152174 > unrolled thread
| Started by | Philipp Klaus Krause <pkk@spth.de> |
|---|---|
| First post | 2020-05-11 13:30 +0200 |
| Last post | 2020-06-28 06:32 -0700 |
| Articles | 16 on this page of 76 — 16 participants |
Back to article view | Back to comp.lang.c
How many wide characters may mbstowcs store? Philipp Klaus Krause <pkk@spth.de> - 2020-05-11 13:30 +0200
Re: How many wide characters may mbstowcs store? Manfred <noname@add.invalid> - 2020-05-11 13:55 +0200
Re: How many wide characters may mbstowcs store? Philipp Klaus Krause <pkk@spth.de> - 2020-05-11 14:01 +0200
Re: How many wide characters may mbstowcs store? James Kuyper <jameskuyper@alumni.caltech.edu> - 2020-05-11 09:08 -0400
Re: How many wide characters may mbstowcs store? Philipp Klaus Krause <pkk@spth.de> - 2020-05-11 15:19 +0200
Re: How many wide characters may mbstowcs store? Manfred <noname@add.invalid> - 2020-05-11 18:32 +0200
Re: How many wide characters may mbstowcs store? Philipp Klaus Krause <pkk@spth.de> - 2020-05-11 18:59 +0200
Re: How many wide characters may mbstowcs store? scott@slp53.sl.home (Scott Lurndal) - 2020-05-11 17:42 +0000
Re: How many wide characters may mbstowcs store? Philipp Klaus Krause <pkk@spth.de> - 2020-05-11 20:30 +0200
Re: How many wide characters may mbstowcs store? James Kuyper <jameskuyper@alumni.caltech.edu> - 2020-05-12 00:29 -0400
Re: How many wide characters may mbstowcs store? Manfred <noname@add.invalid> - 2020-05-12 14:41 +0200
Re: How many wide characters may mbstowcs store? Bonita Montero <Bonita.Montero@gmail.com> - 2020-05-12 16:19 +0200
Re: How many wide characters may mbstowcs store? Ben Bacarisse <ben.usenet@bsb.me.uk> - 2020-05-11 13:03 +0100
Re: How many wide characters may mbstowcs store? Philipp Klaus Krause <pkk@spth.de> - 2020-05-11 14:07 +0200
Re: How many wide characters may mbstowcs store? Philipp Klaus Krause <pkk@spth.de> - 2020-05-11 15:20 +0200
Re: How many wide characters may mbstowcs store? Bonita Montero <Bonita.Montero@gmail.com> - 2020-05-11 16:31 +0200
Re: How many wide characters may mbstowcs store? Barry Schwarz <schwarzb@delq.com> - 2020-05-11 10:06 -0700
Re: How many wide characters may mbstowcs store? Philipp Klaus Krause <pkk@spth.de> - 2020-05-11 15:58 +0200
Re: How many wide characters may mbstowcs store? James Kuyper <jameskuyper@alumni.caltech.edu> - 2020-05-11 10:24 -0400
Re: How many wide characters may mbstowcs store? Philipp Klaus Krause <pkk@spth.de> - 2020-05-11 16:52 +0200
Re: How many wide characters may mbstowcs store? Manfred <noname@add.invalid> - 2020-05-11 18:55 +0200
Re: How many wide characters may mbstowcs store? richard@cogsci.ed.ac.uk (Richard Tobin) - 2020-05-11 15:51 +0000
Re: How many wide characters may mbstowcs store? Philipp Klaus Krause <pkk@spth.de> - 2020-05-11 19:01 +0200
Re: How many wide characters may mbstowcs store? scott@slp53.sl.home (Scott Lurndal) - 2020-05-11 17:33 +0000
Re: How many wide characters may mbstowcs store? Philipp Klaus Krause <pkk@spth.de> - 2020-05-11 20:57 +0200
Re: How many wide characters may mbstowcs store? scott@slp53.sl.home (Scott Lurndal) - 2020-05-11 19:17 +0000
Re: How many wide characters may mbstowcs store? richard@cogsci.ed.ac.uk (Richard Tobin) - 2020-05-11 19:41 +0000
Re: How many wide characters may mbstowcs store? scott@slp53.sl.home (Scott Lurndal) - 2020-05-11 20:01 +0000
Re: How many wide characters may mbstowcs store? scott@slp53.sl.home (Scott Lurndal) - 2020-05-11 19:19 +0000
Re: How many wide characters may mbstowcs store? Florian Weimer <fw@deneb.enyo.de> - 2020-05-11 20:24 +0200
Re: How many wide characters may mbstowcs store? Philipp Klaus Krause <pkk@spth.de> - 2020-05-11 20:59 +0200
Re: How many wide characters may mbstowcs store? scott@slp53.sl.home (Scott Lurndal) - 2020-05-11 19:17 +0000
Re: How many wide characters may mbstowcs store? Philipp Klaus Krause <pkk@spth.de> - 2020-05-11 21:24 +0200
Re: How many wide characters may mbstowcs store? Florian Weimer <fw@deneb.enyo.de> - 2020-05-11 22:30 +0200
Re: How many wide characters may mbstowcs store? Bonita Montero <Bonita.Montero@gmail.com> - 2020-05-11 16:44 +0200
Re: How many wide characters may mbstowcs store? Philipp Klaus Krause <pkk@spth.de> - 2020-05-11 16:54 +0200
Re: How many wide characters may mbstowcs store? Bonita Montero <Bonita.Montero@gmail.com> - 2020-05-11 16:57 +0200
Re: How many wide characters may mbstowcs store? Philipp Klaus Krause <pkk@spth.de> - 2020-05-11 17:07 +0200
Re: How many wide characters may mbstowcs store? Bonita Montero <Bonita.Montero@gmail.com> - 2020-05-11 17:08 +0200
Re: How many wide characters may mbstowcs store? James Kuyper <jameskuyper@alumni.caltech.edu> - 2020-05-11 11:25 -0400
Re: How many wide characters may mbstowcs store? Tim Rentsch <tr.17687@z991.linuxsc.com> - 2020-05-11 09:06 -0700
Re: How many wide characters may mbstowcs store? Philipp Klaus Krause <pkk@spth.de> - 2020-05-11 19:05 +0200
Re: How many wide characters may mbstowcs store? Philipp Klaus Krause <pkk@spth.de> - 2020-05-11 19:19 +0200
Re: How many wide characters may mbstowcs store? Tim Rentsch <tr.17687@z991.linuxsc.com> - 2020-05-23 07:51 -0700
Re: How many wide characters may mbstowcs store? Philipp Klaus Krause <pkk@spth.de> - 2020-05-23 20:27 +0200
Re: How many wide characters may mbstowcs store? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2020-05-23 14:25 -0700
Re: How many wide characters may mbstowcs store? Tim Rentsch <tr.17687@z991.linuxsc.com> - 2020-05-26 07:09 -0700
Re: How many wide characters may mbstowcs store? Tim Rentsch <tr.17687@z991.linuxsc.com> - 2020-05-26 07:14 -0700
Re: How many wide characters may mbstowcs store? Spiros Bousbouras <spibou@gmail.com> - 2020-05-26 16:00 +0000
Re: How many wide characters may mbstowcs store? Tim Rentsch <tr.17687@z991.linuxsc.com> - 2020-05-29 21:23 -0700
Re: How many wide characters may mbstowcs store? Spiros Bousbouras <spibou@gmail.com> - 2020-05-30 20:08 +0000
Re: How many wide characters may mbstowcs store? Tim Rentsch <tr.17687@z991.linuxsc.com> - 2020-06-03 08:46 -0700
Re: How many wide characters may mbstowcs store? James Kuyper <jameskuyper@alumni.caltech.edu> - 2020-06-03 10:18 -0700
Re: How many wide characters may mbstowcs store? Tim Rentsch <tr.17687@z991.linuxsc.com> - 2020-06-23 05:35 -0700
Re: How many wide characters may mbstowcs store? James Kuyper <jameskuyper@alumni.caltech.edu> - 2020-06-26 06:32 -0700
Re: How many wide characters may mbstowcs store? Tim Rentsch <tr.17687@z991.linuxsc.com> - 2020-09-02 09:19 -0700
Re: How many wide characters may mbstowcs store? James Kuyper <jameskuyper@alumni.caltech.edu> - 2020-09-02 19:50 -0700
Re: How many wide characters may mbstowcs store? raltbos@xs4all.nl (Richard Bos) - 2020-06-04 20:34 +0000
Re: How many wide characters may mbstowcs store? scott@slp53.sl.home (Scott Lurndal) - 2020-05-11 17:39 +0000
Re: How many wide characters may mbstowcs store? Autist <autist69@gmail.com> - 2020-05-11 19:42 +0200
Re: How many wide characters may mbstowcs store? Philipp Klaus Krause <pkk@spth.de> - 2020-05-11 20:28 +0200
Re: How many wide characters may mbstowcs store? scott@slp53.sl.home (Scott Lurndal) - 2020-05-11 18:37 +0000
Re: How many wide characters may mbstowcs store? Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2020-05-11 11:50 -0700
Re: How many wide characters may mbstowcs store? Ben Bacarisse <ben.usenet@bsb.me.uk> - 2020-05-12 20:02 +0100
Re: How many wide characters may mbstowcs store? Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2020-05-12 13:12 -0700
Re: How many wide characters may mbstowcs store? richard@cogsci.ed.ac.uk (Richard Tobin) - 2020-05-11 19:56 +0000
Re: How many wide characters may mbstowcs store? Tim Rentsch <tr.17687@z991.linuxsc.com> - 2020-05-24 16:49 -0700
Re: How many wide characters may mbstowcs store? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2020-05-11 14:19 -0700
Re: How many wide characters may mbstowcs store? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2020-05-11 14:22 -0700
Re: How many wide characters may mbstowcs store? Philipp Klaus Krause <pkk@spth.de> - 2020-05-12 09:17 +0200
Re: How many wide characters may mbstowcs store? raltbos@xs4all.nl (Richard Bos) - 2020-05-24 16:12 +0000
Re: How many wide characters may mbstowcs store? Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2020-05-24 15:10 -0700
Re: How many wide characters may mbstowcs store? James Kuyper <jameskuyper@alumni.caltech.edu> - 2020-05-24 22:58 -0400
Re: How many wide characters may mbstowcs store? richard@cogsci.ed.ac.uk (Richard Tobin) - 2020-05-11 20:07 +0000
Re: How many wide characters may mbstowcs store? Andrey Tarasevich <andreytarasevich@hotmail.com> - 2020-06-25 21:42 -0700
Re: How many wide characters may mbstowcs store? Tim Rentsch <tr.17687@z991.linuxsc.com> - 2020-06-28 06:32 -0700
Page 4 of 4 — ← Prev page 1 2 3 [4]
| From | Philipp Klaus Krause <pkk@spth.de> |
|---|---|
| Date | 2020-05-11 20:28 +0200 |
| Message-ID | <r9c5ha$db2$1@solani.org> |
| In reply to | #152220 |
Am 11.05.20 um 19:39 schrieb Scott Lurndal: > > It's not a bug. The phrase > > "No characters that follow a null byte (which is converted into > a wide-character code with value 0) shall be examined or converted." > > Is there to ensure that no bytes beyond the null are _read_ from the source > string (thus ensuring that no page fault, for example, occurs because a byte > beyond the nul is on the next (unallocated) page). It has no bearing on > whether the function is allowed to write element 'n-1' of the destination operand > which is allowed explictly by the standard regardless of the length of > the input string. I cna see how you could assume that would be allowed implicitly, but explicitly? Then, by your reasoning, would strcpy() be allowed to write arbitrary amounts of data to the destination?
[toc] | [prev] | [next] | [standalone]
| From | scott@slp53.sl.home (Scott Lurndal) |
|---|---|
| Date | 2020-05-11 18:37 +0000 |
| Message-ID | <7YguG.179866$2U3.164806@fx04.iad> |
| In reply to | #152225 |
Philipp Klaus Krause <pkk@spth.de> writes: >Am 11.05.20 um 19:39 schrieb Scott Lurndal: >> >> It's not a bug. The phrase >> >> "No characters that follow a null byte (which is converted into >> a wide-character code with value 0) shall be examined or converted." >> >> Is there to ensure that no bytes beyond the null are _read_ from the source >> string (thus ensuring that no page fault, for example, occurs because a byte >> beyond the nul is on the next (unallocated) page). It has no bearing on >> whether the function is allowed to write element 'n-1' of the destination operand >> which is allowed explictly by the standard regardless of the length of >> the input string. > >I cna see how you could assume that would be allowed implicitly, but >explicitly? The purpose of the standard is to provide a contract between the application and the implementation. The requirements in the standard describe what the implementation is allowed to do. Explicitly. It cannot write beyond the 'n-1'th element of the destination, and cannot read beyond the nul-byte in the source. >Then, by your reasoning, would strcpy() be allowed to write arbitrary >amounts of data to the destination? Arbitrary in the sense that it can continue to store bytes into the destination until it processes a null byte, yes. It's not analogous to the interfaces you're discussion since strcpy's destination buffer isn't explicity bounded by the API. Consider strncpy, for example, where the implementation must pad the destination with nul-bytes up to the 'n-1'th element of the destination buffer if a nul-byte is encountered in the source string before the destination buffer is exhausted. This is a much better analogy to the behavior of the bounded wide-string functions.
[toc] | [prev] | [next] | [standalone]
| From | Malcolm McLean <malcolm.arthur.mclean@gmail.com> |
|---|---|
| Date | 2020-05-11 11:50 -0700 |
| Message-ID | <0464d036-5ce1-4a13-b92b-a4ff66aa1af4@googlegroups.com> |
| In reply to | #152227 |
On Monday, 11 May 2020 19:38:04 UTC+1, Scott Lurndal wrote: > Philipp Klaus Krause <pkk@spth.de> writes: > >Am 11.05.20 um 19:39 schrieb Scott Lurndal: > >> > >> It's not a bug. The phrase > >> > >> "No characters that follow a null byte (which is converted into > >> a wide-character code with value 0) shall be examined or converted." > >> > >> Is there to ensure that no bytes beyond the null are _read_ from the source > >> string (thus ensuring that no page fault, for example, occurs because a byte > >> beyond the nul is on the next (unallocated) page). It has no bearing on > >> whether the function is allowed to write element 'n-1' of the destination operand > >> which is allowed explictly by the standard regardless of the length of > >> the input string. > > > >I cna see how you could assume that would be allowed implicitly, but > >explicitly? > > The purpose of the standard is to provide a contract between the > application and the implementation. > > The requirements in the standard describe what the implementation is allowed > to do. Explicitly. It cannot write beyond the 'n-1'th element of the > destination, and cannot read beyond the nul-byte in the source. > > >Then, by your reasoning, would strcpy() be allowed to write arbitrary > >amounts of data to the destination? > > Arbitrary in the sense that it can continue to store bytes into the > destination until it processes a null byte, yes. It's not analogous > to the interfaces you're discussion since strcpy's destination buffer > isn't explicity bounded by the API. Consider strncpy, for example, where the > implementation must pad the destination with nul-bytes up to > the 'n-1'th element of the destination buffer if a nul-byte is > encountered in the source string before the destination buffer > is exhausted. This is a much better analogy to the behavior > of the bounded wide-string functions. > Note that if we know that the buffers are correctly aligned and the size is a multiple of the natural word size, probably 8 bytes, we can implement the function in such as way as to always read and write multiples of 8 bytes, and construct the write in registers. As memory access is usually the rate limiting operation, tis might well be faster than trying to detect the exact output buffer end.
[toc] | [prev] | [next] | [standalone]
| From | Ben Bacarisse <ben.usenet@bsb.me.uk> |
|---|---|
| Date | 2020-05-12 20:02 +0100 |
| Message-ID | <87ftc5dlif.fsf@bsb.me.uk> |
| In reply to | #152229 |
Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes: > On Monday, 11 May 2020 19:38:04 UTC+1, Scott Lurndal wrote: <cut> >> The purpose of the standard is to provide a contract between the >> application and the implementation. >> >> The requirements in the standard describe what the implementation is allowed >> to do. Explicitly. It cannot write beyond the 'n-1'th element of the >> destination, and cannot read beyond the nul-byte in the source. If the implementation writes beyond any converted wide null it must pretend that it didn't because the return result counts the number of modified locations. As a result, it can only write the value that was there before (that's technically a modification in C terms, but it's one the implementation can lie about). <cut> > Note that if we know that the buffers are correctly aligned and the size > is a multiple of the natural word size, probably 8 bytes, we can implement > the function in such as way as to always read and write multiples of 8 bytes, > and construct the write in registers. As memory access is usually the > rate limiting operation, tis might well be faster than trying to > detect the exact output buffer end. Is there a way that can be useful when the "extra" entries -- those after any convert wide null -- must remain unchanged? Seems unlikely, but it's not my area of expertise. -- Ben.
[toc] | [prev] | [next] | [standalone]
| From | Malcolm McLean <malcolm.arthur.mclean@gmail.com> |
|---|---|
| Date | 2020-05-12 13:12 -0700 |
| Message-ID | <fdcc2e17-60d3-426c-8883-6324ef0882b8@googlegroups.com> |
| In reply to | #152263 |
On Tuesday, 12 May 2020 20:03:00 UTC+1, Ben Bacarisse wrote: > Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes: > > > On Monday, 11 May 2020 19:38:04 UTC+1, Scott Lurndal wrote: > <cut> > >> The purpose of the standard is to provide a contract between the > >> application and the implementation. > >> > >> The requirements in the standard describe what the implementation is allowed > >> to do. Explicitly. It cannot write beyond the 'n-1'th element of the > >> destination, and cannot read beyond the nul-byte in the source. > > If the implementation writes beyond any converted wide null it must > pretend that it didn't because the return result counts the number of > modified locations. As a result, it can only write the value that was > there before (that's technically a modification in C terms, but it's one > the implementation can lie about). > > <cut> > > Note that if we know that the buffers are correctly aligned and the size > > is a multiple of the natural word size, probably 8 bytes, we can implement > > the function in such as way as to always read and write multiples of 8 bytes, > > and construct the write in registers. As memory access is usually the > > rate limiting operation, tis might well be faster than trying to > > detect the exact output buffer end. > > Is there a way that can be useful when the "extra" entries -- those > after any convert wide null -- must remain unchanged? Seems unlikely, > but it's not my area of expertise. > Yes. That additional restriction makes life very messy, because you've got to have special logic to handle the end case. But you can still do your reads and writes in 64 bits. However you need an extra read of the destination buffer at data end, if your 16 bit output is not a multiple of four. The idea is that you read and write 64 bits at a time, and keep the intermediate information in registers. I haven't actually tried to implement this, much less test it for speed. However it's maybe a worthwhile little test project.
[toc] | [prev] | [next] | [standalone]
| From | richard@cogsci.ed.ac.uk (Richard Tobin) |
|---|---|
| Date | 2020-05-11 19:56 +0000 |
| Message-ID | <r9cam6$29n8$2@macpro.inf.ed.ac.uk> |
| In reply to | #152220 |
In article <n5guG.283633$Xk.216585@fx46.iad>, Scott Lurndal <slp53@pacbell.net> wrote: >Is there to ensure that no bytes beyond the null are _read_ from the source >string (thus ensuring that no page fault, for example, occurs because a byte >beyond the nul is on the next (unallocated) page). It has no bearing on >whether the function is allowed to write element 'n-1' of the >destination operand >which is allowed explictly by the standard regardless of the length of >the input string. What? It can overwrite the contents of the destination array beyond the length required for converting the input? Where does it explicitly say that? -- Richard
[toc] | [prev] | [next] | [standalone]
| From | Tim Rentsch <tr.17687@z991.linuxsc.com> |
|---|---|
| Date | 2020-05-24 16:49 -0700 |
| Message-ID | <86ftbouc4k.fsf@linuxsc.com> |
| In reply to | #152220 |
scott@slp53.sl.home (Scott Lurndal) writes: > Tim Rentsch <tr.17687@z991.linuxsc.com> writes: > >> Philipp Klaus Krause <pkk@spth.de> writes: >> >>> Am 11.05.20 um 16:44 schrieb Bonita Montero: >>> >>>> There's a POSIX-extension that if you pass nullptr for s, you get >>>> the size of the buffer needed for s. Maybe this will help you. >>>> Otherwise: multibyte-characters are usually UTF-8-characters and >>>> it should be easy to find code to convert these charaters into >>>> wide-characters; but it should be also easy to write this yourself >>>> in 20min. >>> >>> At the moment I want to figure out what to do about the problem. File a >>> bug against GCC in Ubuntu? File a defect report / clarification request >>> with WG14? >> >> File a gcc bug report. The gnu/gcc folks have misunderstood the >> standard, and they are shooting their users in the foot. Your >> support/regression tests deserve thanks, and have provided a >> public service. > > It's not a bug. The phrase > > "No characters that follow a null byte (which is converted into > a wide-character code with value 0) shall be examined or converted." > > Is there to ensure that no bytes beyond the null are _read_ from the > source string (thus ensuring that no page fault, for example, occurs > because a byte beyond the nul is on the next (unallocated) page). > It has no bearing on whether the function is allowed to write > element 'n-1' of the destination operand which is allowed explictly > by the standard regardless of the length of the input string. I don't agree with your interpretation. First you are misquoting the description given in 7.22.8.1 p2. Second the statements that "[mbstowcs] stores not more than n wide characters into the array" and that "No multibyte characters that follow a null character [...] will be examined or converted" do not constitute explicit permission to do anything. Just the opposite: they give an explicit restriction NOT to do something. Third the interpretation you suggest is not consistent with all other string library functions in the Standard: they all don't do anything past the final character explicitly processed, unless there is some sort of explicit statement like "under such and such circumstances the contents of the array is indeterminate." There is no such explicit statement here. By the way, the function I was talking about is wcstombs, not mbstowcs. > An application that provides an 'n' parameter larger than the > allocated space of the destination buffer is _BROKEN_. If you want to say it's not a good programming practice, I have no problem with that. But all the evidence I have found supports the conclusion that the behavior gcc exhibits here does not conform to what the Standard is meant to require.
[toc] | [prev] | [next] | [standalone]
| From | Keith Thompson <Keith.S.Thompson+u@gmail.com> |
|---|---|
| Date | 2020-05-11 14:19 -0700 |
| Message-ID | <877dxitbjo.fsf@nosuchdomain.example.com> |
| In reply to | #152201 |
Philipp Klaus Krause <pkk@spth.de> writes:
> Am 11.05.20 um 16:44 schrieb Bonita Montero:
>> There's a POSIX-extension that if you pass nullptr for s, you get
>> the size of the buffer needed for s. Maybe this will help you.
>> Otherwise: multibyte-characters are usually UTF-8-characters and
>> it should be easy to find code to convert these charaters into
>> wide-characters; but it should be also easy to write this yourself
>> in 20min.
>
> At the moment I want to figure out what to do about the problem. File a
> bug against GCC in Ubuntu? File a defect report / clarification request
> with WG14?
Why would you file a bug report against gcc? wcstombs is implemented by
the library, not by the compiler.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */
[toc] | [prev] | [next] | [standalone]
| From | Keith Thompson <Keith.S.Thompson+u@gmail.com> |
|---|---|
| Date | 2020-05-11 14:22 -0700 |
| Message-ID | <873686tbei.fsf@nosuchdomain.example.com> |
| In reply to | #152241 |
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
> Philipp Klaus Krause <pkk@spth.de> writes:
>> Am 11.05.20 um 16:44 schrieb Bonita Montero:
>>> There's a POSIX-extension that if you pass nullptr for s, you get
>>> the size of the buffer needed for s. Maybe this will help you.
>>> Otherwise: multibyte-characters are usually UTF-8-characters and
>>> it should be easy to find code to convert these charaters into
>>> wide-characters; but it should be also easy to write this yourself
>>> in 20min.
>>
>> At the moment I want to figure out what to do about the problem. File a
>> bug against GCC in Ubuntu? File a defect report / clarification request
>> with WG14?
>
> Why would you file a bug report against gcc? wcstombs is implemented by
> the library, not by the compiler.
(Unless it's a result of incorrect optimization by gcc.)
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */
[toc] | [prev] | [next] | [standalone]
| From | Philipp Klaus Krause <pkk@spth.de> |
|---|---|
| Date | 2020-05-12 09:17 +0200 |
| Message-ID | <r9dihj$d6r$1@solani.org> |
| In reply to | #152242 |
Am 11.05.20 um 23:22 schrieb Keith Thompson: > Keith Thompson <Keith.S.Thompson+u@gmail.com> writes: >> Philipp Klaus Krause <pkk@spth.de> writes: >>> Am 11.05.20 um 16:44 schrieb Bonita Montero: >>>> There's a POSIX-extension that if you pass nullptr for s, you get >>>> the size of the buffer needed for s. Maybe this will help you. >>>> Otherwise: multibyte-characters are usually UTF-8-characters and >>>> it should be easy to find code to convert these charaters into >>>> wide-characters; but it should be also easy to write this yourself >>>> in 20min. >>> >>> At the moment I want to figure out what to do about the problem. File a >>> bug against GCC in Ubuntu? File a defect report / clarification request >>> with WG14? >> >> Why would you file a bug report against gcc? wcstombs is implemented by >> the library, not by the compiler. > > (Unless it's a result of incorrect optimization by gcc.) > I had assumed it to be a compiler issue since I was able to observe the problem with GCC, but not LLVM on Ubuntu. And indeed the problem is apparently in the gcc package for Ubuntu: Their patch to the upstream Debian gcc package predefines _FORTIFY_SOURCE to 2, which makes glibc non-compliant.
[toc] | [prev] | [next] | [standalone]
| From | raltbos@xs4all.nl (Richard Bos) |
|---|---|
| Date | 2020-05-24 16:12 +0000 |
| Message-ID | <5eca9c8f.21221265@news.xs4all.nl> |
| In reply to | #152241 |
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote: > Philipp Klaus Krause <pkk@spth.de> writes: > > Am 11.05.20 um 16:44 schrieb Bonita Montero: > >> There's a POSIX-extension that if you pass nullptr for s, you get > >> the size of the buffer needed for s. Maybe this will help you. > >> Otherwise: multibyte-characters are usually UTF-8-characters and > >> it should be easy to find code to convert these charaters into > >> wide-characters; but it should be also easy to write this yourself > >> in 20min. > > > > At the moment I want to figure out what to do about the problem. File a > > bug against GCC in Ubuntu? File a defect report / clarification request > > with WG14? > > Why would you file a bug report against gcc? wcstombs is implemented by > the library, not by the compiler. *Yawm* Same thing, more or less same team. (Imagine someone pulling that excuse against Microsoft C? Or Python, or Forth, or even Basic?) Pull up your panties already. Richard
[toc] | [prev] | [next] | [standalone]
| From | Keith Thompson <Keith.S.Thompson+u@gmail.com> |
|---|---|
| Date | 2020-05-24 15:10 -0700 |
| Message-ID | <87pnatq91c.fsf@nosuchdomain.example.com> |
| In reply to | #152452 |
raltbos@xs4all.nl (Richard Bos) writes:
> Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
[...]
>> Why would you file a bug report against gcc? wcstombs is implemented by
>> the library, not by the compiler.
>
> *Yawm*
>
> Same thing, more or less same team.
Different projects, different teams, different bug reporting systems.
If you file a bug report against gcc for a problem in glibc, you're
just wasting time. They might respond and tell you where to file it.
They *might* redirect it for you, but I wouldn't count on that.
However, as I acknowledged in a followup, this particular problem
may be an issue with gcc's optimizations, which of course implies
that filing a bug report against gcc would be appropriate.
> (Imagine someone pulling that excuse against Microsoft C? Or Python, or
> Forth, or even Basic?)
Imagine paying attention to whether a compiler and runtime library
share a bug reporting system or not.
gcc is often used with libraries other than glibc, and glibc is
often used with compilers other than gcc. This is less true of
the systems you mention (though Microsoft's C library is commonly
used with compilers other than Microsoft's). I haven't looked into
Microsoft's bug reporting system(s). If I wanted to report a bug
in their C implementation, I would do so first.
> Pull up your panties already.
Be less rude.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */
[toc] | [prev] | [next] | [standalone]
| From | James Kuyper <jameskuyper@alumni.caltech.edu> |
|---|---|
| Date | 2020-05-24 22:58 -0400 |
| Message-ID | <rafc89$5hk$1@dont-email.me> |
| In reply to | #152452 |
On 5/24/20 12:12 PM, Richard Bos wrote: > Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote: ... >> Why would you file a bug report against gcc? wcstombs is implemented by >> the library, not by the compiler. > > *Yawm* > > Same thing, more or less same team. Not really - they're very different teams. A bug report filed with the wrong team stands a good chance of being ignored, or at least, dismissed with instructions to file it in the right location. > (Imagine someone pulling that excuse against Microsoft C? Or Python, or > Forth, or even Basic?) How many of those are compilers that are routinely used with a standard library provided by a different vendor? How many of those have standard library implementations that are routinely used with a compiler provided by a different vendor?
[toc] | [prev] | [next] | [standalone]
| From | richard@cogsci.ed.ac.uk (Richard Tobin) |
|---|---|
| Date | 2020-05-11 20:07 +0000 |
| Message-ID | <r9cb9v$2aea$1@macpro.inf.ed.ac.uk> |
| In reply to | #152174 |
In article <r9bd16$nbt$1@solani.org>, Philipp Klaus Krause <pkk@spth.de> wrote: >"size_t mbstowcs(wchar_t * restrict pwcs, const char *restrict s, size_t n); > >The mbstowcs function converts a sequence of multibyte characters that >begins in the initial shift state from the array pointed to by s into a >sequence of corresponding wide characters and stores not more than n >wide characters into the array pointed to by pwcs. No multibyte >characters that follow a null character (which is converted into a null >wide character) will be examined or converted. Each multibyte character >is converted as if by a call to the mbtowc function, except that the >conversion state of the mbtowc function is not affected. > >No more than n elements will be modified in the array pointed to by >pwcs. If copying takes place between objects that overlap, the behavior >is undefined." This description seems to be full of holes. It doesn't even say that the characters it writes into the destination must be the ones it converted from the source, unlike the much better description of wcstombs. -- Richard
[toc] | [prev] | [next] | [standalone]
| From | Andrey Tarasevich <andreytarasevich@hotmail.com> |
|---|---|
| Date | 2020-06-25 21:42 -0700 |
| Message-ID | <rd3uc8$sqv$1@dont-email.me> |
| In reply to | #152174 |
On 5/11/2020 4:30 AM, Philipp Klaus Krause wrote: > > For wcstombs, the wording seems clear to state that it will stop at a > terminating 0, but for the mbstowcs it seems unclear to me. > The issue is fairly similar to the one described here https://trust-in-soft.com/blog/2015/12/21/memcmp-requires-pointers-to-fully-valid-buffers/ The question is whether `memcmp` is allowed to read beyond the first differing byte, while still within the specified buffer size. -- Best regards, Andrey Tarasevich
[toc] | [prev] | [next] | [standalone]
| From | Tim Rentsch <tr.17687@z991.linuxsc.com> |
|---|---|
| Date | 2020-06-28 06:32 -0700 |
| Message-ID | <864kqviae5.fsf@linuxsc.com> |
| In reply to | #152900 |
Andrey Tarasevich <andreytarasevich@hotmail.com> writes: > On 5/11/2020 4:30 AM, Philipp Klaus Krause wrote: > >> For wcstombs, the wording seems clear to state that it will stop at a >> terminating 0, but for the mbstowcs it seems unclear to me. > > The issue is fairly similar to the one described here > > https://trust-in-soft.com/blog/2015/12/21/memcmp-requires-pointers-to-fully-valid-buffers/ > > The question is whether `memcmp` is allowed to read beyond the first > differing byte, while still within the specified buffer size. The question is similar. The answer isn't.
[toc] | [prev] | [standalone]
Page 4 of 4 — ← Prev page 1 2 3 [4]
Back to top | Article view | comp.lang.c
csiph-web