Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.c > #173802 > unrolled thread
| Started by | Kaz Kylheku <864-117-4973@kylheku.com> |
|---|---|
| First post | 2023-09-03 17:59 +0000 |
| Last post | 2023-09-03 19:30 +0100 |
| Articles | 4 — 3 participants |
Back to article view | Back to comp.lang.c
Can we lie to memchr? Kaz Kylheku <864-117-4973@kylheku.com> - 2023-09-03 17:59 +0000
Re: Can we lie to memchr? Tim Rentsch <tr.17687@z991.linuxsc.com> - 2023-09-03 11:22 -0700
Re: Can we lie to memchr? Kaz Kylheku <864-117-4973@kylheku.com> - 2023-09-03 18:58 +0000
Re: Can we lie to memchr? Ben Bacarisse <ben.usenet@bsb.me.uk> - 2023-09-03 19:30 +0100
| From | Kaz Kylheku <864-117-4973@kylheku.com> |
|---|---|
| Date | 2023-09-03 17:59 +0000 |
| Subject | Can we lie to memchr? |
| Message-ID | <20230903104255.310@kylheku.com> |
You would think that memchr can be used to test whether a string is longer than N without traversing it. For instance we can take a gigabyte-long character string and efficiently test wheether it is shorter than 10 characters: memchr(gigastr, 0, 10) == 0 if a null is found within the first 10 bytes, then its length is 10 or more. But suppose a 7 byte string is passed (length 6). That *object* is smaller than n; it does not have an "initial sequence of n characters" for memchr to search. ISO C doesn't say that bytes of the initial sequence which are beyond are sought-after value shall not be accessed by memchr. For instance, for shits and giggles, memchr could perform a right-to-left scan, and report the most recently found, hence leftmost, occurrence of the value. Or it could assume it can load an 8 byte word from the start of the object (even if unaligned), since that lies within 10. Yet that 8 could extend into an unmapped page. -- TXR Programming Language: http://nongnu.org/txr Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal Mastodon: @Kazinator@mstdn.ca
[toc] | [next] | [standalone]
| From | Tim Rentsch <tr.17687@z991.linuxsc.com> |
|---|---|
| Date | 2023-09-03 11:22 -0700 |
| Message-ID | <86r0nfqfql.fsf@linuxsc.com> |
| In reply to | #173802 |
Kaz Kylheku <864-117-4973@kylheku.com> writes: > You would think that memchr can be used to test whether a string is > longer than N without traversing it. For instance we can take a > gigabyte-long character string and efficiently test wheether it is > shorter than 10 characters: > > memchr(gigastr, 0, 10) == 0 > > if a null is found within the first 10 bytes, then its length > is 10 or more. > > But suppose a 7 byte string is passed (length 6). > > That *object* is smaller than n; it does not have an "initial sequence > of n characters" for memchr to search. > > ISO C doesn't say that bytes of the initial sequence which are > beyond are sought-after value shall not be accessed by memchr. [...] It does, and has for more than 10 years.
[toc] | [prev] | [next] | [standalone]
| From | Kaz Kylheku <864-117-4973@kylheku.com> |
|---|---|
| Date | 2023-09-03 18:58 +0000 |
| Message-ID | <20230903115634.349@kylheku.com> |
| In reply to | #173803 |
On 2023-09-03, Tim Rentsch <tr.17687@z991.linuxsc.com> wrote: > Kaz Kylheku <864-117-4973@kylheku.com> writes: >> ISO C doesn't say that bytes of the initial sequence which are >> beyond are sought-after value shall not be accessed by memchr. [...] > > It does, and has for more than 10 years. Thanks, Tim, and also Ben. I looked in the wrong tab of the PDF reader, where I have C99 open! -- TXR Programming Language: http://nongnu.org/txr Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal Mastodon: @Kazinator@mstdn.ca
[toc] | [prev] | [next] | [standalone]
| From | Ben Bacarisse <ben.usenet@bsb.me.uk> |
|---|---|
| Date | 2023-09-03 19:30 +0100 |
| Message-ID | <87fs3vw1n7.fsf@bsb.me.uk> |
| In reply to | #173802 |
Kaz Kylheku <864-117-4973@kylheku.com> writes: > You would think that memchr can be used to test whether a string is > longer than N without traversing it. For instance we can take a > gigabyte-long character string and efficiently test wheether it is > shorter than 10 characters: > > memchr(gigastr, 0, 10) == 0 > > if a null is found within the first 10 bytes, then its length > is 10 or more. > > But suppose a 7 byte string is passed (length 6). > > That *object* is smaller than n; it does not have an "initial sequence > of n characters" for memchr to search. > > ISO C doesn't say that bytes of the initial sequence which are > beyond are sought-after value shall not be accessed by memchr. Well, it does say that "The implementation shall behave as if it reads the characters sequentially and stops as soon as a matching character is found." > For instance, for shits and giggles, memchr could perform a > right-to-left scan, and report the most recently found, hence > leftmost, occurrence of the value. > > Or it could assume it can load an 8 byte word from the start of the object > (even if unaligned), since that lies within 10. Yet that 8 could extend > into an unmapped page. Only if the behaviour is consistent with the above quote, so anything going wrong as a result of looking beyond the first occurrence is, I think, ruled out. -- Ben.
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.c
csiph-web