Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.c > #83783 > unrolled thread
| Started by | boon <root@localhost> |
|---|---|
| First post | 2016-03-13 17:13 +0100 |
| Last post | 2016-03-15 22:32 +0100 |
| Articles | 19 on this page of 39 — 14 participants |
Back to article view | Back to comp.lang.c
strtok() implementation boon <root@localhost> - 2016-03-13 17:13 +0100
Re: strtok() implementation Malcolm McLean <malcolm.mclean5@btinternet.com> - 2016-03-13 10:04 -0700
Re: strtok() implementation boon <root@localhost> - 2016-03-13 18:51 +0100
Re: strtok() implementation Eric Sosman <esosman@comcast-dot-net.invalid> - 2016-03-13 13:38 -0400
Re: strtok() implementation boon <root@localhost> - 2016-03-13 19:05 +0100
Re: strtok() implementation Keith Thompson <kst-u@mib.org> - 2016-03-13 13:50 -0700
Re: strtok() implementation boon <root@localhost> - 2016-03-13 23:10 +0100
Re: strtok() implementation boon <root@localhost> - 2016-03-13 21:06 +0100
Re: strtok() implementation Eric Sosman <esosman@comcast-dot-net.invalid> - 2016-03-13 16:26 -0400
Re: strtok() implementation boon <root@localhost> - 2016-03-13 22:52 +0100
Re: strtok() implementation boon <root@localhost> - 2016-03-13 23:25 +0100
Re: strtok() implementation Ian Collins <ian-news@hotmail.com> - 2016-03-14 15:26 +1300
Re: strtok() implementation boon <root@localhost.localdomain> - 2016-03-14 12:44 +0100
Re: strtok() implementation Tim Rentsch <txr@alumni.caltech.edu> - 2016-03-17 08:23 -0700
Re: strtok() implementation boon <root@localhost> - 2016-03-18 21:09 +0100
Re: strtok() implementation Tim Rentsch <txr@alumni.caltech.edu> - 2016-03-19 14:21 -0700
Re: strtok() implementation Randy Howard <rhoward.mx@EverybodyUsesIt.com> - 2016-03-19 16:25 -0500
Re: strtok() implementation boon <fred900rbc@gmail.com> - 2016-03-24 13:05 -0700
Re: strtok() implementation Tim Rentsch <txr@alumni.caltech.edu> - 2016-03-30 09:13 -0700
Re: strtok() implementation Randy Howard <rhoward.mx@EverybodyUsesIt.com> - 2016-03-30 14:44 -0500
Re: strtok() implementation boon <root@127.10.10.1> - 2016-03-31 10:24 +0200
Re: strtok() implementation Tim Rentsch <txr@alumni.caltech.edu> - 2016-04-05 12:23 -0700
Re: strtok() implementation Ian Collins <ian-news@hotmail.com> - 2016-03-14 15:31 +1300
Re: strtok() implementation boon <root@localhost> - 2016-03-14 20:13 +0100
Re: strtok() implementation Ian Collins <ian-news@hotmail.com> - 2016-03-15 09:48 +1300
Re: strtok() implementation Malcolm McLean <malcolm.mclean5@btinternet.com> - 2016-03-14 14:05 -0700
Re: strtok() implementation Ian Collins <ian-news@hotmail.com> - 2016-03-15 10:09 +1300
Re: strtok() implementation Richard Heathfield <rjh@cpax.org.uk> - 2016-03-14 22:02 +0000
Re: strtok() implementation Gareth Owen <gwowen@gmail.com> - 2016-03-14 22:16 +0000
Re: strtok() implementation Keith Thompson <kst-u@mib.org> - 2016-03-14 14:50 -0700
Re: strtok() implementation raltbos@xs4all.nl (Richard Bos) - 2016-03-14 22:06 +0000
Re: strtok() implementation boon <root@localhost> - 2016-03-15 22:14 +0100
Re: strtok() implementation BartC <bc@freeuk.com> - 2016-03-15 21:23 +0000
Re: strtok() implementation raltbos@xs4all.nl (Richard Bos) - 2016-03-17 12:27 +0000
Re: strtok() implementation boon <root@localhost> - 2016-03-15 22:04 +0100
Re: strtok() implementation Eric Sosman <esosman@comcast-dot-net.invalid> - 2016-03-15 18:18 -0400
Re: strtok() implementation boon <root@localhost> - 2016-03-18 21:19 +0100
Re: strtok() implementation boon <root@localhost> - 2016-03-15 22:08 +0100
Re: strtok() implementation boon <root@localhost> - 2016-03-15 22:32 +0100
Page 2 of 2 — ← Prev page 1 [2]
| From | boon <root@127.10.10.1> |
|---|---|
| Date | 2016-03-31 10:24 +0200 |
| Message-ID | <ndimro$qg2$1@adenine.netfront.net> |
| In reply to | #84373 |
On 03/19/2016 10:21 PM, Tim Rentsch wrote:
> boon <root@localhost> writes:
>
>> On 03/17/2016 04:23 PM, Tim Rentsch wrote:
>>> boon <root@localhost.localdomain> writes:
>>>
>>>> On 03/14/2016 03:26 AM, Ian Collins wrote:
>>>>> On 03/14/16 11:25, boon wrote:
>>>>>> On 03/13/2016 10:52 PM, boon wrote:
>>>>>>> On 03/13/2016 09:26 PM, Eric Sosman wrote:
>>>>>>>> On 3/13/2016 4:06 PM, boon wrote:
>>>>>>>>> On 03/13/2016 05:13 PM, boon wrote:
>>
>> [...]
>>
>>> If you imagine there are available functions 'skip_over' to skip
>>> over a sequence of characters in a given set, and 'skip_to' to
>>> skip to the first occurrence of any character in a given set (or
>>> to a teminating null, whichever comes first), then strtok() may
>>> be written as follows, without ever testing the saved pointer
>>> for NULL (because that never happens):
>>
>> I see. As the reset value for saved pointer is null-terminated string,
>> there is no need to test if it is NULL, indeed.
>>
>> But you test its first element *saved for null character (if
>> (*saved))... isn't this the same logic but with different value for
>> string 'saved' pointer points to?
>
> It isn't. The test of *saved is a test for what's happening with
> the input, not (necessarily) a test for the previous value of
> 'saved'. If the argument 'input' is non-null, convince yourself
> that, just before the if() test, the value of '*saved' may be 0,
> or the values of '*result' and '*saved' may both be zero,
> depending on what 'input' points to. Note that these values may
> arise regardless of the previous value of 'saved' when 'input'
> is non-null.
>
>> I note that your implementation is safer that mine as non NULL pointer
>> are used.
>>
>> I like your trick and I will remember it.
>
> I like it because, for me, it makes it easier to reason about how
> the function works.
>
>>> char *
>>> my_strtok( char *input, const char *delimiters ){
>>> static char *saved = "";
>>> char *result = skip_over( input ? input : saved, delimiters );
>>> saved = skip_to( result, delimiters );
>>>
>>> if( *saved ) return *saved++ = 0, result;
>>>
>>> return saved = "", *result ? result : NULL;
>>> }
>>>
>>> [...] I hope you find this alternate approach of interest.
>>
>> Nice implementation with recursive functions. Thank you.
>
> I'm glad you like it. Note that gcc with optimization level
> greater than -O1 will turn the recursive calls into loops.
>
>> Here is a new implementation exploiting your trick :
>>
>> char *my_strtok(char *str, const char *delim)
>> {
>> char *ret;
>> static char *saveptr = "";
>>
>> if (str)
>> saveptr = str;
>>
>> ret = saveptr += strspn(saveptr, delim);
>> saveptr += strcspn(saveptr, delim);
>>
>> if (*saveptr) return *saveptr++ = '\0', ret;
>>
>> return saveptr = "", *ret ? ret : NULL;
>> }
>
> I would be inclined to write a version like this using a
> conditional-expression assignment rather than an if() (and also
> with different variable names, but I'm using your names):
>
> char *my_strtok(char *str, const char *delim)
> {
> static char *saveptr = "";
> char *ret = str ? str : saveptr;
>
> ret = ret + strspn(saveptr, delim);
> saveptr = ret + strcspn(ret, delim);
>
> if (*saveptr) return *saveptr++ = '\0', ret;
>
> return saveptr = "", *ret ? ret : NULL;
> }
>
> I find this writing easier to follow than the last one.
>
I agree with you about the conditional-expression assignment to
initialize 'ret' variable.
But I do not feel at ease with using comma operator within a return
statement. ;)
--- news://freenews.netfront.net/ - complaints: news@netfront.net ---
[toc] | [prev] | [next] | [standalone]
| From | Tim Rentsch <txr@alumni.caltech.edu> |
|---|---|
| Date | 2016-04-05 12:23 -0700 |
| Message-ID | <kfn1t6j5082.fsf@x-alumni2.alumni.caltech.edu> |
| In reply to | #85460 |
boon <root@127.10.10.1> writes:
> On 03/19/2016 10:21 PM, Tim Rentsch wrote:
[...]
>>
>> I would be inclined to write a version like this using a
>> conditional-expression assignment rather than an if() (and also
>> with different variable names, but I'm using your names):
>>
>> char *my_strtok(char *str, const char *delim)
>> {
>> static char *saveptr = "";
>> char *ret = str ? str : saveptr;
>>
>> ret = ret + strspn(saveptr, delim);
>> saveptr = ret + strcspn(ret, delim);
>>
>> if (*saveptr) return *saveptr++ = '\0', ret;
>>
>> return saveptr = "", *ret ? ret : NULL;
>> }
>>
>> I find this writing easier to follow than the last one.
>
> I agree with you about the conditional-expression assignment to
> initialize 'ret' variable.
>
> But I do not feel at ease with using comma operator within a return
> statement. ;)
I understand your reaction. Let me offer a perspective that the
usage shown is acceptable in this case.
For functions that have an output parameter (eg, 'int *out'),
there is a school of thought that storing those values should be
expressed in 'return' statements along with the function's
result value:
return *out = n, p;
Doing this makes it easier to see that all outputs of a
function have been given appropriate values.
In the code above, the value stored in 'saveptr' is part of the
"output state" of the function. It isn't state that is directly
visible to the caller, but it is important to the caller that the
state-saving be done. Indeed, that state-saving is part of the
specification of the function here, modeled as it is after
strtok(). Since it is part of the function's "output state", it
makes sense to include the setting of that state in the return
expression, along with the regular function return value.
[toc] | [prev] | [next] | [standalone]
| From | Ian Collins <ian-news@hotmail.com> |
|---|---|
| Date | 2016-03-14 15:31 +1300 |
| Message-ID | <dkmm42F53ugU2@mid.individual.net> |
| In reply to | #83783 |
On 03/14/16 05:13, boon wrote:
> Hello,
>
> I am writing strtok() implementation, just for the fun and to improve my
> C coding style and skills.
>
> Here is my solution.
<snip>
Here's an alternative, avoiding any library functions...
#include <stddef.h>
#include <stdbool.h>
static bool
charIn( char c, const char* delim )
{
while( *delim )
{
if( *delim++ == c )
{
return true;
}
}
return false;
}
static bool
charNotIn( char c, const char* delim )
{
while( *delim && *delim != c )
{
++delim;
}
return *delim == '\0';
}
char*
my_strtok( char* restrict str, const char* restrict delim )
{
static char* last = NULL;
if( !delim ) return NULL;
if( !str && !last ) return NULL;
if( !str )
{
str = last;
}
else
{
last = NULL;
}
while( *str && charIn( *str, delim) )
{
++str;
}
if( *str )
{
char* start = str;
while( *start && charNotIn(*start, delim) )
{
++start;
}
if( *start )
{
*start++ = NULL;
}
last = start;
return str;
}
return NULL;
}
--
Ian Collins
[toc] | [prev] | [next] | [standalone]
| From | boon <root@localhost> |
|---|---|
| Date | 2016-03-14 20:13 +0100 |
| Message-ID | <56e70d11$0$4562$426a74cc@news.free.fr> |
| In reply to | #83840 |
On 03/14/2016 03:31 AM, Ian Collins wrote:
> On 03/14/16 05:13, boon wrote:
>> Hello,
>>
>> I am writing strtok() implementation, just for the fun and to improve my
>> C coding style and skills.
>>
>> Here is my solution.
>
> <snip>
>
> Here's an alternative, avoiding any library functions...
>
> #include <stddef.h>
> #include <stdbool.h>
>
> static bool
> charIn( char c, const char* delim )
> {
> while( *delim )
> {
> if( *delim++ == c )
> {
> return true;
> }
> }
>
> return false;
> }
>
> static bool
> charNotIn( char c, const char* delim )
> {
> while( *delim && *delim != c )
> {
> ++delim;
> }
>
> return *delim == '\0';
> }
>
> char*
> my_strtok( char* restrict str, const char* restrict delim )
> {
> static char* last = NULL;
>
> if( !delim ) return NULL;
> if( !str && !last ) return NULL;
>
> if( !str )
> {
> str = last;
> }
> else
> {
> last = NULL;
> }
>
> while( *str && charIn( *str, delim) )
> {
> ++str;
> }
>
> if( *str )
> {
> char* start = str;
>
> while( *start && charNotIn(*start, delim) )
> {
> ++start;
> }
>
> if( *start )
> {
> *start++ = NULL;
> }
>
> last = start;
>
> return str;
> }
>
> return NULL;
> }
>
Thank you Ian. I noticed you have not used local variables (excepted the
ones used as formal parameters and static 'last' variable to save the
"parsing context"). I guess this implementation have chances to be
faster than mines.
Furthermore you added a check on 'delim' parameter. This is something I
have missed again.
I will try to be more attentive during next exercises.
Regards.
[toc] | [prev] | [next] | [standalone]
| From | Ian Collins <ian-news@hotmail.com> |
|---|---|
| Date | 2016-03-15 09:48 +1300 |
| Message-ID | <dkomcrFn02uU2@mid.individual.net> |
| In reply to | #83911 |
On 03/15/16 08:13, boon wrote: > On 03/14/2016 03:31 AM, Ian Collins wrote: >> On 03/14/16 05:13, boon wrote: >>> Hello, >>> >>> I am writing strtok() implementation, just for the fun and to improve my >>> C coding style and skills. >>> >>> Here is my solution. >> >> <snip> >> >> Here's an alternative, avoiding any library functions... <snip> > Thank you Ian. I noticed you have not used local variables (excepted the > ones used as formal parameters and static 'last' variable to save the > "parsing context"). I guess this implementation have chances to be > faster than mines. Gaining speed wasn't the intent, more of a case of style. I often reuse parameter values in cases such as this. If there is any potential performance boost it would come from having the equivalent of "strspn" in-line. > Furthermore you added a check on 'delim' parameter. This is something I > have missed again. There's a gap in the standard there, it does not specify the behaviour when the delimiter is null. My standard library's strtok shares a crash with your version if I run that particular test :) -- Ian Collins
[toc] | [prev] | [next] | [standalone]
| From | Malcolm McLean <malcolm.mclean5@btinternet.com> |
|---|---|
| Date | 2016-03-14 14:05 -0700 |
| Message-ID | <b62598b0-dadc-48c2-aca0-0632ef0a6151@googlegroups.com> |
| In reply to | #83920 |
On Monday, March 14, 2016 at 8:48:38 PM UTC, Ian Collins wrote: > > There's a gap in the standard there, it does not specify the behaviour > when the delimiter is null. My standard library's strtok shares a crash > with your version if I run that particular test :) > I'd just match the whole string. In the olden days an empty string and the null character pointer were the same.
[toc] | [prev] | [next] | [standalone]
| From | Ian Collins <ian-news@hotmail.com> |
|---|---|
| Date | 2016-03-15 10:09 +1300 |
| Message-ID | <dkonksFn02uU3@mid.individual.net> |
| In reply to | #83924 |
On 03/15/16 10:05, Malcolm McLean wrote: > On Monday, March 14, 2016 at 8:48:38 PM UTC, Ian Collins wrote: >> >> There's a gap in the standard there, it does not specify the behaviour >> when the delimiter is null. My standard library's strtok shares a crash >> with your version if I run that particular test :) >> > I'd just match the whole string. > In the olden days an empty string and the null character pointer > were the same. Where they? -- Ian Collins
[toc] | [prev] | [next] | [standalone]
| From | Richard Heathfield <rjh@cpax.org.uk> |
|---|---|
| Date | 2016-03-14 22:02 +0000 |
| Message-ID | <nc7c7p$juc$1@dont-email.me> |
| In reply to | #83926 |
On 14/03/16 21:09, Ian Collins wrote: > On 03/15/16 10:05, Malcolm McLean wrote: >> On Monday, March 14, 2016 at 8:48:38 PM UTC, Ian Collins wrote: >>> >>> There's a gap in the standard there, it does not specify the behaviour >>> when the delimiter is null. My standard library's strtok shares a crash >>> with your version if I run that particular test :) >>> >> I'd just match the whole string. >> In the olden days an empty string and the null character pointer >> were the same. > > Where they? Nope. YHBM. -- Richard Heathfield Email: rjh at cpax dot org dot uk "Usenet is a strange place" - dmr 29 July 1999 Sig line 4 vacant - apply within
[toc] | [prev] | [next] | [standalone]
| From | Gareth Owen <gwowen@gmail.com> |
|---|---|
| Date | 2016-03-14 22:16 +0000 |
| Message-ID | <8737rsaeni.fsf@gmail.com> |
| In reply to | #83938 |
Richard Heathfield <rjh@cpax.org.uk> writes: > Nope. YHBM. :)
[toc] | [prev] | [next] | [standalone]
| From | Keith Thompson <kst-u@mib.org> |
|---|---|
| Date | 2016-03-14 14:50 -0700 |
| Message-ID | <lnfuvsk9va.fsf@kst-u.example.com> |
| In reply to | #83924 |
Malcolm McLean <malcolm.mclean5@btinternet.com> writes:
> On Monday, March 14, 2016 at 8:48:38 PM UTC, Ian Collins wrote:
>>
>> There's a gap in the standard there, it does not specify the behaviour
>> when the delimiter is null. My standard library's strtok shares a crash
>> with your version if I run that particular test :)
>>
> I'd just match the whole string.
> In the olden days an empty string and the null character pointer
> were the same.
Nope.
On some systems (such as, IIRC, SunOS on 68k), a null pointer pointed to
address 0, and that location happened to be readable and contain a 0
byte. As a result, passing a null pointer to a string function often
had the same effect as passing a valid pointer to an empty string, and
some code (probably inadvertently) depended on that. A lot of latent
bugs were detected when the system was changed to protect memory page
zero.
K&R1 page 97 says:
C guarantees that no pointer that validly points at data will
contain zero ...
Therefore a null (zero) pointer does not point to valid data. An empty
string is valid data, consisting of a single null character.
--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
[toc] | [prev] | [next] | [standalone]
| From | raltbos@xs4all.nl (Richard Bos) |
|---|---|
| Date | 2016-03-14 22:06 +0000 |
| Message-ID | <56e735ca.1726078@news.xs4all.nl> |
| In reply to | #83924 |
Malcolm McLean <malcolm.mclean5@btinternet.com> wrote: > On Monday, March 14, 2016 at 8:48:38 PM UTC, Ian Collins wrote: > > > > There's a gap in the standard there, it does not specify the behaviour > > when the delimiter is null. My standard library's strtok shares a crash > > with your version if I run that particular test :) > > > I'd just match the whole string. > In the olden days an empty string and the null character pointer > were the same. Must've been _very_ olden days... even on the Speccy that's bollocks. Richard
[toc] | [prev] | [next] | [standalone]
| From | boon <root@localhost> |
|---|---|
| Date | 2016-03-15 22:14 +0100 |
| Message-ID | <56e87af1$0$27816$426a34cc@news.free.fr> |
| In reply to | #83940 |
On 03/14/2016 11:06 PM, Richard Bos wrote: > Malcolm McLean <malcolm.mclean5@btinternet.com> wrote: > >> On Monday, March 14, 2016 at 8:48:38 PM UTC, Ian Collins wrote: >>> >>> There's a gap in the standard there, it does not specify the behaviour >>> when the delimiter is null. My standard library's strtok shares a crash >>> with your version if I run that particular test :) >>> >> I'd just match the whole string. >> In the olden days an empty string and the null character pointer >> were the same. > > Must've been _very_ olden days... even on the Speccy that's bollocks. have made a search... did you mean ZX Spectrum architecure (discontinued in 1992)? > Richard >
[toc] | [prev] | [next] | [standalone]
| From | BartC <bc@freeuk.com> |
|---|---|
| Date | 2016-03-15 21:23 +0000 |
| Message-ID | <nc9ub9$d48$1@dont-email.me> |
| In reply to | #84054 |
On 15/03/2016 21:14, boon wrote: > On 03/14/2016 11:06 PM, Richard Bos wrote: >> Malcolm McLean <malcolm.mclean5@btinternet.com> wrote: >> >>> On Monday, March 14, 2016 at 8:48:38 PM UTC, Ian Collins wrote: >>>> >>>> There's a gap in the standard there, it does not specify the behaviour >>>> when the delimiter is null. My standard library's strtok shares a >>>> crash >>>> with your version if I run that particular test :) >>>> >>> I'd just match the whole string. >>> In the olden days an empty string and the null character pointer >>> were the same. >> >> Must've been _very_ olden days... even on the Speccy that's bollocks. > > have made a search... did you mean ZX Spectrum architecure (discontinued > in 1992)? The ZX used the Z80 chip. On that, a null pointer would point to address 0x0000, which is also where the program counter starts executing code. So it would be unlikely to contain 0, necessary for *NULL to yield a zero byte just like passing "". Unless there is an unexplicable NOP at the start, but C can hardly rely on that. (But being a Sinclair product, this wouldn't be surprising.) -- Bartc
[toc] | [prev] | [next] | [standalone]
| From | raltbos@xs4all.nl (Richard Bos) |
|---|---|
| Date | 2016-03-17 12:27 +0000 |
| Message-ID | <56eaa29c.3853484@news.xs4all.nl> |
| In reply to | #84054 |
boon <root@localhost> wrote: > On 03/14/2016 11:06 PM, Richard Bos wrote: > > Malcolm McLean <malcolm.mclean5@btinternet.com> wrote: > > > >> On Monday, March 14, 2016 at 8:48:38 PM UTC, Ian Collins wrote: > >>> > >>> There's a gap in the standard there, it does not specify the behaviour > >>> when the delimiter is null. My standard library's strtok shares a crash > >>> with your version if I run that particular test :) > >>> > >> I'd just match the whole string. > >> In the olden days an empty string and the null character pointer > >> were the same. > > > > Must've been _very_ olden days... even on the Speccy that's bollocks. > > have made a search... did you mean ZX Spectrum architecure (discontinued > in 1992)? Yup, and more relevantly, started in 1982. First C compiler I can find, 1984. I've just tried it, and it uses all-bytes-zero as a null pointer and duly prints the byte-code of the first interrupt call if you try to print the string _at_ null. (It also correctly prints nothing if you print an actual null string, of course.) Richard
[toc] | [prev] | [next] | [standalone]
| From | boon <root@localhost> |
|---|---|
| Date | 2016-03-15 22:04 +0100 |
| Message-ID | <56e8788b$0$665$426a74cc@news.free.fr> |
| In reply to | #83920 |
On 03/14/2016 09:48 PM, Ian Collins wrote: > On 03/15/16 08:13, boon wrote: >> On 03/14/2016 03:31 AM, Ian Collins wrote: >>> On 03/14/16 05:13, boon wrote: [...] >> Thank you Ian. I noticed you have not used local variables (excepted the >> ones used as formal parameters and static 'last' variable to save the >> "parsing context"). I guess this implementation have chances to be >> faster than mines. > > Gaining speed wasn't the intent, more of a case of style. I often reuse > parameter values in cases such as this. If there is any potential > performance boost it would come from having the equivalent of "strspn" > in-line. Is your style, GNU style? (default indent tool style) >> Furthermore you added a check on 'delim' parameter. This is something I >> have missed again. > > There's a gap in the standard there, it does not specify the behaviour > when the delimiter is null. My standard library's strtok shares a crash > with your version if I run that particular test :) > Of course, but I am quite sure that strtok() (3) does not crash in such a case.
[toc] | [prev] | [next] | [standalone]
| From | Eric Sosman <esosman@comcast-dot-net.invalid> |
|---|---|
| Date | 2016-03-15 18:18 -0400 |
| Message-ID | <nca1hr$p3l$1@dont-email.me> |
| In reply to | #84052 |
On 3/15/2016 5:04 PM, boon wrote:
> On 03/14/2016 09:48 PM, Ian Collins wrote:
>> [...]
>> There's a gap in the standard there, it does not specify the behaviour
>> when the delimiter is null. My standard library's strtok shares a crash
>> with your version if I run that particular test :)
>
> Of course, but I am quite sure that strtok() (3) does not crash in such
> a case.
The C Standard leaves the behavior "undefined," meaning that
(1) different implementations can behave differently, (2) they
need not behave sensibly, and (3) they need not even document
how they'll behave. The "gap" Ian refers to isn't an oversight,
but rather a refusal to place requirements on the consequences of
a programming error: Feed a function invalid arguments, and all
bets are off.
Reference: C Standard, section 7.1.4, paragraph 1.
--
esosman@comcast-dot-net.invalid
"Don't be afraid of work. Make work afraid of you." -- TLM
[toc] | [prev] | [next] | [standalone]
| From | boon <root@localhost> |
|---|---|
| Date | 2016-03-18 21:19 +0100 |
| Message-ID | <56ec627d$0$19746$426a74cc@news.free.fr> |
| In reply to | #84059 |
On 03/15/2016 11:18 PM, Eric Sosman wrote: > On 3/15/2016 5:04 PM, boon wrote: >> On 03/14/2016 09:48 PM, Ian Collins wrote: >>> [...] >>> There's a gap in the standard there, it does not specify the behaviour >>> when the delimiter is null. My standard library's strtok shares a crash >>> with your version if I run that particular test :) >> >> Of course, but I am quite sure that strtok() (3) does not crash in such >> a case. > > The C Standard leaves the behavior "undefined," meaning that > (1) different implementations can behave differently, (2) they > need not behave sensibly, and (3) they need not even document > how they'll behave. The "gap" Ian refers to isn't an oversight, > but rather a refusal to place requirements on the consequences of > a programming error: Feed a function invalid arguments, and all > bets are off. > > Reference: C Standard, section 7.1.4, paragraph 1. > Ok. Thank you for this clarification. Ian standard library's strtok is not buggy but has an undefined behavior when a NULL pointer is passed has delimiter parameter.
[toc] | [prev] | [next] | [standalone]
| From | boon <root@localhost> |
|---|---|
| Date | 2016-03-15 22:08 +0100 |
| Message-ID | <56e87999$0$3054$426a74cc@news.free.fr> |
| In reply to | #83920 |
On 03/14/2016 09:48 PM, Ian Collins wrote: > On 03/15/16 08:13, boon wrote: >> On 03/14/2016 03:31 AM, Ian Collins wrote: >>> On 03/14/16 05:13, boon wrote: [...] > There's a gap in the standard there, it does not specify the behaviour > when the delimiter is null. My standard library's strtok shares a crash > with your version if I run that particular test :) > Well I have read too fast... what is your C library with a strtok() implementation as buggy as mine? ;)
[toc] | [prev] | [next] | [standalone]
| From | boon <root@localhost> |
|---|---|
| Date | 2016-03-15 22:32 +0100 |
| Message-ID | <56e87f32$0$3328$426a74cc@news.free.fr> |
| In reply to | #83783 |
On 03/13/2016 05:13 PM, boon wrote:
> Hello,
>
> I am writing strtok() implementation, just for the fun and to improve my
> C coding style and skills.
>
> Here is my solution.
>
> char *my_strtok(char *str, const char *delim)
> {
> char *ret, *s, *min_s;
> const char *p;
> static char *saveptr, *end;
>
> if (str) {
> end = str + strlen(str);
> saveptr = str;
> }
>
> if (*saveptr == '\0') {
> saveptr = end = NULL;
> return NULL;
> }
>
> ret = saveptr;
> min_s = end;
>
> for (p = delim; *p != '\0'; p++) {
> s = strchr(saveptr, *p);
> if (s && s < min_s)
> min_s = s;
> }
>
> if (min_s < end) {
> *min_s++ = '\0';
> saveptr = min_s;
> } else {
> saveptr = end;
> }
>
> return ret;
> }
>
> I have often difficulties in variable naming.
> Please do not hesitate to criticize this implementation.
>
> Regards.
In addition:
an strtok_r() implemention:
char *my_strtok(char *str, const char *delim, char **saveptr)
{
char *ret;
if (!delim || !saveptr)
return NULL;
if (str)
*saveptr = str;
*saveptr += strspn(*saveptr, delim);
if (**saveptr == '\0')
return NULL;
ret = *saveptr;
*saveptr += strcspn(*saveptr, delim);
if (*saveptr == '\0') {
*saveptr = NULL;
return ret;
}
*(*saveptr)++ = '\0';
return ret;
}
Sounds to be correct, but I have not implemented any unit test. I should
have. I have found a nice unit test C library (libcmocka) which seems
interesting. It uses ld(1) --wrap option to implement wrapper functions
for symbols to be mocked. May be interesting for my next C exercises.
But mocking is not required here for such little functions with no
interactions with outside world.
https://lwn.net/Articles/558106/
[toc] | [prev] | [standalone]
Page 2 of 2 — ← Prev page 1 [2]
Back to top | Article view | comp.lang.c
csiph-web