Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.c > #123231 > unrolled thread

NULL as the empty string

Started byjacobnavia <jacob@jacob.remcomp.fr>
First post2017-11-21 23:52 +0100
Last post2017-12-15 09:18 -0800
Articles 11 on this page of 91 — 21 participants

Back to article view | Back to comp.lang.c


Contents

  NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-11-21 23:52 +0100
    Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-11-21 15:16 -0800
      Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-11-22 00:38 +0100
        Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-11-21 16:02 -0800
          Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-11-22 01:13 +0100
            Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-11-21 16:52 -0800
        Re: NULL as the empty string Robert Wessel <robertwessel2@yahoo.com> - 2017-11-21 18:09 -0600
        Re: NULL as the empty string Siri Cruise <chine.bleu@yahoo.com> - 2017-11-21 16:34 -0800
        Re: NULL as the empty string David Brown <david.brown@hesbynett.no> - 2017-11-22 12:12 +0100
      Re: NULL as the empty string supercat@casperkitty.com - 2017-11-21 15:57 -0800
        Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-11-22 01:06 +0100
          Re: NULL as the empty string supercat@casperkitty.com - 2017-11-22 15:42 -0800
            Re: NULL as the empty string Melzzzzz <Melzzzzz@zzzzz.com> - 2017-11-22 23:49 +0000
              Re: NULL as the empty string supercat@casperkitty.com - 2017-11-22 15:56 -0800
                Re: NULL as the empty string Melzzzzz <Melzzzzz@zzzzz.com> - 2017-11-23 00:06 +0000
                  Re: NULL as the empty string supercat@casperkitty.com - 2017-11-23 17:31 -0800
                    Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-11-24 09:42 +0100
                      Re: NULL as the empty string supercat@casperkitty.com - 2017-11-24 13:47 -0800
      Re: NULL as the empty string Jorgen Grahn <grahn+nntp@snipabacken.se> - 2017-11-22 06:46 +0000
      Re: NULL as the empty string John Bode <jfbode1029@gmail.com> - 2017-12-08 10:27 -0800
        Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-12-08 11:11 -0800
          Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-12-08 21:39 +0100
            Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-12-08 13:03 -0800
              Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-12-08 22:50 +0100
                Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-12-08 15:19 -0800
                  Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-12-09 00:35 +0100
                    Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-12-08 16:05 -0800
                      Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-12-09 01:22 +0100
                        Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-12-08 17:39 -0800
                        Re: NULL as the empty string John Bode <jfbode1029@gmail.com> - 2017-12-11 12:22 -0800
                      Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-12-09 01:29 +0100
                        Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-12-08 17:47 -0800
                          Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-12-09 07:05 +0100
                            Re: NULL as the empty string David Brown <david.brown@hesbynett.no> - 2017-12-09 18:37 +0100
                            Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-12-09 11:53 -0800
                              Re: NULL as the empty string supercat@casperkitty.com - 2017-12-12 10:49 -0800
                                Re: NULL as the empty string Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2017-12-12 13:39 -0800
                                  Re: NULL as the empty string supercat@casperkitty.com - 2017-12-12 16:05 -0800
                                    Re: NULL as the empty string Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2017-12-13 03:43 -0800
                                      Re: NULL as the empty string supercat@casperkitty.com - 2017-12-13 08:45 -0800
                                      Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-12-13 09:12 -0800
                                        Re: NULL as the empty string Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2017-12-13 13:27 -0800
                                          Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-12-13 14:02 -0800
                                            Re: NULL as the empty string asetofsymbols@gmail.com - 2017-12-13 14:58 -0800
                                              Re: NULL as the empty string asetofsymbols@gmail.com - 2017-12-13 15:11 -0800
                                            Re: NULL as the empty string Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2017-12-14 03:49 -0800
                                              Re: NULL as the empty string mark.bluemel@gmail.com - 2017-12-14 04:05 -0800
                                              Re: NULL as the empty string David Brown <david.brown@hesbynett.no> - 2017-12-14 13:09 +0100
                                                Re: NULL as the empty string Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2017-12-14 05:02 -0800
                                                  Re: NULL as the empty string David Brown <david.brown@hesbynett.no> - 2017-12-14 14:54 +0100
                                                Re: NULL as the empty string supercat@casperkitty.com - 2017-12-14 07:38 -0800
                                                  Re: NULL as the empty string Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2017-12-14 09:50 -0800
                                              Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-12-14 09:20 -0800
                                                Re: NULL as the empty string supercat@casperkitty.com - 2017-12-14 09:53 -0800
                                                  Re: NULL as the empty string Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2017-12-14 12:57 -0800
                                        Re: NULL as the empty string herrmannsfeldt@gmail.com - 2017-12-14 17:22 -0800
                                          Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-12-14 17:26 -0800
        Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-12-08 21:23 +0100
          Re: NULL as the empty string Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2017-12-08 13:41 -0800
            Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-12-08 22:54 +0100
    Re: NULL as the empty string supercat@casperkitty.com - 2017-11-21 15:17 -0800
      Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-11-22 00:26 +0100
        Re: NULL as the empty string supercat@casperkitty.com - 2017-11-21 16:03 -0800
    Re: NULL as the empty string "Pascal J. Bourguignon" <pjb@informatimago.com> - 2017-11-22 00:27 +0100
      Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-11-22 00:42 +0100
        Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-11-21 16:05 -0800
          Re: NULL as the empty string herrmannsfeldt@gmail.com - 2017-12-06 22:33 -0800
            Re: NULL as the empty string supercat@casperkitty.com - 2017-12-07 12:04 -0800
              Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-12-07 23:20 +0100
                Re: NULL as the empty string supercat@casperkitty.com - 2017-12-07 15:04 -0800
    Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-11-21 15:28 -0800
    Re: NULL as the empty string Thiago Adams <thiago.adams@gmail.com> - 2017-11-21 16:04 -0800
    Re: NULL as the empty string Siri Cruise <chine.bleu@yahoo.com> - 2017-11-21 16:25 -0800
      Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-11-22 01:34 +0100
    Re: NULL as the empty string bartc <bc@freeuk.com> - 2017-11-22 00:36 +0000
    Re: NULL as the empty string Öö Tiib <ootiib@hot.ee> - 2017-11-21 23:07 -0800
    NULL as the empty string asetofsymbols@gmail.com - 2017-11-23 22:23 -0800
    Re: NULL as the empty string Geoff <geoff@invalid.invalid> - 2017-12-09 09:05 -0800
      Re: NULL as the empty string Lew Pitcher <lew.pitcher@digitalfreehold.ca> - 2017-12-09 12:40 -0500
    Re: NULL as the empty string gordonb.yj0bc@burditt.org (Gordon Burditt) - 2017-12-09 13:50 -0600
    Re: NULL as the empty string Ian Collins <ian-news@hotmail.com> - 2017-12-10 08:59 +1300
      Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-12-09 12:22 -0800
        Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-12-11 01:42 +0100
          Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-12-10 19:20 -0800
            Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-12-11 18:56 +0100
              Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-12-11 11:19 -0800
          Re: NULL as the empty string supercat@casperkitty.com - 2017-12-15 09:29 -0800
            Re: NULL as the empty string Thiago Adams <thiago.adams@gmail.com> - 2018-01-05 08:28 -0800
              Re: NULL as the empty string supercat@casperkitty.com - 2018-01-05 09:37 -0800
                Re: NULL as the empty string Thiago Adams <thiago.adams@gmail.com> - 2018-01-05 17:08 -0800
      Re: NULL as the empty string supercat@casperkitty.com - 2017-12-15 09:18 -0800

Page 5 of 5 — ← Prev page 1 2 3 4 [5]


#124071

FromIan Collins <ian-news@hotmail.com>
Date2017-12-10 08:59 +1300
Message-ID<f92tlaFp91dU8@mid.individual.net>
In reply to#123231
On 11/22/2017 11:52 AM, jacobnavia wrote:
> Whaat would happen if we decide to give meaning to NULL?
> 
> Assume that in all places of the string library (strlen, strcmp, etc) a
> NULL was the equivalent of the string "\0"
> 
> strlen would return zero, etc. This way, we would save terabytes of "\0"
> strings since an empty string would not require any storage, since it is
> empty of course!

How would you distinguish between an error condition, returning null and 
a normal return of an empty string?

-- 
Ian.

[toc] | [prev] | [next] | [standalone]


#124072

FromKeith Thompson <kst-u@mib.org>
Date2017-12-09 12:22 -0800
Message-ID<lny3mbr6rv.fsf@kst-u.example.com>
In reply to#124071
Ian Collins <ian-news@hotmail.com> writes:
> On 11/22/2017 11:52 AM, jacobnavia wrote:
>> Whaat would happen if we decide to give meaning to NULL?
>> 
>> Assume that in all places of the string library (strlen, strcmp, etc) a
>> NULL was the equivalent of the string "\0"
>> 
>> strlen would return zero, etc. This way, we would save terabytes of "\0"
>> strings since an empty string would not require any storage, since it is
>> empty of course!
>
> How would you distinguish between an error condition, returning null and 
> a normal return of an empty string?

Well, you could still compare the result for equality to NULL.

But if the standard were changed to treat a null pointer as if it
were a pointer to an empty string, then non-standard functions
would likely return a null pointer to denote an empty string.
For example, strdup() (which is POSIX, not ISO C) might return
a null pointer given a null pointer argument -- or even given a
non-null pointer to an empty string.

A related problem with this change is that new code would be written
with the assumption that strlen(NULL)==0.  Such code would break
when compiled with many existing implementations that conform to
C11 or earlier.  It might be worthwhile *if* there were enough
advantages to the change, but IMHO there are not.

-- 
Keith Thompson (The_Other_Keith) kst-u@mib.org  <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something.  This is something.  Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"

[toc] | [prev] | [next] | [standalone]


#124118

Fromjacobnavia <jacob@jacob.remcomp.fr>
Date2017-12-11 01:42 +0100
Message-ID<p0kk96$h7s$1@dont-email.me>
In reply to#124072
Le 09/12/2017 à 21:22, Keith Thompson a écrit :
> But if the standard were changed to treat a null pointer as if it
> were a pointer to an empty string, then non-standard functions
> would likely return a null pointer to denote an empty string.
> For example, strdup() (which is POSIX, not ISO C) might return
> a null pointer given a null pointer argument -- or even given a
> non-null pointer to an empty string.

Truth table:

strdup(str) --> NULL or a malloced string.
strdup(NULL) --> NULL.

Everything would stay the same.

Unless you WANT to crash within strdup and not when the NULL pointer 
result is dereferenced. In that case (as in the others)

#define strdup TrappingStrdup

char *TrappingStrdup(char *str)
{
	if (*str == 0)
		;
	return strdup(str);
}

[toc] | [prev] | [next] | [standalone]


#124120

FromKeith Thompson <kst-u@mib.org>
Date2017-12-10 19:20 -0800
Message-ID<lnbmj6q7an.fsf@kst-u.example.com>
In reply to#124118
jacobnavia <jacob@jacob.remcomp.fr> writes:
> Le 09/12/2017 à 21:22, Keith Thompson a écrit :
>> But if the standard were changed to treat a null pointer as if it
>> were a pointer to an empty string, then non-standard functions
>> would likely return a null pointer to denote an empty string.
>> For example, strdup() (which is POSIX, not ISO C) might return
>> a null pointer given a null pointer argument -- or even given a
>> non-null pointer to an empty string.
>
> Truth table:
>
> strdup(str) --> NULL or a malloced string.
> strdup(NULL) --> NULL.
>
> Everything would stay the same.

(That's not a "truth table".  The entries in a truth table are true or
false.)

Would strdup(NULL) returning a malloced empty string be non-conforming?

> Unless you WANT to crash within strdup and not when the NULL pointer 
> result is dereferenced. In that case (as in the others)
>
> #define strdup TrappingStrdup
>
> char *TrappingStrdup(char *str)
> {
> 	if (*str == 0)
> 		;
> 	return strdup(str);
> }

Or I can just use strdup() as it's currently implemented (though *str
has undefined behavior and is not guaranteed to trap).  For that matter,
since evaluting *str has undefined behavior, a compiler could assume
that str is non-null thereafter.

But that raises an interesting question.  In C as it's currently
defined, if s is a pointer to an empty string:

    const char *s = "";

then *s and s[0] both yield '\0', and more generally, if s is a pointer
to a string then s[strlen(s)] == '\0'.  With your proposed change, that
guarantee would be broken.  s could be (treated as) a pointer to an
empty string, but if s==NULL then *s and s[0] would have undefined
behavior (and would likely trap).

If I want to traverse a string, I can write:

    for (ptr = s; *ptr != '\0' ptr ++) {
        do_something_with(*ptr);
    }

and it will work if ptr points to a string.  Under your proposal,
ptr==NULL would make it a valid pointer to a string, but the above
code would break.

Your change would not affect existing programs that (correctly) assume a
null pointer doesn't point to a string, but such code could be broken if
it's called by new code that relies on your new definitions.

-- 
Keith Thompson (The_Other_Keith) kst-u@mib.org  <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something.  This is something.  Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"

[toc] | [prev] | [next] | [standalone]


#124150

Fromjacobnavia <jacob@jacob.remcomp.fr>
Date2017-12-11 18:56 +0100
Message-ID<p0mgs5$bbv$1@dont-email.me>
In reply to#124120
Le 11/12/2017 à 04:20, Keith Thompson a écrit :
> jacobnavia <jacob@jacob.remcomp.fr> writes:
>> Le 09/12/2017 à 21:22, Keith Thompson a écrit :
>>> But if the standard were changed to treat a null pointer as if it
>>> were a pointer to an empty string, then non-standard functions
>>> would likely return a null pointer to denote an empty string.
>>> For example, strdup() (which is POSIX, not ISO C) might return
>>> a null pointer given a null pointer argument -- or even given a
>>> non-null pointer to an empty string.
>>
>> Truth table:
>>
>> strdup(str) --> NULL or a malloced string.
>> strdup(NULL) --> NULL.
>>
>> Everything would stay the same.
> 
> (That's not a "truth table".  The entries in a truth table are true or
> false.)
> 

For binary logic yes. For use as an exhaustive description of several 
alternatives no.

> Would strdup(NULL) returning a malloced empty string be non-conforming?
> 

Yes. No storage has been given to duplicate, so strdup shoudn't allocate 
any space

>> Unless you WANT to crash within strdup and not when the NULL pointer
>> result is dereferenced. In that case (as in the others)
>>
>> #define strdup TrappingStrdup
>>
>> char *TrappingStrdup(char *str)
>> {
>> 	if (*str == 0)
>> 		;
>> 	return strdup(str);
>> }
> 
> Or I can just use strdup() as it's currently implemented (though *str
> has undefined behavior and is not guaranteed to trap).  For that matter,
> since evaluting *str has undefined behavior, a compiler could assume
> that str is non-null thereafter.
> 

That's a correct assumption. If "str" is NULL it will trap right there.

> But that raises an interesting question.  In C as it's currently
> defined, if s is a pointer to an empty string:
> 
>      const char *s = "";
> 
> then *s and s[0] both yield '\0', and more generally, if s is a pointer
> to a string then s[strlen(s)] == '\0'.  With your proposed change, that
> guarantee would be broken.  s could be (treated as) a pointer to an
> empty string, but if s==NULL then *s and s[0] would have undefined
> behavior (and would likely trap).
> 

Yes. Note that NULL would be a *representation* of the empty string, not 
an empty string. This is a bogus issue since none of the str* functions 
would return NULL when given actual strings. The whole behavior of the 
string functions would NOT change. They would have a defined behavior 
for NULL arguments, that's all.


> If I want to traverse a string, I can write:
> 
>      for (ptr = s; *ptr != '\0' ptr ++) {
>          do_something_with(*ptr);
>      }
> 
> and it will work if ptr points to a string.  Under your proposal,
> ptr==NULL would make it a valid pointer to a string, but the above
> code would break.
> 

Yes, you can't "traverse" the representation of an empty string. Note 
that if "s" is NULL your code will crash now anyway.


> Your change would not affect existing programs that (correctly) assume a
> null pointer doesn't point to a string, but such code could be broken if
> it's called by new code that relies on your new definitions.
> 

Not at all. Nothing forces you to use NULL as a representation of an 
empty string. Now, if you rely that strlen(NULL) will  crash you rely on 
undefined behavior.

The main advantage I see in this would be that the crashes wouldn't be 
somewhere in the C library but in your code. Most C libraries do not 
ship with debugging information, and a crash somewhere in a shared 
object/dll is worst than a crash in your code.

[toc] | [prev] | [next] | [standalone]


#124170

FromKeith Thompson <kst-u@mib.org>
Date2017-12-11 11:19 -0800
Message-ID<lnvahdoyw7.fsf@kst-u.example.com>
In reply to#124150
jacobnavia <jacob@jacob.remcomp.fr> writes:
> Le 11/12/2017 à 04:20, Keith Thompson a écrit :
>> jacobnavia <jacob@jacob.remcomp.fr> writes:
>>> Le 09/12/2017 à 21:22, Keith Thompson a écrit :
>>>> But if the standard were changed to treat a null pointer as if it
>>>> were a pointer to an empty string, then non-standard functions
>>>> would likely return a null pointer to denote an empty string.
>>>> For example, strdup() (which is POSIX, not ISO C) might return
>>>> a null pointer given a null pointer argument -- or even given a
>>>> non-null pointer to an empty string.
>>>
>>> Truth table:
>>>
>>> strdup(str) --> NULL or a malloced string.
>>> strdup(NULL) --> NULL.
>>>
>>> Everything would stay the same.
>> 
>> (That's not a "truth table".  The entries in a truth table are true or
>> false.)
>
> For binary logic yes. For use as an exhaustive description of several 
> alternatives no.

This is a minor point, but I stand by my statement that what you
presented above is not a "truth table".  If you want to keep calling it
that I can't stop you.

>> Would strdup(NULL) returning a malloced empty string be non-conforming?
>
> Yes. No storage has been given to duplicate, so strdup shoudn't allocate 
> any space

You've said that a null pointer (which you continue to insist on calling
"NULL", which is the name of a macro) is a pointer to an empty string.
If strdup(NULL) returning a non-null pointer to a malloced empty string
is going to be non-conforming, there would have to be specific wording
to that effect.  Note that this would affect the POSIX standard; the
side effects of your proposed change are not limited to ISO C.

>>> Unless you WANT to crash within strdup and not when the NULL pointer
>>> result is dereferenced. In that case (as in the others)
>>>
>>> #define strdup TrappingStrdup
>>>
>>> char *TrappingStrdup(char *str)
>>> {
>>> 	if (*str == 0)
>>> 		;
>>> 	return strdup(str);
>>> }
>> 
>> Or I can just use strdup() as it's currently implemented (though *str
>> has undefined behavior and is not guaranteed to trap).  For that matter,
>> since evaluting *str has undefined behavior, a compiler could assume
>> that str is non-null thereafter.
>
> That's a correct assumption. If "str" is NULL it will trap right there.

Who says it will trap?  As we've discussed here recently (in fact I
think it's what triggered this discussion), there have been systems on
which a null pointer points to memory address 0, and that address
contained a readable zero byte.  If str==NULL, then (*str == 0) on such
a system could yield 1 (true).  This is, as you know, the nature of
undefined behavior.

>> But that raises an interesting question.  In C as it's currently
>> defined, if s is a pointer to an empty string:
>> 
>>      const char *s = "";
>> 
>> then *s and s[0] both yield '\0', and more generally, if s is a pointer
>> to a string then s[strlen(s)] == '\0'.  With your proposed change, that
>> guarantee would be broken.  s could be (treated as) a pointer to an
>> empty string, but if s==NULL then *s and s[0] would have undefined
>> behavior (and would likely trap).
>
> Yes. Note that NULL would be a *representation* of the empty string, not 
> an empty string.

A null pointer would be a treated as a pointer to an empty string.  Of
course a pointer isn't a string.  

>                  This is a bogus issue since none of the str* functions 
> would return NULL when given actual strings. The whole behavior of the 
> string functions would NOT change. They would have a defined behavior 
> for NULL arguments, that's all.

I don't believe there are any standard library functions that return a
pointer to a string that didn't already exist before the function was
called.  strdup() was excluded from the C standard because it does that.

The defined behavior of the standard string functions would not change.
I'm talking about string functions not defined by the standard.

>> If I want to traverse a string, I can write:
>> 
>>      for (ptr = s; *ptr != '\0' ptr ++) {
>>          do_something_with(*ptr);
>>      }
>> 
>> and it will work if ptr points to a string.  Under your proposal,
>> ptr==NULL would make it a valid pointer to a string, but the above
>> code would break.
>
> Yes, you can't "traverse" the representation of an empty string. Note 
> that if "s" is NULL your code will crash now anyway.

An empty string, in C as it's currently defined, is a sequence
consisting of a single null character, and I certainly can traverse it.

You're not just proposing a change to the implementation of the C
standard library.  You're proposing a change in the definition of
"pointer to a string" that will require updating existing code.

Say I provide this function in a library:

    size_t sum_str(const char *s) {
        size_t sum = 0;
        while (*s != '\0') {
            sum += (unsigned char)*s;
            s ++;
        }
        return sum;
    }

Currently, sum_str(NULL) has undefined behavior.  Let's assume
your proposal is adopted in a future C standard.  Programmers will
reasonably assume that a null pointer is a valid pointer to an
empty string -- but sum_str(NULL) will still have undefined behavior.

If all such functions aren't updated (and they won't be), then we'll
have two different models, one in which a null pointer is not a valid
pointer to a string, and one in which it's a valid pointer to an empty
string.  Mixing code using those two different models will inevitably
lead to bugs that wouldn't have occurred otherwise.

[...]

If you want to update lcc-win's library implementation so it checks for
null pointers and treats them as pointers to empty strings, I have no
particular objection.  I do oppose making such a change to the C
standard.

-- 
Keith Thompson (The_Other_Keith) kst-u@mib.org  <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something.  This is something.  Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"

[toc] | [prev] | [next] | [standalone]


#124360

Fromsupercat@casperkitty.com
Date2017-12-15 09:29 -0800
Message-ID<95500bf2-ad4e-4ad4-9dd6-41c0f0807f05@googlegroups.com>
In reply to#124118
On Sunday, December 10, 2017 at 6:42:22 PM UTC-6, jacobnavia wrote:
> strdup(str) --> NULL or a malloced string.
> strdup(NULL) --> NULL.

Better would be to either have a "mode" parameter to control corner-case
behaviors (including running out of memory) or have separate functions for
different usage cases.  Some code would need a function that returns a
pointer to a single-byte allocation holding a zero when given any kind of
empty string, and some would benefit from a function that returns null
when given any kind of empty string.

If one didn't have to support existing client code which expects to call
"free" on things returned from strdup, btw, an even better approach may
be to do something like:

    const char _Static_empty_string = ""; // Exported symbol

    char *strnacpy(char const *dat, size_t len)
    {
      if (!len) return &_Static_empty_string;
      char *ret = malloc(len+1);
      if (!ret)
      {
        raise(...out of memory);
        exit(EXIT_FAILURE);
      }
      memcpy(ret, dat, len);
      ret[len]=0;
      return ret;    
    }

    void strfree(char *p)
    {
      if (p != &_Static_empty_string)
        free(p);
    }

    void strdup(char const *st)
    {
      return strnacpyst, strlen(st));
    }

The cost of comparisons to _Static_empty_string may be slightly greater
than that of comparisons to null, but if one would allow implementations
to define _Static_empty_string as ((char const*)0) if they can guarantee
that would provide the advantages of using NULL on platforms which offer
such guarantee, while being compatible with those that don't.  The only
problem (and it's a big one) is that it would require client code to use
strfree() rather than free().

[toc] | [prev] | [next] | [standalone]


#125107

FromThiago Adams <thiago.adams@gmail.com>
Date2018-01-05 08:28 -0800
Message-ID<b4553592-89a5-4d22-8183-9148b595378e@googlegroups.com>
In reply to#124360
On Friday, December 15, 2017 at 3:30:02 PM UTC-2, supe...@casperkitty.com wrote:
> On Sunday, December 10, 2017 at 6:42:22 PM UTC-6, jacobnavia wrote:
> > strdup(str) --> NULL or a malloced string.
> > strdup(NULL) --> NULL.
> 
> Better would be to either have a "mode" parameter to control corner-case
> behaviors (including running out of memory) or have separate functions for
> different usage cases.  Some code would need a function that returns a
> pointer to a single-byte allocation holding a zero when given any kind of
> empty string, and some would benefit from a function that returns null
> when given any kind of empty string.
> 
> If one didn't have to support existing client code which expects to call
> "free" on things returned from strdup, btw, an even better approach may
> be to do something like:
> 
>     const char _Static_empty_string = ""; // Exported symbol
> 
>     char *strnacpy(char const *dat, size_t len)
>     {
>       if (!len) return &_Static_empty_string;
>       char *ret = malloc(len+1);
>       if (!ret)
>       {
>         raise(...out of memory);
>         exit(EXIT_FAILURE);
>       }
>       memcpy(ret, dat, len);
>       ret[len]=0;
>       return ret;    
>     }
> 
>     void strfree(char *p)
>     {
>       if (p != &_Static_empty_string)
>         free(p);
>     }
> 
>     void strdup(char const *st)
>     {
>       return strnacpyst, strlen(st));
>     }
> 
> The cost of comparisons to _Static_empty_string may be slightly greater
> than that of comparisons to null, but if one would allow implementations
> to define _Static_empty_string as ((char const*)0) if they can guarantee
> that would provide the advantages of using NULL on platforms which offer
> such guarantee, while being compatible with those that don't.  The only
> problem (and it's a big one) is that it would require client code to use
> strfree() rather than free().

My suggestion (not only for char* but any pointer) is to have
a type modifier that changes the type to "maybe be null" and 
"not null".
I wish the default was "not null", but this breaks 
compatibility. Maybe we could have compiler flags to choose
the default mode.

How it works? considering the default is "not null" 

void strlen(char *psz); //psz not null by default

void strlen2(char * _null psz); //psz can be null

int main()
{
  char *psz1 = "a";
  strlen(psz1); //ok
  strlen2(psz1); //ok

  char * _null psz2 = NULL;
  strlen(psz2); //error
  strlen2(psz2); //ok

  char * _null  psz3 = NULL;
  strlen(psz3); //error
  strlen2(psz3); //ok

  ...
  if (psz3)
  {
    strlen(psz3); //ok
  }

}

When the static analysis cannot determine if the 
"maybe null" is not null then the programmer can do 
a cast.



char * _null psz3 = NULL;
  ...
  ...
  if (a && b && c && !d && g())
  {
    //I know that this psz3 is not null here..
    strlen((char*) psz3); //ok
  }

This makes the code safe and fast.

[toc] | [prev] | [next] | [standalone]


#125117

Fromsupercat@casperkitty.com
Date2018-01-05 09:37 -0800
Message-ID<7d0366d5-c83e-469a-a2b7-a63274835480@googlegroups.com>
In reply to#125107
On Friday, January 5, 2018 at 10:28:49 AM UTC-6, Thiago Adams wrote:
> My suggestion (not only for char* but any pointer) is to have
> a type modifier that changes the type to "maybe be null" and 
> "not null".
> I wish the default was "not null", but this breaks 
> compatibility. Maybe we could have compiler flags to choose
> the default mode.

Even in strongly-typed languages, arrays of non-nullable types pose a
problem.  In many cases, arrays will need to be populated non-sequentially,
and proving that all elements of an array get written may require solving
the Halting Problem.  If there isn't any meaningful value that could be
used to pre-populate an array of pointers, filling it with null is apt
to be better than having code fill in some other meaningless value.

The biggest problem with null is that processor instruction sets don't
include forms of base-plus-displacement addressing modes that trap when
the base is zero (whether or not the displacement is), nor do they include
'compute pointer displacement' instructions which trap when the base is
zero and the displacement *isn't*.  The biggest reason that null pointers
lead to data corruption is that while some platforms may be able to cheaply
and reliably trap `*p = 123;` when `p` is null (and implementations on such
platforms will often do so), reliably trapping a sequence like
`p += 123456; ... *p = 123;` is generally much more expensive (and so many
implementations don't).

Note that specifying that adding or subtracting zero to/from a null pointer
yields a null pointer, and that subtracting two null pointers from each
other yields zero, will allow a number of constructs to be expressed more
cleanly.  C++ defines such behaviors and I think they would be useful in C
as well.  They would be unlikely to impose extra cost on most platforms,
except when using implementation options to trap null-pointer arithmetic.
In those cases, there would be a slight code-size cost, but the extra code
could be kept out of the execution path for the pointer-not-null case; with
slight extra work, it could even be kept far enough away from the execution
path that caching efficiency would be unaffected.

BTW, another approach that might be useful would be for the Standard to
specify a pointer constant with the following properties:

1. It may be a null pointer on implementations where a null pointer would
   satisfy the remaining constraints.

2. Adding or subtracting zero to/from it would yield the same value, and
   subtracting it from itself would yield zero.

3. It may be used as an argument to functions like memcpy, memmove, fread,
   fwrite, etc. when the size argument is zero.

4. One byte may be read from this location and yield zero; writes to this
   location will result in UB.

5. When passed to free() or realloc() it will behave like a null pointer
   (this would of course be a gimmee if the constant is, in fact, a null
   pointer).

6. Functions like malloc, calloc, and realloc() would be allowed to return
   this pointer value when asked to allocate a zero-size region.

User code could easily produce a pointer constant which satisfies all of the
properties except for #5 and #6, but using one "dummy object" consistently
throughout a program would seem cleaner than having every compilation unit
define its own separate object for that purpose.

Note that on implementations where null pointers would intrinsically have
all of the above properties, the added cost of adding the library constant
would be essentially zero.  On any implementation where the build system
would know the address of a constant zero byte, the cost would be limited
to an extra comparison in each of free() and realloc() to support #5 above.

[toc] | [prev] | [next] | [standalone]


#125152

FromThiago Adams <thiago.adams@gmail.com>
Date2018-01-05 17:08 -0800
Message-ID<f70b66a4-9800-4a5e-ae37-0a3453a67c59@googlegroups.com>
In reply to#125117
On Friday, January 5, 2018 at 3:37:50 PM UTC-2, supe...@casperkitty.com wrote:
> On Friday, January 5, 2018 at 10:28:49 AM UTC-6, Thiago Adams wrote:
> > My suggestion (not only for char* but any pointer) is to have
> > a type modifier that changes the type to "maybe be null" and 
> > "not null".
> > I wish the default was "not null", but this breaks 
> > compatibility. Maybe we could have compiler flags to choose
> > the default mode.
> 
> Even in strongly-typed languages, arrays of non-nullable types pose a
> problem.  In many cases, arrays will need to be populated non-sequentially,
> and proving that all elements of an array get written may require solving
> the Halting Problem.  If there isn't any meaningful value that could be
> used to pre-populate an array of pointers, filling it with null is apt
> to be better than having code fill in some other meaningless value.


We can create array of "can be null pointer" fill it sequentially
or not and then at the end just cast to array of "cannot be null pointer".

We don´t need to worry about the static analysis for complicated code.


//array that can be null of T pointers that can be null
T * maybe_null * maybe_null pArray = malloc(sizeof(T*) * N);

if (pArray)
{
...fill...
}

//returns and array that can be null of pointers to T that cannot be null
return (T** maybe_null) pArray;

[toc] | [prev] | [next] | [standalone]


#124359

Fromsupercat@casperkitty.com
Date2017-12-15 09:18 -0800
Message-ID<bda6685e-08f7-465a-8f43-d6faf7cd4dd0@googlegroups.com>
In reply to#124071
On Saturday, December 9, 2017 at 1:59:49 PM UTC-6, Ian Collins wrote:
> On 11/22/2017 11:52 AM, jacobnavia wrote:
> > Whaat would happen if we decide to give meaning to NULL?
> > 
> > Assume that in all places of the string library (strlen, strcmp, etc) a
> > NULL was the equivalent of the string "\0"
> > 
> > strlen would return zero, etc. This way, we would save terabytes of "\0"
> > strings since an empty string would not require any storage, since it is
> > empty of course!
> 
> How would you distinguish between an error condition, returning null and 
> a normal return of an empty string?

By using or wrapping memory allocation routines such that they are
guaranteed never to return unsuccessfully.  If a program won't be
able to do anything useful if it can't get all the memory it needs,
wrapping memory-allocation routines so that they'll either succeed,
raise a signal, or force an abnormal exit, will eliminate the need
for a lot of client-side error-checking code and will make it much
easier to ensure that the program will behave predictably in all
out-of-memory scenarios.

[toc] | [prev] | [standalone]


Page 5 of 5 — ← Prev page 1 2 3 4 [5]

Back to top | Article view | comp.lang.c


csiph-web