Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.c > #123231 > unrolled thread

NULL as the empty string

Started byjacobnavia <jacob@jacob.remcomp.fr>
First post2017-11-21 23:52 +0100
Last post2017-12-15 09:18 -0800
Articles 20 on this page of 91 — 21 participants

Back to article view | Back to comp.lang.c


Contents

  NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-11-21 23:52 +0100
    Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-11-21 15:16 -0800
      Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-11-22 00:38 +0100
        Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-11-21 16:02 -0800
          Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-11-22 01:13 +0100
            Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-11-21 16:52 -0800
        Re: NULL as the empty string Robert Wessel <robertwessel2@yahoo.com> - 2017-11-21 18:09 -0600
        Re: NULL as the empty string Siri Cruise <chine.bleu@yahoo.com> - 2017-11-21 16:34 -0800
        Re: NULL as the empty string David Brown <david.brown@hesbynett.no> - 2017-11-22 12:12 +0100
      Re: NULL as the empty string supercat@casperkitty.com - 2017-11-21 15:57 -0800
        Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-11-22 01:06 +0100
          Re: NULL as the empty string supercat@casperkitty.com - 2017-11-22 15:42 -0800
            Re: NULL as the empty string Melzzzzz <Melzzzzz@zzzzz.com> - 2017-11-22 23:49 +0000
              Re: NULL as the empty string supercat@casperkitty.com - 2017-11-22 15:56 -0800
                Re: NULL as the empty string Melzzzzz <Melzzzzz@zzzzz.com> - 2017-11-23 00:06 +0000
                  Re: NULL as the empty string supercat@casperkitty.com - 2017-11-23 17:31 -0800
                    Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-11-24 09:42 +0100
                      Re: NULL as the empty string supercat@casperkitty.com - 2017-11-24 13:47 -0800
      Re: NULL as the empty string Jorgen Grahn <grahn+nntp@snipabacken.se> - 2017-11-22 06:46 +0000
      Re: NULL as the empty string John Bode <jfbode1029@gmail.com> - 2017-12-08 10:27 -0800
        Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-12-08 11:11 -0800
          Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-12-08 21:39 +0100
            Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-12-08 13:03 -0800
              Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-12-08 22:50 +0100
                Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-12-08 15:19 -0800
                  Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-12-09 00:35 +0100
                    Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-12-08 16:05 -0800
                      Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-12-09 01:22 +0100
                        Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-12-08 17:39 -0800
                        Re: NULL as the empty string John Bode <jfbode1029@gmail.com> - 2017-12-11 12:22 -0800
                      Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-12-09 01:29 +0100
                        Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-12-08 17:47 -0800
                          Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-12-09 07:05 +0100
                            Re: NULL as the empty string David Brown <david.brown@hesbynett.no> - 2017-12-09 18:37 +0100
                            Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-12-09 11:53 -0800
                              Re: NULL as the empty string supercat@casperkitty.com - 2017-12-12 10:49 -0800
                                Re: NULL as the empty string Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2017-12-12 13:39 -0800
                                  Re: NULL as the empty string supercat@casperkitty.com - 2017-12-12 16:05 -0800
                                    Re: NULL as the empty string Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2017-12-13 03:43 -0800
                                      Re: NULL as the empty string supercat@casperkitty.com - 2017-12-13 08:45 -0800
                                      Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-12-13 09:12 -0800
                                        Re: NULL as the empty string Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2017-12-13 13:27 -0800
                                          Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-12-13 14:02 -0800
                                            Re: NULL as the empty string asetofsymbols@gmail.com - 2017-12-13 14:58 -0800
                                              Re: NULL as the empty string asetofsymbols@gmail.com - 2017-12-13 15:11 -0800
                                            Re: NULL as the empty string Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2017-12-14 03:49 -0800
                                              Re: NULL as the empty string mark.bluemel@gmail.com - 2017-12-14 04:05 -0800
                                              Re: NULL as the empty string David Brown <david.brown@hesbynett.no> - 2017-12-14 13:09 +0100
                                                Re: NULL as the empty string Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2017-12-14 05:02 -0800
                                                  Re: NULL as the empty string David Brown <david.brown@hesbynett.no> - 2017-12-14 14:54 +0100
                                                Re: NULL as the empty string supercat@casperkitty.com - 2017-12-14 07:38 -0800
                                                  Re: NULL as the empty string Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2017-12-14 09:50 -0800
                                              Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-12-14 09:20 -0800
                                                Re: NULL as the empty string supercat@casperkitty.com - 2017-12-14 09:53 -0800
                                                  Re: NULL as the empty string Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2017-12-14 12:57 -0800
                                        Re: NULL as the empty string herrmannsfeldt@gmail.com - 2017-12-14 17:22 -0800
                                          Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-12-14 17:26 -0800
        Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-12-08 21:23 +0100
          Re: NULL as the empty string Malcolm McLean <malcolm.arthur.mclean@gmail.com> - 2017-12-08 13:41 -0800
            Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-12-08 22:54 +0100
    Re: NULL as the empty string supercat@casperkitty.com - 2017-11-21 15:17 -0800
      Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-11-22 00:26 +0100
        Re: NULL as the empty string supercat@casperkitty.com - 2017-11-21 16:03 -0800
    Re: NULL as the empty string "Pascal J. Bourguignon" <pjb@informatimago.com> - 2017-11-22 00:27 +0100
      Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-11-22 00:42 +0100
        Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-11-21 16:05 -0800
          Re: NULL as the empty string herrmannsfeldt@gmail.com - 2017-12-06 22:33 -0800
            Re: NULL as the empty string supercat@casperkitty.com - 2017-12-07 12:04 -0800
              Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-12-07 23:20 +0100
                Re: NULL as the empty string supercat@casperkitty.com - 2017-12-07 15:04 -0800
    Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-11-21 15:28 -0800
    Re: NULL as the empty string Thiago Adams <thiago.adams@gmail.com> - 2017-11-21 16:04 -0800
    Re: NULL as the empty string Siri Cruise <chine.bleu@yahoo.com> - 2017-11-21 16:25 -0800
      Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-11-22 01:34 +0100
    Re: NULL as the empty string bartc <bc@freeuk.com> - 2017-11-22 00:36 +0000
    Re: NULL as the empty string Öö Tiib <ootiib@hot.ee> - 2017-11-21 23:07 -0800
    NULL as the empty string asetofsymbols@gmail.com - 2017-11-23 22:23 -0800
    Re: NULL as the empty string Geoff <geoff@invalid.invalid> - 2017-12-09 09:05 -0800
      Re: NULL as the empty string Lew Pitcher <lew.pitcher@digitalfreehold.ca> - 2017-12-09 12:40 -0500
    Re: NULL as the empty string gordonb.yj0bc@burditt.org (Gordon Burditt) - 2017-12-09 13:50 -0600
    Re: NULL as the empty string Ian Collins <ian-news@hotmail.com> - 2017-12-10 08:59 +1300
      Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-12-09 12:22 -0800
        Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-12-11 01:42 +0100
          Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-12-10 19:20 -0800
            Re: NULL as the empty string jacobnavia <jacob@jacob.remcomp.fr> - 2017-12-11 18:56 +0100
              Re: NULL as the empty string Keith Thompson <kst-u@mib.org> - 2017-12-11 11:19 -0800
          Re: NULL as the empty string supercat@casperkitty.com - 2017-12-15 09:29 -0800
            Re: NULL as the empty string Thiago Adams <thiago.adams@gmail.com> - 2018-01-05 08:28 -0800
              Re: NULL as the empty string supercat@casperkitty.com - 2018-01-05 09:37 -0800
                Re: NULL as the empty string Thiago Adams <thiago.adams@gmail.com> - 2018-01-05 17:08 -0800
      Re: NULL as the empty string supercat@casperkitty.com - 2017-12-15 09:18 -0800

Page 2 of 5 — ← Prev page 1 [2] 3 4 5  Next page →


#124020

FromKeith Thompson <kst-u@mib.org>
Date2017-12-08 11:11 -0800
Message-ID<ln7etxt4ox.fsf@kst-u.example.com>
In reply to#124016
John Bode <jfbode1029@gmail.com> writes:
> On Tuesday, November 21, 2017 at 5:16:29 PM UTC-6, Keith Thompson wrote:
>> jacobnavia <jacob@jacob.remcomp.fr> writes:
>> > Whaat would happen if we decide to give meaning to NULL?
>> 
>> NULL (more precisely a null pointer) has a meaning.  It's a pointer
>> value that doesn't point to anything.
>
> I think I get what Jacob is getting at - instead of NULL being a macro
> that expands to a 0, have it be a special keyword with special,
> context-dependent semantics.  That is, a call against strlen or strcmp
> with NULL would be interpreted differently by the compiler, which
> would emit code that immediately returned a 0 or false result, without
> having to actually evaluate anything, or store an actual empty string
> or pointer to an empty string.

That wasn't the impression I got at all.  I thought jacob was advocating
giving a null pointer value a special meaning, so that either
    strlen(NULL)
or
    strlen(0)
or
    char *ptr = 0;
    strlen(ptr)
would return 0 (rather than having undefined behavior, as it currently
does).  I didn't think he was talking about giving the NULL macro a
special meaning distinct from being a null pointer constant.

jacob, would you care to comment?

[...]

-- 
Keith Thompson (The_Other_Keith) kst-u@mib.org  <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something.  This is something.  Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"

[toc] | [prev] | [next] | [standalone]


#124023

Fromjacobnavia <jacob@jacob.remcomp.fr>
Date2017-12-08 21:39 +0100
Message-ID<p0etae$tqd$1@dont-email.me>
In reply to#124020
Le 08/12/2017 à 20:11, Keith Thompson a écrit :
> John Bode <jfbode1029@gmail.com> writes:
>> On Tuesday, November 21, 2017 at 5:16:29 PM UTC-6, Keith Thompson wrote:
>>> jacobnavia <jacob@jacob.remcomp.fr> writes:
>>>> Whaat would happen if we decide to give meaning to NULL?
>>>
>>> NULL (more precisely a null pointer) has a meaning.  It's a pointer
>>> value that doesn't point to anything.
>>
>> I think I get what Jacob is getting at - instead of NULL being a macro
>> that expands to a 0, have it be a special keyword with special,
>> context-dependent semantics.  That is, a call against strlen or strcmp
>> with NULL would be interpreted differently by the compiler, which
>> would emit code that immediately returned a 0 or false result, without
>> having to actually evaluate anything, or store an actual empty string
>> or pointer to an empty string.
> 
> That wasn't the impression I got at all.  I thought jacob was advocating
> giving a null pointer value a special meaning, so that either
>      strlen(NULL)
> or
>      strlen(0)
> or
>      char *ptr = 0;
>      strlen(ptr)
> would return 0 (rather than having undefined behavior, as it currently
> does).  I didn't think he was talking about giving the NULL macro a
> special meaning distinct from being a null pointer constant.
> 
> jacob, would you care to comment?
> 
> [...]
> 

NULL is an empty string, it has no data and needs no space to represent 
it. Instead of crashing at the sight of a NULL pointer, strlen returns 
zero since there is nothing there.

The concept of nothing (zero) is fairly recent in maths, several 
centuries only. I just thought that the best way to represent nothing in 
a software system is zero, onbviously, (also known as NULL in this part 
of tyhe world where "C" lives).

This is obviously a very space efficient encoding, but the idea is about 
the principle of an empty object, not so much about some space gains.

Then I disgress: are all empty objects equal?

RT * fn(int *a, struct AirBnB *x);

Should fn(NULL,NULL) (that returns maybe NULL or fakes some result) be 
allowed?

We have setup things in most cases so that an empty object triggers an 
exception. That is unnecessary in many cases, and in the case of strcmp 
it should be banned... isn't it?

Yes, if it is an error and you want the machine to trap that. You write 
a cover function:

int TrappingStrlen(char *src)
{
	if (*src == 0) return 0;
	return strlen(src);
}
and then

#define strlen TrappingStrlen

[toc] | [prev] | [next] | [standalone]


#124024

FromKeith Thompson <kst-u@mib.org>
Date2017-12-08 13:03 -0800
Message-ID<lnzi6trkxz.fsf@kst-u.example.com>
In reply to#124023
jacobnavia <jacob@jacob.remcomp.fr> writes:
> Le 08/12/2017 à 20:11, Keith Thompson a écrit :
>> John Bode <jfbode1029@gmail.com> writes:
>>> On Tuesday, November 21, 2017 at 5:16:29 PM UTC-6, Keith Thompson wrote:
>>>> jacobnavia <jacob@jacob.remcomp.fr> writes:
>>>>> Whaat would happen if we decide to give meaning to NULL?
>>>>
>>>> NULL (more precisely a null pointer) has a meaning.  It's a pointer
>>>> value that doesn't point to anything.
>>>
>>> I think I get what Jacob is getting at - instead of NULL being a macro
>>> that expands to a 0, have it be a special keyword with special,
>>> context-dependent semantics.  That is, a call against strlen or strcmp
>>> with NULL would be interpreted differently by the compiler, which
>>> would emit code that immediately returned a 0 or false result, without
>>> having to actually evaluate anything, or store an actual empty string
>>> or pointer to an empty string.
>> 
>> That wasn't the impression I got at all.  I thought jacob was advocating
>> giving a null pointer value a special meaning, so that either
>>      strlen(NULL)
>> or
>>      strlen(0)
>> or
>>      char *ptr = 0;
>>      strlen(ptr)
>> would return 0 (rather than having undefined behavior, as it currently
>> does).  I didn't think he was talking about giving the NULL macro a
>> special meaning distinct from being a null pointer constant.
>> 
>> jacob, would you care to comment?
>> 
>> [...]
>
> NULL is an empty string, it has no data and needs no space to represent 
> it. Instead of crashing at the sight of a NULL pointer, strlen returns 
> zero since there is nothing there.

To be clear, this is what you're proposing, not what the C standard
currently says.

> The concept of nothing (zero) is fairly recent in maths, several 
> centuries only. I just thought that the best way to represent nothing in 
> a software system is zero, onbviously, (also known as NULL in this part 
> of tyhe world where "C" lives).
[SNIP]

None of this answers the question I was trying to ask.

John Bode suggests that you were proposing making the macro NULL
mean something other than just a null pointer constant.  I assumed
that you were using the term NULL as a shorthand for a null pointer
value, though on re-reading your previous post I'm not entirely
clear on that point.

You propose that strlen(NULL) should return 0 (rather than having
undefined behavior as it currently does).  Does your proposal apply
only to code that uses the NULL macro, or would strlen() return
0 for any null pointer argument?  Would strlen(0), which passes a
null pointer value to strlen *without* referring to the NULL macro,
be required to return 0 under your proposal?

[...]

-- 
Keith Thompson (The_Other_Keith) kst-u@mib.org  <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something.  This is something.  Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"

[toc] | [prev] | [next] | [standalone]


#124026

Fromjacobnavia <jacob@jacob.remcomp.fr>
Date2017-12-08 22:50 +0100
Message-ID<p0f1eb$sg1$1@dont-email.me>
In reply to#124024
Le 08/12/2017 à 22:03, Keith Thompson a écrit :
> You propose that strlen(NULL) should return 0 (rather than having
> undefined behavior as it currently does).  Does your proposal apply
> only to code that uses the NULL macro, or would strlen() return
> 0 for any null pointer argument?  Would strlen(0), which passes a
> null pointer value to strlen*without*  referring to the NULL macro,
> be required to return 0 under your proposal?

Trying to make a distinction between using a macro and not would break 
the whole preprocessor.

Obviously this must be checked at run time. I do not see any use of calling

strlen(NULL);

in the code.

That could be simplified to zero by the compiler as an optimization.

I am proposing

The undefined behavior of calling strlen with a NULL pointer value is 
defined as zero. NULL is the empty string. If you want to crash for 
debugging reasons write a cover function and a #define during 
development. Once the program is ready, you do NOT want to crash. At 
least not in strlen.

The same for strcat(string,NULL); That does nothing.

strcpy(NULL, str), where str is not NULL should crash, since the rest of 
the code relies on a fresh copy. strcpy(str,NULL) should set the first 
byte of str to zero.

Etc. We could define other contexts also, where a NULL pointer conveys 
information, for instance when you open

FILE *f = fopen(NULL,"rw");

should give a temporary file that is erased by fclose.

For instance.

[toc] | [prev] | [next] | [standalone]


#124030

FromKeith Thompson <kst-u@mib.org>
Date2017-12-08 15:19 -0800
Message-ID<lnvahgst7s.fsf@kst-u.example.com>
In reply to#124026
jacobnavia <jacob@jacob.remcomp.fr> writes:
> Le 08/12/2017 à 22:03, Keith Thompson a écrit :
>> You propose that strlen(NULL) should return 0 (rather than having
>> undefined behavior as it currently does).  Does your proposal apply
>> only to code that uses the NULL macro, or would strlen() return
>> 0 for any null pointer argument?  Would strlen(0), which passes a
>> null pointer value to strlen*without*  referring to the NULL macro,
>> be required to return 0 under your proposal?
>
> Trying to make a distinction between using a macro and not would break 
> the whole preprocessor.
>
> Obviously this must be checked at run time. I do not see any use of calling
>
> strlen(NULL);
>
> in the code.

I'm fairly sure that answers my question, thanks.  You're talking
about null pointer values, not about the NULL macro.

I suggest using the phrase "null pointer" rather than "NULL".
Using the latter could imply (as John Bode thought) that you're
talking specifically about the macro rather than about the value
that it represents.

> That could be simplified to zero by the compiler as an optimization.
>
> I am proposing
>
> The undefined behavior of calling strlen with a NULL pointer value is 
> defined as zero. NULL is the empty string.
[...]

What you propose is treating a null pointer as *a pointer to* the
empty string, not as the empty string itself.  (Unless you're also
proposing to change the standard's definition of the word "string".)

Since the cases we're talking about have undefined behavior, a
conforming implementation can behave as you propose.  Does lcc-win
do this?  If not, are you planning to change it so it does?

-- 
Keith Thompson (The_Other_Keith) kst-u@mib.org  <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something.  This is something.  Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"

[toc] | [prev] | [next] | [standalone]


#124031

Fromjacobnavia <jacob@jacob.remcomp.fr>
Date2017-12-09 00:35 +0100
Message-ID<p0f7ki$6t5$1@dont-email.me>
In reply to#124030
Le 09/12/2017 à 00:19, Keith Thompson a écrit :
> Since the cases we're talking about have undefined behavior, a
> conforming implementation can behave as you propose.  Does lcc-win
> do this?  If not, are you planning to change it so it does?

Maybe, depends on the pros /cons I see in this discussion. Since any use 
of NULL in strings is undefined behavior now, I have free hand to do 
anything. I proposed the following truth table:

strlen(str)

The same behavior. For non NULL pointers nothing changes.

strlen(NULL) --> 0.

For strcat:

strcat(str,str) --> the same
strcat(str,NULL) -> No-Op. Returns str
strcat(NULL, str); -> No-Op, returns NULL.

Now, it would be needed to change the definition of a string. A string 
is either a pointer to an array of zero terminated chars, or NULL, 
meaning the empty string.

There are then two representations for the empty string:

NULL or "\0"

The difference is that the second can be expanded if storage is 
available, as it is now:

char string[23];

string[0]==0;

strcat(string,"hello");

This is OK, but NULL can't be expanded and remains the empty string and 
strcat returns NULL.

[toc] | [prev] | [next] | [standalone]


#124032

FromKeith Thompson <kst-u@mib.org>
Date2017-12-08 16:05 -0800
Message-ID<lnr2s4sr4c.fsf@kst-u.example.com>
In reply to#124031
jacobnavia <jacob@jacob.remcomp.fr> writes:
> Le 09/12/2017 à 00:19, Keith Thompson a écrit :
>> Since the cases we're talking about have undefined behavior, a
>> conforming implementation can behave as you propose.  Does lcc-win
>> do this?  If not, are you planning to change it so it does?
>
> Maybe, depends on the pros /cons I see in this discussion. Since any use 
> of NULL in strings is undefined behavior now, I have free hand to do 
> anything. I proposed the following truth table:
>
> strlen(str)
>
> The same behavior. For non NULL pointers nothing changes.
>
> strlen(NULL) --> 0.
>
> For strcat:
>
> strcat(str,str) --> the same
> strcat(str,NULL) -> No-Op. Returns str
> strcat(NULL, str); -> No-Op, returns NULL.
>
> Now, it would be needed to change the definition of a string. A string 
> is either a pointer to an array of zero terminated chars, or NULL, 
> meaning the empty string.

The current definitions (N1570 7.1.1.p1) are:

    A *string* is a contiguous sequence of characters terminated by and
    including the first null character.
    [...]
    A *pointer to a string* is a pointer to its initial (lowest
    addressed) character.

So a string is not a pointer, and a pointer cannot be a string.

Are you proposing to change the definition so that there's no
distinction between a "string" and a "pointer to a string"?  What then
would we call the sequence of characters that's currently defined as a
"string"?  If such a change were to be made to the standard, it would
require changing the wording for every function that acts on strings.

> There are then two representations for the empty string:
>
> NULL or "\0"

You keep referring to NULL (the name of a macro) when you mean the null
pointer.  That's exactly the point that caused the confusion in the
first place.

The way I would describe your proposal is that a null pointer is to be
treated as if it were a pointer to an empty string.  (I'm not sure
whether it would make sense to say that a null pointer actually points
to an empty string.  If so, I'm having difficulty coming up with a
revised definition of "string", since a null pointer does not physically
point to anything.  It might do so on a system where memory address 0
contains a readable 0 bytes, but it would be impractical for the
standard to require that.)

[...]

-- 
Keith Thompson (The_Other_Keith) kst-u@mib.org  <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something.  This is something.  Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"

[toc] | [prev] | [next] | [standalone]


#124033

Fromjacobnavia <jacob@jacob.remcomp.fr>
Date2017-12-09 01:22 +0100
Message-ID<p0faco$mh8$1@dont-email.me>
In reply to#124032
Le 09/12/2017 à 01:05, Keith Thompson a écrit :
>    A*string*  is a contiguous sequence of characters terminated by and
>      including the first null character.

A string is a contiguous sequence of characters terminated by and 
including the first null character.

A *pointer to a string* is a pointer to its initial (lowest
     addressed) character or NULL, meaning the empty string.

NULL has a pointer type but it doesn't point to anything, so it is used 
to represent the empty string.

With these definitions we can establish truth tables for each of the 
string functions in a similar way that math functions have defined 
behaviors for NANs and infs

[toc] | [prev] | [next] | [standalone]


#124035

FromKeith Thompson <kst-u@mib.org>
Date2017-12-08 17:39 -0800
Message-ID<lnh8t0smrg.fsf@kst-u.example.com>
In reply to#124033
jacobnavia <jacob@jacob.remcomp.fr> writes:
> Le 09/12/2017 à 01:05, Keith Thompson a écrit :
>>    A*string*  is a contiguous sequence of characters terminated by and
>>      including the first null character.
>
> A string is a contiguous sequence of characters terminated by and
> including the first null character.
>
> A *pointer to a string* is a pointer to its initial (lowest
>     addressed) character or NULL, meaning the empty string.
>
> NULL has a pointer type but it doesn't point to anything, so it is
> used to represent the empty string.

The NULL macro, when evaluated as an expression, does not necessarily
have pointer type.  It could have integer type.

> With these definitions we can establish truth tables for each of the
> string functions in a similar way that math functions have defined
> behaviors for NANs and infs

Let me ask you again, *please* stop saying "NULL" when you mean "a null
pointer value".  They're two different things.  Every occurrence (I
think) of NULL that you've used in this discussion would more accurately
be written as "a null pointer".  The macro NULL defined in <stddef.h>
and several other headers is not relevant.

Your use of "NULL" has already caused at least one person to
misunderstand what you're talking about.

-- 
Keith Thompson (The_Other_Keith) kst-u@mib.org  <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something.  This is something.  Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"

[toc] | [prev] | [next] | [standalone]


#124173

FromJohn Bode <jfbode1029@gmail.com>
Date2017-12-11 12:22 -0800
Message-ID<fb4e9b80-2fc3-46c4-bc6e-cc5716d74d49@googlegroups.com>
In reply to#124033
On Friday, December 8, 2017 at 6:22:57 PM UTC-6, jacobnavia wrote:
> Le 09/12/2017 à 01:05, Keith Thompson a écrit :
> >    A*string*  is a contiguous sequence of characters terminated by and
> >      including the first null character.
> 
> A string is a contiguous sequence of characters terminated by and 
> including the first null character.
> 
> A *pointer to a string* is a pointer to its initial (lowest
>      addressed) character or NULL, meaning the empty string.
> 
> NULL has a pointer type but it doesn't point to anything, so it is used 
> to represent the empty string.
> 

It can also be used to represent an empty container (stack, list, queue, tree, etc.), or an 
empty array of any type, etc.  Not *just* an empty string.  

NULL is a well-defined *invalid* pointer value, guaranteed to compare unequal to any
valid pointer.  It is used to represent a pointer that doesn't point to anything meaningful.

> With these definitions we can establish truth tables for each of the 
> string functions in a similar way that math functions have defined 
> behaviors for NANs and infs

Could do something similar for memcmp and memcpy, I guess.

[toc] | [prev] | [next] | [standalone]


#124034

Fromjacobnavia <jacob@jacob.remcomp.fr>
Date2017-12-09 01:29 +0100
Message-ID<p0fap8$oin$1@dont-email.me>
In reply to#124032
Le 09/12/2017 à 01:05, Keith Thompson a écrit :
> Are you proposing to change the definition so that there's no
> distinction between a "string" and a "pointer to a string"?

Of course not. A string is a (maybe empty) sequence of chars finished by 
zero. NULL is a representation for the empty string, as NANs are 
representations for computing failure, or inf is a representation of 
floating point overflow.

This representations are treated specially by the software. The same 
with NULL.

The practical side of this is that string functions will never crash 
again when passed NULL. It is always better to NOT crash in strlen.

Andf nobody can say:

"I rely on my program crashing when a NULL pointer is used"

Undefined behavior can be defined at any time. If you want the old 
behavior you make a cover function and #define it as strlen.

[toc] | [prev] | [next] | [standalone]


#124036

FromKeith Thompson <kst-u@mib.org>
Date2017-12-08 17:47 -0800
Message-ID<ln7etwsmd7.fsf@kst-u.example.com>
In reply to#124034
jacobnavia <jacob@jacob.remcomp.fr> writes:
> Le 09/12/2017 à 01:05, Keith Thompson a écrit :
>> Are you proposing to change the definition so that there's no
>> distinction between a "string" and a "pointer to a string"?
>
> Of course not.

That was the implication of what you wrote earlier.  I'm glad to see
that wasn't your intent.

[...]

>                A string is a (maybe empty) sequence of chars finished
> by zero. NULL is a representation for the empty string, as NANs are
> representations for computing failure, or inf is a representation of
> floating point overflow.

Would it be more convenient to treat a floating-point NaN as equivalent
to 0.0?  Is it always better for sqrt() not to crash?

[...]

-- 
Keith Thompson (The_Other_Keith) kst-u@mib.org  <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something.  This is something.  Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"

[toc] | [prev] | [next] | [standalone]


#124038

Fromjacobnavia <jacob@jacob.remcomp.fr>
Date2017-12-09 07:05 +0100
Message-ID<p0fuek$65u$1@dont-email.me>
In reply to#124036
Le 09/12/2017 à 02:47, Keith Thompson a écrit :
> Would it be more convenient to treat a floating-point NaN as equivalent
> to 0.0?  

The behavior of sqrt is defined for all inputs, as it is the behavior of 
pow. The behavior of strlen is not defined for NULL inputs.

For instance pow(1,x) should return always 1, even if x is a NAN, and 
yes, in this case it is treated as zero

>> Is it always better for sqrt() not to crash?

It will never crash unless you enable the floating point exceptions.

What is your point here? We are talking about strings, not floating 
point numbers. I just made a comparison between a representation of 
something that is a special value (NAN or inf) and the representation of 
the empty string, that could be the NULL pointer.

[toc] | [prev] | [next] | [standalone]


#124064

FromDavid Brown <david.brown@hesbynett.no>
Date2017-12-09 18:37 +0100
Message-ID<p0h71i$6ko$1@dont-email.me>
In reply to#124038
On 09/12/17 07:05, jacobnavia wrote:
> Le 09/12/2017 à 02:47, Keith Thompson a écrit :
>> Would it be more convenient to treat a floating-point NaN as equivalent
>> to 0.0? 
> 
> The behavior of sqrt is defined for all inputs, as it is the behavior of 
> pow. The behavior of strlen is not defined for NULL inputs.
> 

Defining strlen(NULL) to be 0 would still mean strlen is not defined for 
all inputs.  It would merely define it for one more input that before.

strlen is not defined if the argument points to something other than a 
string.  It could well crash if the argument pointed at memory that is 
not accessible, or mapped to some hardware device, or where there are no 
terminating nulls in the memory.

Functions that take pointers are always going to be a bit different from 
ones that take arithmetic types, and they will always valid and defined 
for only a subset of possible arguments.

(I agree that 0 would be a reasonable result for strlen(NULL).  Just 
don't imagine it will make strlen "safe" and "uncrashable".)

> For instance pow(1,x) should return always 1, even if x is a NAN, and 
> yes, in this case it is treated as zero
> 
>>> Is it always better for sqrt() not to crash?
> 
> It will never crash unless you enable the floating point exceptions.
> 
> What is your point here? We are talking about strings, not floating 
> point numbers. I just made a comparison between a representation of 
> something that is a special value (NAN or inf) and the representation of 
> the empty string, that could be the NULL pointer.
> 

[toc] | [prev] | [next] | [standalone]


#124070

FromKeith Thompson <kst-u@mib.org>
Date2017-12-09 11:53 -0800
Message-ID<ln374jsmof.fsf@kst-u.example.com>
In reply to#124038
jacobnavia <jacob@jacob.remcomp.fr> writes:
> Le 09/12/2017 à 02:47, Keith Thompson a écrit :
>> Would it be more convenient to treat a floating-point NaN as equivalent
>> to 0.0?  
[...]
> What is your point here? We are talking about strings, not floating 
> point numbers. I just made a comparison between a representation of 
> something that is a special value (NAN or inf) and the representation of 
> the empty string, that could be the NULL pointer.

My point is that in floating-point we have a clear and useful
distinction between a "null" value, 0.0, and a representation that
isn't a numeric value, NaN.

For char*, we have a clear distinction between a pointer to an
empty string and a pointer that doesn't point to a string at all.
They're logically quite distinct, and in my opinion it's a very
useful distinction.

I concede that it's not a very compelling point.  Still, I don't
think treating a null pointer as if it were a pointer to an empty
string is a good idea.  It's not a *horrible* idea, and if it were
added to a future standard I'd accept it with a little grumbling.

-- 
Keith Thompson (The_Other_Keith) kst-u@mib.org  <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something.  This is something.  Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"

[toc] | [prev] | [next] | [standalone]


#124227

Fromsupercat@casperkitty.com
Date2017-12-12 10:49 -0800
Message-ID<200dbc56-510f-48ac-b9b4-64cb421934d3@googlegroups.com>
In reply to#124070
On Saturday, December 9, 2017 at 1:53:35 PM UTC-6, Keith Thompson wrote:
> My point is that in floating-point we have a clear and useful
> distinction between a "null" value, 0.0, and a representation that
> isn't a numeric value, NaN.

In some cases, one may want to compute e.g. the sum of all numbers in a list
which aren't NaN.  In such cases, an operator which means "add X to Y if X
isn't NaN, or do nothing if it is" may be useful.  That doesn't mean that NaN
is just another representation for zero; it does mean, however, that it would
be a representation for something that should behave as an additive identity
in some contexts.

A fairly common pattern is for code to receive a pointer to a string, compute
its length N, and then treat it as an array of N characters (keeping track of
strings' lengths is in almost every way better than relying on a terminating
zero byte, except that there's no other nice way to get the length of a string
literal used within an expression.  If an implementation guarantees that
adding 0 to a null pointer will simply yield a null pointer, and guarantees
that functions like memcpy(any,any,0) will behave as no-ops regardless of
the pointers passed in, then a function which behaves like strlen() except
when the argument is null (in which case it returns zero with no side-effect)
may eliminate the need for explicit null checking in user-code.

I would favor giving such a function a different name from strlen, and
having a standard macro define its existence, so that code could exploit
it on platforms where it exists, but be compatible with those were it
does not.  If someone is reading such code and encounters the function,
they'd have to look it up once, but would then know the meaning of that
function name and the defined corner cases every time they see it in any
program.  By contrast, if someone is reading code and finds a call to
fredStrlen(), they'd not only have to look at the definition of fredStrlen()
in that program to find out how it works, but they'd have to examine that
function in *every* program they examine, since it may be that e.g. one
version stops looking after 256 characters but another is suitable for
strings of arbitrary length.

[toc] | [prev] | [next] | [standalone]


#124240

FromMalcolm McLean <malcolm.arthur.mclean@gmail.com>
Date2017-12-12 13:39 -0800
Message-ID<780aa089-aa70-4232-9838-499bae843bf1@googlegroups.com>
In reply to#124227
On Tuesday, December 12, 2017 at 6:49:30 PM UTC, supe...@casperkitty.com wrote:
> On Saturday, December 9, 2017 at 1:53:35 PM UTC-6, Keith Thompson wrote:
> > My point is that in floating-point we have a clear and useful
> > distinction between a "null" value, 0.0, and a representation that
> > isn't a numeric value, NaN.
> 
> In some cases, one may want to compute e.g. the sum of all numbers 
> in a list which aren't NaN.  In such cases, an operator which means 
> "add X to Y if X isn't NaN, or do nothing if it is" may be useful.  
> That doesn't mean that NaN is just another representation for zero; 
> it does mean, however, that it would be a representation for something
> that should behave as an additive identity in some contexts.
> 
If we're taking the mean, then if a value is zero we increment N,
if a value is NaN then most likely we treat it as missing and
increment nether total nor N. So you need a test for NaN, and
you need the behaviour explicitly documented in your "mean" function.

[toc] | [prev] | [next] | [standalone]


#124253

Fromsupercat@casperkitty.com
Date2017-12-12 16:05 -0800
Message-ID<d7b422a0-6791-4b1a-98fd-342b9687a3a2@googlegroups.com>
In reply to#124240
On Tuesday, December 12, 2017 at 3:39:30 PM UTC-6, Malcolm McLean wrote:
> On Tuesday, December 12, 2017 at 6:49:30 PM UTC, supe...@casperkitty.com wrote:
> > On Saturday, December 9, 2017 at 1:53:35 PM UTC-6, Keith Thompson wrote:
> > > My point is that in floating-point we have a clear and useful
> > > distinction between a "null" value, 0.0, and a representation that
> > > isn't a numeric value, NaN.
> > 
> > In some cases, one may want to compute e.g. the sum of all numbers 
> > in a list which aren't NaN.  In such cases, an operator which means 
> > "add X to Y if X isn't NaN, or do nothing if it is" may be useful.  
> > That doesn't mean that NaN is just another representation for zero; 
> > it does mean, however, that it would be a representation for something
> > that should behave as an additive identity in some contexts.
> > 
> If we're taking the mean, then if a value is zero we increment N,
> if a value is NaN then most likely we treat it as missing and
> increment nether total nor N. So you need a test for NaN, and
> you need the behaviour explicitly documented in your "mean" function.

I said "sum", rather than "mean".  By way of analogy, if one has a function
to build up a string by concatenating a variety of components, many of
which will be required in some cases but not others, being able to simply
build up a string with all the parts that are specified may be more
convenient than having to explicitly check whether each part is specified
before trying to add it.  To be sure, I think the set of string functions
in the standard library is generally sufficiently feeble that the lack of
an strlen-ish function that behaves cleanly on a null pointer is really
the least of its problems, but there are places I would use such a function
if it existed.

[toc] | [prev] | [next] | [standalone]


#124277

FromMalcolm McLean <malcolm.arthur.mclean@gmail.com>
Date2017-12-13 03:43 -0800
Message-ID<1be1dc23-3a06-440e-b9a5-6a90aba5bd3b@googlegroups.com>
In reply to#124253
On Wednesday, December 13, 2017 at 12:05:27 AM UTC, supe...@casperkitty.com wrote:
> On Tuesday, December 12, 2017 at 3:39:30 PM UTC-6, Malcolm McLean wrote:
> > On Tuesday, December 12, 2017 at 6:49:30 PM UTC, supe...@casperkitty.com wrote:
> > > On Saturday, December 9, 2017 at 1:53:35 PM UTC-6, Keith Thompson wrote:
> > > > My point is that in floating-point we have a clear and useful
> > > > distinction between a "null" value, 0.0, and a representation that
> > > > isn't a numeric value, NaN.
> > > 
> > > In some cases, one may want to compute e.g. the sum of all numbers 
> > > in a list which aren't NaN.  In such cases, an operator which means 
> > > "add X to Y if X isn't NaN, or do nothing if it is" may be useful.  
> > > That doesn't mean that NaN is just another representation for zero; 
> > > it does mean, however, that it would be a representation for something
> > > that should behave as an additive identity in some contexts.
> > > 
> > If we're taking the mean, then if a value is zero we increment N,
> > if a value is NaN then most likely we treat it as missing and
> > increment nether total nor N. So you need a test for NaN, and
> > you need the behaviour explicitly documented in your "mean" function.
> 
> I said "sum", rather than "mean".  By way of analogy, if one has a function
> to build up a string by concatenating a variety of components, many of
> which will be required in some cases but not others, being able to simply
> build up a string with all the parts that are specified may be more
> convenient than having to explicitly check whether each part is specified
> before trying to add it.  To be sure, I think the set of string functions
> in the standard library is generally sufficiently feeble that the lack of
> an strlen-ish function that behaves cleanly on a null pointer is really
> the least of its problems, but there are places I would use such a function
> if it existed.
>
Assuming that NaN represents missing values, taking the mean makes sense.
Simply ignore those values. Taking the sum is a slippier concept and
you don't want it handled at hardware level or in the language definition.
It belongs in the high-level code written by statistical people, who
may not know much about optimising compilers.

[toc] | [prev] | [next] | [standalone]


#124289

Fromsupercat@casperkitty.com
Date2017-12-13 08:45 -0800
Message-ID<6b71ac9c-c256-487f-8bbc-5054f33878be@googlegroups.com>
In reply to#124277
On Wednesday, December 13, 2017 at 5:43:29 AM UTC-6, Malcolm McLean wrote:
> Assuming that NaN represents missing values, taking the mean makes sense.
> Simply ignore those values. Taking the sum is a slippier concept and
> you don't want it handled at hardware level or in the language definition.
> It belongs in the high-level code written by statistical people, who
> may not know much about optimising compilers.

Having an "add value if defined, or else ignore it" operation which code can
request when it wants such semantics doesn't seem like a slippery concept.
There would be some room for judgment as to how the cost of adding it to a
language compares with the cost of having every piece of code that needs such
functionality implement its own version.  Some languages include such a
construct e.g. SQL's "sum" operation works that way) while others don't.

[toc] | [prev] | [next] | [standalone]


Page 2 of 5 — ← Prev page 1 [2] 3 4 5  Next page →

Back to top | Article view | comp.lang.c


csiph-web