Groups > comp.lang.c > #385515 > unrolled thread

Interval Comparisons

Started by	Lawrence D'Oliveiro <ldo@nz.invalid>
First post	2024-06-04 07:14 +0000
Last post	2024-06-07 10:42 +0000
Articles	20 on this page of 46 — 11 participants

Back to article view | Back to comp.lang.c

  Interval Comparisons Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-04 07:14 +0000
    Re: Interval Comparisons David Brown <david.brown@hesbynett.no> - 2024-06-04 10:58 +0200
      Re: Interval Comparisons Mikko <mikko.levanto@iki.fi> - 2024-06-04 12:13 +0300
        Re: Interval Comparisons David Brown <david.brown@hesbynett.no> - 2024-06-04 13:02 +0200
          Re: Interval Comparisons bart <bc@freeuk.com> - 2024-06-04 12:23 +0100
            Re: Interval Comparisons David Brown <david.brown@hesbynett.no> - 2024-06-04 15:24 +0200
              Re: Interval Comparisons bart <bc@freeuk.com> - 2024-06-04 15:16 +0100
                Re: Interval Comparisons David Brown <david.brown@hesbynett.no> - 2024-06-04 17:40 +0200
              Re: Interval Comparisons scott@slp53.sl.home (Scott Lurndal) - 2024-06-04 15:27 +0000
                Re: Interval Comparisons bart <bc@freeuk.com> - 2024-06-04 16:58 +0100
                  Re: Interval Comparisons Michael S <already5chosen@yahoo.com> - 2024-06-04 19:25 +0300
                    Re: Interval Comparisons bart <bc@freeuk.com> - 2024-06-04 17:54 +0100
            Re: Interval Comparisons Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-05 03:29 +0000
          Re: Interval Comparisons Mikko <mikko.levanto@iki.fi> - 2024-06-04 16:11 +0300
        Re: Interval Comparisons Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-06-04 15:42 +0200
        Re: Interval Comparisons Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-04 14:04 -0700
    Re: Interval Comparisons bart <bc@freeuk.com> - 2024-06-04 11:39 +0100
    Re: Interval Comparisons Thiago Adams <thiago.adams@gmail.com> - 2024-06-04 08:32 -0300
      Re: Interval Comparisons Bonita Montero <Bonita.Montero@gmail.com> - 2024-06-04 13:37 +0200
      Re: Interval Comparisons Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-04 15:29 -0700
    Re: Interval Comparisons Blue-Maned_Hawk <bluemanedhawk@invalid.invalid> - 2024-06-04 11:41 +0000
      Re: Interval Comparisons Michael S <already5chosen@yahoo.com> - 2024-06-04 15:17 +0300
      Re: Interval Comparisons Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-04 23:12 +0000
        Re: Interval Comparisons bart <bc@freeuk.com> - 2024-06-05 00:22 +0100
          Re: Interval Comparisons Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-05 01:30 +0000
            Re: Interval Comparisons bart <bc@freeuk.com> - 2024-06-06 19:48 +0100
              Re: Interval Comparisons Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-06 22:54 +0000
                Re: Interval Comparisons bart <bc@freeuk.com> - 2024-06-07 01:52 +0100
                  Re: Interval Comparisons Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-07 02:17 +0000
                    Re: Interval Comparisons Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-06 20:53 -0700
                      Re: Interval Comparisons Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-07 04:25 +0000
                      Re: Interval Comparisons David Brown <david.brown@hesbynett.no> - 2024-06-07 11:22 +0200
                        Re: Interval Comparisons Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-07 02:55 -0700
                          Re: Interval Comparisons David Brown <david.brown@hesbynett.no> - 2024-06-07 13:04 +0200
                            Re: Interval Comparisons Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-07 11:57 -0700
                              Re: Interval Comparisons David Brown <david.brown@hesbynett.no> - 2024-06-08 17:42 +0200
                        Re: Interval Comparisons bart <bc@freeuk.com> - 2024-06-07 11:28 +0100
                          Re: Interval Comparisons Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-07 10:45 +0000
                            Re: Interval Comparisons Michael S <already5chosen@yahoo.com> - 2024-06-07 14:51 +0300
                          Re: Interval Comparisons David Brown <david.brown@hesbynett.no> - 2024-06-07 13:17 +0200
                            Re: Interval Comparisons bart <bc@freeuk.com> - 2024-06-07 13:20 +0100
                              Re: Interval Comparisons David Brown <david.brown@hesbynett.no> - 2024-06-09 13:26 +0200
                                Re: Interval Comparisons bart <bc@freeuk.com> - 2024-06-10 16:33 +0100
                                  Re: Interval Comparisons David Brown <david.brown@hesbynett.no> - 2024-06-10 17:56 +0200
                            Re: Interval Comparisons scott@slp53.sl.home (Scott Lurndal) - 2024-06-07 14:00 +0000
                        Re: Interval Comparisons Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-07 10:42 +0000

Page 1 of 3 [1] 2 3 Next page →

#385515 — Interval Comparisons

From	Lawrence D'Oliveiro <ldo@nz.invalid>
Date	2024-06-04 07:14 +0000
Subject	Interval Comparisons
Message-ID	<v3merq$b1uj$1@dont-email.me>

Would it break backward compatibility for C to add a feature like this 
from Python? Namely, the ability to check if a value lies in an interval:

    def valid_char(c) :
        "is integer c the code for a valid Unicode character." \
        " This excludes surrogates."
        return \
            (
                0 <= c <= 0x10FFFF
            and
                not (0xD800 <= c < 0xE000)
            )
    #end valid_char

[toc] | [next] | [standalone]

#385526

From	David Brown <david.brown@hesbynett.no>
Date	2024-06-04 10:58 +0200
Message-ID	<v3ml0d$bpds$5@dont-email.me>
In reply to	#385515

On 04/06/2024 09:14, Lawrence D'Oliveiro wrote:
> Would it break backward compatibility for C to add a feature like this
> from Python? Namely, the ability to check if a value lies in an interval:
> 
>      def valid_char(c) :
>          "is integer c the code for a valid Unicode character." \
>          " This excludes surrogates."
>          return \
>              (
>                  0 <= c <= 0x10FFFF
>              and
>                  not (0xD800 <= c < 0xE000)
>              )
>      #end valid_char

Do you mean, could C treat "a <= x <= b" as "(a <= x) && (x <= b)" 
without breaking existing code?  The answer is no, C treats it as the 
expression "(a <= x) <= b".  So you would be changing the meaning of 
existing C code.  I think it's fair to say there is likely to be very 
little existing correct and working C code that relies on the current 
interpretation of such expressions, but the possibility is enough to 
rule out such a change ever happening in C.  (And it would also 
complicate the grammar a fair bit.)

<https://c-faq.com/expr/transitivity.html>

[toc] | [prev] | [next] | [standalone]

#385527

From	Mikko <mikko.levanto@iki.fi>
Date	2024-06-04 12:13 +0300
Message-ID	<v3mlrb$c7d5$1@dont-email.me>
In reply to	#385526

On 2024-06-04 08:58:53 +0000, David Brown said:

> On 04/06/2024 09:14, Lawrence D'Oliveiro wrote:
>> Would it break backward compatibility for C to add a feature like this
>> from Python? Namely, the ability to check if a value lies in an interval:
>> 
>> def valid_char(c) :
>> "is integer c the code for a valid Unicode character." \
>> " This excludes surrogates."
>> return \
>> (
>> 0 <= c <= 0x10FFFF
>> and
>> not (0xD800 <= c < 0xE000)
>> )
>> #end valid_char
> 
> Do you mean, could C treat "a <= x <= b" as "(a <= x) && (x <= b)" 
> without breaking existing code?  The answer is no, C treats it as the 
> expression "(a <= x) <= b".  So you would be changing the meaning of 
> existing C code.  I think it's fair to say there is likely to be very 
> little existing correct and working C code that relies on the current 
> interpretation of such expressions, but the possibility is enough to 
> rule out such a change ever happening in C.  (And it would also 
> complicate the grammar a fair bit.)
> 
> 
> <https://c-faq.com/expr/transitivity.html>

That does not prevet from doing the same with a different syntax.
The main difference is that in the current C syntax that cannot be
said without mentioning c twice. In the example program C would
require that c is mentioned four times but the shown Python code
only needs it mentioned twice. An ideal syntax woult only mention
it once, perhaps

  return c in 0 .. 0xD7FF, 0xE000 .. 0x10FFFF ;

or

  return c : [0 .. 0xD800), [0xE000 .. 0x10FFFF] ;

or something like that, preferably so that no new reserved word is
needed.

-- 
Mikko

[toc] | [prev] | [next] | [standalone]

#385529

From	David Brown <david.brown@hesbynett.no>
Date	2024-06-04 13:02 +0200
Message-ID	<v3ms7b$d5sq$1@dont-email.me>
In reply to	#385527

On 04/06/2024 11:13, Mikko wrote:
> On 2024-06-04 08:58:53 +0000, David Brown said:
> 
>> On 04/06/2024 09:14, Lawrence D'Oliveiro wrote:
>>> Would it break backward compatibility for C to add a feature like this
>>> from Python? Namely, the ability to check if a value lies in an 
>>> interval:
>>>
>>> def valid_char(c) :
>>> "is integer c the code for a valid Unicode character." \
>>> " This excludes surrogates."
>>> return \
>>> (
>>> 0 <= c <= 0x10FFFF
>>> and
>>> not (0xD800 <= c < 0xE000)
>>> )
>>> #end valid_char
>>
>> Do you mean, could C treat "a <= x <= b" as "(a <= x) && (x <= b)" 
>> without breaking existing code?  The answer is no, C treats it as the 
>> expression "(a <= x) <= b".  So you would be changing the meaning of 
>> existing C code.  I think it's fair to say there is likely to be very 
>> little existing correct and working C code that relies on the current 
>> interpretation of such expressions, but the possibility is enough to 
>> rule out such a change ever happening in C.  (And it would also 
>> complicate the grammar a fair bit.)
>>
>>
>> <https://c-faq.com/expr/transitivity.html>
> 
> That does not prevet from doing the same with a different syntax.
> The main difference is that in the current C syntax that cannot be
> said without mentioning c twice. In the example program C would
> require that c is mentioned four times but the shown Python code
> only needs it mentioned twice. An ideal syntax woult only mention
> it once, perhaps
> 
>   return c in 0 .. 0xD7FF, 0xE000 .. 0x10FFFF ;
> 
> or
> 
>   return c : [0 .. 0xD800), [0xE000 .. 0x10FFFF] ;
> 
> or something like that, preferably so that no new reserved word is
> needed.
> 

Sure, you can always add new things to a language if they would 
previously have been syntax errors or constraint errors.  But is there a 
use for it?

It is fine if you have a language that has good support for lists, sets, 
ranges, and other higher-level features - then an "in" keyword is a 
great idea.  But C is not such a language, and that kind of feature 
would be well outside the scope of the language.

It would be easy enough to write a macro "in_range(a, x, b)" that would 
do the job.  It is even easier, and more productive, that you simply 
write the "valid_char" function and use it, if that's what you need.

[toc] | [prev] | [next] | [standalone]

#385531

From	bart <bc@freeuk.com>
Date	2024-06-04 12:23 +0100
Message-ID	<v3mtf2$ct28$2@dont-email.me>
In reply to	#385529

On 04/06/2024 12:02, David Brown wrote:
> On 04/06/2024 11:13, Mikko wrote:
>> On 2024-06-04 08:58:53 +0000, David Brown said:
>>
>>> On 04/06/2024 09:14, Lawrence D'Oliveiro wrote:
>>>> Would it break backward compatibility for C to add a feature like this
>>>> from Python? Namely, the ability to check if a value lies in an 
>>>> interval:
>>>>
>>>> def valid_char(c) :
>>>> "is integer c the code for a valid Unicode character." \
>>>> " This excludes surrogates."
>>>> return \
>>>> (
>>>> 0 <= c <= 0x10FFFF
>>>> and
>>>> not (0xD800 <= c < 0xE000)
>>>> )
>>>> #end valid_char
>>>
>>> Do you mean, could C treat "a <= x <= b" as "(a <= x) && (x <= b)" 
>>> without breaking existing code?  The answer is no, C treats it as the 
>>> expression "(a <= x) <= b".  So you would be changing the meaning of 
>>> existing C code.  I think it's fair to say there is likely to be very 
>>> little existing correct and working C code that relies on the current 
>>> interpretation of such expressions, but the possibility is enough to 
>>> rule out such a change ever happening in C.  (And it would also 
>>> complicate the grammar a fair bit.)
>>>
>>>
>>> <https://c-faq.com/expr/transitivity.html>
>>
>> That does not prevet from doing the same with a different syntax.
>> The main difference is that in the current C syntax that cannot be
>> said without mentioning c twice. In the example program C would
>> require that c is mentioned four times but the shown Python code
>> only needs it mentioned twice. An ideal syntax woult only mention
>> it once, perhaps
>>
>>   return c in 0 .. 0xD7FF, 0xE000 .. 0x10FFFF ;
>>
>> or
>>
>>   return c : [0 .. 0xD800), [0xE000 .. 0x10FFFF] ;
>>
>> or something like that, preferably so that no new reserved word is
>> needed.
>>
> 
> Sure, you can always add new things to a language if they would 
> previously have been syntax errors or constraint errors.  But is there a 
> use for it?
> 
> It is fine if you have a language that has good support for lists, sets, 
> ranges, and other higher-level features - then an "in" keyword is a 
> great idea.  But C is not such a language, and that kind of feature 
> would be well outside the scope of the language.

I disagree. I have a script language where 'in' works with all sorts of 
data types, and where ranges like a..b and sets like [a..b, c, d, e] are 
actual types.

Yet I also introduced 'in' into my systems language, even though it is 
very restricted:

     if a in b..c then
     if a in [b, c, d] then

This is limited to integer types. The set construct here doesn't allow 
ranges (it could have done). Neither the range or set is a datatype - it 
just syntax. (I can't do range r := 1..10.)

It is incredibly useful:

    if c in [' ', '\t', '\n'] then ... # whitespace
    if b in 0..255 then
    if b in u8.bounds then             # alternative

Not to forget:

    if x = y = 0 then                  # both x and y are zero

It doesn't need the full spec of the higher level language.

> It would be easy enough to write a macro "in_range(a, x, b)" that would 
> do the job.  It is even easier, and more productive, that you simply 
> write the "valid_char" function and use it, if that's what you need.

Yes it would be easier - to provide an ugly, half-assed solution that 
everyone will write a different way (I would use (x, a, b) for example),
and which can go wrong as soon as someone writes (a, x(), b).

That's the problem with the macro scheme, it stops the language properly 
evolving.

[toc] | [prev] | [next] | [standalone]

#385539

From	David Brown <david.brown@hesbynett.no>
Date	2024-06-04 15:24 +0200
Message-ID	<v3n4is$emdc$1@dont-email.me>
In reply to	#385531

On 04/06/2024 13:23, bart wrote:
> On 04/06/2024 12:02, David Brown wrote:
>> On 04/06/2024 11:13, Mikko wrote:
>>> On 2024-06-04 08:58:53 +0000, David Brown said:
>>>
>>>> On 04/06/2024 09:14, Lawrence D'Oliveiro wrote:
>>>>> Would it break backward compatibility for C to add a feature like this
>>>>> from Python? Namely, the ability to check if a value lies in an 
>>>>> interval:
>>>>>
>>>>> def valid_char(c) :
>>>>> "is integer c the code for a valid Unicode character." \
>>>>> " This excludes surrogates."
>>>>> return \
>>>>> (
>>>>> 0 <= c <= 0x10FFFF
>>>>> and
>>>>> not (0xD800 <= c < 0xE000)
>>>>> )
>>>>> #end valid_char
>>>>
>>>> Do you mean, could C treat "a <= x <= b" as "(a <= x) && (x <= b)" 
>>>> without breaking existing code?  The answer is no, C treats it as 
>>>> the expression "(a <= x) <= b".  So you would be changing the 
>>>> meaning of existing C code.  I think it's fair to say there is 
>>>> likely to be very little existing correct and working C code that 
>>>> relies on the current interpretation of such expressions, but the 
>>>> possibility is enough to rule out such a change ever happening in 
>>>> C.  (And it would also complicate the grammar a fair bit.)
>>>>
>>>>
>>>> <https://c-faq.com/expr/transitivity.html>
>>>
>>> That does not prevet from doing the same with a different syntax.
>>> The main difference is that in the current C syntax that cannot be
>>> said without mentioning c twice. In the example program C would
>>> require that c is mentioned four times but the shown Python code
>>> only needs it mentioned twice. An ideal syntax woult only mention
>>> it once, perhaps
>>>
>>>   return c in 0 .. 0xD7FF, 0xE000 .. 0x10FFFF ;
>>>
>>> or
>>>
>>>   return c : [0 .. 0xD800), [0xE000 .. 0x10FFFF] ;
>>>
>>> or something like that, preferably so that no new reserved word is
>>> needed.
>>>
>>
>> Sure, you can always add new things to a language if they would 
>> previously have been syntax errors or constraint errors.  But is there 
>> a use for it?
>>
>> It is fine if you have a language that has good support for lists, 
>> sets, ranges, and other higher-level features - then an "in" keyword 
>> is a great idea.  But C is not such a language, and that kind of 
>> feature would be well outside the scope of the language.
> 
> I disagree. I have a script language where 'in' works with all sorts of 
> data types, and where ranges like a..b and sets like [a..b, c, d, e] are 
> actual types.

C is not a script language.

> 
> Yet I also introduced 'in' into my systems language, even though it is 
> very restricted:
> 
>      if a in b..c then
>      if a in [b, c, d] then
> 
> This is limited to integer types. The set construct here doesn't allow 
> ranges (it could have done). Neither the range or set is a datatype - it 
> just syntax. (I can't do range r := 1..10.)

Adding such a feature to your own personal language, for your own 
personal use, is easy enough (relative to the rest of the work involved 
in designing your own personal language and making tools for it, which 
is of course no small feat).  Adding it to C with its standards, 
existing code, toolchains, additional tools, developers, etc., is a 
whole different kettle of fish.

I don't think it would be practical to add it to C in a way that is 
simple and restricted enough to be suitable for C, while also being 
useful enough to make it worth the effort.

Remember, when you add these things to your own language, you have your 
own needs in mind and can ignore everything else, all corner cases, and 
all complications.  Putting a feature in C means making decisions like 
figuring out what type the expression "b..c" has, whether the various 
bits and pieces have to be constants or if they can be variables, how 
the operator precedences work, how to treat floating point numbers or 
mixes of different types, and countless other factors.  If a language 
already has the concepts, rules and grammar for ranges or lists, adding 
an "in" operator is natural - if not, then it's a huge amount of extra 
junk pulled into the language and syntax for a very minor gain.

I don't disagree that it could be useful, and I'm sure I'd use it if it 
existed in C, I just disagree that it makes sense in C.

> 
> It is incredibly useful:
> 
>     if c in [' ', '\t', '\n'] then ... # whitespace
>     if b in 0..255 then
>     if b in u8.bounds then             # alternative
> 
> Not to forget:
> 
>     if x = y = 0 then                  # both x and y are zero
> 
> It doesn't need the full spec of the higher level language.
> 
>> It would be easy enough to write a macro "in_range(a, x, b)" that 
>> would do the job.  It is even easier, and more productive, that you 
>> simply write the "valid_char" function and use it, if that's what you 
>> need.
> 
> Yes it would be easier - to provide an ugly, half-assed solution that 

You and I are British - the term is "half-arsed" :-)

> everyone will write a different way (I would use (x, a, b) for example),
> and which can go wrong as soon as someone writes (a, x(), b).
> 
> That's the problem with the macro scheme, it stops the language properly 
> evolving.
> 

If it were considered useful enough, it could be standardised in the C 
library.  If it is not useful enough to standardise in the library, it 
is certainly not useful enough to put in the language itself.

In practice, while I would put something like this in a new language, I 
don't think it is important enough to try to add to C.  When you need to 
do a lot of checks, you'd put them within a function (or macro if you 
prefer), such as "isspace()".

[toc] | [prev] | [next] | [standalone]

#385544

From	bart <bc@freeuk.com>
Date	2024-06-04 15:16 +0100
Message-ID	<v3n7ko$evip$1@dont-email.me>
In reply to	#385539

On 04/06/2024 14:24, David Brown wrote:
> On 04/06/2024 13:23, bart wrote:
>> On 04/06/2024 12:02, David Brown wrote:

>>> It is fine if you have a language that has good support for lists, 
>>> sets, ranges, and other higher-level features - then an "in" keyword 
>>> is a great idea.  But C is not such a language, and that kind of 
>>> feature would be well outside the scope of the language.
>>
>> I disagree. I have a script language where 'in' works with all sorts 
>> of data types, and where ranges like a..b and sets like [a..b, c, d, 
>> e] are actual types.
> 
> C is not a script language.
> 
>>
>> Yet I also introduced 'in' into my systems language, even though it is 
>> very restricted:
>>
>>      if a in b..c then
>>      if a in [b, c, d] then
>>
>> This is limited to integer types. The set construct here doesn't allow 
>> ranges (it could have done). Neither the range or set is a datatype - 
>> it just syntax. (I can't do range r := 1..10.)
> 
> Adding such a feature to your own personal language, for your own 
> personal use, is easy enough (relative to the rest of the work involved 
> in designing your own personal language and making tools for it, which 
> is of course no small feat).  Adding it to C with its standards, 
> existing code, toolchains, additional tools, developers, etc., is a 
> whole different kettle of fish.

I was responding to your comment:

"and that kind of feature would be well outside the scope of the language."

I think it can suit that level of language if you avoid being too ambitious.

I agree it is not practical to apply to C at this point, not without 
making it ugly or unwieldy enough that people might as well use existing 
solutions.

(Such a feature also aids simpler non-optimising compilers. Take these 
examples that all do the same thing:

     if a <= f() and f() <= c then fi

     if a <= f() <= c then fi

     if f() in a..c then fi

If the two f() calls in the first example were considered common 
subexpressions, I don't have the means in my compiler to detect that 
that and evaluate them just once.

In the other two examples, the language lets you express that directly.

Even for a simpler 'b in a..c' example, it is easier to generate more 
efficient code, and do that more efficiently too than building something 
up only to tear it down again.)


>>> It would be easy enough to write a macro "in_range(a, x, b)" that 
>>> would do the job.  It is even easier, and more productive, that you 
>>> simply write the "valid_char" function and use it, if that's what you 
>>> need.
>>
>> Yes it would be easier - to provide an ugly, half-assed solution that 
> 
> You and I are British - the term is "half-arsed" :-)

I'm catering for a wider readership.

(Actually I'm not quite considered British enough to be allowed in the 
upcoming election.)

[toc] | [prev] | [next] | [standalone]

#385546

From	David Brown <david.brown@hesbynett.no>
Date	2024-06-04 17:40 +0200
Message-ID	<v3nch6$frn1$1@dont-email.me>
In reply to	#385544

On 04/06/2024 16:16, bart wrote:
> On 04/06/2024 14:24, David Brown wrote:
>> On 04/06/2024 13:23, bart wrote:
>>> On 04/06/2024 12:02, David Brown wrote:
> 
>>>> It is fine if you have a language that has good support for lists, 
>>>> sets, ranges, and other higher-level features - then an "in" keyword 
>>>> is a great idea.  But C is not such a language, and that kind of 
>>>> feature would be well outside the scope of the language.
>>>
>>> I disagree. I have a script language where 'in' works with all sorts 
>>> of data types, and where ranges like a..b and sets like [a..b, c, d, 
>>> e] are actual types.
>>
>> C is not a script language.
>>
>>>
>>> Yet I also introduced 'in' into my systems language, even though it 
>>> is very restricted:
>>>
>>>      if a in b..c then
>>>      if a in [b, c, d] then
>>>
>>> This is limited to integer types. The set construct here doesn't 
>>> allow ranges (it could have done). Neither the range or set is a 
>>> datatype - it just syntax. (I can't do range r := 1..10.)
>>
>> Adding such a feature to your own personal language, for your own 
>> personal use, is easy enough (relative to the rest of the work 
>> involved in designing your own personal language and making tools for 
>> it, which is of course no small feat).  Adding it to C with its 
>> standards, existing code, toolchains, additional tools, developers, 
>> etc., is a whole different kettle of fish.
> 
> I was responding to your comment:
> 
> "and that kind of feature would be well outside the scope of the language."
> 
> I think it can suit that level of language if you avoid being too 
> ambitious.
> 

It might be that we would agree on that if we worked hard enough to find 
a common definition for "that level of language".  But I think that 
would be a lot of time and effort for little purpose.  I do agree that 
with enough limitation in the scope of the feature, it is less 
unreasonable for a low-level language.  But I think I would want to 
limit the scope until there is little point in the "in" operator - or I 
would want to go the other direction and define something like Pascal's 
sets with many more operators and uses.

> I agree it is not practical to apply to C at this point, not without 
> making it ugly or unwieldy enough that people might as well use existing 
> solutions.

Yes.

> 
> (Such a feature also aids simpler non-optimising compilers. Take these 
> examples that all do the same thing:
> 
>      if a <= f() and f() <= c then fi
> 
>      if a <= f() <= c then fi
> 
>      if f() in a..c then fi
> 
> If the two f() calls in the first example were considered common 
> subexpressions, I don't have the means in my compiler to detect that 
> that and evaluate them just once.
> 

I see your point, but I rate the design and use of a language as /much/ 
more important than the ease of implementation.  I realise the balance 
is a bit different when the user is the implementer.

> In the other two examples, the language lets you express that directly.
> 
> Even for a simpler 'b in a..c' example, it is easier to generate more 
> efficient code, and do that more efficiently too than building something 
> up only to tear it down again.)
> 
> 
>>>> It would be easy enough to write a macro "in_range(a, x, b)" that 
>>>> would do the job.  It is even easier, and more productive, that you 
>>>> simply write the "valid_char" function and use it, if that's what 
>>>> you need.
>>>
>>> Yes it would be easier - to provide an ugly, half-assed solution that 
>>
>> You and I are British - the term is "half-arsed" :-)
> 
> I'm catering for a wider readership.

We can educate them!

> 
> (Actually I'm not quite considered British enough to be allowed in the 
> upcoming election.)
> 

I can't vote either, but that's because I don't live in the UK.  And 
given the state of UK politics these days, I'm happy to be out of it. 
For quite a while, the Scottish Parliament were looking like the adults 
in the room, but they've managed to mess things up for themselves too.

[toc] | [prev] | [next] | [standalone]

#385545

From	scott@slp53.sl.home (Scott Lurndal)
Date	2024-06-04 15:27 +0000
Message-ID	<DnG7O.8145$_US6.7552@fx44.iad>
In reply to	#385539

David Brown <david.brown@hesbynett.no> writes:
>On 04/06/2024 13:23, bart wrote:

>> It is incredibly useful:
>> 
>>     if c in [' ', '\t', '\n'] then ... # whitespace

if (strpbrk(c, " \t\n") != NULL) it_is_whitespace.

>
>If it were considered useful enough, it could be standardised in the C 
>library.  If it is not useful enough to standardise in the library, it 
>is certainly not useful enough to put in the language itself.

indeed.

[toc] | [prev] | [next] | [standalone]

#385547

From	bart <bc@freeuk.com>
Date	2024-06-04 16:58 +0100
Message-ID	<v3ndji$fv12$1@dont-email.me>
In reply to	#385545

On 04/06/2024 16:27, Scott Lurndal wrote:
> David Brown <david.brown@hesbynett.no> writes:
>> On 04/06/2024 13:23, bart wrote:
> 
>>> It is incredibly useful:
>>>
>>>      if c in [' ', '\t', '\n'] then ... # whitespace
> 
> if (strpbrk(c, " \t\n") != NULL) it_is_whitespace.

That doesn't do the same thing. In my example, c is a character, not a 
string.

To achieve the same thing using strpbrk requires code like this:

     char c[2];

     c[0]=rand()&255;            // Create a string
     c[1]=0;

     if (strpbrk(c, " \t\n") != NULL) puts("whitespace");

If I compile this with gcc -O3, then the checking part is this:

     lea rcx, 46[rsp]
     mov BYTE PTR 47[rsp], 0
     lea rdx, .LC0[rip]
     mov BYTE PTR 46[rsp], al
     call    strpbrk             // CALL TO LIBRARY FUNCTION
     test    rax, rax
     je  .L2
     lea rcx, .LC1[rip]
     call    puts

I don't know what it gets up to inside strprbk. If I write this in my 
language:

     if c in [9,10,32] then
         puts("whitespace")
     fi

The generated code is this (using alternate register names, D0 = rax):

     mov   D0, D3      # (could have tested D3 (= c) directly.)
     cmp   D0, 9
     jz    L4
     cmp   D0, 10
     jz    L4
     cmp   D0, 32
     jnz   L3
L4:
     lea   D10, [L5]
     call  puts*
L3:

Anyway, the construct is not limited to character codes that can be 
contained within a string. It works for 64-bit values which can include 
0. And it could be extended to other scalar types like floats and pointers.

[toc] | [prev] | [next] | [standalone]

#385548

From	Michael S <already5chosen@yahoo.com>
Date	2024-06-04 19:25 +0300
Message-ID	<20240604192547.00003fbd@yahoo.com>
In reply to	#385547

On Tue, 4 Jun 2024 16:58:43 +0100
bart <bc@freeuk.com> wrote:

> On 04/06/2024 16:27, Scott Lurndal wrote:
> > David Brown <david.brown@hesbynett.no> writes:  
> >> On 04/06/2024 13:23, bart wrote:  
> >   
> >>> It is incredibly useful:
> >>>
> >>>      if c in [' ', '\t', '\n'] then ... # whitespace  
> > 
> > if (strpbrk(c, " \t\n") != NULL) it_is_whitespace.  
> 
> That doesn't do the same thing. In my example, c is a character, not
> a string.
>


Will that be be better?
if (memchr(" \t\n", c, 3) != NULL)

[toc] | [prev] | [next] | [standalone]

#385549

From	bart <bc@freeuk.com>
Date	2024-06-04 17:54 +0100
Message-ID	<v3ngs9$gksq$1@dont-email.me>
In reply to	#385548

On 04/06/2024 17:25, Michael S wrote:
> On Tue, 4 Jun 2024 16:58:43 +0100
> bart <bc@freeuk.com> wrote:
> 
>> On 04/06/2024 16:27, Scott Lurndal wrote:
>>> David Brown <david.brown@hesbynett.no> writes:
>>>> On 04/06/2024 13:23, bart wrote:
>>>    
>>>>> It is incredibly useful:
>>>>>
>>>>>       if c in [' ', '\t', '\n'] then ... # whitespace
>>>
>>> if (strpbrk(c, " \t\n") != NULL) it_is_whitespace.
>>
>> That doesn't do the same thing. In my example, c is a character, not
>> a string.
>>
> 
> 
> Will that be be better?
> if (memchr(" \t\n", c, 3) != NULL)
> 

It's a better match. But on gcc-O3, it still calls the library function 
up to version 12.x. After that, it's smart enough to generate similar 
code to my non-optimising compiler using the built-in feature.

It is also still limited to byte values.

My approach is to fix a language, which is easier, than to expend 
magnitudes more effort in elaborate tools and ultra-smart compilers.

[toc] | [prev] | [next] | [standalone]

#385571

From	Lawrence D'Oliveiro <ldo@nz.invalid>
Date	2024-06-05 03:29 +0000
Message-ID	<v3om2e$qb6k$2@dont-email.me>
In reply to	#385531

On Tue, 4 Jun 2024 12:23:15 +0100, bart wrote:

> That's the problem with the macro scheme, it stops the language properly
> evolving.

The problem is the way C does macros. Other languages with powerful macros 
(*cough* Lisp *cough*) aren’t stopped from evolving; quite the opposite.

[toc] | [prev] | [next] | [standalone]

#385538

From	Mikko <mikko.levanto@iki.fi>
Date	2024-06-04 16:11 +0300
Message-ID	<v3n3q0$ei7c$1@dont-email.me>
In reply to	#385529

On 2024-06-04 11:02:03 +0000, David Brown said:

> On 04/06/2024 11:13, Mikko wrote:
>> On 2024-06-04 08:58:53 +0000, David Brown said:
>> 
>>> On 04/06/2024 09:14, Lawrence D'Oliveiro wrote:
>>>> Would it break backward compatibility for C to add a feature like this
>>>> from Python? Namely, the ability to check if a value lies in an interval:
>>>> 
>>>> def valid_char(c) :
>>>> "is integer c the code for a valid Unicode character." \
>>>> " This excludes surrogates."
>>>> return \
>>>> (
>>>> 0 <= c <= 0x10FFFF
>>>> and
>>>> not (0xD800 <= c < 0xE000)
>>>> )
>>>> #end valid_char
>>> 
>>> Do you mean, could C treat "a <= x <= b" as "(a <= x) && (x <= b)" 
>>> without breaking existing code?  The answer is no, C treats it as the 
>>> expression "(a <= x) <= b".  So you would be changing the meaning of 
>>> existing C code.  I think it's fair to say there is likely to be very 
>>> little existing correct and working C code that relies on the current 
>>> interpretation of such expressions, but the possibility is enough to 
>>> rule out such a change ever happening in C.  (And it would also 
>>> complicate the grammar a fair bit.)
>>> 
>>> 
>>> <https://c-faq.com/expr/transitivity.html>
>> 
>> That does not prevet from doing the same with a different syntax.
>> The main difference is that in the current C syntax that cannot be
>> said without mentioning c twice. In the example program C would
>> require that c is mentioned four times but the shown Python code
>> only needs it mentioned twice. An ideal syntax woult only mention
>> it once, perhaps
>> 
>>  return c in 0 .. 0xD7FF, 0xE000 .. 0x10FFFF ;
>> 
>> or
>> 
>>  return c : [0 .. 0xD800), [0xE000 .. 0x10FFFF] ;
>> 
>> or something like that, preferably so that no new reserved word is
>> needed.
>> 
> 
> Sure, you can always add new things to a language if they would 
> previously have been syntax errors or constraint errors.  But is there 
> a use for it?

I don't see any need. That c must be mentioned twice for each interval is
not a problem. If there is a complex expression in place of c it can be
computed and stored to a variable before comparison to an interval.

> It is fine if you have a language that has good support for lists, 
> sets, ranges, and other higher-level features - then an "in" keyword is 
> a great idea.  But C is not such a language, and that kind of feature 
> would be well outside the scope of the language.

Or, if one for some reason does it in C anyway, one should have or make
a library of the essential functions, incuding membership tests.

> It would be easy enough to write a macro "in_range(a, x, b)" that would 
> do the job.  It is even easier, and more productive, that you simply 
> write the "valid_char" function and use it, if that's what you need.

Indeed.

-- 
Mikko

[toc] | [prev] | [next] | [standalone]

#385543

From	Janis Papanagnou <janis_papanagnou+ng@hotmail.com>
Date	2024-06-04 15:42 +0200
Message-ID	<v3n5jo$esag$1@dont-email.me>
In reply to	#385527

On 04.06.2024 11:13, Mikko wrote:
> On 2024-06-04 08:58:53 +0000, David Brown said:
>> On 04/06/2024 09:14, Lawrence D'Oliveiro wrote:
>>> Would it break backward compatibility for C to add a feature like this
>>> from Python? Namely, the ability to check if a value lies in an
>>> interval:
>>>
>>> def valid_char(c) :
>>> "is integer c the code for a valid Unicode character." \
>>> " This excludes surrogates."
>>> return \
>>> (
>>> 0 <= c <= 0x10FFFF

While nice to have it's just syntactic sugar.

>>> and
>>> not (0xD800 <= c < 0xE000)
>>> )
>>> #end valid_char
>>
>> Do you mean, could C treat "a <= x <= b" as "(a <= x) && (x <= b)"
>> without breaking existing code?  The answer is no, C treats it as the
>> expression "(a <= x) <= b".  So you would be changing the meaning of
>> existing C code.  I think it's fair to say there is likely to be very
>> little existing correct and working C code that relies on the current
>> interpretation of such expressions, but the possibility is enough to
>> rule out such a change ever happening in C.  (And it would also
>> complicate the grammar a fair bit.)
>>
>>
>> <https://c-faq.com/expr/transitivity.html>
> 
> That does not prevet from doing the same with a different syntax.
> The main difference is that in the current C syntax that cannot be
> said without mentioning c twice. In the example program C would
> require that c is mentioned four times but the shown Python code
> only needs it mentioned twice. An ideal syntax woult only mention
> it once, perhaps
> 
>  return c in 0 .. 0xD7FF, 0xE000 .. 0x10FFFF ;

Introducing a new keyword 'in' would also break a lot of code, even
more code than the syntactic change ( . <= . <= . ) mentioned above
in the OP, don't you think?

> 
> or
> 
>  return c : [0 .. 0xD800), [0xE000 .. 0x10FFFF] ;
> 
> or something like that, preferably so that no new reserved word is
> needed.

Not worth the hassle, IMO.

Janis

[toc] | [prev] | [next] | [standalone]

#385559

From	Keith Thompson <Keith.S.Thompson+u@gmail.com>
Date	2024-06-04 14:04 -0700
Message-ID	<878qzk1kts.fsf@nosuchdomain.example.com>
In reply to	#385527

Mikko <mikko.levanto@iki.fi> writes:
> On 2024-06-04 08:58:53 +0000, David Brown said:
>> On 04/06/2024 09:14, Lawrence D'Oliveiro wrote:
>>> Would it break backward compatibility for C to add a feature like this
>>> from Python? Namely, the ability to check if a value lies in an interval:
>>> def valid_char(c) :
>>> "is integer c the code for a valid Unicode character." \
>>> " This excludes surrogates."
>>> return \
>>> (
>>> 0 <= c <= 0x10FFFF
>>> and
>>> not (0xD800 <= c < 0xE000)
>>> )
>>> #end valid_char
>> Do you mean, could C treat "a <= x <= b" as "(a <= x) && (x <= b)" 
>> without breaking existing code?  The answer is no, C treats it as
>> the expression "(a <= x) <= b".  So you would be changing the
>> meaning of existing C code.  I think it's fair to say there is
>> likely to be very little existing correct and working C code that
>> relies on the current interpretation of such expressions, but the
>> possibility is enough to rule out such a change ever happening in C.
>> (And it would also complicate the grammar a fair bit.)
>> 
>> <https://c-faq.com/expr/transitivity.html>
>
> That does not prevet from doing the same with a different syntax.
> The main difference is that in the current C syntax that cannot be
> said without mentioning c twice. In the example program C would
> require that c is mentioned four times but the shown Python code
> only needs it mentioned twice. An ideal syntax woult only mention
> it once, perhaps
>
>  return c in 0 .. 0xD7FF, 0xE000 .. 0x10FFFF ;
>
> or
>
>  return c : [0 .. 0xD800), [0xE000 .. 0x10FFFF] ;
>
> or something like that, preferably so that no new reserved word is
> needed.

Relatedly, gcc has case ranges as an extension, and there's a proposal
to add them to C2Y (Y=6?):
<https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3269.htm>

The gcc feature uses the existing "..." token rather than "..".  I'm not
sure whether using ".." would have caused problems beyond the need to
introduce a new token.

One minor issue, whether the feature uses ".." or "...", is that "1...2"
is a valid preprocessing number (and not a valid literal) so
`c in 1...2` would result in a syntax error.  You just need to add
spaces: `c in 1 ... 2` (which I'd argue is a good idea anyway).

-- 
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

[toc] | [prev] | [next] | [standalone]

#385528

From	bart <bc@freeuk.com>
Date	2024-06-04 11:39 +0100
Message-ID	<v3mqtc$ct28$1@dont-email.me>
In reply to	#385515

On 04/06/2024 08:14, Lawrence D'Oliveiro wrote:
> Would it break backward compatibility for C to add a feature like this
> from Python? Namely, the ability to check if a value lies in an interval:
> 
>      def valid_char(c) :
>          "is integer c the code for a valid Unicode character." \
>          " This excludes surrogates."
>          return \
>              (
>                  0 <= c <= 0x10FFFF
>              and
>                  not (0xD800 <= c < 0xE000)
>              )
>      #end valid_char

Yes it would break compatibility. The first '0 <= c' yields a 0 or 1 value.

But Python can also do it as `c in range(0, 0x10FFFF+1)`.

That could conceivably be added; the main obstacle would be introducing 
that new `in` keyword, while a better solution than `range` would be likely.

The chances of it actually happening are infinitesimal, and I'd be long 
dead before it become widely available.

This is the upside of devising your own language; I daily use these forms:

      a <= b <= c
      b in a .. c

in my systems language. The only stipulation with the first form is that 
if there are any angle brackets, then they all point the same way, 
otherwise the result is too confusing.

The language also needs to ensure middle terms of evaluated only once.

If I ever want to have the C meaning of 'a <= b <= c' (say I'm porting 
some code), then it can be written like this to break it up:

     (a <= b) <= c

[toc] | [prev] | [next] | [standalone]

#385533

From	Thiago Adams <thiago.adams@gmail.com>
Date	2024-06-04 08:32 -0300
Message-ID	<v3mu14$dhe9$1@dont-email.me>
In reply to	#385515

On 04/06/2024 04:14, Lawrence D'Oliveiro wrote:
> Would it break backward compatibility for C to add a feature like this
> from Python? Namely, the ability to check if a value lies in an interval:
> 
>      def valid_char(c) :
>          "is integer c the code for a valid Unicode character." \
>          " This excludes surrogates."
>          return \
>              (
>                  0 <= c <= 0x10FFFF
>              and
>                  not (0xD800 <= c < 0xE000)
>              )
>      #end valid_char

See Chaining Comparisons
https://www.open-std.org/JTC1/SC22/WG21/docs/papers/2018/p0893r0.html


https://medium.com/@barryrevzin/chaining-comparisons-seeking-information-from-the-audience-abec909a1366


I don't know what are the current status of this proposal.

[toc] | [prev] | [next] | [standalone]

#385535

From	Bonita Montero <Bonita.Montero@gmail.com>
Date	2024-06-04 13:37 +0200
Message-ID	<v3muaj$dl2t$1@raubtier-asyl.eternal-september.org>
In reply to	#385533

Am 04.06.2024 um 13:32 schrieb Thiago Adams:

> https://medium.com/@barryrevzin/chaining-comparisons-seeking-information-from-the-audience-abec909a1366
> I don't know what are the current status of this proposal.

This is for C++ and usualy you'd like to explicity do the chained
-comparison and not as a fold-expression.

[toc] | [prev] | [next] | [standalone]

#385562

From	Keith Thompson <Keith.S.Thompson+u@gmail.com>
Date	2024-06-04 15:29 -0700
Message-ID	<871q5c1gwe.fsf@nosuchdomain.example.com>
In reply to	#385533

Thiago Adams <thiago.adams@gmail.com> writes:
> On 04/06/2024 04:14, Lawrence D'Oliveiro wrote:
>> Would it break backward compatibility for C to add a feature like this
>> from Python? Namely, the ability to check if a value lies in an interval:
>>      def valid_char(c) :
>>          "is integer c the code for a valid Unicode character." \
>>          " This excludes surrogates."
>>          return \
>>              (
>>                  0 <= c <= 0x10FFFF
>>              and
>>                  not (0xD800 <= c < 0xE000)
>>              )
>>      #end valid_char
>
> See Chaining Comparisons
> https://www.open-std.org/JTC1/SC22/WG21/docs/papers/2018/p0893r0.html
>
> https://medium.com/@barryrevzin/chaining-comparisons-seeking-information-from-the-audience-abec909a1366
>
> I don't know what are the current status of this proposal.

That's a proposal for C++.

One interesting piece of information is that the authors did some research
on existing code:

"""
To that end, we created a clang-tidy check for all uses of chained
comparison operators, ran it on many open source code bases, and
solicited help from the C++ community to run it on their own. The check
itself casts an intentionally wide net, matching any instance of a @ b @
c for any of the six comparison operators, regardless of the types of
these underlying expressions.

Overall, what we found was:

- Zero instances of chained arithmetic comparisons that are correct
  today. That is, intentionally using the current standard behavior.
- Four instances of currently-erroneous arithmetic chaining, of the
  assert(0 <= ratio <= 1.0); variety. These are bugs that compile today
  but don’t do what the programmer intended, but with this proposal would
  change in meaning to become correct.
- Many instances of using successive comparison operators in DSLs that
  overloaded these operators to give meaning unrelated to comparisons.
"""

I presume they searched only C++ code, but I'd expect similar results
for C.

As indicated above, such a change would quietly break any existing
code that uses something like `a < b < c` that's intended to mean
`(a < b) < c`, but it would quietly *fix* any code that uses `a < b < c`
under the incorrect assumption that the comparisons are chained.
(Though the latter code will not have been tested under the new semantics.)

-- 
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

[toc] | [prev] | [next] | [standalone]

Page 1 of 3 [1] 2 3 Next page →

csiph-web

Interval Comparisons

Contents

#385515 — Interval Comparisons

#385526

#385527

#385529

#385531

#385539

#385544

#385546

#385545

#385547

#385548

#385549

#385571

#385538

#385543

#385559

#385528

#385533

#385535

#385562