Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.c > #385515 > unrolled thread
| Started by | Lawrence D'Oliveiro <ldo@nz.invalid> |
|---|---|
| First post | 2024-06-04 07:14 +0000 |
| Last post | 2024-06-07 10:42 +0000 |
| Articles | 20 on this page of 46 — 11 participants |
Back to article view | Back to comp.lang.c
Interval Comparisons Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-04 07:14 +0000
Re: Interval Comparisons David Brown <david.brown@hesbynett.no> - 2024-06-04 10:58 +0200
Re: Interval Comparisons Mikko <mikko.levanto@iki.fi> - 2024-06-04 12:13 +0300
Re: Interval Comparisons David Brown <david.brown@hesbynett.no> - 2024-06-04 13:02 +0200
Re: Interval Comparisons bart <bc@freeuk.com> - 2024-06-04 12:23 +0100
Re: Interval Comparisons David Brown <david.brown@hesbynett.no> - 2024-06-04 15:24 +0200
Re: Interval Comparisons bart <bc@freeuk.com> - 2024-06-04 15:16 +0100
Re: Interval Comparisons David Brown <david.brown@hesbynett.no> - 2024-06-04 17:40 +0200
Re: Interval Comparisons scott@slp53.sl.home (Scott Lurndal) - 2024-06-04 15:27 +0000
Re: Interval Comparisons bart <bc@freeuk.com> - 2024-06-04 16:58 +0100
Re: Interval Comparisons Michael S <already5chosen@yahoo.com> - 2024-06-04 19:25 +0300
Re: Interval Comparisons bart <bc@freeuk.com> - 2024-06-04 17:54 +0100
Re: Interval Comparisons Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-05 03:29 +0000
Re: Interval Comparisons Mikko <mikko.levanto@iki.fi> - 2024-06-04 16:11 +0300
Re: Interval Comparisons Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2024-06-04 15:42 +0200
Re: Interval Comparisons Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-04 14:04 -0700
Re: Interval Comparisons bart <bc@freeuk.com> - 2024-06-04 11:39 +0100
Re: Interval Comparisons Thiago Adams <thiago.adams@gmail.com> - 2024-06-04 08:32 -0300
Re: Interval Comparisons Bonita Montero <Bonita.Montero@gmail.com> - 2024-06-04 13:37 +0200
Re: Interval Comparisons Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-04 15:29 -0700
Re: Interval Comparisons Blue-Maned_Hawk <bluemanedhawk@invalid.invalid> - 2024-06-04 11:41 +0000
Re: Interval Comparisons Michael S <already5chosen@yahoo.com> - 2024-06-04 15:17 +0300
Re: Interval Comparisons Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-04 23:12 +0000
Re: Interval Comparisons bart <bc@freeuk.com> - 2024-06-05 00:22 +0100
Re: Interval Comparisons Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-05 01:30 +0000
Re: Interval Comparisons bart <bc@freeuk.com> - 2024-06-06 19:48 +0100
Re: Interval Comparisons Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-06 22:54 +0000
Re: Interval Comparisons bart <bc@freeuk.com> - 2024-06-07 01:52 +0100
Re: Interval Comparisons Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-07 02:17 +0000
Re: Interval Comparisons Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-06 20:53 -0700
Re: Interval Comparisons Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-07 04:25 +0000
Re: Interval Comparisons David Brown <david.brown@hesbynett.no> - 2024-06-07 11:22 +0200
Re: Interval Comparisons Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-07 02:55 -0700
Re: Interval Comparisons David Brown <david.brown@hesbynett.no> - 2024-06-07 13:04 +0200
Re: Interval Comparisons Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-07 11:57 -0700
Re: Interval Comparisons David Brown <david.brown@hesbynett.no> - 2024-06-08 17:42 +0200
Re: Interval Comparisons bart <bc@freeuk.com> - 2024-06-07 11:28 +0100
Re: Interval Comparisons Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-07 10:45 +0000
Re: Interval Comparisons Michael S <already5chosen@yahoo.com> - 2024-06-07 14:51 +0300
Re: Interval Comparisons David Brown <david.brown@hesbynett.no> - 2024-06-07 13:17 +0200
Re: Interval Comparisons bart <bc@freeuk.com> - 2024-06-07 13:20 +0100
Re: Interval Comparisons David Brown <david.brown@hesbynett.no> - 2024-06-09 13:26 +0200
Re: Interval Comparisons bart <bc@freeuk.com> - 2024-06-10 16:33 +0100
Re: Interval Comparisons David Brown <david.brown@hesbynett.no> - 2024-06-10 17:56 +0200
Re: Interval Comparisons scott@slp53.sl.home (Scott Lurndal) - 2024-06-07 14:00 +0000
Re: Interval Comparisons Lawrence D'Oliveiro <ldo@nz.invalid> - 2024-06-07 10:42 +0000
Page 1 of 3 [1] 2 3 Next page →
| From | Lawrence D'Oliveiro <ldo@nz.invalid> |
|---|---|
| Date | 2024-06-04 07:14 +0000 |
| Subject | Interval Comparisons |
| Message-ID | <v3merq$b1uj$1@dont-email.me> |
Would it break backward compatibility for C to add a feature like this
from Python? Namely, the ability to check if a value lies in an interval:
def valid_char(c) :
"is integer c the code for a valid Unicode character." \
" This excludes surrogates."
return \
(
0 <= c <= 0x10FFFF
and
not (0xD800 <= c < 0xE000)
)
#end valid_char
[toc] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2024-06-04 10:58 +0200 |
| Message-ID | <v3ml0d$bpds$5@dont-email.me> |
| In reply to | #385515 |
On 04/06/2024 09:14, Lawrence D'Oliveiro wrote: > Would it break backward compatibility for C to add a feature like this > from Python? Namely, the ability to check if a value lies in an interval: > > def valid_char(c) : > "is integer c the code for a valid Unicode character." \ > " This excludes surrogates." > return \ > ( > 0 <= c <= 0x10FFFF > and > not (0xD800 <= c < 0xE000) > ) > #end valid_char Do you mean, could C treat "a <= x <= b" as "(a <= x) && (x <= b)" without breaking existing code? The answer is no, C treats it as the expression "(a <= x) <= b". So you would be changing the meaning of existing C code. I think it's fair to say there is likely to be very little existing correct and working C code that relies on the current interpretation of such expressions, but the possibility is enough to rule out such a change ever happening in C. (And it would also complicate the grammar a fair bit.) <https://c-faq.com/expr/transitivity.html>
[toc] | [prev] | [next] | [standalone]
| From | Mikko <mikko.levanto@iki.fi> |
|---|---|
| Date | 2024-06-04 12:13 +0300 |
| Message-ID | <v3mlrb$c7d5$1@dont-email.me> |
| In reply to | #385526 |
On 2024-06-04 08:58:53 +0000, David Brown said: > On 04/06/2024 09:14, Lawrence D'Oliveiro wrote: >> Would it break backward compatibility for C to add a feature like this >> from Python? Namely, the ability to check if a value lies in an interval: >> >> def valid_char(c) : >> "is integer c the code for a valid Unicode character." \ >> " This excludes surrogates." >> return \ >> ( >> 0 <= c <= 0x10FFFF >> and >> not (0xD800 <= c < 0xE000) >> ) >> #end valid_char > > Do you mean, could C treat "a <= x <= b" as "(a <= x) && (x <= b)" > without breaking existing code? The answer is no, C treats it as the > expression "(a <= x) <= b". So you would be changing the meaning of > existing C code. I think it's fair to say there is likely to be very > little existing correct and working C code that relies on the current > interpretation of such expressions, but the possibility is enough to > rule out such a change ever happening in C. (And it would also > complicate the grammar a fair bit.) > > > <https://c-faq.com/expr/transitivity.html> That does not prevet from doing the same with a different syntax. The main difference is that in the current C syntax that cannot be said without mentioning c twice. In the example program C would require that c is mentioned four times but the shown Python code only needs it mentioned twice. An ideal syntax woult only mention it once, perhaps return c in 0 .. 0xD7FF, 0xE000 .. 0x10FFFF ; or return c : [0 .. 0xD800), [0xE000 .. 0x10FFFF] ; or something like that, preferably so that no new reserved word is needed. -- Mikko
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2024-06-04 13:02 +0200 |
| Message-ID | <v3ms7b$d5sq$1@dont-email.me> |
| In reply to | #385527 |
On 04/06/2024 11:13, Mikko wrote: > On 2024-06-04 08:58:53 +0000, David Brown said: > >> On 04/06/2024 09:14, Lawrence D'Oliveiro wrote: >>> Would it break backward compatibility for C to add a feature like this >>> from Python? Namely, the ability to check if a value lies in an >>> interval: >>> >>> def valid_char(c) : >>> "is integer c the code for a valid Unicode character." \ >>> " This excludes surrogates." >>> return \ >>> ( >>> 0 <= c <= 0x10FFFF >>> and >>> not (0xD800 <= c < 0xE000) >>> ) >>> #end valid_char >> >> Do you mean, could C treat "a <= x <= b" as "(a <= x) && (x <= b)" >> without breaking existing code? The answer is no, C treats it as the >> expression "(a <= x) <= b". So you would be changing the meaning of >> existing C code. I think it's fair to say there is likely to be very >> little existing correct and working C code that relies on the current >> interpretation of such expressions, but the possibility is enough to >> rule out such a change ever happening in C. (And it would also >> complicate the grammar a fair bit.) >> >> >> <https://c-faq.com/expr/transitivity.html> > > That does not prevet from doing the same with a different syntax. > The main difference is that in the current C syntax that cannot be > said without mentioning c twice. In the example program C would > require that c is mentioned four times but the shown Python code > only needs it mentioned twice. An ideal syntax woult only mention > it once, perhaps > > return c in 0 .. 0xD7FF, 0xE000 .. 0x10FFFF ; > > or > > return c : [0 .. 0xD800), [0xE000 .. 0x10FFFF] ; > > or something like that, preferably so that no new reserved word is > needed. > Sure, you can always add new things to a language if they would previously have been syntax errors or constraint errors. But is there a use for it? It is fine if you have a language that has good support for lists, sets, ranges, and other higher-level features - then an "in" keyword is a great idea. But C is not such a language, and that kind of feature would be well outside the scope of the language. It would be easy enough to write a macro "in_range(a, x, b)" that would do the job. It is even easier, and more productive, that you simply write the "valid_char" function and use it, if that's what you need.
[toc] | [prev] | [next] | [standalone]
| From | bart <bc@freeuk.com> |
|---|---|
| Date | 2024-06-04 12:23 +0100 |
| Message-ID | <v3mtf2$ct28$2@dont-email.me> |
| In reply to | #385529 |
On 04/06/2024 12:02, David Brown wrote:
> On 04/06/2024 11:13, Mikko wrote:
>> On 2024-06-04 08:58:53 +0000, David Brown said:
>>
>>> On 04/06/2024 09:14, Lawrence D'Oliveiro wrote:
>>>> Would it break backward compatibility for C to add a feature like this
>>>> from Python? Namely, the ability to check if a value lies in an
>>>> interval:
>>>>
>>>> def valid_char(c) :
>>>> "is integer c the code for a valid Unicode character." \
>>>> " This excludes surrogates."
>>>> return \
>>>> (
>>>> 0 <= c <= 0x10FFFF
>>>> and
>>>> not (0xD800 <= c < 0xE000)
>>>> )
>>>> #end valid_char
>>>
>>> Do you mean, could C treat "a <= x <= b" as "(a <= x) && (x <= b)"
>>> without breaking existing code? The answer is no, C treats it as the
>>> expression "(a <= x) <= b". So you would be changing the meaning of
>>> existing C code. I think it's fair to say there is likely to be very
>>> little existing correct and working C code that relies on the current
>>> interpretation of such expressions, but the possibility is enough to
>>> rule out such a change ever happening in C. (And it would also
>>> complicate the grammar a fair bit.)
>>>
>>>
>>> <https://c-faq.com/expr/transitivity.html>
>>
>> That does not prevet from doing the same with a different syntax.
>> The main difference is that in the current C syntax that cannot be
>> said without mentioning c twice. In the example program C would
>> require that c is mentioned four times but the shown Python code
>> only needs it mentioned twice. An ideal syntax woult only mention
>> it once, perhaps
>>
>> return c in 0 .. 0xD7FF, 0xE000 .. 0x10FFFF ;
>>
>> or
>>
>> return c : [0 .. 0xD800), [0xE000 .. 0x10FFFF] ;
>>
>> or something like that, preferably so that no new reserved word is
>> needed.
>>
>
> Sure, you can always add new things to a language if they would
> previously have been syntax errors or constraint errors. But is there a
> use for it?
>
> It is fine if you have a language that has good support for lists, sets,
> ranges, and other higher-level features - then an "in" keyword is a
> great idea. But C is not such a language, and that kind of feature
> would be well outside the scope of the language.
I disagree. I have a script language where 'in' works with all sorts of
data types, and where ranges like a..b and sets like [a..b, c, d, e] are
actual types.
Yet I also introduced 'in' into my systems language, even though it is
very restricted:
if a in b..c then
if a in [b, c, d] then
This is limited to integer types. The set construct here doesn't allow
ranges (it could have done). Neither the range or set is a datatype - it
just syntax. (I can't do range r := 1..10.)
It is incredibly useful:
if c in [' ', '\t', '\n'] then ... # whitespace
if b in 0..255 then
if b in u8.bounds then # alternative
Not to forget:
if x = y = 0 then # both x and y are zero
It doesn't need the full spec of the higher level language.
> It would be easy enough to write a macro "in_range(a, x, b)" that would
> do the job. It is even easier, and more productive, that you simply
> write the "valid_char" function and use it, if that's what you need.
Yes it would be easier - to provide an ugly, half-assed solution that
everyone will write a different way (I would use (x, a, b) for example),
and which can go wrong as soon as someone writes (a, x(), b).
That's the problem with the macro scheme, it stops the language properly
evolving.
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2024-06-04 15:24 +0200 |
| Message-ID | <v3n4is$emdc$1@dont-email.me> |
| In reply to | #385531 |
On 04/06/2024 13:23, bart wrote: > On 04/06/2024 12:02, David Brown wrote: >> On 04/06/2024 11:13, Mikko wrote: >>> On 2024-06-04 08:58:53 +0000, David Brown said: >>> >>>> On 04/06/2024 09:14, Lawrence D'Oliveiro wrote: >>>>> Would it break backward compatibility for C to add a feature like this >>>>> from Python? Namely, the ability to check if a value lies in an >>>>> interval: >>>>> >>>>> def valid_char(c) : >>>>> "is integer c the code for a valid Unicode character." \ >>>>> " This excludes surrogates." >>>>> return \ >>>>> ( >>>>> 0 <= c <= 0x10FFFF >>>>> and >>>>> not (0xD800 <= c < 0xE000) >>>>> ) >>>>> #end valid_char >>>> >>>> Do you mean, could C treat "a <= x <= b" as "(a <= x) && (x <= b)" >>>> without breaking existing code? The answer is no, C treats it as >>>> the expression "(a <= x) <= b". So you would be changing the >>>> meaning of existing C code. I think it's fair to say there is >>>> likely to be very little existing correct and working C code that >>>> relies on the current interpretation of such expressions, but the >>>> possibility is enough to rule out such a change ever happening in >>>> C. (And it would also complicate the grammar a fair bit.) >>>> >>>> >>>> <https://c-faq.com/expr/transitivity.html> >>> >>> That does not prevet from doing the same with a different syntax. >>> The main difference is that in the current C syntax that cannot be >>> said without mentioning c twice. In the example program C would >>> require that c is mentioned four times but the shown Python code >>> only needs it mentioned twice. An ideal syntax woult only mention >>> it once, perhaps >>> >>> return c in 0 .. 0xD7FF, 0xE000 .. 0x10FFFF ; >>> >>> or >>> >>> return c : [0 .. 0xD800), [0xE000 .. 0x10FFFF] ; >>> >>> or something like that, preferably so that no new reserved word is >>> needed. >>> >> >> Sure, you can always add new things to a language if they would >> previously have been syntax errors or constraint errors. But is there >> a use for it? >> >> It is fine if you have a language that has good support for lists, >> sets, ranges, and other higher-level features - then an "in" keyword >> is a great idea. But C is not such a language, and that kind of >> feature would be well outside the scope of the language. > > I disagree. I have a script language where 'in' works with all sorts of > data types, and where ranges like a..b and sets like [a..b, c, d, e] are > actual types. C is not a script language. > > Yet I also introduced 'in' into my systems language, even though it is > very restricted: > > if a in b..c then > if a in [b, c, d] then > > This is limited to integer types. The set construct here doesn't allow > ranges (it could have done). Neither the range or set is a datatype - it > just syntax. (I can't do range r := 1..10.) Adding such a feature to your own personal language, for your own personal use, is easy enough (relative to the rest of the work involved in designing your own personal language and making tools for it, which is of course no small feat). Adding it to C with its standards, existing code, toolchains, additional tools, developers, etc., is a whole different kettle of fish. I don't think it would be practical to add it to C in a way that is simple and restricted enough to be suitable for C, while also being useful enough to make it worth the effort. Remember, when you add these things to your own language, you have your own needs in mind and can ignore everything else, all corner cases, and all complications. Putting a feature in C means making decisions like figuring out what type the expression "b..c" has, whether the various bits and pieces have to be constants or if they can be variables, how the operator precedences work, how to treat floating point numbers or mixes of different types, and countless other factors. If a language already has the concepts, rules and grammar for ranges or lists, adding an "in" operator is natural - if not, then it's a huge amount of extra junk pulled into the language and syntax for a very minor gain. I don't disagree that it could be useful, and I'm sure I'd use it if it existed in C, I just disagree that it makes sense in C. > > It is incredibly useful: > > if c in [' ', '\t', '\n'] then ... # whitespace > if b in 0..255 then > if b in u8.bounds then # alternative > > Not to forget: > > if x = y = 0 then # both x and y are zero > > It doesn't need the full spec of the higher level language. > >> It would be easy enough to write a macro "in_range(a, x, b)" that >> would do the job. It is even easier, and more productive, that you >> simply write the "valid_char" function and use it, if that's what you >> need. > > Yes it would be easier - to provide an ugly, half-assed solution that You and I are British - the term is "half-arsed" :-) > everyone will write a different way (I would use (x, a, b) for example), > and which can go wrong as soon as someone writes (a, x(), b). > > That's the problem with the macro scheme, it stops the language properly > evolving. > If it were considered useful enough, it could be standardised in the C library. If it is not useful enough to standardise in the library, it is certainly not useful enough to put in the language itself. In practice, while I would put something like this in a new language, I don't think it is important enough to try to add to C. When you need to do a lot of checks, you'd put them within a function (or macro if you prefer), such as "isspace()".
[toc] | [prev] | [next] | [standalone]
| From | bart <bc@freeuk.com> |
|---|---|
| Date | 2024-06-04 15:16 +0100 |
| Message-ID | <v3n7ko$evip$1@dont-email.me> |
| In reply to | #385539 |
On 04/06/2024 14:24, David Brown wrote:
> On 04/06/2024 13:23, bart wrote:
>> On 04/06/2024 12:02, David Brown wrote:
>>> It is fine if you have a language that has good support for lists,
>>> sets, ranges, and other higher-level features - then an "in" keyword
>>> is a great idea. But C is not such a language, and that kind of
>>> feature would be well outside the scope of the language.
>>
>> I disagree. I have a script language where 'in' works with all sorts
>> of data types, and where ranges like a..b and sets like [a..b, c, d,
>> e] are actual types.
>
> C is not a script language.
>
>>
>> Yet I also introduced 'in' into my systems language, even though it is
>> very restricted:
>>
>> if a in b..c then
>> if a in [b, c, d] then
>>
>> This is limited to integer types. The set construct here doesn't allow
>> ranges (it could have done). Neither the range or set is a datatype -
>> it just syntax. (I can't do range r := 1..10.)
>
> Adding such a feature to your own personal language, for your own
> personal use, is easy enough (relative to the rest of the work involved
> in designing your own personal language and making tools for it, which
> is of course no small feat). Adding it to C with its standards,
> existing code, toolchains, additional tools, developers, etc., is a
> whole different kettle of fish.
I was responding to your comment:
"and that kind of feature would be well outside the scope of the language."
I think it can suit that level of language if you avoid being too ambitious.
I agree it is not practical to apply to C at this point, not without
making it ugly or unwieldy enough that people might as well use existing
solutions.
(Such a feature also aids simpler non-optimising compilers. Take these
examples that all do the same thing:
if a <= f() and f() <= c then fi
if a <= f() <= c then fi
if f() in a..c then fi
If the two f() calls in the first example were considered common
subexpressions, I don't have the means in my compiler to detect that
that and evaluate them just once.
In the other two examples, the language lets you express that directly.
Even for a simpler 'b in a..c' example, it is easier to generate more
efficient code, and do that more efficiently too than building something
up only to tear it down again.)
>>> It would be easy enough to write a macro "in_range(a, x, b)" that
>>> would do the job. It is even easier, and more productive, that you
>>> simply write the "valid_char" function and use it, if that's what you
>>> need.
>>
>> Yes it would be easier - to provide an ugly, half-assed solution that
>
> You and I are British - the term is "half-arsed" :-)
I'm catering for a wider readership.
(Actually I'm not quite considered British enough to be allowed in the
upcoming election.)
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2024-06-04 17:40 +0200 |
| Message-ID | <v3nch6$frn1$1@dont-email.me> |
| In reply to | #385544 |
On 04/06/2024 16:16, bart wrote: > On 04/06/2024 14:24, David Brown wrote: >> On 04/06/2024 13:23, bart wrote: >>> On 04/06/2024 12:02, David Brown wrote: > >>>> It is fine if you have a language that has good support for lists, >>>> sets, ranges, and other higher-level features - then an "in" keyword >>>> is a great idea. But C is not such a language, and that kind of >>>> feature would be well outside the scope of the language. >>> >>> I disagree. I have a script language where 'in' works with all sorts >>> of data types, and where ranges like a..b and sets like [a..b, c, d, >>> e] are actual types. >> >> C is not a script language. >> >>> >>> Yet I also introduced 'in' into my systems language, even though it >>> is very restricted: >>> >>> if a in b..c then >>> if a in [b, c, d] then >>> >>> This is limited to integer types. The set construct here doesn't >>> allow ranges (it could have done). Neither the range or set is a >>> datatype - it just syntax. (I can't do range r := 1..10.) >> >> Adding such a feature to your own personal language, for your own >> personal use, is easy enough (relative to the rest of the work >> involved in designing your own personal language and making tools for >> it, which is of course no small feat). Adding it to C with its >> standards, existing code, toolchains, additional tools, developers, >> etc., is a whole different kettle of fish. > > I was responding to your comment: > > "and that kind of feature would be well outside the scope of the language." > > I think it can suit that level of language if you avoid being too > ambitious. > It might be that we would agree on that if we worked hard enough to find a common definition for "that level of language". But I think that would be a lot of time and effort for little purpose. I do agree that with enough limitation in the scope of the feature, it is less unreasonable for a low-level language. But I think I would want to limit the scope until there is little point in the "in" operator - or I would want to go the other direction and define something like Pascal's sets with many more operators and uses. > I agree it is not practical to apply to C at this point, not without > making it ugly or unwieldy enough that people might as well use existing > solutions. Yes. > > (Such a feature also aids simpler non-optimising compilers. Take these > examples that all do the same thing: > > if a <= f() and f() <= c then fi > > if a <= f() <= c then fi > > if f() in a..c then fi > > If the two f() calls in the first example were considered common > subexpressions, I don't have the means in my compiler to detect that > that and evaluate them just once. > I see your point, but I rate the design and use of a language as /much/ more important than the ease of implementation. I realise the balance is a bit different when the user is the implementer. > In the other two examples, the language lets you express that directly. > > Even for a simpler 'b in a..c' example, it is easier to generate more > efficient code, and do that more efficiently too than building something > up only to tear it down again.) > > >>>> It would be easy enough to write a macro "in_range(a, x, b)" that >>>> would do the job. It is even easier, and more productive, that you >>>> simply write the "valid_char" function and use it, if that's what >>>> you need. >>> >>> Yes it would be easier - to provide an ugly, half-assed solution that >> >> You and I are British - the term is "half-arsed" :-) > > I'm catering for a wider readership. We can educate them! > > (Actually I'm not quite considered British enough to be allowed in the > upcoming election.) > I can't vote either, but that's because I don't live in the UK. And given the state of UK politics these days, I'm happy to be out of it. For quite a while, the Scottish Parliament were looking like the adults in the room, but they've managed to mess things up for themselves too.
[toc] | [prev] | [next] | [standalone]
| From | scott@slp53.sl.home (Scott Lurndal) |
|---|---|
| Date | 2024-06-04 15:27 +0000 |
| Message-ID | <DnG7O.8145$_US6.7552@fx44.iad> |
| In reply to | #385539 |
David Brown <david.brown@hesbynett.no> writes: >On 04/06/2024 13:23, bart wrote: >> It is incredibly useful: >> >> if c in [' ', '\t', '\n'] then ... # whitespace if (strpbrk(c, " \t\n") != NULL) it_is_whitespace. > >If it were considered useful enough, it could be standardised in the C >library. If it is not useful enough to standardise in the library, it >is certainly not useful enough to put in the language itself. indeed.
[toc] | [prev] | [next] | [standalone]
| From | bart <bc@freeuk.com> |
|---|---|
| Date | 2024-06-04 16:58 +0100 |
| Message-ID | <v3ndji$fv12$1@dont-email.me> |
| In reply to | #385545 |
On 04/06/2024 16:27, Scott Lurndal wrote:
> David Brown <david.brown@hesbynett.no> writes:
>> On 04/06/2024 13:23, bart wrote:
>
>>> It is incredibly useful:
>>>
>>> if c in [' ', '\t', '\n'] then ... # whitespace
>
> if (strpbrk(c, " \t\n") != NULL) it_is_whitespace.
That doesn't do the same thing. In my example, c is a character, not a
string.
To achieve the same thing using strpbrk requires code like this:
char c[2];
c[0]=rand()&255; // Create a string
c[1]=0;
if (strpbrk(c, " \t\n") != NULL) puts("whitespace");
If I compile this with gcc -O3, then the checking part is this:
lea rcx, 46[rsp]
mov BYTE PTR 47[rsp], 0
lea rdx, .LC0[rip]
mov BYTE PTR 46[rsp], al
call strpbrk // CALL TO LIBRARY FUNCTION
test rax, rax
je .L2
lea rcx, .LC1[rip]
call puts
I don't know what it gets up to inside strprbk. If I write this in my
language:
if c in [9,10,32] then
puts("whitespace")
fi
The generated code is this (using alternate register names, D0 = rax):
mov D0, D3 # (could have tested D3 (= c) directly.)
cmp D0, 9
jz L4
cmp D0, 10
jz L4
cmp D0, 32
jnz L3
L4:
lea D10, [L5]
call puts*
L3:
Anyway, the construct is not limited to character codes that can be
contained within a string. It works for 64-bit values which can include
0. And it could be extended to other scalar types like floats and pointers.
[toc] | [prev] | [next] | [standalone]
| From | Michael S <already5chosen@yahoo.com> |
|---|---|
| Date | 2024-06-04 19:25 +0300 |
| Message-ID | <20240604192547.00003fbd@yahoo.com> |
| In reply to | #385547 |
On Tue, 4 Jun 2024 16:58:43 +0100
bart <bc@freeuk.com> wrote:
> On 04/06/2024 16:27, Scott Lurndal wrote:
> > David Brown <david.brown@hesbynett.no> writes:
> >> On 04/06/2024 13:23, bart wrote:
> >
> >>> It is incredibly useful:
> >>>
> >>> if c in [' ', '\t', '\n'] then ... # whitespace
> >
> > if (strpbrk(c, " \t\n") != NULL) it_is_whitespace.
>
> That doesn't do the same thing. In my example, c is a character, not
> a string.
>
Will that be be better?
if (memchr(" \t\n", c, 3) != NULL)
[toc] | [prev] | [next] | [standalone]
| From | bart <bc@freeuk.com> |
|---|---|
| Date | 2024-06-04 17:54 +0100 |
| Message-ID | <v3ngs9$gksq$1@dont-email.me> |
| In reply to | #385548 |
On 04/06/2024 17:25, Michael S wrote:
> On Tue, 4 Jun 2024 16:58:43 +0100
> bart <bc@freeuk.com> wrote:
>
>> On 04/06/2024 16:27, Scott Lurndal wrote:
>>> David Brown <david.brown@hesbynett.no> writes:
>>>> On 04/06/2024 13:23, bart wrote:
>>>
>>>>> It is incredibly useful:
>>>>>
>>>>> if c in [' ', '\t', '\n'] then ... # whitespace
>>>
>>> if (strpbrk(c, " \t\n") != NULL) it_is_whitespace.
>>
>> That doesn't do the same thing. In my example, c is a character, not
>> a string.
>>
>
>
> Will that be be better?
> if (memchr(" \t\n", c, 3) != NULL)
>
It's a better match. But on gcc-O3, it still calls the library function
up to version 12.x. After that, it's smart enough to generate similar
code to my non-optimising compiler using the built-in feature.
It is also still limited to byte values.
My approach is to fix a language, which is easier, than to expend
magnitudes more effort in elaborate tools and ultra-smart compilers.
[toc] | [prev] | [next] | [standalone]
| From | Lawrence D'Oliveiro <ldo@nz.invalid> |
|---|---|
| Date | 2024-06-05 03:29 +0000 |
| Message-ID | <v3om2e$qb6k$2@dont-email.me> |
| In reply to | #385531 |
On Tue, 4 Jun 2024 12:23:15 +0100, bart wrote: > That's the problem with the macro scheme, it stops the language properly > evolving. The problem is the way C does macros. Other languages with powerful macros (*cough* Lisp *cough*) aren’t stopped from evolving; quite the opposite.
[toc] | [prev] | [next] | [standalone]
| From | Mikko <mikko.levanto@iki.fi> |
|---|---|
| Date | 2024-06-04 16:11 +0300 |
| Message-ID | <v3n3q0$ei7c$1@dont-email.me> |
| In reply to | #385529 |
On 2024-06-04 11:02:03 +0000, David Brown said: > On 04/06/2024 11:13, Mikko wrote: >> On 2024-06-04 08:58:53 +0000, David Brown said: >> >>> On 04/06/2024 09:14, Lawrence D'Oliveiro wrote: >>>> Would it break backward compatibility for C to add a feature like this >>>> from Python? Namely, the ability to check if a value lies in an interval: >>>> >>>> def valid_char(c) : >>>> "is integer c the code for a valid Unicode character." \ >>>> " This excludes surrogates." >>>> return \ >>>> ( >>>> 0 <= c <= 0x10FFFF >>>> and >>>> not (0xD800 <= c < 0xE000) >>>> ) >>>> #end valid_char >>> >>> Do you mean, could C treat "a <= x <= b" as "(a <= x) && (x <= b)" >>> without breaking existing code? The answer is no, C treats it as the >>> expression "(a <= x) <= b". So you would be changing the meaning of >>> existing C code. I think it's fair to say there is likely to be very >>> little existing correct and working C code that relies on the current >>> interpretation of such expressions, but the possibility is enough to >>> rule out such a change ever happening in C. (And it would also >>> complicate the grammar a fair bit.) >>> >>> >>> <https://c-faq.com/expr/transitivity.html> >> >> That does not prevet from doing the same with a different syntax. >> The main difference is that in the current C syntax that cannot be >> said without mentioning c twice. In the example program C would >> require that c is mentioned four times but the shown Python code >> only needs it mentioned twice. An ideal syntax woult only mention >> it once, perhaps >> >> return c in 0 .. 0xD7FF, 0xE000 .. 0x10FFFF ; >> >> or >> >> return c : [0 .. 0xD800), [0xE000 .. 0x10FFFF] ; >> >> or something like that, preferably so that no new reserved word is >> needed. >> > > Sure, you can always add new things to a language if they would > previously have been syntax errors or constraint errors. But is there > a use for it? I don't see any need. That c must be mentioned twice for each interval is not a problem. If there is a complex expression in place of c it can be computed and stored to a variable before comparison to an interval. > It is fine if you have a language that has good support for lists, > sets, ranges, and other higher-level features - then an "in" keyword is > a great idea. But C is not such a language, and that kind of feature > would be well outside the scope of the language. Or, if one for some reason does it in C anyway, one should have or make a library of the essential functions, incuding membership tests. > It would be easy enough to write a macro "in_range(a, x, b)" that would > do the job. It is even easier, and more productive, that you simply > write the "valid_char" function and use it, if that's what you need. Indeed. -- Mikko
[toc] | [prev] | [next] | [standalone]
| From | Janis Papanagnou <janis_papanagnou+ng@hotmail.com> |
|---|---|
| Date | 2024-06-04 15:42 +0200 |
| Message-ID | <v3n5jo$esag$1@dont-email.me> |
| In reply to | #385527 |
On 04.06.2024 11:13, Mikko wrote: > On 2024-06-04 08:58:53 +0000, David Brown said: >> On 04/06/2024 09:14, Lawrence D'Oliveiro wrote: >>> Would it break backward compatibility for C to add a feature like this >>> from Python? Namely, the ability to check if a value lies in an >>> interval: >>> >>> def valid_char(c) : >>> "is integer c the code for a valid Unicode character." \ >>> " This excludes surrogates." >>> return \ >>> ( >>> 0 <= c <= 0x10FFFF While nice to have it's just syntactic sugar. >>> and >>> not (0xD800 <= c < 0xE000) >>> ) >>> #end valid_char >> >> Do you mean, could C treat "a <= x <= b" as "(a <= x) && (x <= b)" >> without breaking existing code? The answer is no, C treats it as the >> expression "(a <= x) <= b". So you would be changing the meaning of >> existing C code. I think it's fair to say there is likely to be very >> little existing correct and working C code that relies on the current >> interpretation of such expressions, but the possibility is enough to >> rule out such a change ever happening in C. (And it would also >> complicate the grammar a fair bit.) >> >> >> <https://c-faq.com/expr/transitivity.html> > > That does not prevet from doing the same with a different syntax. > The main difference is that in the current C syntax that cannot be > said without mentioning c twice. In the example program C would > require that c is mentioned four times but the shown Python code > only needs it mentioned twice. An ideal syntax woult only mention > it once, perhaps > > return c in 0 .. 0xD7FF, 0xE000 .. 0x10FFFF ; Introducing a new keyword 'in' would also break a lot of code, even more code than the syntactic change ( . <= . <= . ) mentioned above in the OP, don't you think? > > or > > return c : [0 .. 0xD800), [0xE000 .. 0x10FFFF] ; > > or something like that, preferably so that no new reserved word is > needed. Not worth the hassle, IMO. Janis
[toc] | [prev] | [next] | [standalone]
| From | Keith Thompson <Keith.S.Thompson+u@gmail.com> |
|---|---|
| Date | 2024-06-04 14:04 -0700 |
| Message-ID | <878qzk1kts.fsf@nosuchdomain.example.com> |
| In reply to | #385527 |
Mikko <mikko.levanto@iki.fi> writes:
> On 2024-06-04 08:58:53 +0000, David Brown said:
>> On 04/06/2024 09:14, Lawrence D'Oliveiro wrote:
>>> Would it break backward compatibility for C to add a feature like this
>>> from Python? Namely, the ability to check if a value lies in an interval:
>>> def valid_char(c) :
>>> "is integer c the code for a valid Unicode character." \
>>> " This excludes surrogates."
>>> return \
>>> (
>>> 0 <= c <= 0x10FFFF
>>> and
>>> not (0xD800 <= c < 0xE000)
>>> )
>>> #end valid_char
>> Do you mean, could C treat "a <= x <= b" as "(a <= x) && (x <= b)"
>> without breaking existing code? The answer is no, C treats it as
>> the expression "(a <= x) <= b". So you would be changing the
>> meaning of existing C code. I think it's fair to say there is
>> likely to be very little existing correct and working C code that
>> relies on the current interpretation of such expressions, but the
>> possibility is enough to rule out such a change ever happening in C.
>> (And it would also complicate the grammar a fair bit.)
>>
>> <https://c-faq.com/expr/transitivity.html>
>
> That does not prevet from doing the same with a different syntax.
> The main difference is that in the current C syntax that cannot be
> said without mentioning c twice. In the example program C would
> require that c is mentioned four times but the shown Python code
> only needs it mentioned twice. An ideal syntax woult only mention
> it once, perhaps
>
> return c in 0 .. 0xD7FF, 0xE000 .. 0x10FFFF ;
>
> or
>
> return c : [0 .. 0xD800), [0xE000 .. 0x10FFFF] ;
>
> or something like that, preferably so that no new reserved word is
> needed.
Relatedly, gcc has case ranges as an extension, and there's a proposal
to add them to C2Y (Y=6?):
<https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3269.htm>
The gcc feature uses the existing "..." token rather than "..". I'm not
sure whether using ".." would have caused problems beyond the need to
introduce a new token.
One minor issue, whether the feature uses ".." or "...", is that "1...2"
is a valid preprocessing number (and not a valid literal) so
`c in 1...2` would result in a syntax error. You just need to add
spaces: `c in 1 ... 2` (which I'd argue is a good idea anyway).
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
[toc] | [prev] | [next] | [standalone]
| From | bart <bc@freeuk.com> |
|---|---|
| Date | 2024-06-04 11:39 +0100 |
| Message-ID | <v3mqtc$ct28$1@dont-email.me> |
| In reply to | #385515 |
On 04/06/2024 08:14, Lawrence D'Oliveiro wrote:
> Would it break backward compatibility for C to add a feature like this
> from Python? Namely, the ability to check if a value lies in an interval:
>
> def valid_char(c) :
> "is integer c the code for a valid Unicode character." \
> " This excludes surrogates."
> return \
> (
> 0 <= c <= 0x10FFFF
> and
> not (0xD800 <= c < 0xE000)
> )
> #end valid_char
Yes it would break compatibility. The first '0 <= c' yields a 0 or 1 value.
But Python can also do it as `c in range(0, 0x10FFFF+1)`.
That could conceivably be added; the main obstacle would be introducing
that new `in` keyword, while a better solution than `range` would be likely.
The chances of it actually happening are infinitesimal, and I'd be long
dead before it become widely available.
This is the upside of devising your own language; I daily use these forms:
a <= b <= c
b in a .. c
in my systems language. The only stipulation with the first form is that
if there are any angle brackets, then they all point the same way,
otherwise the result is too confusing.
The language also needs to ensure middle terms of evaluated only once.
If I ever want to have the C meaning of 'a <= b <= c' (say I'm porting
some code), then it can be written like this to break it up:
(a <= b) <= c
[toc] | [prev] | [next] | [standalone]
| From | Thiago Adams <thiago.adams@gmail.com> |
|---|---|
| Date | 2024-06-04 08:32 -0300 |
| Message-ID | <v3mu14$dhe9$1@dont-email.me> |
| In reply to | #385515 |
On 04/06/2024 04:14, Lawrence D'Oliveiro wrote: > Would it break backward compatibility for C to add a feature like this > from Python? Namely, the ability to check if a value lies in an interval: > > def valid_char(c) : > "is integer c the code for a valid Unicode character." \ > " This excludes surrogates." > return \ > ( > 0 <= c <= 0x10FFFF > and > not (0xD800 <= c < 0xE000) > ) > #end valid_char See Chaining Comparisons https://www.open-std.org/JTC1/SC22/WG21/docs/papers/2018/p0893r0.html https://medium.com/@barryrevzin/chaining-comparisons-seeking-information-from-the-audience-abec909a1366 I don't know what are the current status of this proposal.
[toc] | [prev] | [next] | [standalone]
| From | Bonita Montero <Bonita.Montero@gmail.com> |
|---|---|
| Date | 2024-06-04 13:37 +0200 |
| Message-ID | <v3muaj$dl2t$1@raubtier-asyl.eternal-september.org> |
| In reply to | #385533 |
Am 04.06.2024 um 13:32 schrieb Thiago Adams: > https://medium.com/@barryrevzin/chaining-comparisons-seeking-information-from-the-audience-abec909a1366 > I don't know what are the current status of this proposal. This is for C++ and usualy you'd like to explicity do the chained -comparison and not as a fold-expression.
[toc] | [prev] | [next] | [standalone]
| From | Keith Thompson <Keith.S.Thompson+u@gmail.com> |
|---|---|
| Date | 2024-06-04 15:29 -0700 |
| Message-ID | <871q5c1gwe.fsf@nosuchdomain.example.com> |
| In reply to | #385533 |
Thiago Adams <thiago.adams@gmail.com> writes:
> On 04/06/2024 04:14, Lawrence D'Oliveiro wrote:
>> Would it break backward compatibility for C to add a feature like this
>> from Python? Namely, the ability to check if a value lies in an interval:
>> def valid_char(c) :
>> "is integer c the code for a valid Unicode character." \
>> " This excludes surrogates."
>> return \
>> (
>> 0 <= c <= 0x10FFFF
>> and
>> not (0xD800 <= c < 0xE000)
>> )
>> #end valid_char
>
> See Chaining Comparisons
> https://www.open-std.org/JTC1/SC22/WG21/docs/papers/2018/p0893r0.html
>
> https://medium.com/@barryrevzin/chaining-comparisons-seeking-information-from-the-audience-abec909a1366
>
> I don't know what are the current status of this proposal.
That's a proposal for C++.
One interesting piece of information is that the authors did some research
on existing code:
"""
To that end, we created a clang-tidy check for all uses of chained
comparison operators, ran it on many open source code bases, and
solicited help from the C++ community to run it on their own. The check
itself casts an intentionally wide net, matching any instance of a @ b @
c for any of the six comparison operators, regardless of the types of
these underlying expressions.
Overall, what we found was:
- Zero instances of chained arithmetic comparisons that are correct
today. That is, intentionally using the current standard behavior.
- Four instances of currently-erroneous arithmetic chaining, of the
assert(0 <= ratio <= 1.0); variety. These are bugs that compile today
but don’t do what the programmer intended, but with this proposal would
change in meaning to become correct.
- Many instances of using successive comparison operators in DSLs that
overloaded these operators to give meaning unrelated to comparisons.
"""
I presume they searched only C++ code, but I'd expect similar results
for C.
As indicated above, such a change would quietly break any existing
code that uses something like `a < b < c` that's intended to mean
`(a < b) < c`, but it would quietly *fix* any code that uses `a < b < c`
under the incorrect assumption that the comparisons are chained.
(Though the latter code will not have been tested under the new semantics.)
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
[toc] | [prev] | [next] | [standalone]
Page 1 of 3 [1] 2 3 Next page →
Back to top | Article view | comp.lang.c
csiph-web