Groups > comp.unix.shell > #4323 > unrolled thread

Why 'break' even has effects outside of its function?

Started by	Cong Wang <xiyou.wangcong@gmail.com>
First post	2012-03-05 08:23 +0000
Last post	2012-03-05 13:29 -0500
Articles	20 on this page of 26 — 10 participants

Back to article view | Back to comp.unix.shell

  Why 'break' even has effects outside of its function? Cong Wang <xiyou.wangcong@gmail.com> - 2012-03-05 08:23 +0000
    Re: Why 'break' even has effects outside of its function? Lew Pitcher <lpitcher@teksavvy.com> - 2012-03-05 01:03 -0800
      Re: Why 'break' even has effects outside of its function? Cong Wang <xiyou.wangcong@gmail.com> - 2012-03-05 09:42 +0000
        Re: Why 'break' even has effects outside of its function? Kaz Kylheku <kaz@kylheku.com> - 2012-03-05 13:59 +0000
        Re: Why 'break' even has effects outside of its function? Janis Papanagnou <janis_papanagnou@hotmail.com> - 2012-03-05 15:40 +0100
          Re: Why 'break' even has effects outside of its function? Kaz Kylheku <kaz@kylheku.com> - 2012-03-05 15:18 +0000
            Re: Why 'break' even has effects outside of its function? "Ed Morton" <mortonspam@gmail.com> - 2012-03-05 16:03 +0000
              Re: Why 'break' even has effects outside of its function? Kaz Kylheku <kaz@kylheku.com> - 2012-03-05 16:37 +0000
                Re: Why 'break' even has effects outside of its function? "Ed Morton" <mortonspam@gmail.com> - 2012-03-05 17:49 +0000
                  Re: Why 'break' even has effects outside of its function? gazelle@shell.xmission.com (Kenny McCormack) - 2012-03-05 18:02 +0000
                    Re: Why 'break' even has effects outside of its function? Ed Morton <mortonspam@gmail.com> - 2012-03-05 17:57 -0600
            Re: Why 'break' even has effects outside of its function? Janis Papanagnou <janis_papanagnou@hotmail.com> - 2012-03-05 19:22 +0100
              Re: Why 'break' even has effects outside of its function? pacman@kosh.dhis.org (Alan Curry) - 2012-03-05 22:15 +0000
                Re: Why 'break' even has effects outside of its function? Janis Papanagnou <janis_papanagnou@hotmail.com> - 2012-03-06 12:04 +0100
                  Re: Why 'break' even has effects outside of its function? pacman@kosh.dhis.org (Alan Curry) - 2012-03-06 23:10 +0000
                Re: Why 'break' even has effects outside of its function? Kaz Kylheku <kaz@kylheku.com> - 2012-03-07 05:58 +0000
                Re: Why 'break' even has effects outside of its function? Kaz Kylheku <kaz@kylheku.com> - 2012-03-07 05:58 +0000
                Re: Why 'break' even has effects outside of its function? Kaz Kylheku <kaz@kylheku.com> - 2012-03-07 07:25 +0000
                  Re: Why 'break' even has effects outside of its function? pacman@kosh.dhis.org (Alan Curry) - 2012-03-07 08:57 +0000
                    Re: Why 'break' even has effects outside of its function? Kaz Kylheku <kaz@kylheku.com> - 2012-03-07 09:39 +0000
                      Re: Why 'break' even has effects outside of its function? pacman@kosh.dhis.org (Alan Curry) - 2012-03-08 01:28 +0000
                        Re: Why 'break' even has effects outside of its function? Sven Mascheck <mascheck@email.invalid> - 2012-03-13 22:50 +0000
                          Re: Why 'break' even has effects outside of its function? Kaz Kylheku <kaz@kylheku.com> - 2012-03-13 23:00 +0000
                        Re: Why 'break' even has effects outside of its function? Kaz Kylheku <kaz@kylheku.com> - 2012-03-14 05:59 +0000
              Re: Why 'break' even has effects outside of its function? Kaz Kylheku <kaz@kylheku.com> - 2012-03-07 05:48 +0000
            Re: Why 'break' even has effects outside of its function? Barry Margolin <barmar@alum.mit.edu> - 2012-03-05 13:29 -0500

Page 1 of 2 [1] 2 Next page →

#4323 — Why 'break' even has effects outside of its function?

From	Cong Wang <xiyou.wangcong@gmail.com>
Date	2012-03-05 08:23 +0000
Subject	Why 'break' even has effects outside of its function?
Message-ID	<jj1t5r$nsj$1@wangcong.dont-email.me>

Hi, list!

I just found an interesting thing which surprised me, that is, why
the following "break 2" can even effect the loop outside of the
function foo()?

bash-4.2$ foo() { local i; for i in 1 2 3; do break; echo $i; done;}
bash-4.2$ for i in 1 2 3; do foo; echo $i; done
1
2
3
bash-4.2$ unset foo
bash-4.2$ foo() { local i; for i in 1 2 3; do break 2; echo $i; done;}
bash-4.2$ for i in 1 2 3; do foo; echo $i; done

What is the rationale behind this behaviour?

Thanks!

[toc] | [next] | [standalone]

#4324

From	Lew Pitcher <lpitcher@teksavvy.com>
Date	2012-03-05 01:03 -0800
Message-ID	<4e827b34-16d9-4a03-884c-f62c27d34a08@y10g2000vbn.googlegroups.com>
In reply to	#4323

On Mar 5, 3:23 am, Cong Wang <xiyou.wangc...@gmail.com> wrote:
> Hi, list!
>
> I just found an interesting thing which surprised me, that is, why
> the following "break 2" can even effect the loop outside of the
> function foo()?
>
> bash-4.2$ foo() { local i; for i in 1 2 3; do break; echo $i; done;}
> bash-4.2$ for i in 1 2 3; do foo; echo $i; done
> 1
> 2
> 3
> bash-4.2$ unset foo
> bash-4.2$ foo() { local i; for i in 1 2 3; do break 2; echo $i; done;}
> bash-4.2$ for i in 1 2 3; do foo; echo $i; done
>
> What is the rationale behind this behaviour?
>
> Thanks!

The number argument to break indicates the number of loops (at most)
to break out of.

Thus,
  do #loop 1
    do #loop 2
      break 1
    done #loop 2
  done #loop 1
will only break out of loop 2, and not loop 1, while
  do #loop 1
    do #loop 2
      break 2
    done #loop 2
  done #loop 1
will break out of both loop 2 /and/ loop 1

Your use of break matches this second case.

FWIW, see bash(1) ("man 1 bash")

       break [n]
              Exit from within a for, while, until, or select loop.
If  n  is
              specified, break n levels.  n must be ≥ 1.  If n is
greater than
              the number of enclosing loops, all enclosing loops  are
exited.
              The  return  value is 0 unless n is not greater than or
equal to
              1.

[toc] | [prev] | [next] | [standalone]

#4326

From	Cong Wang <xiyou.wangcong@gmail.com>
Date	2012-03-05 09:42 +0000
Message-ID	<jj21pu$es9$1@wangcong.dont-email.me>
In reply to	#4324

On Mon, 05 Mar 2012 at 09:03 GMT, Lew Pitcher <lpitcher@teksavvy.com> wrote:
> On Mar 5, 3:23 am, Cong Wang <xiyou.wangc...@gmail.com> wrote:
>> Hi, list!
>>
>> I just found an interesting thing which surprised me, that is, why
>> the following "break 2" can even effect the loop outside of the
>> function foo()?
>>
>> bash-4.2$ foo() { local i; for i in 1 2 3; do break; echo $i; done;}
>> bash-4.2$ for i in 1 2 3; do foo; echo $i; done
>> 1
>> 2
>> 3
>> bash-4.2$ unset foo
>> bash-4.2$ foo() { local i; for i in 1 2 3; do break 2; echo $i; done;}
>> bash-4.2$ for i in 1 2 3; do foo; echo $i; done
>>
>> What is the rationale behind this behaviour?
>>
>> Thanks!
>
> The number argument to break indicates the number of loops (at most)
> to break out of.
>

Of course I knew this...

The question is that why "break 2" can even break the loop *outside
of* the function which contains it? From my current understanding, it
should only effect the loops *within* the function (in above example,
foo()), breaking the loop outside of foo() is odd...

[toc] | [prev] | [next] | [standalone]

#4327

From	Kaz Kylheku <kaz@kylheku.com>
Date	2012-03-05 13:59 +0000
Message-ID	<20120305051612.291@kylheku.com>
In reply to	#4326

On 2012-03-05, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> The question is that why "break 2" can even break the loop *outside
> of* the function which contains it? From my current understanding, it
> should only effect the loops *within* the function (in above example,
> foo()), breaking the loop outside of foo() is odd...

The answer is that bash uses dynamic scoping to implement break rather than
static (i.e. lexical) scoping. 

That is to say, the loop body establishes a break escape point not lexically
(via physical enclosure of one abstract syntax tree within another, resolved
without the help of the run-time environment) but dynamically (via the run-time
stack of evaluation frames, resolved at run time).

The break is lexically/statically outside of the function, but dynamically it
is nested inside within the contour of the function invocation.

The Single Unix Specification has nothing to say about which sense is to be
used (it simply uses the word "enclosing"), and so it means that
implementations should probably behave conservatively and provide a dynamic
interpretation of "enclosing".

[toc] | [prev] | [next] | [standalone]

#4328

From	Janis Papanagnou <janis_papanagnou@hotmail.com>
Date	2012-03-05 15:40 +0100
Message-ID	<jj2j84$t11$1@news.m-online.net>
In reply to	#4326

On 05.03.2012 10:42, Cong Wang wrote:
> On Mon, 05 Mar 2012 at 09:03 GMT, Lew Pitcher <lpitcher@teksavvy.com> wrote:
>> On Mar 5, 3:23 am, Cong Wang <xiyou.wangc...@gmail.com> wrote:
>>> Hi, list!
>>>
>>> I just found an interesting thing which surprised me, that is, why
>>> the following "break 2" can even effect the loop outside of the
>>> function foo()?
>>>
>>> bash-4.2$ foo() { local i; for i in 1 2 3; do break; echo $i; done;}
>>> bash-4.2$ for i in 1 2 3; do foo; echo $i; done
>>> 1
>>> 2
>>> 3
>>> bash-4.2$ unset foo
>>> bash-4.2$ foo() { local i; for i in 1 2 3; do break 2; echo $i; done;}
>>> bash-4.2$ for i in 1 2 3; do foo; echo $i; done
>>>
>>> What is the rationale behind this behaviour?
>>>
>>> Thanks!
>>
>> The number argument to break indicates the number of loops (at most)
>> to break out of.
>>
> 
> Of course I knew this...
> 
> The question is that why "break 2" can even break the loop *outside
> of* the function which contains it? From my current understanding, it
> should only effect the loops *within* the function (in above example,
> foo()), breaking the loop outside of foo() is odd...

I agree with you that this - even if considering the "dynamic scope"
explanation for bash (and zsh) - is odd. The ksh behaves differently.

The POSIX text apparently is just a slightly modified text of the
description of break in "The [new] Kornshell" book. I disagree with
the other poster, though, that "enclosing" as described in:

  "exit from the smallest enclosing for, while, or until loop"

should be seen or interpreted as some dynamic structure. This sort of
formulation is commonly used as a syntactic, thus static, description
of structure.

Janis

[toc] | [prev] | [next] | [standalone]

#4329

From	Kaz Kylheku <kaz@kylheku.com>
Date	2012-03-05 15:18 +0000
Message-ID	<20120305064121.221@kylheku.com>
In reply to	#4328

On 2012-03-05, Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
> I agree with you that this - even if considering the "dynamic scope"
> explanation for bash (and zsh) - is odd. The ksh behaves differently.

Users of compiled languages will find this odd, because they have
been conditioned that way.

It's hardly odd in a dynamic language geared toward interpretation.

It's a useful behavior for refactoring because it lets you move code from the
body of a loop into a helper function even if it contains breaks.

Say, what happens in ksh if you do "eval break" instead of "break"?

The dynamic interpretation trivially still sees the following break as being
enclosed in the loop. It works as naively expected and life goes on:

  while true ; do
    eval break
  done

I do not regard this as being lexically enclosed, reason being that "break" is
just a piece of data at this point, and not a command, and therefore it has no
lexical relationship with the loop. The point where it becomes a command is in
the bowels of eval, which is a function with a remote scope somewhere in the
bowels of the shell.

But it is textually enclosed, and good luck explaining to newbies
the difference. Some shell programmers might well not find it odd and expect
the eval to work, even if they don't expect break-in-a-function to work.

[toc] | [prev] | [next] | [standalone]

#4330

From	"Ed Morton" <mortonspam@gmail.com>
Date	2012-03-05 16:03 +0000
Message-ID	<201203051603522493@webuse.net>
In reply to	#4329

Kaz Kylheku <kaz@kylheku.com> wrote:

> On 2012-03-05, Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
> > I agree with you that this - even if considering the "dynamic scope"
> > explanation for bash (and zsh) - is odd. The ksh behaves differently.
> 
> Users of compiled languages will find this odd, because they have
> been conditioned that way.
> 
> It's hardly odd in a dynamic language geared toward interpretation.

The good news is that anyone writing code that uses break to jump through
multiple levels of loops will almost certainly have FAR more problems with their
software than just whether or not the break takes them out of a function so none
of this really matters...

     Ed.

Posted using www.webuse.net

[toc] | [prev] | [next] | [standalone]

#4331

From	Kaz Kylheku <kaz@kylheku.com>
Date	2012-03-05 16:37 +0000
Message-ID	<20120305081421.785@kylheku.com>
In reply to	#4330

On 2012-03-05, Ed Morton <mortonspam@gmail.com> wrote:
> Kaz Kylheku <kaz@kylheku.com> wrote:
>
>> On 2012-03-05, Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
>> > I agree with you that this - even if considering the "dynamic scope"
>> > explanation for bash (and zsh) - is odd. The ksh behaves differently.
>> 
>> Users of compiled languages will find this odd, because they have
>> been conditioned that way.
>> 
>> It's hardly odd in a dynamic language geared toward interpretation.
>
> The good news is that anyone writing code that uses break to jump through
> multiple levels of loops will almost certainly have FAR more problems with their
> software than just whether or not the break takes them out of a function so none
> of this really matters...

That is nonsense. Without a multi level break, you end up with messy gotos
or extra variables that are tested at every level, both of which are worse.

Also, a function return is a kind of multi-level break.

You've never written a C (or other) program that returned from inside a loop?

Please rewrite the following to meet the religious obligations of avoiding such
a thing:

int find_val_in_3d_vec(int ***a, int rowsize, int val, int *p2, int *p1, int *p0)
{
  int i2, i1, i0;

  /* top two levels of a are null-terminated vectors;
     for the leaf arrays, rowsize specifies the length. */

  for (i2 = 0; a[i2]; i2++)
    for (i1 = 0; a[i2][i1]; i1++)
       for (i0 = 0; i0 < rowsize; i0++)
         if (a[i2][i1][i0] == val) {
           *p2 = i2, *p1 = i1, *p0 = i0;
           return 1;
         }

  return 0;
}

[toc] | [prev] | [next] | [standalone]

#4332

From	"Ed Morton" <mortonspam@gmail.com>
Date	2012-03-05 17:49 +0000
Message-ID	<201203051749142493@webuse.net>
In reply to	#4331

Kaz Kylheku <kaz@kylheku.com> wrote:

> On 2012-03-05, Ed Morton <mortonspam@gmail.com> wrote:
> > Kaz Kylheku <kaz@kylheku.com> wrote:
> >
> >> On 2012-03-05, Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
> >> > I agree with you that this - even if considering the "dynamic scope"
> >> > explanation for bash (and zsh) - is odd. The ksh behaves differently.
> >> 
> >> Users of compiled languages will find this odd, because they have
> >> been conditioned that way.
> >> 
> >> It's hardly odd in a dynamic language geared toward interpretation.
> >
> > The good news is that anyone writing code that uses break to jump through
> > multiple levels of loops will almost certainly have FAR more problems with their
> > software than just whether or not the break takes them out of a function so none
> > of this really matters...
> 
> That is nonsense. Without a multi level break, you end up with messy gotos
> or extra variables that are tested at every level, both of which are worse.
> 
> Also, a function return is a kind of multi-level break.
> 
> You've never written a C (or other) program that returned from inside a loop?

Not since my first programming class before I knew any better.
 
> Please rewrite the following to meet the religious obligations of avoiding such
> a thing:

Nothing to do with religion - it's about clarity and maintainability.
 
> int find_val_in_3d_vec(int ***a, int rowsize, int val, int *p2, int *p1, int *p0)
> {
>   int i2, i1, i0;
> 
>   /* top two levels of a are null-terminated vectors;
>      for the leaf arrays, rowsize specifies the length. */
> 
>   for (i2 = 0; a[i2]; i2++)
>     for (i1 = 0; a[i2][i1]; i1++)
>        for (i0 = 0; i0 < rowsize; i0++)
>          if (a[i2][i1][i0] == val) {
>            *p2 = i2, *p1 = i1, *p0 = i0;
>            return 1;
>          }
> 
>   return 0;
> }
>
 
int find_val_in_3d_vec(int ***a, int rowsize, int val, int *p2, int *p1, int *p0)
{
  int found_val = 0;
  int i2, i1, i0;

  /* top two levels of a are null-terminated vectors;
     for the leaf arrays, rowsize specifies the length. */

  for (i2 = 0; a[i2] && !found_val; i2++)
    for (i1 = 0; a[i2][i1] && !found_val; i1++)
       for (i0 = 0; (i0 < rowsize) && !found_val; i0++)
         if (a[i2][i1][i0] == val) {
           *p2 = i2, *p1 = i1, *p0 = i0;
           found_val = 1;
         }

  printf("DEBUG: %s result is %d\n",__FUNCTION__,found_val);

  return found_val;
}

Notice that now the reasons for exiting each "for" loop are clearly shown in the
terminating conditions of each instead of one of the reasons being hidden and
un-named inside an "if" statement inside the code of the innermost loop. The
code is now a bit easier to understand and less likely to get broken by future
enhancements.

Also notice how easy it was for me to add a debugging statement to just print
the return value of the function.

    Ed.

Posted using www.webuse.net

[toc] | [prev] | [next] | [standalone]

#4334

From	gazelle@shell.xmission.com (Kenny McCormack)
Date	2012-03-05 18:02 +0000
Message-ID	<jj2v4j$f3f$1@news.xmission.com>
In reply to	#4332

In article <201203051749142493@webuse.net>,
Ed Morton <mortonspam@gmail.com> wrote:
...
>> Please rewrite the following to meet the religious obligations of
>avoiding such
>> a thing:
>
>Nothing to do with religion - it's about clarity and maintainability.

I.e., religion.

Note that religious people do not consider the things they believe to be
"religious" - they consider them to be "facts".

Similarly, programming zealots do not consider the things they believe to be
zealous - they consider them to be "about clarity and maintainability".

-- 
Religion is regarded by the common people as true,
	by the wise as foolish,
	and by the rulers as useful.

(Seneca the Younger, 65 AD)

[toc] | [prev] | [next] | [standalone]

#4345

From	Ed Morton <mortonspam@gmail.com>
Date	2012-03-05 17:57 -0600
Message-ID	<jj3jsj$ulb$1@dont-email.me>
In reply to	#4334

On 3/5/2012 12:02 PM, Kenny McCormack wrote:
> In article<201203051749142493@webuse.net>,
> Ed Morton<mortonspam@gmail.com>  wrote:
> ...
>>> Please rewrite the following to meet the religious obligations of
>> avoiding such
>>> a thing:
>>
>> Nothing to do with religion - it's about clarity and maintainability.
>
> I.e., religion.

No, clarity and maintainability are some of the many goals during software 
development. Religion is not.

Not sure where you're going with this as I don't know how you feel about 
religion: is it that you think clarity and maintainability are divine, 
un-fathomable entities or that you think clarity and maintainability are 
superstitious nonsense that aren't worth considering?

For me they're just a couple of things to consider when writing code but YMMV.

     Ed.

[toc] | [prev] | [next] | [standalone]

#4335

From	Janis Papanagnou <janis_papanagnou@hotmail.com>
Date	2012-03-05 19:22 +0100
Message-ID	<jj309m$7d9$1@news.m-online.net>
In reply to	#4329

On 05.03.2012 16:18, Kaz Kylheku wrote:
> On 2012-03-05, Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
>> I agree with you that this - even if considering the "dynamic scope"
>> explanation for bash (and zsh) - is odd. The ksh behaves differently.
> 
> Users of compiled languages will find this odd, because they have
> been conditioned that way.

I don't want to discuss that mindset of what people might think or not.

So I'll just tell you that I, personally, am used in a lot of language
types, interpretative and compiled, procedural and object oriented, or
functional, assembler and high level languages. And this interpretation
of bash's break I do consider to be very odd.

Your red herring about "users of compiled languages" isn't effective.

(BTW, for a really interesting dynamic concept have a look at Simula's
execution model, which is one of the most complex and dynamic that I've
seen in programming languages.)

> 
> It's hardly odd in a dynamic language geared toward interpretation.
> 
> It's a useful behavior for refactoring because it lets you move code from the
> body of a loop into a helper function even if it contains breaks.
> 
> Say, what happens in ksh if you do "eval break" instead of "break"?

What do you think happens? It's not even relevant in the given context!

Simply put, eval will make one more parse iteration on an argument which
defines a special built-in command break, which results in that special
built-in command. The result that you get is the same as without; bash's
odd result and ksh's not so odd result.

> 
> The dynamic interpretation trivially still sees the following break as being
> enclosed in the loop. It works as naively expected and life goes on:
> 
>   while true ; do
>     eval break
>   done

What do you think this tells us WRT the topic in question? - That break
will still "exit from the smallest enclosing for, while, or until loop",
(and independent of eval), in bash and in ksh. A static relation.

> 
> I do not regard this as being lexically enclosed, reason being that "break" is
> just a piece of data at this point, and not a command, and therefore it has no
> lexical relationship with the loop.

You are introducing a command eval to declare break as data but know that
after the shell parsing it's a command and handled as such. Introducing
eval doesn't make that argument any better even if you camouflage it behind
eval. (Yet another red herring.)

> The point where it becomes a command is in
> the bowels of eval, which is a function with a remote scope somewhere in the
> bowels of the shell.

It's just a "special built-in command" of the shell, with a very specific
semantics; eval is a prominent exception in the shell's commands set; but
it doesn't contribute to the question of "break N".

> 
> But it is textually enclosed, and good luck explaining to newbies
> the difference. Some shell programmers might well not find it odd and expect
> the eval to work, even if they don't expect break-in-a-function to work.

Why are you thinking there's a problem explaining break, eval, or both?

Eval doesn't contribute to an explanation why break in bash behaves as
it does.

Janis

[toc] | [prev] | [next] | [standalone]

#4339

From	pacman@kosh.dhis.org (Alan Curry)
Date	2012-03-05 22:15 +0000
Message-ID	<jj3du8$jqm$1@speranza.aioe.org>
In reply to	#4335

In article <jj309m$7d9$1@news.m-online.net>,
Janis Papanagnou  <janis_papanagnou@hotmail.com> wrote:
>On 05.03.2012 16:18, Kaz Kylheku wrote:
>> On 2012-03-05, Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
>>> I agree with you that this - even if considering the "dynamic scope"
>>> explanation for bash (and zsh) - is odd. The ksh behaves differently.
>> 
>> Users of compiled languages will find this odd, because they have
>> been conditioned that way.
>
>I don't want to discuss that mindset of what people might think or not.
>
>So I'll just tell you that I, personally, am used in a lot of language
>types, interpretative and compiled, procedural and object oriented, or
>functional, assembler and high level languages. And this interpretation
>of bash's break I do consider to be very odd.

perl allows this, but generates a warning for it.

>
>> 
>> The dynamic interpretation trivially still sees the following break as being
>> enclosed in the loop. It works as naively expected and life goes on:
>> 
>>   while true ; do
>>     eval break
>>   done
>>
>
>What do you think this tells us WRT the topic in question? - That break
>will still "exit from the smallest enclosing for, while, or until loop",
>(and independent of eval), in bash and in ksh. A static relation.

Right, but the second parsing is done on the eval'ed command line later, and
the parsing of the loop is not in progress at the same time, so it's weird to
think of the loop as "lexically enclosing" the second parse. It's more
natural (for me at least) that eval should create a new top-level lexical
environment.

perl also generates a warning for a loop operator that looks outside an eval
to find its loop.

>
>> 
>> I do not regard this as being lexically enclosed, reason being that "break" is
>> just a piece of data at this point, and not a command, and therefore it has no
>> lexical relationship with the loop.
>
>You are introducing a command eval to declare break as data but know that
>after the shell parsing it's a command and handled as such. Introducing
>eval doesn't make that argument any better even if you camouflage it behind
>eval. (Yet another red herring.)

I think of lexical relationships as existing between "in-progress parse tree
nodes", so that once you've finished parsing something, it's impossible for
anything else to later reach into its lexical environment and attach to
something there (in this case, the loop's endpoint).

>
>Why are you thinking there's a problem explaining break, eval, or both?
>
>Eval doesn't contribute to an explanation why break in bash behaves as
>it does.

What do you think of this:

  while :; do
  eval done

Should it work? Or this one:

  while :; do
  echo before
  eval done
  echo after
  done

Some of us just think of "break" as being bound to the enclosing "while" in
exactly the same syntactic manner as the "done" that ends it.

-- 
Alan Curry

[toc] | [prev] | [next] | [standalone]

#4363

From	Janis Papanagnou <janis_papanagnou@hotmail.com>
Date	2012-03-06 12:04 +0100
Message-ID	<jj4qv9$68n$1@news.m-online.net>
In reply to	#4339

On 05.03.2012 23:15, Alan Curry wrote:
> In article <jj309m$7d9$1@news.m-online.net>,
> Janis Papanagnou  <janis_papanagnou@hotmail.com> wrote:
>> On 05.03.2012 16:18, Kaz Kylheku wrote:
>>>
>>>   while true ; do
>>>     eval break
>>>   done
>>
>> What do you think this tells us WRT the topic in question? - That break
>> will still "exit from the smallest enclosing for, while, or until loop",
>> (and independent of eval), in bash and in ksh. A static relation.
> 
> Right, but the second parsing is done on the eval'ed command line later, and
> the parsing of the loop is not in progress at the same time, so it's weird to
> think of the loop as "lexically enclosing" the second parse. It's more
> natural (for me at least) that eval should create a new top-level lexical
> environment.

I understand that the one interpretation is more natural for you than
the other; since this seems odd to me I will abstain continuing on that.

[...]
> I think of lexical relationships as existing between "in-progress parse tree
> nodes", so that once you've finished parsing something, it's impossible for
> anything else to later reach into its lexical environment and attach to
> something there (in this case, the loop's endpoint).
> 
>>
>> Why are you thinking there's a problem explaining break, eval, or both?
>>
>> Eval doesn't contribute to an explanation why break in bash behaves as
>> it does.
> 
> What do you think of this:
> 
>   while :; do
>   eval done
> 
> Should it work?

This makes no sense to me, seems completely wrong, and I suppose it would
just lead to an error. A quick check with some common shells confirms that.

> Or this one:
> 
>   while :; do
>   echo before
>   eval done
>   echo after
>   done
> 
> Some of us just think of "break" as being bound to the enclosing "while" in
> exactly the same syntactic manner as the "done" that ends it.

As said initially; this seems odd to me, so I can't comment on that mindset.
All I want to say WRT your example is that 'done' and 'break' are different
sorts of lexical elements in shell. If you think of them as being in some
respect similar I can just propose to rethink about that mental model (that
stems from your perl view, IIUC) that you have of the shell syntax and its
associated operative semantics.

Janis

[toc] | [prev] | [next] | [standalone]

#4372

From	pacman@kosh.dhis.org (Alan Curry)
Date	2012-03-06 23:10 +0000
Message-ID	<jj65gb$5gf$1@speranza.aioe.org>
In reply to	#4363

In article <jj4qv9$68n$1@news.m-online.net>,
Janis Papanagnou  <janis_papanagnou@hotmail.com> wrote:
>On 05.03.2012 23:15, Alan Curry wrote:
>> 
>> What do you think of this:
>> 
>>   while :; do
>>   eval done
>> 
>> Should it work?
>
>This makes no sense to me, seems completely wrong, and I suppose it would
>just lead to an error. A quick check with some common shells confirms that.

At least everyone agrees on that.

>> 
>> Some of us just think of "break" as being bound to the enclosing "while" in
>> exactly the same syntactic manner as the "done" that ends it.
>
>As said initially; this seems odd to me, so I can't comment on that mindset.
>All I want to say WRT your example is that 'done' and 'break' are different
>sorts of lexical elements in shell. If you think of them as being in some
>respect similar I can just propose to rethink about that mental model (that
>stems from your perl view, IIUC) that you have of the shell syntax and its
>associated operative semantics.

I wrote many shell scripts before I learned perl, so that influence didn't
work that way. Besides, if influenced by perl I'd be expecting dynamic
scoping. perl has an optional warning for it, but I wouldn't expect one from
a shell since shells mostly don't do warnings.

Now that I've seen that breaking out of your caller's loop from within a
shell function is possible, I'll be sure to continue never doing it. It's too
spooky.

Oh and it's not quite universal. pdksh and posh reject it.

-- 
Alan Curry

[toc] | [prev] | [next] | [standalone]

#4374

From	Kaz Kylheku <kaz@kylheku.com>
Date	2012-03-07 05:58 +0000
Message-ID	<20120306214509.680@kylheku.com>
In reply to	#4339

On 2012-03-05, Alan Curry <pacman@kosh.dhis.org> wrote:
> In article <jj309m$7d9$1@news.m-online.net>,
> Janis Papanagnou  <janis_papanagnou@hotmail.com> wrote:
>>On 05.03.2012 16:18, Kaz Kylheku wrote:
>>> On 2012-03-05, Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
>>>> I agree with you that this - even if considering the "dynamic scope"
>>>> explanation for bash (and zsh) - is odd. The ksh behaves differently.
>>> 
>>> Users of compiled languages will find this odd, because they have
>>> been conditioned that way.
>>
>>I don't want to discuss that mindset of what people might think or not.
>>
>>So I'll just tell you that I, personally, am used in a lot of language
>>types, interpretative and compiled, procedural and object oriented, or
>>functional, assembler and high level languages. And this interpretation
>>of bash's break I do consider to be very odd.
>
> perl allows this, but generates a warning for it.
>
>>
>>> 
>>> The dynamic interpretation trivially still sees the following break as being
>>> enclosed in the loop. It works as naively expected and life goes on:
>>> 
>>>   while true ; do
>>>     eval break
>>>   done
>>>
>>
>>What do you think this tells us WRT the topic in question? - That break
>>will still "exit from the smallest enclosing for, while, or until loop",
>>(and independent of eval), in bash and in ksh. A static relation.
>
> Right, but the second parsing is done on the eval'ed command line later, and
> the parsing of the loop is not in progress at the same time, so it's weird to
> think of the loop as "lexically enclosing" the second parse. It's more
> natural (for me at least) that eval should create a new top-level lexical
> environment.
>
> perl also generates a warning for a loop operator that looks outside an eval
> to find its loop.
>
>>
>>> 
>>> I do not regard this as being lexically enclosed, reason being that "break" is
>>> just a piece of data at this point, and not a command, and therefore it has no
>>> lexical relationship with the loop.
>>
>>You are introducing a command eval to declare break as data but know that
>>after the shell parsing it's a command and handled as such. Introducing
>>eval doesn't make that argument any better even if you camouflage it behind
>>eval. (Yet another red herring.)
>
> I think of lexical relationships as existing between "in-progress parse tree
> nodes", so that once you've finished parsing something, it's impossible for
> anything else to later reach into its lexical environment and attach to
> something there (in this case, the loop's endpoint).
>
>>
>>Why are you thinking there's a problem explaining break, eval, or both?
>>
>>Eval doesn't contribute to an explanation why break in bash behaves as
>>it does.
>
> What do you think of this:
>
>   while :; do
>   eval done

while, do and done are lexical tokens in the syntax of the shell, and not
functions/commands. This is recognized and codified in the standard,
which presents a grammar.

So I think what you have above is a parse error which is recognized
inside eval as such.

> Some of us just think of "break" as being bound to the enclosing "while" in
> exactly the same syntactic manner as the "done" that ends it.

That isn't the case syntactically though. break is just a command, which
can occur in any syntax whatsoever.

   while true ; do
     case $bar in 
        fluff )
          break ;;
      ...
     esac
   done

Do you suppose there are two parser production rules for case/esac, one for
when case is embedded in a loop, and one when it isn't? (I don't
see such a thing in the spec.)

[toc] | [prev] | [next] | [standalone]

#4375

From	Kaz Kylheku <kaz@kylheku.com>
Date	2012-03-07 05:58 +0000
Message-ID	<20120306215443.547@kylheku.com>
In reply to	#4339

On 2012-03-05, Alan Curry <pacman@kosh.dhis.org> wrote:
> Some of us just think of "break" as being bound to the enclosing "while" in
> exactly the same syntactic manner as the "done" that ends it.

Some of us are wrong.

[toc] | [prev] | [next] | [standalone]

#4376

From	Kaz Kylheku <kaz@kylheku.com>
Date	2012-03-07 07:25 +0000
Message-ID	<20120306215813.85@kylheku.com>
In reply to	#4339

On 2012-03-05, Alan Curry <pacman@kosh.dhis.org> wrote:
> Right, but the second parsing is done on the eval'ed command line later, and
> the parsing of the loop is not in progress at the same time, so it's weird to
> think of the loop as "lexically enclosing" the second parse. It's more
> natural (for me at least) that eval should create a new top-level lexical
> environment.

If you think that, you are more educated than the naive user who expects the
eval to just work in the "obvious" environment where it occurs, because
you know what a "top level lexical environment" is. Yet, you are incompletely
educated if you think that is the only right behavior.

The Emacs people disagree with you, for instance. Emacs Lisp has an eval
function which does not introduce any environment. It is a dynamically scoped
dialect.  This hasn't stopped that crowd from doing useful things, and probably
a lot of things in Emacs depend on eval doing what it does in that dialect.

Anyway, the standard shell language does not have lexical /anything/.
Just top level functions and variables.

Bash has local variables, and they are dynamically scoped.  This allows
eval hacks to work when locals are involved.

bash_func()
{
  local x=3;
  local varname="x";
  eval local y=\$$varname # y becomes 3
  ...
}

this is kind of thing is useful in a language which has no macro facility.

For instance you can write a function which accepts a variable
name (for a variable in the calling environment) and can change its value.

In Bash, your eval'ed code can not only refer to the variable whose name
was passed in, but also to the local variables in the function.

You can write a function which controls the evaluation of syntax
that is passed in:

dotimes ()
{
  local __count_var=$1
  local __max_count=$2
  local __count=0

  shift; shift;

  while [ $__count -lt $__max_count ] ; do
     eval $__count_var=\$__count;
     eval "$@"
     __count=$(( __count + 1 ))
  done
}

dotimes x 10 'printf "x = %d\n" $x'

Why do we use underscores? For some measure of hygiene (which is exactly like
macro hygiene).  If the code being evaluated contains $__count, it resolves
to the local variable inside dotimes.  So if we call it count, stuff like
this will break:

dotimes x 10 'mkdir -p foo-$count-$x'   # user's $count expected, not ours

And by the way:

  #!/bin/bash

  br=break

  while true ; do
    echo in loop
    eval $br
  done

  echo out of loop

output:

  in loop
  out of loop

Bash does the "right thing" by conforming to the expectations of shell coders,
The alternative is to die with an error, which is nonproductive.

That naive behavior is not so naive: it lets you do things that are not
possible otherwise.

Now here is my point. Think about this.  Suppose the above "eval $br" were to
die with error like "bash: break: only meaningful in a `for', `while', or
`until' loop".  Given that we can write the dotimes function above, with
the crazy eval hacks that work, and that we have to hide our local variables
with underscores, etc, don't you think that this diagnostic would be
/laughably/ inconsistent?

> Some of us just think of "break" as being bound to the enclosing "while" in
> exactly the same syntactic manner as the "done" that ends it.

done is just punctuation, like a closing parenthesis.  break has semantics of
its own.

Moreover, you do not necessarily know, statically, where it goes. Remember, it
takes an argument: break $run_time_value .

[toc] | [prev] | [next] | [standalone]

#4377

From	pacman@kosh.dhis.org (Alan Curry)
Date	2012-03-07 08:57 +0000
Message-ID	<jj77u1$jn3$1@speranza.aioe.org>
In reply to	#4376

In article <20120306215813.85@kylheku.com>,
Kaz Kylheku  <kaz@kylheku.com> wrote:
>On 2012-03-05, Alan Curry <pacman@kosh.dhis.org> wrote:
>> Right, but the second parsing is done on the eval'ed command line later, and
>> the parsing of the loop is not in progress at the same time, so it's weird to
>> think of the loop as "lexically enclosing" the second parse. It's more
>> natural (for me at least) that eval should create a new top-level lexical
>> environment.
>
[...]
>
>Anyway, the standard shell language does not have lexical /anything/.
>Just top level functions and variables.

The shell itself contains a parser for the language. An obvious
implementation would create a new one of those, with no way of knowing
whether it's inside a loop in the outer execution environment, and thus no
way to influence an outer loop's execution.

>
>For instance you can write a function which accepts a variable
>name (for a variable in the calling environment) and can change its value.

Having access to variables is nowhere near as deep a relationship as having
access to the execution control stack that keps track loops currently in
progress!

>
>In Bash, your eval'ed code can not only refer to the variable whose name
>was passed in, but also to the local variables in the function.

Global variables are accessible everywhere, including functions and evals.
That doesn't seem like a big deal to me.

>
>You can write a function which controls the evaluation of syntax
>that is passed in:

This should be good...

>
>dotimes ()
>{
>  local __count_var=$1
>  local __max_count=$2
>  local __count=0
>
>  shift; shift;
>
>  while [ $__count -lt $__max_count ] ; do
>     eval $__count_var=\$__count;

I think that would be just as good without the backslash. And indirect
assignments are surely one of the most common uses of eval, so not
surprising.

>     eval "$@"

Whether this is a spooky eval or not depends on what the caller put in it...

>     __count=$(( __count + 1 ))
>  done
>}
>
>dotimes x 10 'printf "x = %d\n" $x'

And this caller doesn't put anything interesting there. This works perfectly
in pdksh, which doesn't support "eval break". If you try

dotimes x 10 'printf "x = %d\n" $x;break'

then most shells just print "x = 0" and quit, but pdksh prints the 10 values
of x, with a "break: cannot break" error after each one. And pdksh is still
matching up with my intuition, so at least I'm not crazy in a *unique* way.
Reading and writing of global variables is just not surprising. Breaking out
of a "foreign" loop (one that was not parsed by the same parsing pass as the
break statement) is a whole different category.

>
>Why do we use underscores? For some measure of hygiene (which is exactly like
>macro hygiene).  If the code being evaluated contains $__count, it resolves
>to the local variable inside dotimes.  So if we call it count, stuff like
>this will break:
>
>dotimes x 10 'mkdir -p foo-$count-$x'   # user's $count expected, not ours
>
>And by the way:
>
>  #!/bin/bash
>
>  br=break
>
>  while true ; do
>    echo in loop
>    eval $br
>  done
>
>  echo out of loop
>
>output:
>
>  in loop
>  out of loop
>
>Bash does the "right thing" by conforming to the expectations of shell coders,
>The alternative is to die with an error, which is nonproductive.

pdksh does the "right thing" by telling the coder he's nuts.

>
>That naive behavior is not so naive: it lets you do things that are not
>possible otherwise.
>
>Now here is my point. Think about this.  Suppose the above "eval $br" were to
>die with error like "bash: break: only meaningful in a `for', `while', or

...like pdksh's error message but more verbose....

>`until' loop".  Given that we can write the dotimes function above, with
>the crazy eval hacks that work, and that we have to hide our local variables
>with underscores, etc, don't you think that this diagnostic would be
>/laughably/ inconsistent?

Me and pdksh don't think so, since we see looping control structures as
super-low-level stuff that should resolve at parse time, and shell variables
as a simple global mapping of strings to other strings.

>
>> Some of us just think of "break" as being bound to the enclosing "while" in
>> exactly the same syntactic manner as the "done" that ends it.
>
>done is just punctuation, like a closing parenthesis.  break has semantics of
>its own.
>
>Moreover, you do not necessarily know, statically, where it goes. Remember, it
>takes an argument: break $run_time_value .

Oh hell, that seals it. All the previous examples are only slightly spooky.
That's like a computed goto. No... it's more than that. Combined with the
ability to break out of a function, you can't even know how many functions
you'll be prematurely remotely terminating. It's pure evil... it's LONGJMP!

The Bourne shell has been hiding a longjmp equivalent all this time. Must
wash hands... can't get clean...

Come back, csh, all is forgiven

-- 
Alan Curry

[toc] | [prev] | [next] | [standalone]

#4378

From	Kaz Kylheku <kaz@kylheku.com>
Date	2012-03-07 09:39 +0000
Message-ID	<20120307010429.677@kylheku.com>
In reply to	#4377

On 2012-03-07, Alan Curry <pacman@kosh.dhis.org> wrote:
> In article <20120306215813.85@kylheku.com>,
> Kaz Kylheku  <kaz@kylheku.com> wrote:
>>On 2012-03-05, Alan Curry <pacman@kosh.dhis.org> wrote:
>>> Right, but the second parsing is done on the eval'ed command line later, and
>>> the parsing of the loop is not in progress at the same time, so it's weird to
>>> think of the loop as "lexically enclosing" the second parse. It's more
>>> natural (for me at least) that eval should create a new top-level lexical
>>> environment.
>>
> [...]
>>
>>Anyway, the standard shell language does not have lexical /anything/.
>>Just top level functions and variables.
>
> The shell itself contains a parser for the language. An obvious
> implementation would create a new one of those, with no way of knowing
> whether it's inside a loop in the outer execution environment, and thus no
> way to influence an outer loop's execution.
>
>>
>>For instance you can write a function which accepts a variable
>>name (for a variable in the calling environment) and can change its value.
>
> Having access to variables is nowhere near as deep a relationship as having
> access to the execution control stack that keps track loops currently in
> progress!

It is exactly the same thing. The control stack contains variables that
keep track of this.

>
>>In Bash, your eval'ed code can not only refer to the variable whose name
>>was passed in, but also to the local variables in the function.
>
> Global variables are accessible everywhere, including functions and evals.
> That doesn't seem like a big deal to me.
>
>>
>>You can write a function which controls the evaluation of syntax
>>that is passed in:
>
> This should be good...
>
>>
>>dotimes ()
>>{
>>  local __count_var=$1
>>  local __max_count=$2
>>  local __count=0
>>
>>  shift; shift;
>>
>>  while [ $__count -lt $__max_count ] ; do
>>     eval $__count_var=\$__count;
>
> I think that would be just as good without the backslash.

Yes, but only because we control __count and we know that __count contains a
number. A double evaluation of a number just yields that number.

In the general case, we want that dollar sign to survive into the eval, so the
eval will properly evaluate $__count, expanding just once.

> And indirect
> assignments are surely one of the most common uses of eval, so not
> surprising.
>
>>     eval "$@"
>
> Whether this is a spooky eval or not depends on what the caller put in it...

Not really. Unless the caller knows about __count_var and __count, 
the eval will behave in an expected way.

>>     __count=$(( __count + 1 ))
>>  done
>>}
>>
>>dotimes x 10 'printf "x = %d\n" $x'
>
> And this caller doesn't put anything interesting there. This works perfectly
> in pdksh, which doesn't support "eval break". 

Thus, it is inconsistent.

> If you try
>
> dotimes x 10 'printf "x = %d\n" $x;break'
>
> then most shells just print "x = 0" and quit,

But that is a feature! break bails out of dotimes properly. dotimes is a wrapper
around a while loop.

>>Bash does the "right thing" by conforming to the expectations of shell coders,
>>The alternative is to die with an error, which is nonproductive.
>
> pdksh does the "right thing" by telling the coder he's nuts.

Which is out of place in a language that allows nutty eval hacks.

Bash is consistent.

>>
>>That naive behavior is not so naive: it lets you do things that are not
>>possible otherwise.
>>
>>Now here is my point. Think about this.  Suppose the above "eval $br" were to
>>die with error like "bash: break: only meaningful in a `for', `while', or
>
> ...like pdksh's error message but more verbose....
>
>>`until' loop".  Given that we can write the dotimes function above, with
>>the crazy eval hacks that work, and that we have to hide our local variables
>>with underscores, etc, don't you think that this diagnostic would be
>>/laughably/ inconsistent?
>
> Me and pdksh don't think so, since we see looping control structures as
> super-low-level stuff that should resolve at parse time, and shell variables
> as a simple global mapping of strings to other strings.

That is laughably inconsistent. Part of the language has delusions of grandeur
about becoming a compiled language one day, and part of the language is
a dynamically scoped hack. 

Simple global mapping? Are there no locals then?


>
>>
>>> Some of us just think of "break" as being bound to the enclosing "while" in
>>> exactly the same syntactic manner as the "done" that ends it.
>>
>>done is just punctuation, like a closing parenthesis.  break has semantics of
>>its own.
>>
>>Moreover, you do not necessarily know, statically, where it goes. Remember, it
>>takes an argument: break $run_time_value .
>
> Oh hell, that seals it. All the previous examples are only slightly spooky.
> That's like a computed goto.

Precisely: with N nestings, there are N possible target points, selected by number.
Nothing "like" about it.

> No... it's more than that. Combined with the
> ability to break out of a function, you can't even know how many functions
> you'll be prematurely remotely terminating. It's pure evil... it's LONGJMP!

It would be even more like longjmp if it used named blocks instead.

 break main_event_loop_way_up_high

> The Bourne shell has been hiding a longjmp equivalent all this time. Must
> wash hands... can't get clean...

But bash has the internal unwind-protect to undo the local variables, whereas
longjmp doesn't clean up (unless you wrap some functionality around it and use
only the disciplined interface).

  # global x
  x=42

  func()
  {
     local x=0
     break;
  }

  while true; do
    func
  done

  echo $x

Output:

  42

longjmp, but with proper unwinding of the dynamic scope which restores the
global value of x.

Man is this break hack ever well-supported, wouldn't you say. :)

I'd be curious what happens if you step through this in the bash debugger, bashdb.
Will it preserve the semantics? Ha.

[toc] | [prev] | [next] | [standalone]

Page 1 of 2 [1] 2 Next page →

csiph-web

Why 'break' even has effects outside of its function?

Contents

#4323 — Why 'break' even has effects outside of its function?

#4324

#4326

#4327

#4328

#4329

#4330

#4331

#4332

#4334

#4345

#4335

#4339

#4363

#4372

#4374

#4375

#4376

#4377

#4378