Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!eternal-september.org!feeder.eternal-september.org!mx04.eternal-september.org!.POSTED!not-for-mail
From: Steven Simpson <ss@domain.invalid>
Newsgroups: comp.lang.java.programmer
Subject: Re: Why is that in JDK8: value used in lambda expression shuld be effectively final?
Date: Sat, 05 Jan 2013 13:20:10 +0000
Organization: A noiseless patient Spider
Lines: 148
Message-ID: <atllr9-v74.ln1@s.simpson148.btinternet.com>
References: <c885bedb-977c-49b4-a10a-b9cdd3df1dd9@googlegroups.com> <m0sdr9-0tg.ln1@s.simpson148.btinternet.com> <9f030e71-96ab-4ead-9690-4369f4a19aa9@googlegroups.com> <pk7er9-i4j.ln1@s.simpson148.btinternet.com> <d6fdb884-3181-4b37-a011-6da19e649758@googlegroups.com> <odmer9-onl.ln1@s.simpson148.btinternet.com> <0680c1e0-16cb-4791-8a5f-95a3ff2bcba8@googlegroups.com> <fa166a9f-b0ed-4eba-9789-97ef2f67923a@googlegroups.com> <mcujr9-dfs.ln1@s.simpson148.btinternet.com> <9t7503ltp93d.t74qawj4roq7.dlg@40tude.net> <i66kr9-5pt.ln1@s.simpson148.btinternet.com> <3ww7gkbbwtco.8pbf4v2rzex6$.dlg@40tude.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Info: mx04.eternal-september.org; posting-host="9fa8d7cb0df1c78c19c94ab218768777"; logging-data="9474"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX19P0SiaJu48h32sVKLIJlUwLueL5z8BoyQ="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/17.0 Thunderbird/17.0
In-Reply-To: <3ww7gkbbwtco.8pbf4v2rzex6$.dlg@40tude.net>
Cancel-Lock: sha1:vzPP0mhORFOzgpMpCCWd9Ig2kFc=
Xref: csiph.com comp.lang.java.programmer:20984

On 05/01/13 01:24, Peter Duniho wrote:
> On Fri, 04 Jan 2013 23:45:54 +0000, Steven Simpson wrote:
>> [...]
>> Do these examples fall into just a couple of categories, in terms of the
>> guarantees that the user of a function object makes to the provider
>> about how it will be invoked?:
>>
>>   1. The object will be invoked on the same thread that provided it (e.g.
>>      in an event-driven environment)
>>   2. (1) + the object will not be invoked once the call that provided it
>>      has completed (e.g. in control abstraction)
>>
>> ...or are there more, especially ones which make fewer or alternative
>> guarantees?
> Honestly, I'm not aware of a lambda/closure taxonomy that uses those
> categories at all.  If such exists, I am ignorant of it.

That's because I'm making them up!  :-D

The goal I'm trying to reason my way towards is that, since Java is 
supposed to provide a degree of safety (and so mutable local capture and 
other features are currently missing from its 'closures'), there ought 
to be a way to signal to the compiler cases where it is safe to enable 
those features.  Rather than the provider of the function object 
choosing when it is safe (e.g. by annotating shared variables with 
@Shared, or manually boxing them in 1-element arrays), the user of the 
object should choose by annotating the parameter that conveys the 
object, because that user is in a position to make guarantees about the use.


> First and foremost is that the "user of a function object" often has no
> knowledge that a closure is being used at all.  It knows it's received a
> function object of some sort (e.g. a delegate instance in C#/.NET), but the
> origin of that object could be varied.

That's why I used the term 'function object'.


> A given API may or may not make promises regarding usage of the function
> object, but these promises likely have little to do with the declaration
> and implementation of the function object.

Whatever form the function object was originally expressed as, the one 
who invokes it is in a position to make guarantees about how it is 
invoked, and I'm suggesting that these promises could influence the 
options available for expressing the function object.


> Beyond that, wrt point #1: function objects, even those made from closures
> with captured variables, may be safely invoked even in a multi-threaded
> environment, so long as the code is written correctly.  For example,
> enforcing volatile or fully-synchronized access to shared variables
> (whether due to capturing or otherwise).

I would expect a good proportion of these cases to end up requiring 
certain objects to be created explicitly anyway, simply to have 
something to synchronize on, in which case, Java's lambdas would be 
adequate because the variables would explicitly be not local.

But I don't know, hence I'm asking for examples.


> Wrt point #2: because variable capturing involves effectively "lifting" the
> variable out of the declaring method's local context, invoking the function
> object after the declaring method has completed is not a concern at all.

(That wasn't the concern particularly, though I think there have been 
notions that MLC would be achieved by means other than boxing, and such 
means would depend on the local's lifetime.  I don't know enough about 
this to elaborate.)

The idea behind #2 is that there is a class of closure use cases which 
are really demanding control abstraction.  If the function-object user 
makes sufficient promises about how it is invoked, the compiler could 
permit a control-abstraction syntax as an alternative to a lambda, in 
which the code can do anything that a local block could do, including 
throwing, returning, breaking, continuing.

If one can show that practically all cases with a /prima facie/ demand 
for additional features from Java lambdas actually fall into categories 
#1 or #2, it guides us in developing additional closure features for 
Java that satisfy the conflicting demands of safety versus expressiveness.

An argument about category #1 could run something like this...  One camp 
demands to be able to write:

   int sum = 0;
   list.forEach((v) -> { sum += v; });

Another camp says, "ouch, don't like that!" because forEach might 
execute the lambda in parallel.  You should therefore get an error or 
warning about shared access to 'sum'.

The first camp says, "okay, we'll take responsibility for 'sum' by 
declaring that we know it's shared," and either boxes the variable 
manually, or proposes an annotation to silence the warning or permit the 
code:

   @Shared
   int sum = 0;
   list.forEach((v) -> { sum += v; });

The problem is that this depends on whether the author has understood 
the (informal) guarantees that might be provided by forEach: "This 
method invokes its argument multiple times in serial."

Instead of the author of the code above deciding when it's safe to use 
MLC, a serial contract for forEach should formally express the 
guarantee, (say) by annotating its parameter:

   void forEach(@Serial Block action);

The compiler then turns on MLC for any lambda assigned to 'action', 
without having to use @Shared or explicit boxing.  Of course, the 
implementation has to make the corresponding guarantee to meet the 
method's contract.

Alongside forEach could be parallelForEach:

   void parallelForEach(Block action);

The lack of annotation means that its implementation does not have to 
make the guarantee.

The provider of a lambda now has a choice between a potentially parallel 
implementation, which won't allow MLC, and a guaranteed serial one, 
which will.

The no-MLC camp is happy, because they can provide parallelForEach 
exploiting parallelism, and know that the user won't inadvertently mess 
about with locals unsafely.  The MLC camp is happy, because they can use 
forEach to mutate their locals safely.

Some obvious holes include:

  * A method accepting a @Serial Block could fail to make the serial
    guarantee, and the compiler would not be able to check it.  However,
    I don't think that's much worse than the compiler not being able to
    check that (say) a Comparator implementation isn't returning random
    values.
  * I'm using annotations as if they are some sort of type qualifier,
    which I doubt is a valid way to regard them. Not sure.



-- 
ss at comp dot lancs dot ac dot uk