Reaching Scope of Lambda Expressions

ISO/IEC JTC1 SC22 WG21 N2998 = 09-0188 - 2009-10-22

Lawrence Crowl, crowl@google.com, Lawrence@Crowl.org

This paper revises N2957 = 09-0147 - 2009-09-25.

Problem
Importance
Solution
Wording
Acknowledgments
References

Problem

The adoption of [N2927] restricts the referents of a lambda expression to the immediately-enclosing function or lambda expression. This restriction is severe and does damage to the utility of the lambda feature.

One principle of lambda design is that programmers should be able to replace all well-structured control constructs with function calls taking lambda expressions as arguments. [N2927] prevents that replacement, and is essentially equivalent to saying that the body of a nested loop cannot access function parameters.

Consider the following function to multiply square matricies.

matrix operator*( const matrix& a, const matrix& b )
{
    int n = a.numrows();
    matrix c( n, n );
    for ( int i = 0; i < n; i++ ) {
        for ( int j = 0; j < n; j++ ) {
            double t = 0.0;
            for ( int k = 0; k < n; k++ ) {
                t += a[i][k] * b[k][j];
            }
            c[i][j] = t;
        }
    }
    return c;
}

Its rewrite into functions and lambdas, as intended by the adopted [N2550], is as follows. Note that the rewrite is approximately the same size.

matrix operator*( const matrix& a, const matrix& b )
{
    int n = a.numrows();
    matrix c( n, n );
    for_range( 0, n-1, [&]( int i ){
        for_range( 0, n-1, [&]( int j ){
            double t = 0.0;
            for_range( 0, n-1, [&]( int k ){
                t += a[i][k] * b[k][j];
            } );
            c[i][j] = t;
        } );
    } );
    return c;
}

However this straightforward reformulation is ill-formed as per [N2927] wording for 5.1.1 [expr.prim.lambda] paragraphs 9–12. To make the code well-formed, the function must be rewritten as follows, where inserted text shows new code, which sometimes alter variable references.

matrix operator*( const matrix& a, const matrix& b )
{
    int n = a.numrows( );
    matrix c( n, n );
    for_range( 0, n-1, [&]( int i ){
        const matrix& a1 = a;
        const matrix& b1 = b;
	int& n1 = n;
	matrix& c1 = c;
        for_range( 0, n1-1, [&]( int j ){
            const matrix& a2 = a1;
            const matrix& b2 = b1;
	    int& n2 = n1;
	    int& i2 = i;
            double t = 0.0;
            for_range( 0, n1-1, [&]( int k ){
                t += a2[i2][k] * b2[k][j];
            } );
            c1[i][j] = t;
        } );
    } );
    return c;
}

Such extensive mechanical editing will inhibit use, make errors more likely, and complicate code maintenance. Such editing is properly the domain of the compiler.

Importance

One view of lambdas is that they are primarily useful for very short inline function arguments to standard algorithms, and hence do not need general scoping support. This view considerably underestimates the power of control abstraction. Indeed, the algorithms in the standard library are the beginning of control abstraction, not the end. In a highly concurrent world, programmers need more control constructs than can be reasonably provided by the language. Programmers will need to define and use their own constructs, and they will be using those constructs with the same frequency that current programmers use for and if.

Consider this following example, adapted from [CrowlLeBlanc94], for Gaussian elimination. It contains two constrol structures, one for the triangulation and one for the row reduction. Only the latter is a common control construct.

int size = system.size();
triangulate( size, [&]( int pivot, int reduce ) {
    auto fraction = system[reduce][pivot] / system[pivot][pivot];
    forall( pivot, size-1, [&]( int variable ) {
        system[reduce][variable] -= system[pivot][variable] * fraction;
    } );
} );

In particular, the triangulate control construct encapsulates all the available parallelism in the triangulation. Just as importantly, it also captures all the necessary synchronization. So, this highly parallel code contains no threads and no synchronization, which are the source of most effort and bugs in parallel programming. More imporantly, there is no particular commitment to any sequential or parallel execution of its control constructs. A vector machine might choose to execute triangulate serially and forall concurrently. On the other hand, a NUMA machine would likely execute triangulate with coarse parallelism and forall sequentially.

Good control abstraction enables more effective use of machine resources. But to be effectively programmed, lambda expressions must be able to easily substitute for ordinary well-structured control statements. For that easy substitution, we must restore the intent of [N2550].

Solution

In outline, the solution is as follows.

Wording

All edits fall in section 5.1.2 [expr.prim.lambda] and are based on N2960 (which includes the changes of N2927).

Replace paragraph 9, including the example, by (a counterpart of the original paragraph 9 will be inserted in paragraph 12 below):

A lambda-expression whose smallest enclosing scope is a block scope ([basic.scope.local] 3.3.3) is a local lambda expression; any other lambda-expression shall not have a capture-list in its lambda-introducer. The reaching scope of a local lambda expression is the set of enclosing scopes up to and including the innermost enclosing function and its parameters. [Note: This reaching scope includes any intervening lambda-expressions. —end note]

Edit paragraph 10 as follows.

The identifiers in a capture-list are looked up using the usual rules for unqualified name lookup ([basic.lookup.unqual] 3.4.1); each such lookup shall find a variable or reference with automatic storage duration declared in the reaching scope of the local lambda expression. An entity (i.e., a variable, a reference, or this) is said to be explicitly captured if it appears in the lambda-expression's capture-list. An explicitly captured entity is used ([basic.def.odr] 3.2).

Edit paragraph 11 as follows.

If a lambda-expression has an associated capture-default and its compound-statement uses ([basic.def.odr] 3.2) this or a variable or reference with automatic storage duration declared in an enclosing function or lambda-expression and the used entity is not explicitly captured, then the used entity is said to be implicitly captured; such entities shall be declared within the reaching scope of the lambda expression. [Note: The implicit capture of an entity by a nested lambda-expression can cause its implicit capture by the containing lambda-expression (see below). Implicit uses of this can result in implicit capture. —end note]

Edit paragraph 12 as follows.

An entity is captured if it is captured explicitly or implicitly. An entity captured by a lambda-expression is used ([basic.def.odr] 3.2) in the scope containing the lambda-expression. If this is captured , either explicitly or implicitly, the lambda-expression shall appear directly in within the definition of a non-static member function, i.e., not in another lambda-expression by a local lambda expression, its nearest enclosing function shall be a non-static member function . [Note: This rule prevents access from a nested lambda-expression to the members of the enclosing lambda-expression's closure object. —end note] If a lambda-expression uses ([basic.def.odr] 3.2) this or a variable or reference with automatic storage duration from its reaching scope, that entity shall be captured by the lambda-expression. If a lambda-expression captures an entity, and that entity is not defined or captured in the immediately enclosing lamba expression or function, the program is ill-formed. [Example:


void f1(int i) {
  int const N = 20;
  auto m1 = [=]{
     int const M = 30;
     auto m2 = [i]{
        int x[N][M]; // OK: N and M are not "used"
        x[0][0] = i; // OK: i is explicitly captured by m2
                     // and implicitly captured by m1
     };
  };
  struct s1 {
    int f;
    int work(int n) {
      int m = n*n;
      int j = 40;
      auto m3 = [this,m]{
        auto m4 = [&,j]{ // error: j not captured by m3
          int x = n; // error: n implicitly captured by m4
                     // but not captured by m3
          x += m; // OK: m implicitly captured by m4
                  // and explicitly captured by m3
          x += i; // error: i is outside of the reaching scope
          x += f; // OK: this captured implicitly by m4
                  // and explicitly by m3
        };
      };
    }
  };
}

end example]

After paragraph 15, add a new paragraph as follows.

If a lambda-expression m1 captures an entity, and that entity is captured by an immediately enclosing lambda-expression m2, then m1's capture is transformed as follows:

[Example: The nested lambda expressions and invocations below will output 123234.


int a = 1, b = 1, c = 1;
auto m1 = [a, &b, &c]() mutable {
    auto m2 = [a, b, &c]() mutable {
        std::cout << a << b << c;
        a = 4; b = 4; c = 4;
    };
    a = 3; b = 3; c = 3;
    m2();
};
a = 2; b = 2; c = 2;
m1();
std::cout << a << b << c;

end example]

Acknowledgments

William M. Miller provided many helpful comments and alternate wording suggestions.

References

[N2927]
New wording for C++0x Lambdas (rev. 2), Daveed Vandevoorde, ISO/IEC JTC1 SC22 WG21 N2927 = 09-0117 - 2009-07-15, http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2009/n2927.pdf
[N2550]
Lambda Expressions and Closures: Wording for Monomorphic Lambdas (Revision 4), Jaakko Järvi, John Freeman, Lawrence Crowl, ISO/IEC JTC1 SC22 WG21 N2550 = 08-0060 - 2008-02-29, http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2550.pdf
[CrowlLeBlanc94]
Parallel Programming with Control Abstraction, Lawrence A. Crowl, Thomas J. LeBlanc, ACM Transactions on Programming Languages and Systems 16(3): 524-576, May 1994, http://www.Crowl.org/Lawrence/paper/journals/1994J-TOPLAS-03-524/