Doc Number:	X3J16/93-0158
					WG21/N0365
			    Date:	September 28, 1993
			    Project:	Programming Language C++
			    Ref Doc:	93-0062/N0269
			    Reply to:	Sam Kendall
					Sun Microsystems Laboratories, Inc.
					Sam.Kendall@East.Sun.COM


				 EXPRESSION ATTRIBUTES

				      Sam Kendall
			  Sun Microsystems Laboratories, Inc.


	1.  INTRODUCTION

	   We all know that every expression has two attributes: a type, and a
	boolean "lvalue-ness".  This is true in ISO C as well as in C++.  But
	C++ expressions have several attributes besides those two.  For example,
	some functions have default arguments; the list of default arguments is
	another attribute.  I have found that trying to analyze expressions
	solely in terms of attributes yields a great deal of clarity.

	   I am dealing here with type analysis only.  In other words, I'm not
	trying to define the full meaning of each expression; I am just trying
	to provide a foundation to decide very clearly, for each expression,
	whether it is well-formed and what its type is.  If we can get that
	far in understanding expressions we will have clarified a great deal.

	   The basic setup should be familiar to any student of programming
	languages and compilers.  (See Aho, Sethi and Ullman chapter 2 for more
	information.)  We consider each expression as a syntax tree.  Our aim is
	to assign _attributes_ to each node in a bottom-up fashion.

	   The attributes of leaf expressions are determined by name lookup and
	other rules; for example, when I use the identifier `x', name lookup
	determines which declared function(s) or variable `x' I am naming, and
	from the associated declaration(s) we get the attributes.  The rules for
	assigning attributes to leaf expressions are not the focus of this
	paper, which is why I am hand-waving.

	   The attributes of interior nodes are determined solely from the
	attributes of IMMEDIATE child nodes.  This is a key point; it is our
	source of clarity.

	   I have not worked this out, but I think a type conversion will be
	considered to ADD a node between some node (the expression being
	converted) and its parent.  This "conversion node" has attributes which
	are typically different from those of its immediate child.  Overload
	selection may also be handled this way.

	   One point of notation: I use "f" (as in "f3.7") to indicate sections
	of the WP (working paper).

	   Thanks to Tom Plum and Scott Turner for excellent feedback; I have
	tried to incorporate it with varying success.


	1.1  Relationship to the WP

	   These attributes are all implied by the WP.  For example, the
	sentence "The address-of operator & may not be applied to a bit-field"
	in f9.6 implies the existence of the boolean attribute I call
	"Bit_field".  Even for this simple attribute there are cases the WP
	leaves ambiguous; see the example `&(i ? i : b)' below.

	   I'd like my terminology to be as close to the WP as possible.  If you
	catch me departing from the WP unnecessarily please let me know.

	   I'm not sure what changes should be made to the WP as a result of the
	clarifications that arise from thinking about expression attributes.  It
	is easy to say "we should rewrite the WP in terms of attributes", and
	this would surely make the WP less ambiguous, but it would also be a ton
	of work, and the increased formalism would probably leave the WP less
	accessible to novices.

	   The ultimate point of all this is to be able to write a more-or-less
	formal definition of expression type analysis.  With luck this
	definition will encompass the built-in operators as well as user-defined
	operators and functions.  I think we need to write this before we can
	tell whether the relevant parts of the WP need to be rewritten, or
	whether they can be edited into clarity.


	1.2  What about Merging Lvalue and Type?

	   Alternate schemes are possible.  For example, in 92-0053/N0130 I
	proposed (in this paper's terms) to fold the Lvalue and Bit_field
	attributes into Type.  I still think folding Lvalue (though not
	Bit_field) into Type is a good idea.

	   But I'm not following that line of thought here.  In this paper I
	stay as close to the WP as I can.


	2.  THE ATTRIBUTES

	   Each expression (and subexpression) has the following attributes.


	Attribute	Value				Section	In ISO C also?
	---------	-----				-------	--------------

	Overload_set	a set of 4-tuples as follows		no
	 [
	  Type    	a C++ type as in 3.6.2 except	3.6.2	yes
			no "constant" type modifier
			(see note below); never a
			reference type

	  Lvalue	boolean: is the expression an	3.7	yes
			lvalue?

	  Default_args	nil				8.2.6	no
			OR
			a sequence of boolean:
			does each argument have a
			default?

	  NS_member	nil				9.2	no
			OR
			a class type: the expression is
			a nonstatic member of that class
	 ]

	Constant	nil				3.6.2,	yes
			OR				5.19
			a value of type Type: the
			value of the constant expression

	Bit_field	boolean: is the expression	9.6	yes
			a bit-field?

	Class_qualifier	nil				5.3/2	no
			OR
			a class type: the expression
			is class-qualified

	Was_reference	boolean: was a reference type	-	no
			modifier stripped?


	2.1  Rationale for Each Attribute

	   For most expressions the Overload_set is a singleton set, so as a
	shorthand we often talk about, for example, the type of an expression.
	But an overloaded function name or operator may have many types; some
	may be lvalues while others are not; it may have more than one sequence
	of default arguments, and some of the overloadings may be nonstatic
	members while others are not.  For this reason Type, Lvalue,
	Default_args, and NS_member are not independent attributes.  The other
	attributes are not in Overload_set because they have only one value even
	for an overloaded function name.

	   The reasons for Type and Lvalue are obvious.

	   Default_args allows default arguments to be omitted in calls.  Notice
	that Default_args is a sequence of booleans, not of "expressions".  This
	is because for type analysis we don't care what the default argument is,
	only whether it is present.

	   NS_member allows nonstatic member names to appear as the right-hand
	operand of `.'  and `->', while disallowing other names from appearing
	there.  NS_member also governs the appearance of nonstatic member names
	in the bodies of member functions (and elsewhere).

	   Bit_field allows us to enforce the prohibition on reference to
	bit-field and pointer to bit-field.

	   Constant allows the integer constant expression 0 to be implicitly
	converted to pointer type, while disallowing other constant expressions,
	and nonconstant integers, from being so converted.  Constant also allows
	error-checking of contexts requiring a constant expression, such as case
	labels.

	   Class_qualifier allows the expressions `&m' and `&C::m' to have different
	types (pointer and pointer to member, respectively, for `m' a nonstatic
	member of `C').

	   Was_reference is necessary in analyzing some expressions involving
	members.  Given this class:

		struct A {
		    A();
		    ~A();
		    int i;
		    int& ir;
		};

	Note that the Types of `i' and `ir', are the same.  We use Was_reference
	to tell them apart: `A().ir' is an lvalue while `A().i' is not; and
	`&A::i' is an `int A::*', while it's not clear what `&A::ir' is.  (See
	93-0135/N0342 section 4 for a little bit more information.)


	2.2  Why Constant is not part of Type

	   f3.6.2 essentially states that Constant is part of Type, not an
	independent attribute.  (We are talking about constant expressions, not
	about the `const' type modifier.)  However, constant-ness is essentially
	orthogonal to type.  Constant-ness is not like a type modifier; if it
	were, "constant int" would be a type distinct from "int", and for a
	"constant int" to be used as an "int" there would have to be a type
	conversion.  But there are no such conversions mentioned in the WP.

	   For this reason, and because there is no syntax to talk about a
	"constant" type, I have made Constant a separate attribute.  The same
	needs to happen in the WP.


	2.3  The List is Not Enough for ID-Expressions

	   My list of attributes isn't enough to define the right-hand side of
	the `.' and `->' operators.  For that more attributes come into play,
	including whether `m' is a static member.  In the grammar, the
	right-hand side of `.'  and `->' is an id-expression.  We can view these
	additional attributes (not mentioned again) as attributes of
	id-expressions, but not of other kinds of expressions.


	3.  CONTEXT

	   Each expression occurs in a context.  For purposes of type analysis
	the context of an expression consists of the type system (all visible
	type declarations) and a _class context_.

	   A class context is either nil or a possibly-cv-qualified class type.
	For example:

		void f();
		struct A {
			void g() {
				...; // class context is `A'
			}
			void h() const {
				...; // class context is `const A'
			}
		};
		main() {
			...;	     // class context is nil
		}


	4.  EXAMPLES

	   This section is driven entirely by examples.  The first, "simple"
	examples are just to illustrate the concepts.  The second, "tough"
	examples are an attempt to show ambiguities that attributes can help us
	with.

	   I haven't succeeded in understanding all of the examples, but they
	can serve as a basis for discussion.


	4.1  Notation and Defaults

	   Our notation is to show the attributes as a list:

		Name1: value1; Name2: value2

	   A boolean attribute is false unless its name appears, in which case
	it is true.  So for example "Lvalue" is short for "Lvalue: true", and if
	no "Lvalue" appears in the list then "Lvalue: false" is implied.

	   Other attributes are nil unless otherwise indicated.

	   The Overload_set is delimited using { } and [ ] if necessary, but
	when it is the singleton set the delimiters are omitted.

	   Note: I am NOT proposing to introduce this notation into the WP.


	4.2  Simple Examples

	   Examples are based on this program fragment:

		int i;
		struct A {
		    int m;
		    int b :2;
		    void g();
		    void cg() const;
		    ~A();
		};
		A a;
		void f();
		void f(int);
		int& r = i;
		void A::g() {
		    // All expressions below are in this context.
		}


	Expression	Attributes
	----------	----------
	i		Type: int; Lvalue
	m		Type: int; Lvalue; NS_member: A
	&m		Type: int*
	A::m		Type: int; Lvalue; NS_member: A; Class_qualifier: A
	&A::m		Type: int A::*
	this->m		Type: int; Lvalue
	r		Type: int; Lvalue; Was_reference

	Expression	Attributes
	----------	----------
	f		{ [ Type: void ()    ],
			  [ Type: void (int) ] }; Lvalue
	&f		{ [ Type: void (*)()    ],
			  [ Type: void (*)(int) ] }
	g		Type: void ()      ; NS_member: A; Lvalue
	cg		Type: void () const; NS_member: A; Lvalue

	Expression	Attributes
	----------	----------
	b		Type: int; Lvalue; Bit_field
	b = 0		Type: int; Lvalue; Bit_field
	i ? i : b	Type: int; Lvalue; Bit_field
	&(b = 0)	not well-formed since unary & does not allow bit-fields
	&(i ? i : b)	not well-formed since unary & does not allow bit-fields

	   Note: how Bit_field propagates through the assignment and ?:
	operators is not clearly described in the WP, but these are reasonable
	interpretations.

	Expression	Attribute
	----------	---------
	A::~A		Type: void () const volatile; NS_member: A;
			Class_qualifier: A

	   Note: Thinking of a destructor as having const volatile type (when it
	appears in an expression) is a possible way to show that it is
	well-formed to invoke it explicitly on const or volatile objects, eg:

		volatile A va;
		va.A::~A();

	Note also that a destructor is not an lvalue, since you cannot take its
	address.


	4.3  Tough Examples

	>>> Example 1

		struct A {
		  void f();
		};
		void (A::*fp1)() = &A::f;	// ok
		void (A::*fp2)() = &(A::f);	// well-formed or not?

	>>> Example 1 Attributes

	Expression	Attributes
	----------	----------
	A::f		Type: void (); NS_member: A; Class_qualifier: A
	(A::f)		EITHER
				the same
			OR
				the same except that Class_qualifier is nil
	&A::f		Type: void (A::*)()
	&(A::f)		EITHER
				Type: void (A::*)()
			OR
				not well-formed since if a member function's
				address is taken Class_qualifier must be non-nil

	>>> Example 1 Discussion

	   f5.3/2 favors the second outcome (not well-formed), since it claims
	that the syntax `& qualified-id' is necessary to produce a
	pointer-to-member.

	   On the other hand, one expects `( E )' to mean the same as `E' for
	any expression `E'.  Cfront 3.0 implements the first outcome
	(well-formed).  Since compilers are typically implemented using
	attributes much like what this letter discusses, we can expect most of
	them to follow cfront in this matter.

	   Rewording f5.3/2 in terms of the Class_qualifier attribute would
	clarify this area.  Then f5.1/5, which defines the meaning of `(
	expression )', can either state that Class_qualifier is preserved by the
	parens (giving us the first outcome), or that it is made nil (giving us
	the second outcome).


	>>> Example 2

		struct A {
		  void f();
		  static void f(int);
		};
		void (A::*fp1)() = &A::f;	// gets void f()
		void (*fp2)(int) = &A::f;	// gets static void f(int)
		void A::f() {
		  void (*fp3)(int) = &f;	// not well-formed
		}

	>>> Example 2 Attributes

	Expression	Attributes
	----------	----------
	A::f		{ [ Type: void ();            NS_member: A ],
			  [ Type: void (int); Lvalue               ] };
			Class_qualifier: A
	&A::f		{ [ Type: void (A::*)() ],
			  [ Type: void (*)(int)	] }
	&f		{ [ Type: void ();            NS_member: A ],
			  [ Type: void (int); Lvalue               ] }


	>>> Example 2 Discussion

	   It is surprising enough to find that we can have a static and a
	nonstatic member function with the same name.  But `&A::f' is an
	expression with two types, one of which is a pointer type and the other
	of which is a pointer-to-member type.  This may seem bizarre.

	   `&f' is not well-formed because one of the two `f's is not an lvalue.

	   This is an example of why Lvalue and NS_member need to be part of the
	Overload_set and not independent attributes.

	   In f5.3/2 and f13.3 the WP gives the rules for this example, but it
	is never said that the rule for address-of (both address-of qualified-id
	and normal address-of) can apply to overloaded names, in particular to
	overloaded names some of which are nonstatic member and some of which
	are not.  Rewording in terms of attributes would clarify this example.


	>>> Example 3

		struct A {
		  void f(A&);
		  void f(const A&);
		  void g(A&) const;
		  void g(const A&);
		};

		void h(A a, const A ca) {
		  a.f(a);       // calls f(A&) by 13.2/12
		  a.g(a);       // ambiguous
		  ca.g(a);	// calls g(A&) const
		}

	>>> Example 3 Attributes

	Expression	Attributes
	----------	----------
	f		{ [ Type: void (A&);       NS_member: A ],
			  [ Type: void (const A&); NS_member: A ] }
	g		{ [ Type: void (A&) const; NS_member: A ],
			  [ Type: void (const A&); NS_member: A ] }
	a.f		{ [ Type: void (A&)       ],
			  [ Type: void (const A&) ] }
	a.g		{ [ Type: void (A&) const ],
			  [ Type: void (const A&) ] }
	ca.g		Type: void (A&) const
			OR
			Type: void (A&)

	>>> Example 3 Discussion

	   This is a straightforward application of the rules of f13.2.
	Thinking about it in terms of attributes is interesting, though.

	   Whether the function type of `ca.g' still has its const qualifier is
	an interesting question.  It doesn't matter in this example, but
	intuitively the const qualifier has "served its purpose" after being
	used to select one of the two `g's.  It probably matters in Example 4.


	>>> Example 4

		struct A {
		  void f() const;
		  static void f();
		};

		void A::f() const {
		  f();       // ?
		}

	>>> Example 4 Attributes

	Expression	Attributes
	----------	----------
	f		{ [ Type: void () const;         NS_member: A ],
			  [ Type: void ()	 Lvalue	              ] };

	>>> Example 4 Discussion

	   This is a difficult one.  It is not specified by f13.2 how to resolve
	overloaded static and nonstatic member functions.


	>>> Example 5

		struct B;
		struct A {
		    void operator+(B&);
		};
		struct B : A {};

		void operator+(B&, A&);

		void f(B b1, B b2) {
		    b1 + b2;		// ambiguous
		}

	>>> Example 5 Attributes

	Expression	Attributes
	----------	----------
	+		{ [ Type: void (B&);             NS_member: A ],
			  [ Type: void (B&, A&); Lvalue	              ] };
	b1		Type: B; Lvalue
	b2		Type: B; Lvalue

	>>> Example 5 Discussion

	   Obviously overloaded operators are very special.  They are the only
	case where a (nonstatic) member function and a non-member function
	"compete" for the same arguments.  "+" is not an expression, of course,
	but we need some way to talk about the set of overloadings introduced by
	that operator.


	5.  WORDING IN THE WP

	5.1  Introducing the Attributes

	   I don't suggest we introduce this notation into the WP.  Instead, I
	suggest we introduce the attributes in a less formal way, as is done
	with lvalues and non-lvalues.

	   We may want to use the word "property" instead; it has a friendlier
	sound but means the same thing.

	   The section introducing the attributes might be something like this:

		3.8  Other Expression Properties

		An expression's type, and whether it is an lvalue, are two
		_properties_ of the expression.  Expressions have a number of
		other properties.  Some expressions have a list of default
		arguments (f8.2.6).  Some expressions are nonstatic members of
		some class (f9.2).  Some expressions are constant expressions
		with some value (f5.19).  Some expressions are bit-field
		expressions (f9.6).  Some expressions are class-qualified by a
		given class (f5.3).  Some expressions are formerly-reference
		expressions, meaning that they are derived from an identifier of
		reference type.

	I am unsure how to introduce the overload set.


	5.2  Adding Bit-Field into the Rest of the WP

	   Here is an example of how we would add one attribute -- Bit_field --
	to other areas of the WP.  Doing so would clarify some of the bit-field
	examples in section 4.2 above.  Unfortunately some other attributes are
	quite a bit harder to add.

	   In f5.1/5 (Primary Expressions), add to the last sentence

		or a bit-field.

	   To f5.16/3 (Conditional Operator), append this sentence:

		The result is a bit-field if it is an lvalue and the second and
		third operands are both bit-fields.

	   To f5.17/1 (Assignment Operators), append this sentence:

		The result is a bit-field if the left operand is a bit-field.

	   In f5.18/1 (Comma Operator), add to the last sentence:

		; the result is a bit-field if its right operand is.

	   The part bit-fields play in argument matching must also be clarified.
	Just as a non-lvalue cannot be bound to a nonconst reference, so a
	bit-field cannot be bound to *any* reference.  But is this checked
	during, or after, overloading resolution?  It makes a difference, as in
	this example:

		void f(long);
		void f(int&);
		struct A {
			int i;
			int b :2;
		};
		A a;
		f(a.i);		// calls f(int&)
		f(a.b);		// either not well-formed or calls f(long)


	5.3  The Overload Set

	   As Scott Turner has pointed out, the Overload_set is not really
	accounted for in the WP.  We don't want to say

		The overload set must be a singleton set.

	in countless locations of the WP.  Instead, we can say something like

		Except in the function call position of a function call operator
		or as an initializer, an expression's overload set must have a
		cardinality of 1.  Because of this restriction, some sections of
		this WP can refer to "the type" of an expression.

	This is not enough, though.  This issue, and many others, awaits a
	comprehensive revision of argument matching, initialization, and
	built-in operators.  If built-in operators can be expressed as
	overloadings, then the number of expression contexts will be greatly
	reduced and this issue will become more tractable.