Ticket #342 (closed defect: fixed)
Templates/memory views: Resolve parser ambiguities at a later stage
|Reported by:||dagss||Owned by:||robertwb|
Description (last modified by dagss) (diff)
Especially if we want to introduce templates, the scheme below should be used to resolve a syntax ambiguity. This holds whether  or () is selected:
- A[B] can (in type context) mean either a C array of size B, or a template with B as argument if  is chosen.
- A(B) can (in type context) mean either an unnamed C function returning type A and taking an argument of type B (yes, really!), or a template with B as argument if () is chosen.
Both of these are only a problem where the declarator name can be dropped though, i.e. inside sizeof or for cdef extern function arguments.
Extract from conversation from Dag to Kurt:
SomeName?[OtherName?] is actually *not* ambiguous, it's just that it is ambiguous in the parser! Later on, SomeName? can be resolved, and it will be known whether SomeName? is a Cython type (=>buffer) or a struct/typedef/C type (=> C array without name).
a) Forget about deciding this at parse time. Instead parse to a much rawer "BracketTypeNode?" (containing base_type and axes), and leave the decision until Cython's declaration analysis phase (where the base_type can be analysed before axes, so base_type will tell what needs to be done with axes).
b) However, this requires that the axes are also parsed without making too many assumptions -- which is potentially hard. Basically this calls for an additional method (in addition) to p_expr and p_c_declarator, which basically parses something which can be "either an expression or declarator". I.e. p_expr_or_c_declarator (with only the empty=True case for p_c_declarator).
- Some things must be type declarations -- like "a*", "(a*)()", "unsigned int".
- Some things must be expressions -- like "a+b", "a::b" etc.
- Some things are ambiguous:
- "somename" can of course be either
- "a(b)" can either be a function call, or a declaration like this:
# takes a function returning a and taking b as argument: cdef extern foo(a(b)) # If giving the argument a name, it is written like this: cdef extern foo(a(argname)(b)) # wierd stuff...
So the strategy would be to have p_expr_or_c_declarator return a parse tree which was "unresolved" (like, ExprOrTypeNode?). And then one could afterwards call either analyse_as_expr or analyse_as_type on the tree (when one knew what to expect). If the tree then e.g. contained something which could only be interpreted as an expression, and one called analyse_as_type, an error would be raised at that point.
This seems like a quite big task which I'm unsure about spending time on. But the result is much more "correct", in that the parser doesn't make decisions it really can't do. Also it helps moving logic out of the parser in general. What do you think?