Ticket #144 (closed task: fixed)

Opened 20 months ago

Last modified 16 months ago

Fix BlockNode/move literals/constants to code generation

Reported by: dagss Owned by: scoder
Priority: minor Milestone: 0.12
Component: Code Generation Keywords: cleanup
Cc:

Description

In summer 2008, I moved the things that BlockNode was made for doing over to code.funcstate. However, things are a bit ugly: Constants/literals are recorded in env during analysis, then each function environment pipes the literals to code.funcstate (possibly making duplicate reports from the env), which again makes efforts to merge duplicate entries.

This is ugly, and I'm not guaranteeing that it is working 100% either.

One side-effect is that if literals cannot be added in tree transformations, and if you remove literals in transformations then it is still allocated etc.

The best fix is to move registration of literals/constants over to code generation phase (modifying code.funcstate along the way). This can be done one literal type at the time, nice and incrementally. When all is done, BlockNode can be deleted.

Change History

Changed 20 months ago by dagss

I raised the question on whether char* literal constants are needed at all on the mailing list. The result is "yes". Note that a fix for this issue likely corresponds loosely to a fix of BlockNode.

Excerpt:

> > Hi,
> >
> > Greg Ewing wrote:
>> >> Dag Sverre Seljebotn wrote:
>> >>
>>> >>> Can anyone think of a reason why C string literals are allocated as
>>> >>> variables in C source?
>> >>
>> >> I can't remember all the reasons I did it that way in Pyrex,
>> >> but one of them may have been so that I could leave calculating
>> >> the length of the string to the C compiler. It's not entirely
>> >> trivial to do that when escape sequences are involved.
> >
> > That is very true. I put a whole lot of work into proper string  
> > escaping to
> > make sure byte strings end up in the binary the way they were  
> > written in
> > the source. That usually makes non-ASCII strings a bit longer than  
> > their
> > pure byte sequence.
> >
> >
>> >> Another reason is so that I can refer to the C version of the
>> >> string in the generated code concisely, instead of having to
>> >> insert the whole string literal at that point, making it
>> >> easier to audit the generated code.
> >
> > It's also more readable, as some important information such as  
> > identifier
> > state or unicode type are currently stored in the string tab behind  
> > the
> > string literal. Although one could argue that it's trivial to move it
> > before the literal...
> >
> > All of this makes me think that it's not a bad idea to keep them  
> > separate
> > in the code.

Changed 20 months ago by dagss

#99 will probably be fixed as a consequence of this one.

Changed 20 months ago by dagss

#68 has a comment on the same subject:

Pyrex has some really good changes gone into it in this area (Greg can be a lot more confident about not breaking things so I suppose he just took it all the way at once); see the commits prior to and leading up to this one:

 http://hg.cython.org/pyrex/rev/da6e97bb7e6d

Changed 17 months ago by dagss

  • priority changed from major to minor

Changed 17 months ago by scoder

There is a test case for #99 (-T99) that tests for this.

Changed 17 months ago by scoder

  • owner changed from somebody to scoder
  • status changed from new to assigned
  • milestone changed from wishlist to 0.12

Changed 17 months ago by scoder

Changed 17 months ago by scoder

The constant handling has been moved to codegen time now. The only exception is the builtins cache, which remains inside the scoping infrastructure (for now), as there is no technical need to move it.

Two problems remain:

  • CArrayDeclaratorNode needs the array size at type analysis time. In the case of a C compile time constant expression (such as "C_CONST_VAL+3"), there is currently no way to retrieve a string for the size. This is tested in carrays.pax.
  • Some PyrexTypes have a create_convert_utility_code() method that dynamically generates a UtilityCode? instance during type analysis. If this requires constant allocation (such as in CStructOrUnionType), this cannot be done before the code generation phase.

Both of these need fixing as they break previously working code.

Changed 16 months ago by scoder

The latter is not ticket #257.

Changed 16 months ago by scoder

The latter is now ticket #257.

Changed 16 months ago by scoder

  • status changed from assigned to closed
  • resolution set to fixed
Note: See TracTickets for help on using tickets.