Ticket #561 (closed enhancement: fixed)

Opened 3 years ago

Last modified 3 years ago

major performance bug with special functions

Reported by: cwitty Owned by: cwitty
Priority: major Milestone: 0.13
Component: Code Generation Keywords:
Cc: Carl.Witty@…

Description

I found (and, I think, fixed) a major performance bug with special functions in Cython.

If you define a special function (__len__, __add__, __getattr__, etc.) in a cdef class, and then inherit from that class with a non-cdef class (either a Cython class that's not cdef, or a Python class), then calling the special function on the subclass will be vastly slower than necessary.

For example, with this Cython code:

cdef class NullAdder:
    def __add__(self, other):
        return self

class NullAdder2(NullAdder):
    pass

I get these timings:

sage: na = NullAdder()
sage: timeit('na + na', number=10^5, repeat=30)
100000 loops, best of 30: 94.3 ns per loop
sage: na2 = NullAdder2()
sage: timeit('na2 + na2', number=10^5, repeat=30)
100000 loops, best of 30: 261 ns per loop

where it takes about 170ns extra to call the method in the subclass. (The overhead differs by special function; for __getattr__ it's more like 700ns of overhead on the same machine.)

After my patch, the timings become:

sage: na = NullAdder()
sage: timeit('na + na', number=10^5, repeat=30)
100000 loops, best of 30: 92 ns per loop
sage: na2 = NullAdder2()
sage: timeit('na2 + na2', number=10^5, repeat=30)
100000 loops, best of 30: 92.9 ns per loop

That is, to within measurement error, the special function on the subclass is exactly the same speed as the special function on the superclass.

What's going on (described in terms of __getattr__/.tp_getattro, but the same thing happens for any special function):

In typeobject.c, there's a subtle interplay between the contents of the .tp_getattro slot and the "__getattr__" in the type's dictionary, to make sure they always match. When defining a type in C, you can set .tp_getattro, and typeobject.c:add_operators() will create a "__getattr__" containing a wrapper for the .tp_getattro function. When defining a type in Python, you can set __getattr__, and typeobject.c:update_one_slot() will set .tp_getattro to a function that will look up __getattr__ and call it.

If you define a type in C, that sets .tp_getattro (so add_operators() creates a __getattr__ wrapper), and then inherit from the type in Python, it needs to avoid setting .tp_getattro to the function that looks up __getattr__; instead, it should leave .tp_getattro alone (inherited from its parent). This case is detected in update_one_slot() by checking if __getattr__ is a wrapper object; if so, it was presumably created by add_operators(), and presumably matches .tp_getattro, so nothing has to be done.

Cython was breaking this optimization by setting both .tp_getattro and __getattr__. Then update_one_slot() notices that __getattr__ is set, and is not a wrapper object; so it sets .tp_getattro to call __getattr__.

The fix is simple... just don't create the Python versions of the methods and let PyType_Ready (which calls add_operators()) do it for you.

I do know of one behavioral change with this patch: if you define __getattr__ in a cdef class, then Python introspection will claim that you implemented __getattribute__ instead. (There may be other similar changes.)

I tested the patch as follows: applied it to Cython 0.12.1, Cython 0.13.beta0, and current cython-devel; and ran the test suite. All tests passed (except the two failures reported by Arfrever for Cython 0.13.beta0). I then applied it to the Sage Cython spkg from Sage 4.5.1, rebuilt Sage's cython code using the new version ("sage -ba"), and ran the entire test suite ("make ptestlong"). All tests passed.

I'm not sure if it's too late, but I think this patch (or something like it) should be seriously considered for 0.13.

Attachments

trac_561-cython-special-functions.patch Download (1.9 KB) - added by cwitty 3 years ago.
trac_561-test-suite.patch Download (20.0 KB) - added by cwitty 3 years ago.
trac_561-python3-doctest-fixes.patch Download (14.9 KB) - added by cwitty 3 years ago.

Change History

Changed 3 years ago by cwitty

Changed 3 years ago by scoder

  • type changed from defect to enhancement

It looks like your patch lacks dedicated tests, which I find necessary for this kind of change.

Also, what Python versions did you test this on? I'd like to avoid accidentally breaking Python 2.3 compatibility just because we start relying on some 'better' behaviour of PyType_Ready() in newer Python versions.

Changed 3 years ago by cwitty

Changed 3 years ago by cwitty

  • cc Carl.Witty@… added

I added an extensive test suite (to be applied on top of the first patch) that tests all of the bound methods affected by the patch. The only behavioral change of the patch is that with the previous version of Cython, defining __getattr__ gets you Python-level methods of __getattr__ and __getattribute__; after the patch, defining __getattr__ only gets you __getattribute__.

I tested with Python 2.5 and 2.6.

Changed 3 years ago by robertwb

  • owner changed from somebody to cwitty

Changed 3 years ago by robertwb

  • status changed from new to closed
  • resolution set to fixed

Changed 3 years ago by robertwb

  • status changed from closed to reopened
  • resolution fixed deleted

Changed 3 years ago by cwitty

Changed 3 years ago by cwitty

This fixes the Python 3.1 doctest failures by splitting some of the tests into a _py2 and a _py3 file.

Also, I simply removed the failing doctest from type_slots_int_long_T287.pyx. (Lisandro suggested changing the test, but I couldn't figure out how to change it and still be valid for Python 3.)

Changed 3 years ago by robertwb

  • status changed from reopened to closed
  • resolution set to fixed

This has been fixed except for __getattr__.

Note: See TracTickets for help on using tickets.