Ticket #299 (new task)

Opened 5 years ago

Last modified 5 years ago

Buffers: Clean up internal representation of acquired buffers

Reported by: dagss Owned by: somebody
Priority: major Milestone: wishlist
Component: Code Generation Keywords: kurtgsoc
Cc:

Description (last modified by dagss) (diff)

For speed purposes, some information (shape, and strides and suboffsets when applicable) is unpacked into local variables when acquiring a buffer.

The rest of the Cython codebase is (naturally) centered around "one variable Cython side is one variable C side". For instance temporary expressions must store their result in a single temporary variable C-side (which is acquired through code.funcstate.allocate_temp, Code.py).

Using structs instead of loose variables for unpacking buffers would remedy this problem, while keeping the necesarry unpacked variables on the stack (so it should compile to as optimal code).

E.g. for strided 1D mode, one would rather than

PyObject* __pyx_v_myvar
Py_buffer __pyx_buf_myvar
Py_ssize_t __pyx_bufshape0_myvar
Py_ssize_t __pyx_bufstride0_myvar

use a single struct like this:

typedef struct {
  Py_buffer bufinfo;
  Py_ssize_t shape0, stride0;
} __Pyx_StridedBuf_1D;

typedef struct {
  __Pyx_StridedBuf_1D buffer;
  PyObject* object;
} __Pyx_StridedBuf_1D_Obj;

// in function...:

__Pyx_StridedBuf_1D_Obj __pyx_v_myvar;

The reason to split into a struct without and one with object, is to make it easier to implement non-object-linked buffers later (e.g. int[:]).

In addition to simply changing the variable allocation scheme (in Buffer.py), one would then need to output __pyx_v_myvar.object rather than __pyx_v_myvar everywhere the Python object is needed; this can rather easily be done in NameNode.

Change History

Changed 5 years ago by dagss

In addition to {{NameNode?}}}, various cleanup code (exception handling etc.) would need to be taught using __pyx_v_myvar.object instead for their Py_DECREF.

Changed 5 years ago by dagss

Note that the object field present in Py_buffer is filled in by the getbuffer implementation in the object in question, and so we should not rely on it for this purpose, but must store our own object reference.

Changed 5 years ago by dagss

  • description modified (diff)

Changed 5 years ago by dagss

  • description modified (diff)

Changed 5 years ago by dagss

Note 1

Note that in combination with #311, this looks like

typedef struct {
  __Pyx_buffer* bufinfo;
  Py_ssize_t shape0, stride0;
} __Pyx_StridedBuf_1D;

One would copy it by value, and in doing so modify the reference count inside bufinfo. When the bufinfo refcount hits 0, one would release the buffer.

Note 2 Furthermore, doing efficient slicing (#178) requires moving the base pointer. E.g. with arr[10:] the strides are not affected, only the base pointer. This might be an argument in favor of storing the buffer pointer itself by value, making it

typedef struct {
  __Pyx_buffer* bufinfo;
  char* data;
  Py_ssize_t shape0, stride0;
} __Pyx_StridedBuf_1D;

Note 3 Finally, I'm not sure how much of an issue it is to have some unecessarry fields here and there. Perhaps the full version with shape, strides and suboffsets will be all that is needed (*shrug*).

Changed 5 years ago by dagss

Perhaps some variation on

typedef struct {
  Py_ssize_t shape, stride, suboffset;
} __Pyx_Buf_DimInfo;

typedef struct {
  __Pyx_buffer* bufinfo;
  char* data;
  __Pyx_Buf_DimInfo diminfo[2];
} __Pyx_Buffer_2D;

Then one could have a generic version without a size in diminfo? Not sure how C works here...

Changed 5 years ago by dagss

Apparently most C89 compilers will allow

typedef struct {
   ...
   __Pyx_Buf_DimInfo diminfo[1]
} __Pyx_Buffer_2D;

and then malloc-ing too much memory for it. Using diminfo[] is a C99 feature which we thus can't use.

Not really useful as we must allocate on the stack.

Note: See TracTickets for help on using tickets.