blogroll tags

from hell import interesting_revelations

Somewhat inspired by the philosophical thickets in the depths of one of the more fundamental discussions on python-list aka comp.lang.python, I wrote a little function in C that grossly violates the Python object model's integrity and swaps two object structures in-place. What I wasn't expecting is that this can be used to illustrate some interesting facets of the CPython internals.

The original version looked like this:

static PyObject *
swap(PyObject *self, PyObject *args)
{
    PyObject *obj1, *obj2;
    Py_ssize_t len;
    PyObject *temp;

    if (!PyArg_ParseTuple(args, "OO", &obj1, &obj2)) {
        return NULL;
    }

    len = obj1->ob_type->tp_basicsize;
    if (obj2->ob_type->tp_basicsize != len) {
        PyErr_SetString(PyExc_TypeError, "types have different sizes (incompatible)");
        return NULL;
    }

    temp = PyMem_Malloc(len);
    memcpy(temp, obj1, len);
    memcpy(obj1, obj2, len);
    memcpy(obj2, temp, len);
    obj2->ob_refcnt = obj1->ob_refcnt;
    obj1->ob_refcnt = temp->ob_refcnt;

    Py_INCREF(Py_None);
    return Py_None;
}

Simple: get the object size in memory, and swap using a temporary variable. This sort of works — but not quite.

Python 3.1.2 (release31-maint, Jul  8 2010, 09:18:08) 
[GCC 4.4.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from hell import swap
>>> a = "this is the first string"
>>> b = "this is the second string!"
>>> swap(a,b)
>>> a
'this is the second string!'
>>> b
'this is the first string'
>>> t1 = (1,2,3)
>>> t2 = (a,)
>>> swap(t1, t2)
>>> t1
(1,)
>>> t2
zsh: segmentation fault  python3

As you can see, it swapped the strings without any problems (I'll show you some problems further below), but it behaved strangely with the tuples: the new t1 does have only one element, like the old t2, but that one element is the first element of the old t1! Also, what the flip happens when you try to access t2?

Turns out tuple is a variable-size type. That means it can be created with any number of items, and have an according size in memory depending on how large it has to be. My original code only respected the “basic size” of the type, meaning that, in the case of tuples, it copied the information on how many items there are, but not the actual items. When trying to print t2, Python reads beyond the end of the tuple structure, probably dereferences an invalid pointer, and dies a painful death.

On a side note, Python's list type is, contrary to what you might expect, not a variable-size type — it cannot be, since, in the case of variable-size types, the length must be known when the object is created (and allocated), and can never change. (The reason is that realloc(3)-ing an object might move it, which would invalidate pointers, which is when all hell would break loose). Lists don't keep their items in the actual object structure, they simply keep a pointer.

Armed with the knowledge of variable-size types, we can fix hell.swap to work for tuples:

    len1 = obj1->ob_type->tp_basicsize
           + ((PyVarObject*)obj1)->ob_size * obj1->ob_type->tp_itemsize;

    len2 = obj2->ob_type->tp_basicsize
           + ((PyVarObject*)obj2)->ob_size * obj2->ob_type->tp_itemsize;

    if (len1 != len2) {
        PyErr_SetString(PyExc_TypeError, "objects have different sizes (incompatible)");
        return NULL;
    }

    temp = PyMem_Malloc(len1);
    memcpy(temp, obj1, len1);
    memcpy(obj1, obj2, len1);
    memcpy(obj2, temp, len1);
    obj2->ob_refcnt = obj1->ob_refcnt;
    obj1->ob_refcnt = temp->ob_refcnt;
    PyMem_Free(temp);

Recompile, and we're ready for more apocalyptic idiocy. This time, after checking that tuples actually work as expected, we will be swapping strings in the wrong place to the great detriment of our sanity.

Python 3.1.2 (release31-maint, Jul  8 2010, 09:18:08) 
[GCC 4.4.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from hell import swap
>>> t1, t2, t3 = (1,2,3), (None,), ("a", "b", "erm...")
>>> swap(t1, t2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: objects have different sizes (incompatible)
>>> swap(t1, t3)
>>> t1
('a', 'b', 'erm...')
>>> t3
(1, 2, 3)
>>> s = set(t1)
>>> swap(t1[0], t1[1])
>>> s
{'b', 'erm...', 'a'}
>>> 'b' in s
False
>>> 'a' in s
False
>>> 'erm...' in s
True
>>> 'a' in list(s)
True
>>> 'b' in list(s)
True
>>> 

Okay, erm, what? It looks like it's there, but it's not, but then it is? There is, of course, a simple explanation for this:

sets (like dicts) are, for speed, implemented as a hash table. When you look up something in a set or dict, it first calculates a hash, and then searches for that. However, since it's possible for two objects to have the same hash, it also checks for equality. You will only get a result when there is an object around with both the same hash and is equal.

So, what happens here is: when you execute 'a' in s, the hash of 'a' is calculated, and all the items of the set that are referred to by that hash are checked whether they actually are 'a'. Since swapping, however, the hash of 'a' is associated with 'b' and vice versa — the set is corrupted because it correctly assumes that the hash of an object either never changed or does not, in fact, exist (lists, for example, aren't hashable at all, since they're mutable, and the hash would have to change when the object changes, which would defeat the whole point).

There you have it: that's what you get when you muck around in Python's memory.

I've uploaded the source code to JollyBOX code. Use it wisely.

>> from hell import swap
>>> swap(str, int)
zsh: segmentation fault  python3
% ]]>


Bach's St. John Passion

The video shown below contains the beginning of J. S. Bach's St. John Passion, a very baroque piece of music. I am not asking you to watch it all, there's a lot of repetition in it.

This post isn't really about music. It is about certain ideas that can be expressed through music.

To be perfectly honest, this beginning chorus scares me. Of course, the music is meant to be somewhat creepy, what with the restless, quick dancing of the strings and the dissonant woodwind melodies. The music effectively creates a certain mood, but there's much more to it than that. Let's have a look at the text(1):

Herr, unser Herrscher, dessen Ruhm
In allen Landen herrlich ist! Lord, our ruler, whose glory
is magnificent everywhere!

So basically(2), we are dealing with some kind of disembodied “lord” who is evidently, judging by the way the composer stresses the word „Herr”/“lord”, rather important. In fact, he is being completely and utterly glorified. Obviously, they(3) love him a lot. However, if we just glance back at the mood the music creates, that sinister, dark, mood, we must conclude that something doesn't quite fit. This disembodied “lord” person can't be all that great and lovely. If he were, the music would certainly be cheerful.

Now, I happen to know that this music, or at least the fact that is is so sinister, is all about the death of a certain Jesus (as the name is popularly transliterated from Hebrew to Latin via ancient Greek) of Nazareth (Palestine). Depending on how you interpret this, you could conclude that either they are glorifying death, glorifying and worshipping a dead guy, or glorifying some kind of “lord” that made sure aforementioned Jesus was murdered.

Anyway, what we can see is a great amount of glorification and love of some kind of “lord” that has a profound connexion to death, and I think we can agree that death is, in general, not a very nice thing. Natural, yes, but we do, as a culture, or as a species, have a certain amount of dislike for it. All of that is perfectly okay. A bit strange, maybe, but, in essence, okay. But then they start calling this “lord” a “ruler”. To recap, they love him, they might be a little afraid of him,death connexion and all, they hold him to be fabulously glorious (just listen to that melisma...), and they might do just about anything for him, this lord-ruler.

Of course, “lord” refers to a deity also known as “God”, or, more precisely, one deity that be three deities all at once, actually worshipped all around the planet. This deity is supposed to be all-powerful (no wonder they sound a bit scared), all-loving (no wonder they love him?), and glorious (umn, yeah). Also, even when this deity is around, at least two thirds of him tend to be invisible. All of this story sounds rather unlikely, and in fact, there is (you guessed it), not so much as a scrap of scientific evidence around to support any of it.

Yet, it appears that there are still people who glorify this “lord”, would do anything for him. The fact that he almost certainly doesn't exist doesn't help a whole lot here: there are plenty of people that claim to be his representatives. Or representatives of his representative. (a well-known claimant is Joseph Ratzinger, who styles himself Benedict XIV, which sounds like something a mediæval lord might be called) The thing is, they (or at least a number of them), the people glorifying this fantastic “lord” tend to believe some of the people claiming to represent that same lord-ruler. It would appear they have quite a devoted army behind them. Dangerously devoted. Dangerously devoted to somebody who (almost certainly) doesn't exist.

Does that not scare you ?

[1] Text and translations may be found at http://www.bach-cantatas.com/Vocal/BWV245.htm. The above English is the translation English-3.
[2] No, I'm not being scientific here. I might be attempting some form of humour.
[3] I won't attempt to pinpoint who they are, but there is bound to be a relevant they. After all, this piece of music is famous and played often to this day.

Impressum

spam goes here