Introducing tjol.eu
On good old JollyBOX blog, I try to maintain at least a bit of editorial quality; you could say I like my posts here to be, in a way, “worth printing”. This means that every now and then, something springs to mind that I feel I would like to share, but that doesn't make for a blog post in the style and quality a reader of the JollyBOX blog might expect.
These days, tumble-blogging is quite en vogue, and I have finally found the time and energy to set one up within this my web flotilla (code name: JollyBOX v6), and I call it tjol.eu. It may or may not be the Next Big Thing™, and it is, emphatically, not a journal. This new blog-minor takes its place between the good old blog-major and the blog-micro, held jointly by Twitter and identi.ca.
I took care to implement this in a way that allows posts to be moved from one blog to the other without permalinks breaking, and even supports making selected posts, such as this one, part of both blogs, with a single set of comments and pingbacks.
This will be fun!
My life as a duck
Naturally, you have all thoroughly read Harry Potter. One particular line from the epic has been making its rounds in my mind for months — I am certain you know it well:
WIT BEYOND MEASURE IS MAN'S GREATEST TREASURE
A fabulous motto, don't you think? You might ask how this phrase managed to stick in my mind for months, and there is indeed a simple answer: I haven't been using it a lot, not by my usual standards. My mind, that is. And it would appear that I've adopted the motto to a point, for purely defensive purposes of course, along with muttering to myself in French.
So I wound up working with mentally disabled people, which is, by all means, not unpleasant, and certainly valuable work, but on the other hand it is not exactly voluntary, nothing like well-paid, and, most importantly, frightfully, excruciatingly, dull. There are days with a lot of work, and they're not really that bad, but typically, there isn't particularly much that requires my attention. My reaction to this situation was to spin down my mental faculties to a point where I don't get to write blog posts or programming code an awful lot, but where maintaining an inane French inner monologue and crafting the odd mildly humorous but mostly rather snide remark is enough to keep me going without going totally batty. (you be the judge of that)
![]()
On a wildly different note, my slice of aforementioned treasure still appears to be in place; after all, I got a place at Mansfield College at the University of Oxford to read Physics, starting Michaelmas next.
from hell import interesting_revelations
Somewhat inspired by the philosophical thickets in the depths of one of the more fundamental discussions on python-list aka comp.lang.python, I wrote a little function in C that grossly violates the Python object model's integrity and swaps two object structures in-place. What I wasn't expecting is that this can be used to illustrate some interesting facets of the CPython internals.
The original version looked like this:
static PyObject * swap(PyObject *self, PyObject *args) { PyObject *obj1, *obj2; Py_ssize_t len; PyObject *temp; if (!PyArg_ParseTuple(args, "OO", &obj1, &obj2)) { return NULL; } len = obj1->ob_type->tp_basicsize; if (obj2->ob_type->tp_basicsize != len) { PyErr_SetString(PyExc_TypeError, "types have different sizes (incompatible)"); return NULL; } temp = PyMem_Malloc(len); memcpy(temp, obj1, len); memcpy(obj1, obj2, len); memcpy(obj2, temp, len); obj2->ob_refcnt = obj1->ob_refcnt; obj1->ob_refcnt = temp->ob_refcnt; Py_INCREF(Py_None); return Py_None; }
Simple: get the object size in memory, and swap using a temporary variable. This sort of works — but not quite.
Python 3.1.2 (release31-maint, Jul 8 2010, 09:18:08) [GCC 4.4.4] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from hell import swap >>> a = "this is the first string" >>> b = "this is the second string!" >>> swap(a,b) >>> a 'this is the second string!' >>> b 'this is the first string' >>> t1 = (1,2,3) >>> t2 = (a,) >>> swap(t1, t2) >>> t1 (1,) >>> t2 zsh: segmentation fault python3
As you can see, it swapped the strings without any problems (I'll show you some problems further below), but it behaved strangely with the tuples: the new t1 does have only one element, like the old t2, but that one element is the first element of the old t1! Also, what the flip happens when you try to access t2?
Turns out tuple is a variable-size type. That means it can be created with any number of items, and have an according size in memory depending on how large it has to be. My original code only respected the “basic size” of the type, meaning that, in the case of tuples, it copied the information on how many items there are, but not the actual items. When trying to print t2, Python reads beyond the end of the tuple structure, probably dereferences an invalid pointer, and dies a painful death.
On a side note, Python's list type is, contrary to what you might expect, not a variable-size type — it cannot be, since, in the case of variable-size types, the length must be known when the object is created (and allocated), and can never change. (The reason is that realloc(3)-ing an object might move it, which would invalidate pointers, which is when all hell would break loose). Lists don't keep their items in the actual object structure, they simply keep a pointer.
Armed with the knowledge of variable-size types, we can fix hell.swap to work for tuples:
len1 = obj1->ob_type->tp_basicsize + ((PyVarObject*)obj1)->ob_size * obj1->ob_type->tp_itemsize; len2 = obj2->ob_type->tp_basicsize + ((PyVarObject*)obj2)->ob_size * obj2->ob_type->tp_itemsize; if (len1 != len2) { PyErr_SetString(PyExc_TypeError, "objects have different sizes (incompatible)"); return NULL; } temp = PyMem_Malloc(len1); memcpy(temp, obj1, len1); memcpy(obj1, obj2, len1); memcpy(obj2, temp, len1); obj2->ob_refcnt = obj1->ob_refcnt; obj1->ob_refcnt = temp->ob_refcnt; PyMem_Free(temp);
Recompile, and we're ready for more apocalyptic idiocy. This time, after checking that tuples actually work as expected, we will be swapping strings in the wrong place to the great detriment of our sanity.
Python 3.1.2 (release31-maint, Jul 8 2010, 09:18:08) [GCC 4.4.4] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from hell import swap >>> t1, t2, t3 = (1,2,3), (None,), ("a", "b", "erm...") >>> swap(t1, t2) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: objects have different sizes (incompatible) >>> swap(t1, t3) >>> t1 ('a', 'b', 'erm...') >>> t3 (1, 2, 3) >>> s = set(t1) >>> swap(t1[0], t1[1]) >>> s {'b', 'erm...', 'a'} >>> 'b' in s False >>> 'a' in s False >>> 'erm...' in s True >>> 'a' in list(s) True >>> 'b' in list(s) True >>>
Okay, erm, what? It looks like it's there, but it's not, but then it is? There is, of course, a simple explanation for this:
sets (like dicts) are, for speed, implemented as a hash table. When you look up something in a set or dict, it first calculates a hash, and then searches for that. However, since it's possible for two objects to have the same hash, it also checks for equality. You will only get a result when there is an object around with both the same hash and is equal.
So, what happens here is: when you execute 'a' in s, the hash of 'a' is calculated, and all the items of the set that are referred to by that hash are checked whether they actually are 'a'. Since swapping, however, the hash of 'a' is associated with 'b' and vice versa — the set is corrupted because it correctly assumes that the hash of an object either never changed or does not, in fact, exist (lists, for example, aren't hashable at all, since they're mutable, and the hash would have to change when the object changes, which would defeat the whole point).
There you have it: that's what you get when you muck around in Python's memory.
I've uploaded the source code to JollyBOX code. Use it wisely.
>> from hell import swap >>> swap(str, int) zsh: segmentation fault python3 % ]]>
Some thoughts on proprietary software
Just now I read Bradley Kuhn's recent blog post entitled Proprietary Software Licensing Produces No New Value In Society. The argument made is, in essence, that by receiving money for a proprietary license, a developer is paid without doing any work. This argument is, in its simplicity, quite pre-industrial in nature and fundamentally flawed. Let me explain:
Bradley, in your post, you compared software development to constructing houses. The problem with this is that houses aren't copied - they're singletons. We need a better analogy.
Think suits
Let's say you want to buy a new suit. You have a couple of fundamentally different options: you can either contact a tailor, and pay them to make one. This is a very simple model: the tailor works on a suit, knowing that they'll be paid. In the end, you pay them for the actual work involved. It makes a lot of sense. This is akin to custom software development, where one is paid by the hour.
However, there is a cheaper alternative: go to a store and buy a ready-to-wear, off-the-shelf product. It's probably good enough, and you'll pay a lot less. You're actually getting value for money, I'm sure you'll agree that it's perfectly reasonable to pay for this. However, the way the money flows is a lot less direct and obvious:
At the beginning, someone designed the suit you're buying, without being paid (or being paid by a company that isn't getting paid yet). Someone set up a production line, without being paid directly, on the mere speculation that someone might buy the suit. And now, you, the customer, are (in addition to the manufacturing and distribution costs, which don't exist in software development) retroactively paying the designer for the work they might have done years ago.
Instead of clothing, I could have used any number of other examples, such as any kind of engineered hardware, or even books. However, nobody buys custom-tailored books.
With software, in addition to financing speculative work done in the past without direct remuneration, you're usually paying for support, for bug-fixes, and for future upgrades: You are, actually, helping to finance continued work. Here, I'm mostly thinking of small software development shops, not so much big corporations like Oracle or Microsoft. For more of an insider's perspective (I myself am a student and have experience only with custom (web) software development and free software projects), I can recommend a nice article by Virgil Dupras of Hardcoded Software, recently linked on the python-dev list.
There is an ongoing micro-discussion on identi.ca that might interest you.
As a small clarification: I support free software, but I think that a strict interpretation of freedoms 2 and 3 can have its problems in a world governed by markets and money.
On the evolution of snakes.
It's been a number of years since I first learned programming in Python with Mark Pilgrim's excellent, but now somewhat outdated, book, Dive Into Python. It has managed to become outdated because the Python language is being developed and improved all the time and new features are being added. One of the best features of Python is, beside the standard libraries, arguably, the documentation, which is good enough to include What's New
documents for every release.
I've decided to have a look at the backlog of new features, and consider how I use Python today in ways that simply didn't exist when I originally came across the language.
Read about my findings after the break. (Technical language is used. Knowledge of Python and its features is presumed.)
