Distribute Must Die

A.K.A distribute is dead; long live distribute!

Why?

Hopefully most people know that distribute is now deprecated, and the entirety of its functionality has been merged into the updated and now active setuptools project. If you didn't, consider yourself informed.

If you've never used distribute or setuptools before, they are almost worth it solely for their addition of install_requires in a python package's setup() function, which allows one to specify software dependencies of your python code before it will install. That, and they are required by the wonderful pip. Along with active development, setuptools and python packaging are improving at a rapid pace, making things easier and simpler for everyone. If you are currently using distribute or your packages work directly with it, please consider updating your little bit of python to the latest tools!

There are also practical reasons for updating: not only a myriad of bug fixes, until roughly v1.0, setuptools/distribute didn't properly use a verified HTTPS connection to download all resources. You want this for the thing that installs your software! Over time it's also gained support for a new installation format as well, which you'll hear about below. At the time of writing, PyPI tells me that distribute has been downloaded 384810 times this week. We need to move on!

How?

With a python package:

You might have a file called distribute_setup.py in the root of your python project. This is a bootstrap file that can import all that nice installation functionality if the user doesn't have setuptools or distribute installed. When run directly through python it would also install distribute for you. Essentially, you should replace this file with one called ez_setup.py which can be retrieved from https://bitbucket.org/pypa/setuptools/raw/bootstrap/ez_setup.py

Notice the use of bitbucket's TLS connection to ensure you receive the correct file. Your code in setuptools might then look like this:

try:
    import setuptools
except ImportError:
    from ez_setup import use_setuptools
    use_setuptools()

from setuptools import setup, find_packages

...

setup(
    ...
)

If you have any automated process to retrieve and install distribute, replace it with setuptools.

With your system's python:

If you have distribute installed on your system (or a very old <=0.6 setuptools), you should first uninstall or delete it as necessary, and then install a new setuptools (easiest way to ensure you upgrade cleanly).

Installing setuptools from scratch can be done on a *nix system like so:

wget https://bitbucket.org/pypa/setuptools/raw/bootstrap/ez_setup.py | python

It may need to be done as root to update your system files.

On Windows you should be be able to simply download and double click the file to have python run it.

Your goal is to have setuptools 2.x if you run Python >=2.6, or 1.4.2 if you are unfortunate enough to run 2.4 or 2.5. Or at the very least some version >1.

Please also consider updating to the latest versions of pip, setuptools and virtualenv, if you use those. Such a command to do so is pip install -U pip setuptools virtualenv.

Why now?

If you've been in the python packaging business for a few years, either trying to distribute your own code, or just installing others', you might have seen this image floating around at some point in your travels:

Distribute and Pip are the new hotness!

Ahh, the days when "new hotness" was still a cool phrase.

It's from a time when not so many cared about the python packaging ecosystem, but a few were trying to improve it. Ian Bicking was working on some cool new installation tool called pip (now with the ability to uninstall things!) and Tarek Ziadé had forked the languishing setuptools project into distribute, to further improve it (python 3 compatibility was nice). Tarek has evidently put a lot of work into python packaging as a whole when many others weren't, so unless I missed it I believe a lot of thanks is owed to him that he never received, for pushing it as far as he did when it needed it most. As far as I can gather, he took so much heat for his troubles and effort that he decided to stop working on the problem all together. Fortunately it didn't stop him writing great python; his newest baby looks extremely cool.

Around 2011, a lot of other people realised that the state of python's "standard" packaging tools could be in a hell of a lot better shape than they were, and weren't serving users that well. Some decided to create alternative systems like conda, while some decided to improve what was already there. Probably the largest sea-change was to reinvigorate setuptools by merging most of distribute into it (or possibly the other way around). To this day I am a little confused as to why the merge was named "setuptools" rather than "distribute", because it retroactively made the above image and its advice unfortunately misleading. I imagine more than a few heads were scratched coming to terms with the fact that the name of the active project to be preferred had completely swapped. But it is as it is.

There is a lot more to be said about the history of python's packaging ecosystem, and I really haven't done any justice to it, but that's a little outside of this post's scope. This video should be an interesting watch if you want to know more.

As 2014 comes around a lot has changed and practically all for the better, so it's a great time to update what you're using. Wheels are out, which are a binary distribution format that essentially supersede eggs, are pip-installable, and make things a lot easier for anyone distributing code that isn't pure-python. But you'll need a recent setuptools and pip to use them! I now whole-heartedly recommend virtualenv to manage isolated python environments, which are amazingly useful if you ever happen to work on more than one python project involving 3rd party packages.

There is still a lot of work to be done also, especially for folks like Numpy. They have complex binary dependencies to make their scientific python as fast as it is, and sorting out how to manage those and still make things easy for the user is by no means a solved problem. So pip install numpy is not yet a recommended practice (you'll most likely get a very slow Numpy from doing so, if it even installs at all). There is also a lot of bad blood to wash away from when setuptools was a completely inadequate solution for them.

So of course, in order to improve things as fast as possible, it's best to move on from legacy packaging tools, which are as of writing distribute and easy_install.

More Hints & Tips

For anyone that ever installs and tests a lot of package-based software, wheels should make this process a lot faster. Not only are they fast to install, you can create a cache of them from which pip can fetch them locally, rather than downloading and installing each time. Check out the pip wheel command and the Usage Section of the wheel docs. Integrating this workflow into python software building should cut down almost anyone's build times drastically.

For anyone that's interested in cryptographically making sure their packages are what they say they are and come from who they're supposed to, wheel also comes with the keygen, sign, and verify commands to do this easily. I bet there will come a time when this attack vector becomes important enough for everyone to really start taking it seriously, so it's great to see python is doing something about it now.

The soon-to-be-released python 3.4 should also start coming with pip out of the box, which will be awesome for starting off newbies. Lastly, there's also work to refresh the venerable PyPI with a better, tested code base to allow even more cool stuff to be done in the future - warehouse.

It's a bright and fervorous time for python packaging!

- Matt.

CabinJS