Here is a report about something I've been playing with lately, namely a way to install a full blown custom Odoo-based project entirely with Debian packages. In this report, we'll work on a Debian system, but keep in mind that this would apply readily to any derivative distribution, such as Ubuntu.
It's been a funny experiment, about something that has been on my mind at least since the introduction of packaging options in the OpenERP/Odoo buildout recipe.
At this point I don't know if it's going to become the way Anybox deploys its custom Odoo-based projects, but time will show.
In this blog post, I will
- try and summarize what are those needs that zc.buildout alone does not address,
- then proceed with a commented shell session of a pure Debian-style installation of a real-life project.
- The third part explains how it's been brought and present the whole architecture that backs this up.
- The final part will be about the limitations of this approach.
As we've always said, zc.buildout is a great tool to provide uniformity of installation along several axes:
- uniformity accross projects: always the same bin/buildout command
- uniformity accross installations of a given project
- means to set apart those parameters that must depend on each installation (system wide or per install).
All of this while allowing each project to have more dependencies than the upstream Odoo, or to use newer features and bugfixes from those libraries. Given that these features and bugfixes may well be critical for a precise project while being minor to the ecosystem as a whole, we need flexibility. In turn, full flexibility without a reproducibility warranty would just bring hostile chaos upon us. Buildout gives us such a warranty, albeit with shortcomings.
Namely, installation can fail. This means that the reproducibility promise is actually to be understood as "if it worked, then it is what has been required".
The bootstrap phase is the typical chicken-and-egg problem: one needs to have a local installation of zc.buildout with a compatible enough version to actually run bin/buildout. If the configuration file requires a different version, then the actual first execution will also install it and switch to it before proceeding .
There are several ways to bootstrap  and I would need several pages to write on the details of how each can break. The important thing, though, is that making one work reliably at a given point in time is far from being a guarantee that it won't stop working in an unpredictable manner. Usually, developers won't notice immediately, and that increases the probability of embarassment.
Considering that this process actually involves setuptools, on which zc.buildout also depends for its regular operation actually makes me think that a reliable procedure is quite impossible, at least without the a priori knowledge of versions of both and some assumptions on the environment.
When the bootstrap breaks, the errors are quite obscure, but there's nothing that couldn't be overcome easily by someone accustomed to them in manual mode. On the other hand, we want to let developers focus on their precise project, and give something that works to the one doing the installation. Also consider that the latter one might not even be a person, and part of a wider automated process. Automated processes are great, but sometimes don't let operators switch to manual mode easily.This first execution of bin/buildout is not part of the bootstrap, but one can hardly call the bootstrap successful if the first execution fails.
A very common practice is to include the bootstrap.py script directly in the project. This has the obvious advantage of making the project self-contained, with instructions as simple as: "just do python bootstrap.py && bin/buildout. But it also implies that this precise bootstrap must work in all contexts and without time limitations, if one really wish to ensure reproducibility of a precise build of the project.
Another way is to shortcut the bootstrap.py script and instead install directly zc.buildout in a virtualenv with pip. It can be tightened by installing directly the target version.
Shipping an archived buildout such as one the recipe's extract-downloads-to option can help you build is great: the precise revision of Odoo and all the modules that the project uses are already extracted and packaged.
Still, without further efforts, the installation on a given system will perform these steps:
- download of all needed Python libraries
- compilation of those Python libraries that need it (extensions)
Of these, the most problematic is the download. First, it's not frequent, but versions can disappear. It's not frequent anymore, but the Python Package Index can go offline.
I personnally never had to install Odoo/OpenERP on a machine that could not reach the internet at all, but I've lost some valuable time and energy in the frustration of having to battle with proxy configurations. Your fellow operators might just not be ready to provide you with access codes.
In general, if a step that you're used to perform on your own in 5 minutes suddenly needs assistance from a third party (proxy admin) and overall 1-2 hours of tinkering, you're very likely to have a schedule problem.
A full Debian backed installation
Before explaining how it works, let's demonstrate.
I'm on a fairly barebones Debian 7 system (actually it's a fresh install from the OpenVZ template, of which I removed some stuff for fun). It has only 209 installed packages, and doesn't take much disk space:root@aohdeb:~# dpkg -l | grep '^ii' | wc -l 209 root@aohdeb:~# df -h / Filesystem Size Used Avail Use% Mounted on /dev/simfs 20G 448M 20G 3% /
I have gone through the instructions to hook the system to Anybox package repositories, installed anybox-keyring and additionally registered a special source:root@aohdeb:~# cat /etc/apt/sources.list.d/anybox.list deb http://apt.anybox.fr/openerp common main # anybox-odoo-host is in testing only for now deb http://apt.anybox.fr/openerp anybox-testing contrib # special source for Python libs expected by buildout: deb http://pylib.apt.anybox.fr wheezy-pythonlib contrib
I have a local package of my project that I managed to put on this server. I'll explain how it's been built later, but let's say for now that it has all the needed dependencies, including anybox-odoo-host.root@aohdeb:~# ls *.deb project_name_220.127.116.11.2_all.deb
This is actually a complex project, with lots of dependencies. In particular, it makes use of Aeroo reports, which means a full headless LibreOffice stack.
For the demonstration, we'll use gdebi, a tool that's able to download the dependencies from the repository for a locally available package (see also below for discussion of downloads):root@aohdeb:~# aptitude install gdebi-core
Okay, so that took 10 minutes, of which 8.5 spent downloading (283 MB of at 553 kB/s). As you can see, I'm not much cheating: I forgot to prepare the locale, and it shows. In real life, it'd a better idea to install and prepare the PostgreSQL cluster, maybe with a precise version from apt.postgresql.org, but anyway, I followed the instructions (no need to repeat them here) and remade the cluster with the french locale.
From now, we'll operate fully offline, so let's kill connectivity:root@aohdeb:~# ifdown venet0 SIOCDELRT: No such process root@aohdeb:~# ping pypi.python.org ping: unknown host pypi.python.org
and install our project:root@aohdeb:/# anybox-odoo-add-instance-user instance Adding system user `instance' (UID 105) ... Adding new user `instance' (UID 105) with group `openerp' ... Creating home directory `/srv/openerp/instance' ... sending incremental file list ./ .buildout/ .buildout/default.cfg sent 274 bytes received 38 bytes 624.00 bytes/sec total size is 147 speedup is 0.47 CREATE ROLE instance NOSUPERUSER CREATEDB NOCREATEROLE INHERIT LOGIN; root@aohdeb:~# time anybox-project_name-deploy --port 8069 --cron-port 8070 instance Running command: ['anybox-odoo-deploy', '--offline', '--bootstrap-setuptools-egg-path', '/opt/anybox/lib/python/setuptools-7.0-py2.7.egg', '--bootstrap-buildout-version', '2.2.5', '--port', '8069', '--cron-port', '8070', '--buildout-install-only-specified-part', 'instance', '/usr/share/anybox/project/project_name-18.104.22.168.2.tar.bz2'] This looks like a brand new deployment has to be made for '/srv/openerp/instance' ! Unpacking /usr/share/anybox/project/project_name-22.214.171.124.2.tar.bz2 Creating symlink '/srv/openerp/instance/current_buildout' to project_name-126.96.36.199.2 Creating subdirectory log/ Creating subdirectory dumps/ BUILDOUT OPERATIONS INFO:anybox-odoo-deploy:offline bootstrap: all conditions fulfilled, now attempting Creating directory '/srv/openerp/instance/project_name-188.8.131.52.2/bin'. Creating directory '/srv/openerp/instance/project_name-184.108.40.206.2/develop-eggs'. Generated script '/srv/openerp/instance/project_name-220.127.116.11.2/bin/buildout'. INFO:anybox-odoo-deploy:offline bootstrap: success Calling buildout on /srv/openerp/instance/aohdeb.cfg anybox.recipe.openerp.base: Created etc/ directory Installing openerp. (..) there are some attemps to check what external find-links have but they are harmless (...): Download error on http://download.gna.org/pychart/: [Errno -2] Name or service not known -- Some packages may not be found! (...) anybox.recipe.openerp.base: 'openerp-command' is a direct soft requirement, retrying without it Generated interpreter '/srv/openerp/instance/current_buildout/bin/python_openerp'. Generated script '/srv/openerp/instance/current_buildout/bin/upgrade_openerp'. Generated script '/srv/openerp/instance/current_buildout/bin/start_openerp'. Generated script '/srv/openerp/instance/current_buildout/bin/nosetests'. Generated script '/srv/openerp/instance/current_buildout/bin/test_openerp'. anybox.recipe.openerp.base: Creating config file: etc/openerp.cfg (..) DATABASE OPERATIONS. DB name: 'instance' Calling create/upgrade script 'bin/upgrade_openerp' Starting upgrade, logging details to /srv/openerp/instance/log/upgrade_create.log at level INFO, and major steps to console at level INFO 2014-11-23 19:42:35,690 INFO Database 'instance' base initialization done. Proceeding further 2014-11-23 19:42:35,691 INFO Read package version: 1.16.9 from /srv/openerp/instance/current_buildout/VERSION.txt 2014-11-23 19:43:21,993 INFO This is a fresh database, load init data (...) skip this project's specific initialization (...) 2014-11-23 19:44:31,190 INFO setting version 1.16.9 in database 2014-11-23 19:44:31,194 INFO Initialization successful. Total time: 129 seconds. real 2m17.912s user 0m37.380s sys 0m2.853s
Is it really better ?
We've seen quite a bit of downloading in the above quoted session, so it doesn't look like so much of an improvement at first sight. The key feature however is that all downloads happen at once, following a standard that is very much widespread and easy to tell your counterparts about.
Here's a non exhaustive list of what can be done from there:
- tell in advance the admin of the target system to go through the APT hooking procedure while provisioning the system. Chances are that the machine has a way to reach on the mainline Debian repositories anyway. In any case, that's presenting a prerequisite that has clear bounds and can be decided by the people that have the network mastery.
- in case where hooking to an Anybox repo is against local policy, simply transmit all the involved packages directly.
- use a tool such as apt-offline
- mount a local CD copy of the involved Anybox repos
- someone really committed could go all the way to create a bootable USB stick.
Build process and infrastructure
By default, anybox-odoo-host makes all buildouts use a common shared eggs directory: /opt/anybox/lib/python , so the idea was to prefill this directory. Because we still wish to allow installation of different instances on a single system, this meant building a Debian package for each egg in each version, since overlapping binary Debian packages are strictly prohibited .
- the exact wished versions of setuptools and zc.buildout are passed in command line arguments
- these versions are already available within the eggs cache.
- there's a working system-wide setuptools for inspection (provided by package dependencies)
So that leaves us with providing the eggs packages and passing the right options to anybox-odoo-deploy.
For the eggs, I wrote a script to be executed in the same python environment as a working buildout. It looks at buildout installed eggs and makes a package for each of them, directly in binary form.
To allow for coexistence of several version, the version must actually be set in the package name
This script also detects whether each egg is pure Python or architecture dependent and builds packages accordingly.
A second script takes the output of the first and the archived tarball to cook the final package. It includes the anybox-project_name-deploy that we've seen in action and is nothing but a rewrapping of anybox-odoo-deploy with the correct options.
Early results where encouraging, I could finally get a project to install offline by copying all the Debian packages over.This gets specified in ~instance/.buildout/defaults.cfg.dpkg will throw an error if a package has a file that is listed in another package that's already installed.Actually anybox-odoo-deploy also has two layers of fallback after this before attempting a bare python bootstrap.py.
Two triggered post-release builders
In Anybox's buildbot, we already have builders to produce the tarballs for all our significant projects. This is actually published as part of our buildbot configurator.
Of course I wanted to add Debian package production steps to them, but that would not have been enough, since the eggs packages can be architecture dependent, and there are at least two processor architectures to support: i386 and amd64. It's not a bit stretch to imagine more, completely foreign ones. Also, it'll be necessary to rebuild them for other Debian versions and derivatives… So we need to run on several build hosts.
Luckily, buildbot has this very flexible notion of triggerable builds, so I could make the tarball build trigger two separate builders, one for each architecture, and have them run on appropriate hosts.
Details of the build
The steps that these builds consist of are:
- grab the tarball
- bootstrap and buildout (actually I've just reused the standard steps collection provided by the configurator and also used by testing builds)
- run the eggs extraction script. I didn't mention it above, but it leverages APT to extract only those packages that are not in pylib.apt.anybox.fr yet.
- upload the produced eggs packages
- include master-side the new packages in pylib.apt.anybox.fr, using reprepro.
- (only for one build) produce the final package and upload it
The whole picture
That makes a lot of infrastructure for an experiment, but that was fun, for some value of "fun".
Quircks and limitations
There are some remaining problems with that approach. Some just need to throw a bit more work at them, but some are deeper:
A lot relies on the script that does the eggs extraction, and there's no a priori closed list of what it must do.
For instance, I had to introduce a special rule for Gunicorn, in order to exclude a Python 3 source file that would break the compilation of bytecode that happens at installation.
- The architecture dependent eggs have to be build for each Debian version or derivative. Typically, the stable distributions guarantee ABI stability of the system libraries during a the distribution version lifetime, but it'll differ for each
There's currently no way to rebuild a Debian package for an egg. This will be necessary if we become serious about using this system.
There's currently no way to require additional system libraries, except by hardcoding in the extraction script. Maybe dh_python could be of use here.
local configuration templates may require additional libraries that the buildbot could not have seen. Typically, if the local buildout template adds the gunicorn option, and if that option is not present in the project's buildout, the deb extractor cannot be aware of it, and we have a broken dependency.
Buildout is a multi-component system. A single buildout configuration can very well consist of an Odoo-based application, alongside a Django one, and a part providing Sphinx to build the documentation — or even something totally foreign to Python, such as a daemon written in Rust.
In full generality, it's impossible to guess what secondary parts aim to provide and how. Neither can one foretell whether they are dispendable or essential for the operation. Therefore, the anybox-project_name-deploy wrapper script above restricts the installation to the Odoo/OpenERP part. This can be overridden.
But, for more complicated setups, prebuilding and shipping a whole system image à la Docker might well be the only viable solution. Nevertheless, these Debian packages might serve well as an intermediary towards image production for various systems.