Maintaining language-specific module package stacks

I posted this to the ubuntu-devel@ mailing list. Comments appreciated. In particular, I’m looking to gather consensus for what Ubuntu should be recommending to users who want to develop on and deploy software using language-community-specific modules.


I'd like to talk about addressing the difficulty in maintenance of long tail language-specific stacks in Ubuntu. For example, right now `src:rails` is stuck in disco-proposed[1]. It seems to me that we spend a disproportionate amount of effort trying to get this class of package migrated to the release pocket compared to the number of Ubuntu users who actually care and use them.

I suggest that:

  1. If a language-specific package, or stack of packages, is stuck in proposed, and nobody is volunteering to get them migrated, then we are more willing to delete them from the release pocket and release without that stack.

  2. We recommend, as a project, that users who wish to use these language stacks directly do so via the language-specific packaging tooling.

Background

It’d be great to get input from language-specific communities who may not be intimately familiar with distribution development process, so here’s a quick summary of what I’m talking about. Those already familiar with distribution development workflow can skip to the next section.

You’re presumably aware that language-specific communities generally have their own package repositories and package managers, such as PyPI/pip, RubyGems.org/gem, npm Registry/npm and so on. My understanding is that these communities generally advise that users consume from these repositories directly.

Debian often packages a subset of these repositories - usually for the purpose of fulfilling the dependency requirement of some higher level component, such as Rails in my example. Ubuntu then autosyncs these. Some users prefer these higher level components to come through their distribution, rather than via a language-specific package manager - presumably for the release management consolidation that this provides. However, it’s my understanding that in practice the majority of users do not consume the distribution packaging for these components, instead using the language-specific package managers directly as generally recommended by those upstreams. It also seems quite common for language communities to specifically recommend against consuming language module packages through the distribution.

In general distributions include only one version of each component in a given distribution release. Bringing the entire dependency web in line to make this possible can take considerable effort on the part of distribution developers, particularly when upstreams tend to use a large number of small dependencies and also require specific dependency versions due to frequent API breaks.

In practice, the dependency web is more complicated than this, and entire swathes of packages get held up at once until all the dependencies can be resolved together; this takes work in figuring out what the holdups are, making decisions about which versions to ship, and possibly patching code to change what dependency versions are actually required, in order to make everything work together. Think of it like trying to come up with a single requirements.txt (as generated by pip freeze), Gemfile.lock or package-lock.json file that works for all packages across the entire distribution release.

Until this is resolved, distribution package updates are held in a staging area, and won’t be part of the next distribution release. In Ubuntu we say that that our packages are “stuck in proposed”. Once resolved, the packages “migrate” to the “release pocket”.

The details of this process in Ubuntu are documented here: https://wiki.ubuntu.com/ProposedMigration

For example, we are blocked from shipping a newer version of the PHP interpreter until the wordpress package (which is written in PHP) works with the proposed newer PHP version. Now consider that there are hundreds of these reverse dependency packages like wordpress, including many PHP language modules. The addition of each one makes it a little harder for Ubuntu to move on updating PHP itself. My suggestion is that we more readily remove these reverse dependencies from the distribution release to free up the update of PHP itself in this example, rather that spend what seems like a disproportionate amount of effort fixing things that we suspect very few users actually care about.

Discussion

Part of the purpose of my suggestion is to make it far easier to transition to new language interpreters without worrying too much about the very long tail of barely used reverse dependency language modules that usually hold up these transitions. My understanding is that far more users rely on the distribution supplying the language interpreter and language-specific package manager than the modules themselves. In case of a transition being held up like this, I’m proposing to simply permit the deletion of the long tail from the release pocket, get the newer interpreter stack migrated into the release pocket, and consider it done. The long tail will then migrate if maintained actively by others, and if it isn’t, we’ll ship without it.

Right now for example, I’m suggesting that we simply delete src:rails from the release pocket, including its reverse dependencies, unless the reverse dependency list contains something outside the Rails stack that is unacceptable to us to delete. Rails users who use gem install rails, as recommended by Rails upstream[2], will not be affected.

My suggestion deliberately leaves the door open for able volunteers to be able to maintain these packages in Ubuntu if they wish.

One downside to my suggestion is that availability of particular language module packages may become unreliable between Ubuntu releases from a user’s perspective (if a particular package skips an Ubuntu release before being restored, for example). I suggest that this can be tackled later if it becomes a problem, for example by blacklisting such packages for longer unless a team is prepared to commit to preventing this from happening.

Another potential downside is in packages that are generally useful to users outside their own language ecosystems, yet depend on these language-specific dependency stacks. For example, take Vagrant. Vagrant is written in Ruby, so needs packages originating from RubyGems.org. I’m sure there are plenty of users who aren’t deploying a Ruby-based stack but who do use Vagrant. I think that a significant number of these users probably prefer to consume Vagrant from the distribution package (apt install vagrant) rather than from RubyGems.org (gem install vagrant)[3]. Because distributions include all of their dependencies in their own repositories, this means that to ship Vagrant as a package in Ubuntu, we also need to ship a package for everything in RubyGems.org that Vagrant requires. I suggest that we don’t apply my deletion policy to such packages and their dependencies - that we continue trying to maintain them on a best effort basis as we do today.

If we do change anything in this regard, I think an important part of this is for us to decide as a project what affected users can expect, what we recommend that they do, and that we communicate this clearly. I suggest that, in the absence of Ubuntu development teams volunteering to maintain this class of packages in Ubuntu, we generally follow upstream’s recommendations in using their language-specific package management stack rather than apt/dpkg. This isn’t all that different from the situation today, where users can’t rely on us having packaged the specific modules that they need at particular versions anyway.

We might want to be more specific in our recommendations to users. For example, it generally is better for system stability, particularly when installing software from third party sources, for third party software to be confined well. Where such a system is available, we could specifically recommend its use, and recommend against installing to the “system”. For example, with the use of virtualenv for Python stacks. I note that Ruby is available from upstream as a snap, so if suitable for general deployments that might be the gold standard for recommended confinement; if not, then at least rvm.

Please discuss. I intend to draw this thread to the attention of language-specific communities too, to try and get their input. In particular it’d be great to agree on the same recommendations for Ubuntu users. Note that this list is moderated; I propose to permit emails from language-specific communities in response to this thread for a few weeks to avoid fragmenting discussion between here and the unmoderated list. I’ll make sure replies get through moderation promptly. Don’t worry about not being subscribed: just replying to the list will be fine.

Thanks,

Robie

[1] http://people.canonical.com/~ubuntu-archive/proposed-migration/update_excuses.html#rails

[2] https://guides.rubyonrails.org/getting_started.html#installing-rails

[3] I note that Vagrant upstream specifically advises against apt install vagrant from distributions[4]. Contrary to my understanding of typical direct use of language-specific stacks with which I’m justifying my suggestion, however, my understanding is that many of those who use Vagrant but aren’t Ruby-based developers prefer the distribution package regardless. It makes sense to me that a non-Ruby-developer wouldn’t want to learn, use and maintain an entirely different package manager or add an additional third party software source just for one top level package that the distribution ships anyway, unless they specifically really need a version newer than one provided by the distribution release.

[4] https://www.vagrantup.com/docs/installation/

1 Like