|
This is a proposal for a new, distributed ebuild treatment system. In short the idea is to create a new "most-minor" number represented in ebuild name (just before .ebuild part) to reflect a real life of an ebuild. This will allow to automate inclusion of newly submitted ebuild into the portage tree in a controllable (and secure) manner. Below I will do in-depth description of proposal, starting with motivation based on overview of present situation, followed by introduction of user and maintainer pictures of the proposed system and concluding with discussion of security of such setup. |
Gentoo is undeniably gaining popularity. Mailing list numbers show no less
than 1000 actively participating users. In fact new mailing lists are
created in order to make life in existing ones bearable :-).
Now with 1.0 release out it is a good time to look at the
design principles that went into this distribution in order to pinpoint and
address limitations. To have a base for such analysis lets take a look at
present system, specifically at how ebuilds get accepted and processed.
Now, obligatory "Why I am doing this":
I consider gentoo to be the superior source based distribution (from the
distribution developer perspective) for the
reasons outlined below. However I want to mention the main reason: it has a
potential to become a "unifying base" for other distributions not by design, but
by evolution.
Gentoo linux is a source based distribution oriented on developers. Very
importantly users are treated as developers by encouraging everybody to
participate in distribution development via review of system scripts and ebuilds
and submission of their own (ebuilds and patches). No less importantly (for
growing user base) is its ease of use, maintaining and (provided good
install docs) setup, which makes it popular even among regular linux users.
Portage plays an important role in bringing more people to this distribution by
the means of easy and powerfull control over what and how gets built and
installed on the system. The core of a portage database (the data vs scripts
which control its behavior) is a collection of package directories containing
ebuild scripts and supplementary files. You can find description of the system
on gentoo website. Here I want to address
the other side of the system: how ebuilds get submitted, accepted and processed.
Ebuild submission process is described in its
howto in
detail.
In short, people who want to contribute are required to create an
account at bugs.gentoo.org.
Then they can submit their ebuilds via bugzilla interface (however see not yet
resolved bugs
#71,1038,1195). After
the ebuild is submitted it gets assigned maintainer. Later
it gets reviewed by people form the core group (either by somebody having time
and having happened by that submission or assigned maintainer) who incorporate
it into CVS. As a new package it might appear first in the incoming/ or,
especially for system/core packages and updates it can go directly into its
proper place under /usr/portage. This is when it gets visible by majority and
when everybody can install it.
This is a very reasonable ebuild processing system, however it has its
limitations.
So, what are the indications, that we are hitting mentioned problems?
One observation is increased number of requests posted on mailing lists
to point to the url where that or other ebuild (which apparently did not make it
yet to CVS) can be found. As well as ebuild announcements are maid in mailing
lists occasionally instead or in addition to official way. My position on such
matters is that when submissions are done in ways alternative to the accepted
procedure it is time to reconsider the procedure.
Another observation is that there are bugs
that get sufficient attention on user side and are even resolved, however
proposed solutions do not make it into ebuilds in CVS for a long time. I have
been involved in one such situation with a "too-new-tool" problem with gnucash.
The problem has been resolved for about 3 month (Janyary-March) with
solution that requires two-line modifications to gnucash and guppy ebuilds.
However users trying this program were still occasionally complaining. The bug
still maintains "assigned" status, while the issue was largely resolved with an
updated version of gnucash and new ebuild (Thanks Achim and Seemant!). Also my personal
experience with ebuild submissions was that it is possible to wait for one or
two weeks before it will get attention.
Now, please do not consider this as any kind of offence. There clearly are good
reasons behind such delays in response. All these examples representnon-critical
user-side packages. I am sure core developers are very busy updating system and
incorporating urgent security updates (there were few this and last month),
especially in view of recently announced ebuild freeze (my examples are actually
form the time before the announcement). However these are exactly the situations
which show that the central revision is becoming a bottleneck.
And these observations only scratch the top of an iceberg. Now with the tools
like mkebuild or ebuilder available we can expect many more
users to start contributing new ebuilds.
Thus the idea of this proposal is to design a system that will allow to off-load
processing of all non-critical stuff off core team and let them concentrate on
what is most important.
Below I propose an ebuild processing system (no longer limited
to only a submission and review) that will allow to avoid mentioned
bottleneck on one hand, while letting users to maintain secure and stable
system1 and letting core
group (which can remain relatively small) to keep control over what's happening
with distribution.
On a somewhat unrelated note. I think many people (from open source/free
software world at least) agree that the key element of Linux success is an
ease of reuse of what was done by other people. Very essential to this is that
every developer not just solves any particular problem for himself but
propagates his solution to maintainers of the related project. Gentoo was
designed around just that idea: to automate as much repetitive work normally
done by system administrators manually as possible and thus to allow everybody
using it to benefit from already created install scripts (ebuilds).
Interestingly this design not just allows for such reuse, in fact it silently
pushes people towards sharing their work. If you want to avoid maintaining your
own portage tree, which you have to if you have a few ebuilds not included in
official gentoo distribution, than you will have to submit your work to central
depository so that others can reuse it.
This is excellent, and just what most of us want. However this also means that
increased stress in terms of large number of submitted ebuilds is not an
occasional and temporary situation, but it is likely to remain such (and
actually grow) as gentoo gains more and more users.
Central to this proposal is a concept of ebuild stability levels which are to
be introduced in order to control ebuild flow. The stability levels (I will
often use word "status" instead) can be
considered as a new "minor-minor" tag and should be represented in an ebuild
name. An appropriate place seems to be a one word suffix right before .ebuild
part. In this way it naturally extends present ebuild notation. For example:
vim-6.0-r5.ebuild (present notation) would become
vim-6.0-r5-status.ebuild, where status would be one of: new, confirmed,
unstable, approved, core.
Second important provision is that all new
submissions are pushed to CVS immediately, so that they become available as soon
as user does emerge rsync. Users are encouraged to vote for ebuilds
they like to use or want to see repaired or blocked. These votes serve as a
basis for ebuild status change (see vote0.html
for possible vote system implementations, but please read through this
chapter first).
High level scripts (such as emerge and emerge-based utils from gentool and
gentoolkit) will have to be modified to accommodate for this and treat ebuilds
differently depending on their status. New flags are introduced for use in
make.conf (as well as make.defaults and make.globals as at present), which
define default emerge behavior.
critical, same thing over again, but now slowly and with details.
As soon as new ebuild is submitted it gets a "new" status (+ appropriate
suffix in the name) and is automatically incorporated into the portage tree
(more for CVS security discussion in "server side
perspective" part). Users are able to see and try the submission upon doing
emerge rsync.
Users can and are encouraged to vote for ebuilds:
Last ebuild stability level in this basic structure is "core". It is reserved
for a "system" packages, such as system libraries and other stuff which is
critical for system functionality. All updates and patches shell go through core
developer group and are the sole exception from immediate availability rule.
For stricter security there may be additional "core_new" status, which gets
promoted directly to "core". In direct analogy to "new" packages ebuild shell
accumulate threshold positive vote value before the "new" tag is cleared. It may
be beneficial to have different promotion thresholds for "new" vs.
"approved_new" vs. "core_new" levels.
All "*_new" ebuilds shell be treated equivalently to "new" by emerge and other
high-level tools. As such they do not need three different suffixes and all can
use "new" suffix instead. For security reasons internal list of core and
approved packages should be maintained anyway.
However just to allow everybody see everything will
quickly make things unbearable as well as kill any stability. Therefore these
ebuild stability levels are to be used not just to denote ebuild status,
but also as a reference for emerge as to how to process ebuilds. By default (no
options specified) emerge should only see and install packages with
corresponding status level. Such as emerge category/package shell only
install (and see if for example emerge --world update was called)
packages of default stability level and higher (approved and core typically). In
order to see lower stability level packages user will need to specify
appropriate --enable-status flag. For example to let emerge see
confirmed packages user will need to invoke it as emerge --enable-confirmed
category/package.
In order to set default status, two flags in make.conf (and corresponding options
to emerge) should be used:
| Stability_Level | General stability level of accepted for installation ebuilds. emerge should consult this flag when processing requests to list and install packages. If omitted should default to "approved" thus providing for distribution with stability level corresponding to present gentoo system. |
| rsync_Stability_Level | This should default to Stability_Level if omitted or set stricter than previous one. It defines the stability level of ebuilds which will be checked out upon rsync. If this flag has more relaxed setting than Stability_level (for example approved and All) then user will be able to enjoy stable default setup (non-approved packages are not even shown by emerge! and other high-level utilities) while maintaining ability to see what's in stock either looking at what's under /usr/portage or specifying --allow-status flag (same as above). |
Ok, lets finally consider how all this can be done.
Lets follow the ebuild starting from its birth (when it gets released by an
author :-)).
Ebuild submissions can be kept on bugs.gentoo.org as this is done now.
Author should have bugzilla account which will serve as his feedback link (if
only that problem with bugzilla not finding attachments
(bug #71) was sorted
out finally!). The procedure seems to be well established and tested and I don't
see any reason it should be replaced with something else.
So, there is a robot which immediately pushes new submissions to CVS.
Apparently it requires that this attachment mechanism is working on
bugs.gentoo.org. Alternatively it should also be rather easy to pick up the
package if author provides a link to ebuild.
Two issues immediately come up here:
I am definitely interested in comments. Please do try to beat this down. I will try to play a goal-keeper and defend this design :-).
Ok, ebuild is submitted and made its way to CVS. What now?
Now it has a "new" status. In order to make its way to "confirmed" status it
should accumulate votes. See my
thoughts on voting system for a
discussion of what can be implemented (its a large topic on itself).
Additionally core developers can spill blessing at that or other package at
their discretion. If core developer has time to look at new ebuilds but does not
have personal preference he can consult an "approval-wanted" list. It seems
natural to maintain this list on bugs.gentoo.org.
It is not completely clear though how to deal with situation when some ebuild does not get much attention but somebody from core group decides to review it. In parallel with that goes the situation when new ebuild is submitted by core developer. I see such alternatives:
Please take a look at possible implementations of
voting system.
Also please check the Summary to refresh your mind
after this long tale :-).
After submitting this proposal to gentoo-dev I accumulated some responses. Links
to postings and part of the private conversation you can find in this
reflections section.
Email me, post a message to gentoo-dev mailing list, or use my feedback form.