The Multilib Problem ==================== Olivier Crete June 27, 2003 ======== DRAFT ========== This document mostly discusses the AMD64 architecture, but the proposed solution should also work on other architectures (sparc, mips, ppc, etc). The Problem ----------- Amd64 (as well as other 64 bits extensions of 32 bit architectures) require the same libraries to be compiled/installed for each architecture. This causes problems with a source based distribution like Gentoo. Rejected solutions ------------------ - Pure 64 bit system A pure 64 bit system would be very nice and will maybe one day exist. But for now, Intel is not adopting amd64, so many proprietary applications will only be produced for IA-32. Those applications such as Maya, CFX TASCflow, Quake, etc are vital for many of our users. - Separate roots Another possibility was to have to completely separate roots and to chroot in the appropriate root. This would allow for two completely independant system, but would not allow the sharing of configuration files (such as resolf.conf). Also note that /etc cannot be bind-mounted because they need separate make.conf, etc.. - Separate /var/db/pkg trees inside the same root It might also be possible to have two separate trees for portage to put its info in which would both install in the same place, but maybe different directories (like just keeping the lib64 directories from a 64bit install and keeping everything in the 32bit install). This seems difficult to implement and error prone. Also In the case of libs which include configuration file or utilites, those files would be shared by both the 64bit and 32bit versions so they would need to be kept in sync at the same version. Proposed solution ----------------- First, I'll use the terms "primary architecture" and "alternate architures", each system would have one primary arch (sparc32 for sparc or x86-64 for Amd Opteron) ans one or more alternate archs (sparc64 or i386). I think that allowing more than one alternate architecture will some day become necessary and that we should be ready for it, this will also force ebuilds not to make assumptions about the primary and alternate architectures (32->64 or 64->32). The core of this proposition is to have both the primary and alternate versions of the "program" reside inside the same "image" (an image being the content of the /var/tmp/portage/??$/image directory). A program would then be compiled on the primary architecture and "at the user's choice" for one or more alternate architectures. Emerge would first do as normal until after src_unpack, it would then run src_compile/src_install for each alternate architecture and then for the primary architecture. The primary architecture is compiled/installed last to make sure that executable files (bin/) are for that architecture. After each src_install is executed for an architecture, the so files that have been installed in the lib/ directories would be moved to the architecture specific directory, except if the arch specific directory is lib/, in that case they would be moved to "another directory" (such as lib.temp/ which would be copied back into lib/ after all iterations are done). This copying is done to protect the files installed in a previous iteration from the files built in the next. Changing architecture between iterations would mean changing the CHOST as well as CFLAG/CXXFLAGS. It would be possible for the users to specify different C(XX)FLAGS in make.conf for each architecture. I'm not sure if ARCH should be changed or not. Also the lib suffix (64 for lib64) would also be configurable, so the content of lib/ would not always be the primary architecture. On an AMD64 system, the primary arch would be in lib64. Each package installed for more than the primary architecture would be flagged as such in /var/db/pkg and would be upgraded for all architectures in which it is installed when updates are availables. The dependancy tree would also be aware of architectures and would force recompiling a dependency if it is not compiled to the appropriate alternate architecture. Portage might need to make separare dependancy trees for the various architectures. The alternate src_compile/src_install would be overridable with src_alt_compile/src_alt_install so that ebuild maintainers might do specific things if that's necessary. By default, they would just call the primary version. The mungling inside the lib directories would be done in the following way: first a list of directories where libs could be would be built from ld.so.conf, any file in ${D}/etc/env.d and the defaults (/lib /usr/lib /usr/X11R6/lib). Then for each of those directories, each file would be scanned and if it is a library, its arch would be detected and it would be moved to the "appropriate" directory. Moving all libraries without detecting their arch is also an option, I'm not sure it if might break things. Moving the whole directory is not possible, because some non-library files might be installed in subdirectories (like mozilla) and anyways ld does not look into subdirectories, so working on the first level would be sufficient. GNU ld scripts will be problematic, but on my system there are only such scripts in two cases, glibc whose install scripts already handled multilib and custom gentoo scripts for some files moved to /lib (readline, pam, ncurses, etc). It might be possible to modify other ld scripts with sed. Symlinks which are not auto-generated by ldconfig are also a problem, I do not have a magic solution for them, an addition to the src_alt_install might be required for each offending ebuild. We might also need a flag for no-arch ebuilds which would not depend on architecture specific features (purely scripted applications, java applications, etc). Also for applications that are already multilib aware and would recompile themselves twice, or come with binaries for both architectures, but I dont do not any such application/library for now.