|
Native (automatic) Multilib for Portage | |||
|---|---|---|---|
| Recent | |||
|
Relevant IRC discussion with kugelfang, blubb, ferringb, and geoman. | |||
| Introduction | |||
|
Many moons ago there was a bug (75420) that proposed to add real multilib support into portage. The blocking issues in gcc and libtool have been fixed since the bug was closed. Additionally I am currently using this functionality daily on real life systems and it works. This is an attempt to resurrect that idea. I have filed bug 145737 to restart the discussion. Besides the bug and IRC discussion, there are two other documents that provide some background for the discussion. Eradicator's original multilib notes and blubb's GLEP draft for multilib. | |||
| Basic funcionality | |||
|
I treat the installation of non-default ABI libraries/headers as equivalent to having a USE flag. This is essentially what toolchain packages (glibc, gcc, libstc++-v3) have used the "multilib" USE flag to mean. The current behavior of portage is that, when you emerge a package, all executables, libraries, headers, and config files are installed and merged for the default ABI of the system. With native multilib support, if the "multilib" USE flag is set and the package supports multilib (IUSE="multilib"), then portage will first process each non-default ABI up to the install phase. Then it will process the default ABI up to the install stage. Finally, portage will merge all the installed files to the file-system. Because the default ABI is processed last it will overwrite any files installed by the alternate ABI installs, however, files from the alternate ABIs that have different paths (or names) will remain and be merged. In the case of library files, the install location will change for each ABI because the $(get_libdir) function is ABI sensitive. The net result is that you get non-ABI specific files for the default ABI, and ABI specific files for each ABI that is requested. | |||
| Rationale | |||
|
Combined package install vs. slotted ABI approach: One approach to native mulitlib support in portage is make an "ABI-slot" for each ABI of a package that is installed. However, I don't think this is the best approach. The main advantage I see to the slotted ABI approach is that alternate ABIs can be treated as seperate units for dependencies and packages that already have an installed ABI don't need that ABI to be re-installed. However, ABI slotting with correct dependency handling and correct handling of files that are owned by multiple ABI slots is complex. The biggest downside of having separately installed "atoms" for each ABI is the risk of version skew between the ABIs. If a user has one ABI installed for a library and later installs another ABI for the same library there is a very good chance that the two will be different versions. However, there are two other more subtle and potentially more problematic version skew scenarios. First, the ebuild for the library or (more likely) the eclasses that it uses may have changed between ABI installs without the ebuild version actually being bumped. The other case is that the user may have installed other packages or changed their system in some way (CFLAGS), such that build process (especially configure checks) of the new ABI is different than for the original ABI. In both cases, the new ABI files appears to be the same configuration as the original package ABI, but they are not. I think there are several advantages to treating alternate ABIs libraries and headers as an additional feature or option to activate for a package. First, the impact to portage is less and there is almost no change to the code path if multilib is not being used. Another advantage is that /var/db/pkg and the related code that handles it does not require restructuring to support a new form of slotting (ABI slotting). And there is the "show me the code" factor: I'm using it right now to build systems that I am using in real life. ABI+gcc-config wrapper vs. all ABI info in CHOST: My current method uses the existing ABI support in gcc-config which listens to certain environment variables to determine which ABI mode the compiler should run in. Another option for changing the compiler output during the build of each ABI is to have a different CHOST/compiler executable for each ABI. My opinion is that the ABI in CHOST route is not as well trodden ground as the ABI+gcc-config approach. On MIPS (and maybe some others) it will require doing some non-standard modifications of the CHOST GNU configuration names because ABI information isn't guaranteed to be fully encoded in CHOST. For example, right now mips64el-unknown-linux-gnu is a GNU configuration name for both n32 and n64 ABIs (because from kernel down they are both 64-bit ABIs). One option is overload the vendor field. I consider this approach to be more "icky" than the ABI+gcc-config approach, but regardless this is not a standard practice and I'm concerned about some subtle problems with this approach. I think it would result in more struggle against upstream maintainers than the ABI+gcc-config option I'm currently using. However, I'm less dogmatic about the ABI in CHOST approach than I am about the slotted ABI approach. But the ABI+gcc-config wrapper has been working well so far and very few packages have needed to be patched to work with this approach. | |||
| Basic Usage | |||
|
Requirements:
| |||
| Internal Operation | |||
|
The four portage files that have been modified are:
In ebuild.sh, the functions dyn_unpack, dyn_compile and dyn_install each have an new loop that interates through each ABI with the DEFAULT_ABI occuring last. Each of those functions still calls src_unpack, src_compile and src_install but they are called for each ABI. Each time they are called certain environment variables and the on disk structure is changed so that the resulting action applies to the ABI currently being processed. Here is the psuedo-code for the original portage process in ebuild.sh:
The code changes to ebuild.sh are fairly minor. The majority of the auto-multilib code, including the functions set_abi, _finalize_abi_install are in auto-multilib.sh. The majority of the code in auto-multilib.sh is contained in _finalize_abi_install and supporting routines, which are derived from header processing routines in the multilib.eclass. Also, it is worth noting that the /usr/bin/*-config processing in _finalize_abi_install uses an environment variable ABI_REDIRECT_PROGS to determine whether to create an ABI aware redirect script in place of the real script. In order to support multilib dependencies I created special USE flags of the form multilib_abis_XXX where XXX is the ABI to be installed. These USE flags are special in several ways.
The changes to support the multilib_abi_XXX USE flags are in emerge and portage.py. | |||
| Current Testing | |||
|
I have built a multilib amd64 system from a stage1 with both ABIs (x86, amd64). There are some packages I needed to patch listed below (mostly because of incorrect ebuild assumptions). I have 294 packages installed including modular X. Remote X applications seem to work ok. I have an automated daily script that uses auto-multilib to cross-compile, boot, and test a mips64el system image with the three MIPS ABIs: o32, n32, and n64. This is basically emerge system plus a bunch of other packages (125 total). I also have a python regression test script that uses pexpect and several test packages in the overlay to verify the native multilib functionality. Overall the multilib changes seem to be pretty robust and flexible. | |||
| Additional Notes | |||
|
gcc-wrapper: The gcc wrapper in gcc-config-1* needs to be patched because it doesn't do the right thing in certain cross-compiles situations. The broken case is when ABI is set to the HOST ABI, but the BUILD gcc is called it still tries to use CFLAGS_${ABI} which gives wrong CFLAGS for the native toolchain. If CHOST matches the compilers build architecture then use CFLAGS_${ABI}, otherwise don't use it. Notable packages:
| |||
| Related Discussion | |||
|