|
|
[Gentoo Linux Home] [GLEP Index] [GLEP Source] |
| GLEP: | 58 |
|---|---|
| Title: | Security of distribution of Gentoo software - Infrastructure to User distribution - MetaManifest |
| Version: | 1.10 |
| Last-Modified: | 2010/04/07 21:34:24 |
| Author: | Robin Hugh Johnson <robbat2 at gentoo.org>, |
| Status: | Draft |
| Type: | Standards Track |
| Content-Type: | text/x-rst |
| Requires: | 44 60 |
| Created: | October 2006 |
| Updated: | November 2007, June 2008, July 2008, October 2008, January 2010 |
| Post-History: | December 2009, January 2010 |
Contents
MetaManifest provides a means of verifiable distribution from Gentoo Infrastructure to a user system, while data is conveyed over completely untrusted networks and system, by extending the Manifest2 specification, and adding a top-level Manifest file, with support for other nested Manifests.
As part of a comprehensive security plan, we need a way to prove that something originating from Gentoo as an organization (read Gentoo-owned hardware, run by infrastructure), has not been tampered with. This allows the usage of third-party rsync mirrors, without worrying that they have modified something critical (e.g. eclasses, which are still unsigned).
Securing the untrusted distribution is one of the easier tasks in the security plan - in short, all that is required is having a hash of every item in the tree, and signing that hash to prove it came from Gentoo.
Ironically we have a hashed and signed distribution (it's just not used by most users, due to it's drawbacks): Our tree snapshot tarballs have hashes and signatures.
So now we want to add the same verification to our material that is distributed by rsync. We already provide hashes of subsets of the tree - our Manifests protect individual packages. However metadata, eclasses and profiles are not protected at this time. The directories of packages and distfiles are NOT covered by this, as they are not distributed by rsync.
This portion of the tree-signing work provides only the following guarantee: A user can prove that the tree from the Gentoo infrastructure has not been tampered with since leaving the Gentoo infrastructure. No other guarantees, either implicit or explicit are made.
Additionally, distributing a set of the most recent MetaManifests from a trusted source allows validation of trees that come from community mirrors, and allows detection of all cases of malicious mirrors (either by deliberate delay, replay [C08a, C08b] or alteration).
For lack of a better name, the following solution should be known as the MetaManifest. Those responsible for the name have already been sacked.
MetaManifest basically contains hashes of every file in the tree, either directly or indirectly. The direct case applies to ANY file that does not appear in an existing Manifest file (e.g. eclasses, Manifest files themselves). The indirect case is covered by the CONTENTS of existing Manifest files. If the Manifest itself is correct, we know that by tracking the hash of the Manifest, we can be assured that the contents are protected.
In the following, the MetaManifest file is a file named 'Manifest', located at the root of a repository.
The objective of creating the MetaManifest file(s) is to ensure that every single file in the tree occurs in at least one Manifest.
The above does not conflict the proposal contained in [GLEP33], which restructure eclasses to include subdirectories and Manifest files, as the Manifest rules above still provide indirect verification for all files after the [GLEP33] restructuring if it comes to pass.
Additional levels of Manifests are required, such as per-category, and in the eclasses, profiles and metadata directories. This ensures that a change to a singular file causes the smallest possible overall change in the Manifests as propagated. Creation of the additional levels of Manifests uses the same process as described above, simply starting at a different root point.
MetaManifest generation will take place as part of the existing process by infrastructure that takes the contents of CVS and prepares it for distribution via rsync, which includes generating metadata. In-tree Manifest files are not validated at this point, as they are assumed to be correct.
There are two times that this may happen: firstly, immediately after the rsync has completed - this has the advantage that the kernel file cache is hot, and checking the entire tree can be accomplished quickly. Secondly, the MetaManifest should be checked during installation of a package.
In the following, I've used term 'M2-verify' to note following the hash verification procedures as defined by the Manifest2 format - which compromise checking the file length, and that the hashes match. Which filetypes may be ignored on missing is discussed in [GLEP60].
For this portion of the tree-signing work, no actions are required of the individual Gentoo developers. They will continue to develop and commit as they do presently, and the MetaManifest is added by Infrastructure during the tree generation process, and distributed to users.
Any scripts generating Manifests and the MetaManifest may find it useful to generate multiple levels of Manifests in parallel, and this is explicitly permitted, provided that every file in the tree is covered by at least one Manifest or the MetaManifest file. The uppermost Manifest (MetaManifest) is the only item that does not occur in any other Manifest file, but is instead GPG-signed to enable it's validation.
While [GLEP60] describes the addition of new filetypes, these are NOT needed for implementation of the MetaManifest proposal. Without the new filetypes, all entries in the MetaManifest would be of type 'MISC'.
As discussed by [C08a,C08b], malicious third-party mirrors may use the principles of exclusion and replay to deny an update to clients, while at the same time recording the identity of clients to attack.
This should be guarded against by including a timestamp in the header of the MetaManifest, as well as distributing the latest MetaManifests by a trusted channel.
On all rsync mirrors directly maintained by the Gentoo infrastructure, and not on community mirrors, there should be a new module 'gentoo-portage-metamanifests'. Within this module, all MetaManifests for a recent time frame (e.g. one week) should be kept, named as "MetaManifest.$TS", where $TS is the timestamp from inside the file. The most recent MetaManifest should always be symlinked as MetaManifest.current. The possibility of serving the recent MetaManifests via HTTPS should also be explored to mitigate man-in-the-middle attacks.
The package manager should obtain MetaManifest.current and use it to decide is the tree is too out of date per operation #2 of the verification process. The decision about freshness should be a user-configuration setting, with the ability to override.
With only two levels of Manifests (per-package and top-level), every rsync will cause a lot of traffic transferring the modified top-level MetaManifest. To reduce this, first-level directory Manifests are required. Alternatively, if the distribution method efficiently handles small patch-like changes in an existing file, using an uncompressed MetaManifest may be acceptable (this would primarily be distributed version control systems). Other suggestions in reducing this traffic are welcomed.
I'd like to thank the following people for input on this GLEP.
| [C08a] | Cappos, J et al. (2008). "Package Management Security". University of Arizona Technical Report TR08-02. Available online from: ftp://ftp.cs.arizona.edu/reports/2008/TR08-02.pdf |
| [C08b] | Cappos, J et al. (2008). "Attacks on Package Managers" Available online at: http://www.cs.arizona.edu/people/justin/packagemanagersecurity/ |
| [GLEP33] | Eclass Restructure/Redesign http://www.gentoo.org/proj/en/glep/glep-0033.html |
| [GLEP60] | Manifest2 filetypes http://www.gentoo.org/proj/en/glep/glep-0044.html |
| [GLEPxx2] | Future GLEP on Developer Process security. |
| [GLEPxx3] | Future GLEP on GnuPG Policies and Handling. |
Copyright (c) 2006-2010 by Robin Hugh Johnson. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0.