Gentoo Logo

Disclaimer : This document is a work in progress and should not be considered official yet.

the Complete Gentoo/Linux handbook

Content:

  • Introduction to Linux
    Linux is a great concept. It is a wonderful kernel and has wonderful userland utilities making it a perfect operating system. But for most people, it is quite new and too exciting to just dive in. This part tries to cover Linux: what is it, how does it work, what can you expect etc.
    1. What is Linux?
      What is Linux exactly? How does this all fit in "Free Software"? What is a distribution and why would you care? How is Linux developed? What can you expect from it? All that is covered in this chapter.
    2. Users and the Linux file system
      Linux is built upon the UNIX knowledge and concepts. This means it is fairly robust and uses a very logical approach to files, users and such. But for most people, this logical approach is just what seems the most illogical since they are not used to it. In this chapter, we try to inform those users about how Linux sees a multi-user environment and what the Linux file system is structured like.
    3. Freedom, support and finances
      The most powerful asset of the Linux operating system is the freedom it gives you. But many folks are afraid that this freedom comes with a price: no support, no company backing up Linux. This is all FUD (Fear, Uncertainty, Doubt) and well explained in this chapter.
    4. Staying up to date
      With the decentralised development model that Linux uses, keeping a system up to date might seem like a huge effort. Indeed, it is, but it is covered by the various distributions (such as Gentoo). However, because of the openness that the development model imposes, users have conflicting feelings about stability and the differences between all version models. This chapter explains what all the differences between version models are.
    5. Making a choice
      Now that most non-technical stuff is covered, it is time for you to make choices. Will you use Linux? What distribution? What graphical environment? What mail client? In this chapter we discuss the various differences between major distributions and explain that, once you have picked a distribution, all other choices are reversible.
    6. Finding information
      You have a lot of resources at your disposal, but you need to know where to look for them. In this chapter we provide you a quick overview on the available resources, how to use them for your queries and what you can do to contribute to them.
    7. So far so good
      Now that the introduction to Linux has ended, we will talk about the next few parts and the syntax we use throughout this document.
  • Installing Gentoo
    So you decided to install Gentoo. That's great, so how to continue? Installing Gentoo is a breeze, but not a soft one. You need to have a fair knowledge of the Linux environment if you want to get it right from the first time. In this part, we discuss how to install Gentoo using the available Gentoo media.
    1. Versions, media and installation concerns
      Gentoo provides installation media, with specific versions, for specific architectures, for specific installation methods. This chapter informs you how to pick the right media for your system.
    2. Starting from a minimal environment
      The start of every Gentoo installation process begins with a minimal Linux environment, allowing you to extract a basic Gentoo environment on your disk. Regardless of what minimal environment you are in, you need to know a few basic things about the system. This chapter covers the use of important tools.
    3. Preparing the network
      When you are known to the minimal environment, it is time to get the network up and running. We will inform you how a network is set up, what TCP/IP is and how to deal with networking on Linux, including wireless networks.
    4. Putting the minimal environment in place
      In this chapter we prepare your disk(s) to store the Gentoo Linux environment. We will cover a few additional storage concepts (LVM2, RAID) but the real use of these technologies is postponed for later. Next, we store a minimal Gentoo environment on your disks.
    5. Building the system
      Once the minimal environment was available, we took the dive and chrooted in it. Now, we'll set up the basic configuration directives, build the Gentoo system until it is bootstrapped and has the core system packages available.
    6. Building the Linux kernel
      The core of any Linux Operating System is the kernel. Configuring a kernel might seem like a difficult task, but once you get to know how it works, it hardly is a challenge anymore. In this chapter we will discuss how to configure and build your kernel, either automatically using genkernel, or manually with the kernel configuration dialog.
    7. Configuring the boot process
      The next step is to configure the boot process. The boot process covers the boot loader tool, which loads the Linux kernel in memory, and the init process which governs all the applications and processes that should start on a Linux system.
    8. Configuring the system
      Most of the system's configuration is stored inside /etc. In this almost final chapter we describe what you should configure prior to rebooting (such as the file system table, networking stuff, user accounts, ...). It is important to read this entire chapter completely.
    9. Finishing off
      Now that everything is (hopefully) configured correctly, we reboot in the Gentoo Linux Operating System to discover that we have a nice running minimal environment. Now, where to go from here? You obviously can't work immediately since nothing is installed yet...
  • Gentoo Linux for the desktop user
    Most Gentoo users have a Gentoo-powered desktop system, yet are often unaware of the massive tools and helpful features that Linux offers to increase their desktop experience. In this part you'll find pointers to various tips and tricks, but also best practices for desktop users.
    1. Graphical Linux
      Linux is not an operating system where command-line utilities must be used. Once the system is installed, any user should be able to use it without any knowledge of command-line utilities. Indeed, this is perfectly possible, but requires some configuration. This chapter gives an introduction to popular graphical environments and provides pointers to its configuration.
    2. Plug and play
      Your laptop detects an open wireless network and authenticates itself immediately. You plug in your USB key which gets mounted immediately so you can download the latest Gentoo release on it. While you are at it, your calendar synchronises with your friends while your laptop tries to consume as little power as possible because it is working on batteries. No, this is no utopia...
    3. Software collaboration
      Open standards allow for easy integration of different software tools. However, the race for the best open standard hasn't been won, so various tools only work with one standard while other tools use a different one. This chapter gives a quick overview of the various collaboration-related standards and the applications (or libraries which they use) that make use of them.
  • Gentoo Linux for enterprise environments
    Enterprises require more than just a stable Operating System. Depending on their requirements, a system should be high-available, have a high throughput, connect with legacy systems, have a low maintenance cost, requires zero manual configuration steps, etc. This part will attempt to discuss various interesting topics that might help you get more out of Gentoo Linux.
    1. Software RAID
      If you need a very low-cost system but still want redundancy, using software RAID is a minimal requirement. In this chapter we will describe how to use software RAID within Gentoo Linux.
    2. Logical Volume Management
      With LVM, storage concerns can be tackled easily since your files are located on top of a specific layer, able to hide the complexity of storage from the file system. Learn how to store files across various file systems, moving data without the need to put the system in a frozen state, take live backups without having your data touched while you are busy, ... and all that using LVM2.
    3. Backup systems
      You don't want to lose your files, but eventually you will. Having backups at hand is a prerequisite for good system householding, but is also often overlooked or deemed less important. This chapter will cover a few basics on backups and proposes a few solutions that will help you keep your data safe.
    4. Print server
      High quality printing is a requirement for every office. Linux makes a fine print server, capable of interacting with all possible applications and operating systems.
  • System Administration
    Once your system is set up, your next concern is to decide how to administer the system. Some people have made system administration their full-time job. We'll try to keep the administration to a minimum without losing flexibility.
    1. Software management
      Gentoo's Portage is a powerful software management tool with lots of features. Not only can you easily install and remove software, rebuild tools when they are affected by changes or update your system entirely, it also supports prebuilt packages and different repositories.
    2. Log files
      Not many resources talk about log files. Most documents assume that log files are mentioned by the application documentation, yet many applications just inform you what information you can find in the logs and that isn't sufficient for a good log management policy. Log rotation, event filtering, summary creation, ... are all aspects that this chapter covers.
    3. Centralised system management
      When you administer more than one system, it might be beneficial to set up and maintain your environment from a single location. You can use SSH to log on to other systems, but more advanced tools exist that offer a wide variety of features.
  • Performance tuning
    No system is equal, so many systems have one or more performance bottlenecks due to general settings that aren't as optimal for their systems as they are for others. You can increase your systems performance on many areas as long as you understand why they are bottlenecks and what sacrifices you need to make to increase the throughput.
    1. Input/output performance
      Storage performance. If there is one device in your system that has the highest importance but one of the lowest access times, it is your storage. Disks aren't fast, network storage isn't much better and memory disks are generally too small to contain your entire system. So what can you do to increase the IO performance?
    2. Network performance
      If you are on a 28.8 kbps network, you'll always find that it is slow. Many people still think their network is slow while they're using a 1 Gbps network. By identifying the bottlenecks in your network and designing a good topology, you can increase the performance of your network to a good level.
    3. Rendering performance
      A gamers wet dream: screens and graphical cards that render 3D images so detailed and so fast that it seems that you're looking through a window. Of course, this isn't achievable yet, but it is possible to tune the 3D performance of your card. Of course, we'll take a look at 2D rendering as well.
    4. Software profiling
      Programmers don't always write the most performant code; perhaps because they rely on the compiler to enhance the machine code or because they want to write clean and maintainable code, setting their priorities elsewhere. Using profiling tools, you can find out where in the software the performance bottlenecks may lie.
    5. User-observed performance
      Lately, developers have come to the conclusion that even the fastest solution can lose against a less performant system - at least if a human being is to observe and judge. Many interesting projects have emerged where design patterns and guidelines are discussed that try to embrace the users thoughts on speed and performance.
  • Appendix: architecture specific information
    Not all aspects of a Gentoo Linux system are similar on all architectures. Some small differences come up during and after the installation. This part will cover all architecture-specific information, nicely divided in separate chapters for each architecture.
    1. The x86 Architecture
      The x86 architecture covers all 32-bit Intel and Intel-clones, such as the various AMD processors (the K-series and Athlon/Duron), VIA and Cyrix. The CPU types range from the old (but functional) i386 to the latest Intel Pentium IV and AMD Athlons.

A. Introduction to Linux

1. What is Linux?

1.a. Linux: concept and history

What is Linux?

Linux is a free operating system, consisting of the Linux kernel, libraries and utilities which allow the user to interact with his system.

The Linux kernel is the core of the Linux Operating System. It is responsible for all hardware interaction, process management, memory management, network protocol support and file system support. We probably forgot a few other responsibilities as well, but it is obvious that the kernel has many important responsibilities. All these tasks are handled in the background: as the core of the system, the user has no direct interaction with the kernel.

The core library on a Linux system is the GNU C library, called glibc. This library provides an interface between the Linux kernel, which operates almost independently, and the user applications. The library contains system call definitions and basic features to facilitate the application development for the Linux Operating System.

The core utilities on a Linux Operating System provide you, the user, with a way to interact with the system. These utilities allow you to create and manipulate files, navigate around on your system, start and stop processes, etc. There is no "single" core utility package: the Linux Operating System contains a dozen different packages and two Linux systems can have different utilities.

The most well-known and used utilities however (such as those for navigating on the system) are generally called the GNU Core Utilities. GNU is a project devoted to the development of a completely free (as in 'free speech') Unix-like environment. Because GNU plays an important role on most Linux systems, many people talk about GNU/Linux.

So... what is Linux?

While the above explanation is quite technical, the Linux Operating System is built upon the UNIX idea, delivering UNIX-like features and stability. But it is more than just a UNIX clone. It is developed by several thousand developers who work on the operating system in their free time (although many of them also work on Linux on a paid-for basis).

The development of Linux is decentralised: each part of the Linux Operating System (kernel, libraries, tools, graphical environments, office suites, server software, ...) is developed by its own project which works independently of the other projects. Unlike what many people think, this does not mean that the projects do not work well with each other. Each software title that interacts with another uses standards. A standard is an established or widely recognized technical explanation to accomplish something. The best standards are open standards.

An open standard is a freely available and sufficiently documented technical explanation that allows any developer to write software that operates as the document dictates or supports the communication described by the standard. Therefore it can flawlessly interact with other software titles that adhere to the document as well. The document and its technical implications are free of any juridical limits (like patents, licenses, ...) and the document is accepted by a standards organisation (like ISO, ANSI, ...).

Examples of such open standards are the various network protocols (like TCP/IP, HTTP, ...), character encodings (ASCII, UTF-8, ...), etc. Because the applications use standards, interoperability amongst the various applications is guaranteed.

The Linux Operating System is characterised by freedom and choice. Freedom, because the software is free (although non-free software exists for Linux as well). Choice, because you will have the choice between several applications for each action you want to perform.

Where can I find Linux?

You should not be searching for Linux sensu stricto as you'll only find the Linux kernel which you can't use without additional libraries and tools. What you need to look for is a distribution. A distribution is a project that combines the Linux kernel, libraries and tools in a coherent software package. With a distribution you can install, configure and use a Linux system easily.

Next to the distribution, you might need to install additional software. If you can not install it through your distribution (most distributions offer thousands of software titles out of the box) or you do not know any software title by name, then you can visit one of the many free software repositories around. Known repositories are Freshmeat, Icewalkers, SourceForge, etc.

Known distributions are Fedora, Mandriva, SuSE, Debian, Ubuntu and of course Gentoo, but many others exist as well.

Linux' history

Linus Thorvalds, the creator of the Linux kernel, made a first posting about his hobby project on August 25th, 1991. People could download his code and use it, modify it and redistribute it. Linus also made a few ports available to make it possible for others to run a Linux operating system. Of course, in those days, the operating system contained only a few applications and hardware support was very limited.

In the next few years, the Linux kernel grew and expanded: support for networking, SCSI disks, specific file systems, ... was added and bugs were quickly fixed. Yet installing a Linux Operating System still was difficult as there were no easy installation methods yet. That changed when the first distribution got out.

Early distributions were hardly maintained so no real good candidate for continuous usage. In 1993 Slackware was created, and others followed suit shortly after. Nowadays, several hundred distributions exist.

1.b. Free software model

Freedom of speech

As mentioned previously, Linux is Free Software. The "Free" here should be read as "Freedom of Speech", not "Free beer". The Free Software Foundation defines the freedom as:

  1. freedom to run the program for any purpose,
  2. freedom to study how the program works, and to adapt it to your needs,
  3. freedom to redistribute copies of the program, and
  4. freedom to improve the program, and release your improvements to the public so that the entire community benefits

The Free Software Foundation has prepared and released a specific license that embraces the abovementioned freedom. Their license is called the GPL and is used by the Linux kernel and various other applications. The Free Software Model builds upon this freedom.

The role of distributions

Distributions play an important role: they bundle the free software in a coherent package. A distribution allows you to install a Linux Operating System easily and maintain the software installed on your system. Thanks to distributions, you don't need to know how to build packages, what toolchains are and other packaging-related tasks.

One role of a distribution in the free software model is that of the quality analysis and marketing. Distributions take the source code of many projects and bundle it together. They test the software and provide feedback to the developers of the individual projects. When they are happy with the end result, they present their distribution to the world: it is this end result that the users will install on their system.

Development model

Most free software is developed on a volunteer basis. A free software project generally has some infrastructure at its disposal:

  • a code repository: a location where several people can work on the source code simultaneously. When you hear about CVS or SVN then the topic of the discussion is most likely code repositories or the tools to manage them. A versioning tool allows developers to deal with collaborative source code development. Such tools keep track of all changes made to the project.
  • a web site, displaying news and information about the project. The web site will probably include download and installation instructions, documentation, etc.
  • a mailinglist where developers and users discuss the future of the project, changes and change requests, bugs, etc.
  • a bug tracking system where users can submit bug reports and enhancement requests. Such bug tracking systems allow the developers to keep track of bugs easily.

Most free software projects are ran by volunteers. These people put their knowledge of programming, documentation writing, infrastructure, ... in the project. The motivation that drives these developers makes free software evolve quite fast: why else would someone work on a project in his free time if he wasn't motivated?

Because a project is mostly ran by volunteers, there is no limit to the amount of developers that can work on the project. The Gentoo distribution has more than 350 developers, the Linux kernel has several hundred developers. Many updates to a project are made by contributors as well: people who have found and fixed an issue but are not part of the development base of the project.

The entire development process is open to the public (everyone can see how the project evolves), so there is a lot of feedback from the users. Users participate in discussions on the mailinglists or through IRC (lots of projects have a chat channel). In many cases, active users are asked to join the development team because they provide valuable feedback.

1.c. Is Linux your thing?

Expectations

So what can you expect from Linux?

Linux is a very stable platform that can be used in every area you can be interested in: desktop, workstation, server, programming, embedded, ... Stability is a core concern with Linux. The Linux kernel for instance is a separate entity in the Operating System and not integrated in the shell or hidden from the user. Any instability of an application will cause the application to fail but not the kernel, so the system remains functional.

Because of the development model used, Linux is a fast moving operating system. With Linux you can expect frequent updates with lots of new features. You will notice that as you update your system, the system will remain recent and completely up-to-date with the latest developments. Some distributions (Gentoo included) don't even require you to upgrade your system: once installed, you will always have the latest release. Such an approach is quite unique and you can't find this in operating systems like Microsoft Windows.

Programmers will find that Linux offers them the best development platform they can imagine. An Operating System where you can learn a lot from the inner workings of the system, where you can find a plethora of (free) development tools for languages such as C#, C++, Java, C, PHP, ... where communities help each other in the development of peer projects.

Home users will find Linux to be extremely interesting, with lots of documents available to help you find your way through this new and exciting system. Yes, documentation is a powerful asset: you have sites devoted to the ongoing development of good, professional and clear documents about various Linux-related subjects. Documents are not only available in English but in various languages and this international approach is also taken within the Linux software products themselves: most applications are available in several languages. You can have your entire system available in your native language!

Lots of developers are security-aware. Therefore you will find that most applications are written with a high sense of secure defaults. E-mail clients are not that likely to be easily trapped by viruses; the system only allows you to alter files you have created, leaving system-wide files intact; free updates lessen the chance that an exploitable bug remains on your system; firewalls and other security-related software is freely available and easy to install. Do not read this as if Linux is secure by default, but it does stress security more than some other operating systems.

A Linux Operating System is quite cheap. Many distributions are freely available (free as in free beer), others are available for a small price compared to what they have to offer. To understand how the Free Software Model is sustainable we will discuss this in Freedom, Support and Finances. You are also not forced to stay with a single vendor since most applications use standards which improve interoperability.

You will find that Linux is extremely flexibel. You can use Linux as a desktop, as a workstation, as a TV receiver/recorder, ... You can save disk space by installing just the applications you need without any additional stuff, you can install an entire desktop suite or just the tools you need. You can optimize your installation for your system, or use a generic installation to speed up the installation of hundreds of desktops you administer. You can choose amongst various applications that offer the same functionality but use different ways to achieve their goals. You can do whatever you want, the way you want.

1.d. Linux is not...

... a Microsoft Windows look-a-like

Do not expect Linux to behave like Windows, to run Windows programs, to be compatible with everything Windows offers. Linux is a completely different Operating System with its own way of dealing with things. It is completely different by design, by development model, by community, ... and will most likely stay different.

... secure by default

Security is a major concern with Linux, but "Secure by Default" is something completely different. You should not expect that your Linux environment will always be untouchable; security lays in the hands of whoever controls the system.

Keeping your system up to date is a prerequisite: if you do not update your system regularly, you will eventually have applications on your system that have known exploitable bugs in them. Having a clear policy is important as well: do not trust everyone on the Internet, do not use empty or easy-to-guess passwords, do not use applications from untrusted sources, etc. Know what you do on your system: badly configured services can be the weakest link in someone's security.

... an alternative

The word "alternative" is often used for a less powerful but "sufficient" solution. Linux is more powerful and different. It is not an alternative for any other operating system, but a different operating system.

Forget what you know about the operating system you currently use. Linux is different and you will need to learn it. It will take a while but it is definitely worth it.

2. Users and the Linux file system

2.a. Why multi-user?

Separation of privileges

One of the advantages of a multi-user operating system like Linux is that the privileges are separated. Each process runs with specific privileges and can only execute a limited number of tasks. As long as the process does not run as the root user (the allmighty administrator privilege) it can only deal with files and tasks that are assigned to that particular user.

This separation of privileges provides a small but working security wall: as long as all your users use the system with their user account and not with the root account, the worst that can happen is that the user removed his own files - the system itself is left untouched.

For this reason, you will always hear not to use the root account.

System accounts

To enforce the separation of privileges, specific system accounts are created for each task. If you run a mail server on your system, that mail server will have a user account on your system.

These accounts are not usable by regular users: you can not log in on your system using those accounts. They exist only to allow the specific processes to run with their own permissions and privileges.

2.b. Users and permissions

The user ID

To identify a system account, a unique user identification is used: the UID. This is a number used by the Linux kernel and other applications as numbers are easier to deal with than names (strings). However, Linux is intelligent enough to immediately translate the UID to a user name and vice versa, so in most cases you will only see or use the user name instead of the UID.

The process ID

When your Linux system is up and running, it will have started various processes already. Each process is an application (or part of an application) and receives a unique process identifier (which is also a number): the PID.

PIDs play an important role in the administration of a running Linux system: you need the PID of a specific process to be able to terminate it (in case it behaves badly), change priorities or receive specific system usage statistics regarding a particular application.

For regular Linux usage however, the PID is less important: you still need to understand what a PID is, but you will probably not encounter any use of it until you administer your system.

Privileges

Each process obtains privileges based on the user account it uses. By default, a process runs with the privileges of the user that started the process. For instance, if you start firefox it will run with your privileges.

However, some processes have a specific flag set that tells the Linux kernel not to run the process as the user that executed it, but as a specific user instead. This flag is called the set user id (SetUID or SUID) and tells the Linux kernel to run this application with the privileges of the owner of that application instead of the executor.

Most tools that have the SUID bit set are owned by the root user and therefore start running with the privileges of the root user. Because this is a security thread (remember, running things as root can be dangerous) most tools have a feature called privilege separation: when they are started, they first run the tasks they have to run as root after which they automatically decrease their own privileges to a less powerful state.

2.c. Linux file system hierarchy

Structure of a file system

The most pertinent change Linux users will have to be comfortable with is the file system structure which is quite different from the file system structure operating systems like Microsoft Windows use.

In Linux, the entire file system is structured as one huge tree. You start with the root of the tree and traverse down until you reach your goal. The next Code Listing shows you the first depth of a Linux file system:

Code Listing 3.1: Incomplete example of a Linux File System

/        (The root)
+- bin/  (Executable programs needed to get the system up and running)
+- boot/ (Files related to the boot loader and Linux kernel)
+- dev/  (Device files)
+- etc/  (Configuration files)
+- home/ (User home directories)
+- lib/  (Libraries needed to get the system up and running)
+- mnt/  (Location for mount points)
+- opt/  (Contains large package installations not part of a regular install)
+- proc/ (Kernel-provided information)
+- root/ (Home directory for the root user)
+- sbin/ (System administration executables to get the system up and running)
+- sys/  (Kernel-provided information)
+- tmp/  (Temporary files)
+- usr/  (Applications for day-to-day system usage)
`- var/  (Variable information like log-files, caches, ...)

Suppose you want to navigate to the CUPS error logs (CUPS is a printing service frequently used on Linux systems) which are located inside /var/log/cups you will find the following tree:

Code Listing 3.2: Expanded tree to /var/log/cups

/
+- bin/
(...)
+- usr/
`- var/
   +- cache/
   +- db/
   +- lock/
   +- log/
   |  +- cups/
   |  |  +- access_log
   |  |  +- error_log
   |  |  `- page_log
   |  +- dmesg
   |  +- emerge.log
   |  +- lastlog
   |  `- messages
   +- run/
   +- spool/
   +- state/
   `- tmp/

Each location has its purpose as defined in the Linux File System Hierarchy Standard. As said before, Linux builds upon standards and the file system structure is no exception. Each Linux distribution adheres to this standard (although a few deviations are known). If you want to learn more about the file system structure, please read this standard. A short summary can also be found on your Linux system in the hier manual ("hier" is short for "hierarchy").

Scattered files

One often frowned upon result of this file system structure is that applications scatter their files around on the system. Indeed, the locations for executable files, data files, documentation files, configuration files, ... are defined in the hierarchy standard but result in files being scattered throughout the file system instead of at a single location.

For instance, for a regular application the executable files will be stored in /usr/bin, data files in /usr/share/<application name>/, documentation files in /usr/share/man (for the manuals) or in the data file location, configuration files in /etc, libraries in /usr/lib, etc.

It is up to the distribution to keep track of the files that belong to a particular package. The software management system of a distribution is therefore a very important tool and is often the application that distinguishes one distribution from the others. For Gentoo, the software management system is called Portage.

System administration versus system usage

When you are using your Linux system for daily tasks you should be logged on as a regular user. This user will only have write-access to his personal home directory, located in /home, and have read access to most other places on the system (except where sensitive information is stored).

This user will be able to execute most applications that are stored in the regular executable locations (/bin, /usr/bin and a few other places). Whenever the user wants to run an application, the system will search through those directories for a matching application: it will not search through the entire system.

The administrative user (root) however has access to every location on the system. When he wants to execute an application, the system will search through system administration locations such as /sbin and /usr/sbin as well. Those locations contain tools that should only be ran by the root user. The root user can also read and write to every location on the system (although particular kernel projects exist that allow for more access control, limiting even the root user's capabilities).

The role of hardware

Within the tree structure there does not seem to be any room for the hardware (like disks, CD-ROMs, USB sticks or network mounts). Of course, hardware is important - where else would you store your files on if you do not have a hard disk? The use of such hardware however happens transparent to the user.

Storing files in Linux happens in a layered structure. At the bottom of the layer, you have the actual storage (most likely the partition or removable media). On top of the actual storage you have the file system. A file system can be spanned across several storage devices but most users will have one partition per file system. The file system is mounted in the Linux file system structure. Such mount always happens at a certain directory.

By default, you will have at least one file system for the root of your file system. If you only want to use a single file system, you can have your entire Linux system on a single partition. If you want to use several partitions, you need to think about what directory (and its subdirectories) you want to store on a different file system.

For instance, you might want to have /home stored on a separate file system which allows you to have all the users their data on a single partition (or drive). What happens is that you create a file system on that partition (or drive) and then mount this at /home.

If you do not mount it at /home, the /home and all its contents will be stored on the file system that contains the root of the file system. If you do mount it at /home, /home and all its contents will be stored on the other file system.

This mounting does have an important implication: if you forget to mount a file system at a certain location, the Linux file system structure will look as if that location contains no files. You will be able to add files to that location of course, but they will then be stored on the root file system instead of on the file system you forgot to mount.

In the next Code Listing we show you an example layered approach. The root file system is stored on /dev/hda1 which represents the first partition on the first IDE disk in your system. The /home location is stored on a separate file system (which happens to be the same kind of file system: an ext3 one). This file system is stored on a meta device (a device that actually consists of multiple devices - in this case two partitions).

Code Listing 3.3: Example layered approach for the Linux file system

+--------------------+----------------------------+
|   / (root)         |   /home (home directories) |  <- location
+--------------------+----------------------------+
|   ext3 instance    |   ext3 instance            |  <- file system
+--------------------+--------------+-------------+
|                    |        /dev/md/0           |
|   /dev/hda1        +--------------+-------------+  <- devices
|                    |  /dev/hdb1   |  /dev/hdc1  | 
+--------------------+--------------+-------------+

3. Freedom, support and finances

3.a. Priceless

Userbase

Free Software has a very active user community, filled with people who are eager to help you install, configure and maintain free software. Help your neighbour has never been as succesful as with Free Software.

Take the #gentoo support channel as an example. It has over 800 users who help any Gentoo user or interested party with whatever question he or she might have. Or the Gentoo Forums which has over 2'000 posts per day.

You can and will find support for the Free Software you want, support given by users of that software, who believe the software is the best in its field and have good experiences with the software. Of course, there is a trade: this support is on a volunteer basis, so don't expect someone to answer your question immediately - if you are not friendly, you will undoubtedly be ignored or even removed from the support channels.

Developer base

When you have feature requests, or you have found a bug in the software, the developers are always happy to hear from you. Most projects even have public bugtracking systems where you can submit bugreports to or ask for software enhancements.

As a user, you deal with the developers personally and not with some obscure phone number with a robotic voice on the other end, or an automated reply server who thanks you for your submission only to never hear from it again. These developers are devoted in bringing you the best software available and hope that you can help them improve it.

Those developers live all over the world, in all possible timezones, so when you mail a developer or talk with him directly (for instance over IRC), do not expect him to be available all the time. Not only can he be very sleepy because it is 03.00 on his side, he can also be unavailable due to real-life issues, phone calls, etc. Remember, most developers work on the software in their free time.

3.b. Timeless

Archives

Free Software is timeless. Most projects keep older releases around and some projects even archive free software for various reasons (such as allowing people to find out how old a certain feature is).

You will also find archives of the mailinglists used by the project, sometimes even daily IRC logs. Support channels like the bugtracking system or the forums keep all posts and information around in case you would ever need it.

Stalled software

When a software project "dies" (for instance because the developer(s) are too busy with real-life or just dropped interest in the project), it does not disappear. Such projects are only stalled and ready to be picked up by someone who wants to devote some of his time to the project. Nothing of the project gets lost: the software itself remains, documentation remains, ...

This is one of the major advantages of Free Software: unlike propriatary software which might get dropped by the company, the software does not disappear. If you require long time support for any type of software, you can only trust Free Software - you can never know when the propriatary software is discontinued. In the worst case with Free Software, you will need to take on development of the software on your own or hire someone to do it for you.

3.c. Immortal

Freedom

Because the software is free, you can not kill it. The software can not be taken over by another company. When the project is turned over to an organisation that you do not like, you can just take the software and fork it (a term used to denote that two or more projects develop software based on the same code, but do this independently with their own goals and development accents).

The author of the software can not revoke the rights he has given to you first: once the software is free, it remains free.

Paid support

If you are not satisfied with the support you receive, you can obtain paid support (contracts with a certain support level attached to it) if you want. In many cases, this paid support isn't given by the project itself but by a third party that is well known to the software code base and maintenance.

On certain occasions, you can obtain paid support from the software project itself as well.

4. Staying up to date

4.a. Versioning

The role of freedom

When you are allowed to do anything you want, whenever you want and wherever you want, you will probably think total chaos is but a few inches away. Yet with Free Software we see that people work together closely, forming hierarchies built upon knowledge and expertise instead of popularity: most projects are governed as a meritocracy instead of a democracy.

In a meritocacry, the power goes to the people who have shown that they are superior in their field above most others. The hierarchies formed within a larger software project are chosen based on the abilities of the developers and not their charisma or any political decision.

In such projects, the lead developers decide when software is ready. There are generally five states where we can pin software on: in-development, tagged, development release, stable release and revised. We explain those states in the next few sections.

State: in-development

Software that is in-development is software where the developers work on constantly. When you have software installed on your system that is in-development, your software will already be outdated by the time you use it (unless the project isn't quite active of course).

Such software has the latest version of everything: every feature, every bug and every file that has been started is available in the software. Of course, this does mean that there is hardly any quality assurance on the software apart from the quality measures taken by the developers themselves.

Think as if you bought a painting where the painter was still busy painting: you will probably see it isn't finished yet.

If you are interested in such software, you might very well be a good candidate to become a developer for that software. In-development software is often referred to as the software available through the versioning system, such as CVS or SVN. Some distributions allow you to install such software despite it being unfinished. For instance, within Gentoo you can install cinelerra-cvs, the in-development version of the Cinelerra Video Editor. In this case, you will install the software that was in-development at the moment of installing. Every time you reinstall it, the latest in-development program code is used.

Most users however are not interested in the in-development version of any software. Distributions that allow installing such software therefore also protect their users by making sure the user knows what he is doing before he can install such software. In case of Gentoo, you must configure Portage to allow the installation of untested software.

State: tagged and static

When the developers know that the in-development software code works, or they want it as a reference for future development, they tag the software. Such tagged software is often called a snapshot of the in-development code and is given a specific name (most likely the date when the snapshot was taken).

This occurs often when the project does not allow outsiders to use the in-development code straight from the versioning system (for instance because the versioning system is not powerful enough to handle user requests or because they want a subproject to control the quality of the code before it is handed to contributors and interested parties).

Using such snapshots is often preferred to the in-development code because the distributions who use the software can ask the user who uses the snapshot which snapshot they took. Then the developers can install that specific snapshot themselves and see if they can reproduce the problem. When using in-development code, the developer would need to know exactly when the user has installed the in-development code.

Take Gentoo as an example: various (untested) packages are snapshots. gentoo-syntax-20050325 is the snapshot taken on March 25th, 2005 of the gentoo-syntax package which provides syntax hilighting and indentation settings for vim (a popular command-line editor) for editing Gentoo related files.

State: development release

When the developers feel that the software is in a quite good shape, they will tag it again but instead of having it as a snapshot, they will make it a development release. In most cases, such a release is more than just a snapshot: it is made simultaneously with documentation updates, project web site updates and after meeting quality assurance. The project advises people who want to contribute to the development of the project to use at least the development release.

Because of this, such releases are given a specific version. Sometimes you can see that it is a development release by the name. For instance, kdoc-2.0_alpha54 is a development release (because it contains the atom alpha) of kdoc, a KDE documentation processing/generation tool.

There are three atoms generally used to denote development releases: alpha (quite new, far from ready for production use), beta (should give a nice idea how the program will look and behave like when it is officially released to the public) and rc (release candidate - already contains all the features the final release will have, only bug fixes are accepted).

Previously, lots of projects made development releases. However, we notice that lately most projects have stepped down and only make official releases and in-development code. Distributions are now taking on the job of making snapshots and (although in lesser extend) development releases. Only when major releases are made projects make development releases to make sure the final release is really bugfree.

State: stable release

The real release is the stable release. Such releases are generally governed by a specific subproject of the software project and are made simultaneously with documentation updates, web site updates, and public release information.

If you are not interested in contributing to the project, you should use the stable releases of a project since these releases are very stable (they have undergone a lot of testing), have the most support (user and development community), documentation, etc.

For instance, kde-meta-3.4.1 is an official release of the KDE project (version 3.4.1). You will find that the KDE project itself has written a Press Release: this is an official statement by the project meant for various news sites, editors, distributions and interested users to inform them a new release is made. It contains pointers about the new features and enhancements that are put in the software and where you can download the software release.

The given example, kde-meta-3.4.1, is a great example for us to inform you about versioning numbers. The 3 is the major version. This number only changes when very big changes have happened since the previous release. The 4 is a minor version, informing people that the release has big updates but that they don't warrant a major version bump (a bump means to increase a number by one). The 1 is a release revision (although many people will also say it is a minor version). New releases that only differ in the revision number have small changes that improve stability, resolve security issues and bug fixes, but have relatively minor feature enhancements.

State: revised

While the official project makes stable releases, it is the distribution that makes sure that regular users can install the software on their system. Of course, you can install the software straight from the official project, but then you don't have the advantages that the distribution offers you with respect to software management.

The distribution takes the official release and makes some minor changes to it so that it installs flawlessly on your system. It might add in some eye-candy, add in some additional features that are highly asked upon by the community or change the location or names of some files to make the installation easier to manage.

Sometimes, the distribution finds a bug in the software (based on feedback it has received by users of the distribution or by the developers themselves). Quite often, the distribution will fix the issue for the users of the distribution and release it: in such cases, a revision update is made.

Take gdm-2.6.0.9-r3 as an example. The official release is 2.6.0.9 (if you think this is a dull version, check out binutils-2.15.94.0.2.2) but Gentoo has made three revision releases since: the first revision (-r1) added Gentoo-specific PAM support. The second revision made the package stable for various architectures (different kind of systems). The third revision fixed some IPv6 issues.

4.b. Forks

Same software, different software

We have touched the idea of a fork previously. A fork happens when one group of developers is not satisfied with another group of developers and start developing the same software, but differently. This occurs on occasion in the Free Software world.

For instance, one group of developers might not easily accept new features while there is a huge demand for it. This has happened with blackbox: its developers did not accept certain feature enhancements so a group of developers forked the code. They started fluxbox which was essentially the same as blackbox but its development was different as were the end goals. As of today, both projects still exist.

4.c. The role of distributions

Ease of use

Whereas regular software releases can still be quite difficult a distribution makes it very easy to install software. When you want to install software as released by the projects, you still need to know how to install it and what options you need to enable. You need to know what you should have installed prior to installing the software (the dependencies).

When you install software using the distribution, the distribution does all this for you. It will automatically resolve dependencies and conflicts, use the correct installation options and merge the software on your system, registering every file it installs so that uninstallations are easy as cake.

Protecting users from themselves

Because projects have development releases and even snapshots and in-development code, distributions help their users by making sure novice users can not shoot themselves in the foot by installing such software while retaining the possibility of using such releases by more advanced users.

Distributions also register every file installation. If a user wants to install software that overwrites a file, the distribution will make sure this cannot happen or that the changes are reversible. A distribution will also make sure that two packages that interfere can not be both installed on the system.

Feedback to upstream

One of the most important roles of distributions is to provide feedback to the original software projects about how their software functions within the totality of a Linux system. The distributions inform the software projects about bugs that users reported to the distribution and they provide valuable enhancement requests with contributions based on the revision updates they have made themselves.

Quite often distributions have developers working for them who also work on the software projects. It goes without saying that this only improves the cooperation between the two projects.

Taking care of updates

A distribution also takes care of informing the user about updates. Updates can happen for various reasons. The most important ones are security updates. In this case, the distribution warns the user that he needs to update (or accept the pending updates) because there are security issues with his current system.

Other updates are mostly new versions (new features, lots of bug fixes) made by the software project itself (in other words, new releases) or distribution-specific updates (new revision releases by the distribution).

5. Making a choice

5.a. Distributions

Differences

Now that you know a bit about Linux and Free Software, you need to make a choice about the distribution you want to use. As you already know, a distribution makes it easy for a user to install and maintain software. But a distribution does a lot more than this. In the next few sections we describe various topics which are filled in differently by distributions.

Architectures

For a system to become functional, the source code of an application must be translated to machine instructions. These instructions differ from CPU to CPU. A set of machine instructions for a certain brand of CPUs (and its clones) is called the architecture. The best known architecture is the x86 architecture, but several others exist, such as alpha, sparc, ppc, ...

Not all distributions support all possible architectures. Some distributions even limit their support to a single architecture, others take pride in the fact that they support quite a lot architectures.

Gentoo supports quite a few architectures: alpha, amd64, hppa, ia64, mips, ppc, ppc64, sparc, x86 and even has unofficial support for arm, m68k, s390, sh, ...

Package building

There are many ways a software title can be packaged. Some distributions do not prebuilt the software (so that the system still needs to compile the source code prior to installing it to the system), but most do. Prebuilt software can be packaged in an RPM file (RedHat Package Manager), a DEB file (Debian Package), ... Each of those package formats has its advantages and disadvantages.

By default Gentoo lets the system built the software. The format Gentoo uses is called an ebuild which contains instructions for Portage, the Gentoo software manager, to built the software for the user.

Provided software

Some distributions only provide a few software titles because they aim on a very particular niche (like embedded Linux, a Linux terminal server, ...) while others provide a plethora on software titles. This is quite different from other Operating Systems where you need to acquire additional software, or at least have to locate, download and install it separately. With Linux, this process is often embedded in the distribution which makes it a lot easier for the user.

Gentoo provides more than 9500 packages.

Preconfiguration

When you install software, the distribution can try to preconfigure the software for you. Some distributions go quite far so that the user hardly needs to know how to configure anything - for the common user, everything works out of the box. Other distributions do not try to configure most packages and leave it to the user. After all, the user knows best what he needs and what not.

Gentoo mainly stays with the configuration as provided by the software project and informs its users how to configure the software through excellent step-by-step documentation.

System maintenance

A Linux system is not only a collection of installed software, the software needs to work well (configuration) and should be manageable. System maintenance is a job where you make sure that the system works as it should. You can maintain your entire system through a single software package (like webmin) or through a collection of software titles.

While some distributions try to provide an all-in-one maintenance solution, most distributions opt for a decentralised maintenance with specific tools for specific jobs.

Gentoo does not offer any configuration tools - the user should configure his system through the standard Linux tools.

Branding

When a system is branded it is beautified: logo's are added, backgrounds changed, behaviour altered, ... so that the system feels as if it was developed and released by a single entity instead of several ones. Not all distributions like branding because it removes the default look and feel that the individual software projects have given to their software. They leave it as-is out of respect for the software projects.

Gentoo does not brand applications by default.

Installation

Whereas several distributions have a similar or even identical way of installing software, almost no distribution has the same installation method. Some distributions provide an installation where you hardly need to provide any information, others require you to perform every single step yourself. And all the other distributions are situated somewhere between those extremes.

Gentoo lets you perform every single installation step yourself, making a great learning school for Linux internals.

Policies

Albeit this is less visible in most distributions, some have a policy they adhere to. For instance, some distributions might have a policy that they don't allow non-free software in their distribution. Therefore such distributions will always be free to use with no restrictions whatsoever (apart from those governed by the free software license(s) they use).

Gentoo has a policy, written down in their Social Contract. It is less strict than the one mentioned in the previous paragraph, informing the user that Gentoo will never depend on non-free software. In other words, you will always have the ability to use a completely free Operating System with no crippled features whatsoever. Gentoo does offer non-free software through Portage - at least, it offers the instructions on how to integrate it succesfully on your system. It will never allow you to install software against the spirit of the license which it is released under.

5.b. Software choices

Why it doesn't matter

When you are starting with Linux (and Gentoo Linux) you will undoubtedly find it difficult to know what software to install. How is the best e-mail client called? Can you run Windows applications on Linux? How is the support for the many Word documents you might have? How can you edit pictures?

There are many, many tools available for Gentoo Linux. They offer a plethora of possibilities and functions. It is not mandatory for you to know what software you will use now: when you install Gentoo Linux, you first install a minimal, bare-boned system. When you have this, you can start finding out what software you would like to use.

Since all software is freely available, the best way to know what software to use is to try and test them out until you find one that suits you the best. Of course, it is often wise to build upon the knowledge of others: ask around what the best software would be for your needs.

6. Finding information

6.a. People

Friends and colleagues

When you are searching for information, the best place to look for is amongst your friends and colleagues. They might not describe everything in full detail as you would expect from a book or technical document, but they are interactive meaning that you can ask more questions as they come along. Another advantage is that they might reword their answers if you don't understand them.

Having friends and colleagues to ask questions to is a major advantage, especially if they also use the distribution you want to use. They might even give you on-site help: think of a private tutor :) Make sure that this person doesn't mind you asking a lot of questions though.

When you are advanced in a certain topic, remember that other people helped you when you were still a novice and share your knowledge with other people. Be open for questions and help your friends and colleagues. Don't think you know all the answers though, even Einstein made mistakes.

User groups

When you can't find your answer amongst your friends or colleagues (or they aren't immediately available for help) your best bet would be to ask in User Groups. A Linux User Group (abbreviated to LUG) is a group of Linux users who gather to discuss Linux, give Linux-related presentations, etc.

User Groups are often a good place to start as well since they give you a friendly neighbourhood-like environment where you can ask questions, as simple as they might be, without being seen as a "dull newbie". A User Group is also a good place to find distributions so you can get the latest and greatest distribution for a small fee (most likely the costs of an empty CD/DVD) so you don't need to download it yourself.

In many User Groups you will often find events such as Install Fests. An install festival is a social event where you can bring your computer to and where other people will help you install your favorite distribution. Even better, they will help you tweak it, making it more performant, up to date and tailored to your needs.

Virtual forums

Often called virtual user groups are the web site forums, places where you can find literally hundreds of people willing to help you in any way possible. On these forums (of which the Gentoo Forums are probably a perfect example) you can ask everything you want (as long as it remains on-topic).

Forums have a big advantage: you can always consult them, 24/7, and you will often find that they react quite fast. They also work as a great knowledge base where you can search through, hopefully finding someone who has posed your question before and has received all the information he needed. In that case, you don't need to re-ask (it is even considered rude to ask questions that have been answered not long ago).

Forums are also a great way to make friends: if you are very helpful yourself, you will undoubtedly get noticed. More than often will you find out that others live near you and share the same hobbies and interests. What better incentive do you need to get out to a pub and get a beer :)

6.b. Books and guides

Online guides

For specific subjects you might find that online guides prove to be a better resource. Such guides explain a single topic in great extend, often in a step-by-step construction, guiding you through the topic.

When you consult one of the more interactive resources (like forums) you will often be referred to an online guide which covers your subject. More than often provide those guides the best answer to your question, so don't be upset when the people don't answer your question but refer you to such a guide.

Gentoo has quite a lot of those helpful guides. If you think Gentoo is missing an interesting subject, don't hesitate to ask for one or even write one. Most documentation is written by volunteering contributors, so why not try and contribute :)

You will also find such guides, often in the form of a HOWTO, at the Linux Documentation Project.

Books

When you want to learn more about a broader subject (like Gentoo in general) or in more detail than any guide could offer, you might want to buy (or download) a full book instead. O'Reilly has several dozens of books available covering a lot of subjects. A book is probably the most ultimate help you can find for self-teaching, but mind you, books often get outdated and aren't replaced as fast as online guides.

Some books are available online. More than often are they grown from a small guide to a larger one, eventually changing their layout from a guide to a book. This has happened with the Gentoo Handbook and Gentoo Security Handbook. Once they were only a few pages long. Now they span over a hundred pages.

You can find online books at the Linux Documentation Project.

Massive collaboration guides

Unlike the books, who don't get much updates, and the guides who do get updates if the maintainer is active, there are special kinds of online information pages that do get a lot of updates: massive collaboration guides, often in the form of so-called WiKi projects.

Pages like these can be updated by any user who wishes so, making it quite easy to quickly fix issues and expand the document. But this fast updating has one major setback: people can easily sneak in more errors in the guide, or provide you with a step-by-step trail that is against the spirit of the subject you are interested in.

There is an unofficial Gentoo Wiki filled with guides written by several hundreds of users.

6.c. Online help

Manual pages

Manual pages are documentation pages that cover a single command. A manual page is a reference document that explains all possible options you can give at a command. Unlike guides they do not provide you with a step-by-step explanation on the subject and are therefore not interesting for guided help. They are however very important once you know the tool but want to know it better.

When you are inside a Linux system, you can obtain the manual page for a specific command or subject by typing man <subject>. For instance, to get the manual page for the emerge command often used on Gentoo:

Code Listing 3.1: Getting the emerge manual page

$ man emerge

Almost every possible command has a manual page on your system.

Info pages

Another commonly used format to display information is the GNU Info browser. Whereas the man pages are a single resource containing a quick and dirty overview of the command (and its options), the info pages are a more extensive resource, dividing information in chapters, sections, ... and allowing you to browse from one subject to another.

To view an info page for a command, type info <command>. You'll be greeted by the info browser where you can navigate up and down using your arrow keys (line by line) or PageUp and PageDown (screen by screen). When you encounter a link (visualized by a * in front of it and :: after) press Enter to go to the page.

Using the keys u (up), n (next) and p (previous) you can navigate through the documentation easily. To quit, press q.

Added documentation

Lots of software tools add documentation to your Linux system. This documentation can be in the form of a PDF document, HTML pages or plain text. In most cases, this documentation is stored in /usr/share/doc/<software title>.

For instance, the bzip2 compression utility has a manual (in PDF format) stored inside /usr/share/doc/bzip2-1.0.3-r4 (the version might be different on your system).

Immediate help

Most tools have immediate help available when you run the tool with --help or -h as one of its arguments. Do not hope to find much information here: in most cases the help provided is just a short summary of the available options.

For instance, for the emerge command (which does list quite a lot of detailed information):

Code Listing 3.2: Getting immediate help for the emerge program

$ emerge --help

7. So far so good

7.a. Handbook syntax

Used symbols and colors

Okay, we are now at the end of this first part. As you might have seen, the previous sections suddenly started using some Linux-specific commands. I will quickly explain how this handbook uses those Code Listings and other syntax.

Code listings

A Code Listing can be a command that needs to be executed. When this is the case, the command is prepended with a symbol that refers to the prompt.

A prompt is a short string given by the system to the user, telling that the user can give a command. By default, the prompt for a regular user would look like so on a system with hostname "localhost" and username "john":

Code Listing 1.1: Example prompt

john@localhost ~ $

When you are the root user, the prompt will look like so:

Code Listing 1.2: Example prompt for the root user

root@localhost ~ #

As you can see, it differs not only by the user name, but also the ending character: regular users have a prompt that ends at $, but the root user has an ending character of #. For this reason, we will use this single character throughout the rest of the document to refer to the prompt. When the character is a $ you can (should) execute the command as a regular user. When the character is a # you can (should) execute the command as the root user.

For instance, the ls command (which lists the content of the current working directory) can very well be ran as a regular user, but to install a package (like bzip2) you need to be root:

Code Listing 1.3: Example Code Listing usage for commands

$ ls
# emerge bzip2

As you can see, the command itself is highlighted. When there is output from the command to the screen that you do not need to type, it will be in plain text. When we add some comments, you will notice that it has a different layout. For instance, to change your password:

Code Listing 1.4: Changing the current user his password

$ passwd
Old password: (Enter your old password)
New password: (Enter the new password)
Re-enter new password: (Re-enter the new password to verify)
Password changed.

We will also use Code Listings to show the contents of a file.

Warnings

When the information in this handbook is incorrect due to a bug or a temporary issue, I use a warning to inform you about this temporary setback. I prefer to do it this way than to fix the content itself because I feel that documentation should not be used to fix bugs (or provide workarounds).

An example warning would look like so:

Warning: Due to a bug in the Evolution ebuild you can not install version 2.2.3-r2 for the time being. Please use 2.2.3-r1 until a fix has been found.

A more permanent warning will look like so:

Warning! Do not set the USE variable on the command line as a variable. This will temporarily assume that those USE flags are given, but the next time your system is updated this information is forgotten.

Important

When we want to stress out something important, we will normally put this in the paragraphs using emphasised text or bold text. However, when it is quite urgent and would require a larger rewrite, we will temporarily use an importancy-box like so:

Important: Make sure /etc/hostname is removed afterwards. Otherwise the error will remain since the baselayout package first checks this file prior to /etc/conf.d/hostname.

Notes

At the end of a chapter we might add a few notes, either as a certain type of footnote or a reference to another resource. If the amount of notes aren't too large, we will use a note box like so:

Note: The Linux Documentation Project has a few guides on networking as well. Definitely worth a read.

7.b. What can you expect

Installing Gentoo

In the next part, we will give you lots and lots of Linux technical information to allow you to install Gentoo Linux on your system. Unlike with the Gentoo Handbook we will try to be more verbose (yes, that's possible :) but less step-by-step.

Why? Because you should be able to install Gentoo the way you like it without the need to take a look at the step by step decriptions. Not that the Gentoo Handbook is written badly (hey, I wrote the most of it :p) but just... differently :)

B. Installing Gentoo

1. Versions, media and installation concerns

1.a. About the Gentoo releases

Gentoo versions

One of Gentoo's major advantages is that it does not really know versions. Once installed, you have a Gentoo installation, not a Gentoo 2005.1 or 2004.3 installation. Therefore you will continue to benefit from the Gentoo development with every system update you perform - there is no need to run through a specific upgrade procedure every time Gentoo makes a new release.

Yes, Gentoo does release often - twice a year to be exact. Such a release brings you an up to date installation CD with the latest hardware drivers and features that might improve your installation experience. It can also contain a set of prebuilt packages, helping you to install Gentoo quickly and efficiently.

When you hear someone talk about a specific Gentoo release (Gentoo 2005.1 for instance) they are talking about the installation CDs and set of prebuilt packages, not about the state Gentoo is at at a certain point in time. Gentoo evolves on a daily basis, but can't develop, package, test and release new installation CDs and prebuilt packages every time Gentoo changes...

Gentoo release media

By default, Gentoo releases installation CDs: bootable CDs allowing you to immediately boot in a Gentoo Linux environment containing the necessary tools to help you install Gentoo on your system. Such CDs shouldn't be read from any other operating system but immediately booted from.

For each architecture (see the note below) you will find two installation CDs: a minimal installation CD and a universal one. They both contain the same hardware support drivers (Linux kernel and additional kernel modules) and tools; the universal installation CD however also contains the necessary files to allow any user to install Gentoo without requiring a working Internet connection.

It is important to understand that:

  1. the Gentoo installation procedure is a manual procedure, requiring lots of input from the administrator
  2. there is support for a networkless installation if you use the universal installation CD and the stage-3 installation approach (which will be discussed later), but that Gentoo should not be considered if you don't have a working Internet connection

You will also find a packages CD. As the name implies, this CD contains prebuilt packages you can quickly install to get a working Gentoo installation without going through much software building.

However, these software packages are only available for those who perform a networkless installation and are not maintained by the Gentoo project at all: they are only meant for use during the initial installation of Gentoo. Once installed, your system is not different from any other Gentoo installation.

This part will not talk about the networkless installation. We have decided to postpone any information regarding prebuilt packages to a later stage because of the following reasons:

  • The networkless installation instructions are limiting the user's choices. Only a fraction of the software which a user can (and should) install during installation is available and the user might not be able to deviate from the standard installation routine.
  • The set of prebuilt software is quickly outdated. As Gentoo does not offer a continuously maintained repository of prebuilt packages, any user who does a networkless installation might be facing an installation with insecure software for the time between the (quick) installation and (slower) upgrading.
  • The available prebuilt software differs from architecture to architecture, from release to release. If one wants to have pseudo-static documentation on the Gentoo installation procedure, such variable information should be eliminated.

Note: An architecture is a family of CPUs who support the same instructions. The most known architecture in the desktop world is x86, referring to the Intel-compatible systems. Others are sparc, ppc, mips, ... amd64 is also an architecture although it has additional Intel compatibility. If you are not sure what architecture to pick, don't hesitate to ask.
Incidentally, amd64 is the most common answer to that question, followed by x86 :)

1.b. Gentoo installation approaches

Introduction

You should understand that the Gentoo installation procedure - at least the officially publicised one - is quite different from most other Linux distributions: where other distributions try to perform most steps for you, Gentoo Linux asks you politely (but firmly) to do things yourself.

Getting the hardware up and running, configuring the network, partitioning your disk(s), copying over the initial files, building additional software (including the kernel), ... all these steps should be performed before you can finally boot in a minimal Gentoo environment. Not that all these steps can't be automated (Gentoo even offers tools to automate a few of those steps and you'll find a lot of unofficial installers that automate most - if not all - steps) but by documenting these steps in great detail Gentoo almost forces you to learn various Linux-related procedures.

Another advantage of letting the user perform all steps himself is that the user can now decide himself how he wants to install Gentoo - the options are there, the user needs to make a choice, over and over again. By clearly identifying the options and documenting the possible roads Gentoo hopes that the user is not scared but rather impressed.

For instance, Gentoo offers the user with three initial system states where the user can start from to install Gentoo. These states are called stages: stage-1, stage-2 and stage-3.

Easy and fast: stage-3

The stage-3 system state starts from a minimal Gentoo environment, containing the core system utils that anyone would need to get Gentoo up and running. This is the preferred initial state for most users and also the quickest way to install Gentoo. From this stage onward, the user installs the additional tools he requires (such as certain networking tools for automated IP information retrieval, cron jobs for scheduled process execution, system logger for keeping track of all log events, ...) and builds a Linux kernel to boot from.

When you want to install Gentoo without a working Internet connection (the networkless installation approach) you must use the stage-3 approach since the universal installation CD only contains source code for the additional tools you should install - not for the tools already available in the stage file.

Although the stage-3 system state is the most full-featured one, many users often take this as a bloated stage thinking they can't tweak as much as they can with the other stages. This is wrong, as you can easily rebuild the entire system with new (compiler and USE) settings - and in many occasions faster too!

Tweaking the system: stage-2

The stage-2 system state contains a built and functional toolchain but no system utilities. This is an intermediate state between a stage-1 and stage-3 and also the least often used approach to install Gentoo with. Those who do consider using this stage often alter their profile with respect to base system packages and perform major tweaks with the CFLAGS, CXXFLAGS and USE variables.

Although Gentoo offers a stage-2 initial system state, you should consider performing a stage-2 installation with a stage-3 initial system state. This will preserve you from possible circular dependency issues that are inherent with the stage-2 build.

Tweaking the bootstrapping procedure: stage-1

The stage-1 system state contains a non-optimized toolchain with no system utilities. This is the state where Gentoo Release Engineering developers start from to move to a stage-3 state by rebuilding the toolchain for the specific architecture (migrating to a stage-2) and using this newly rebuilt toolchain to install the system core utilities (migrating to a stage-3).

This state is only interesting for those attempting to change the bootstrapping procedure (by changing the bootstrap.sh script) or want to build a non-default Gentoo environment (for instance using a completely different toolchain).

Although Gentoo offers a stage-1 initial system state, you should consider performing a stage-1 installation with a stage-3 initial system state. This will preserve you from possible circular dependency issues and bootstrapping failures inherent with the stage-1 procedure (which is quite complex).

1.c. Download, burn and boot

Download the media

With the information handed to you in the previous sections you should have an idea what CD(s) you need to download. Gentoo provides the CDs both as an FTP/HTTP download or through the BitTorrent peer-to-peer network. Pick the latest version available (as that one contains the most up-to-date hardware support and additional features) although this is not mandatory: you can easily install a (current) Gentoo from an older installation CD.

As the directory structure on the FTP/HTTP mirrors suggests, you'll find the CDs in the releases/ directory.

Each CD is fully contained within an ISO file. Such a file contains all the content of a CD and should be burned on the CD using a specific (but well supported) procedure. Most CD/DVD burning tools call it Burn ISO or Raw burning; it differs from the regular burning methods used that it burns the content of the file on the CD, not the file as-is (i.e. the end result is not that you just see the single file on the burnt CD).

If you want, you can verify the downloaded ISO file using the .md5 file we provide. This file contains an Message Digest 5 checksum of the file, a known mathematical result of the entire CD that is practically unique to every file. In other words, you won't find a file anywhere with the same checksum unless that file is identical. Under Linux, you can use the md5sum tool to verify the checksum.

We also provide a digital signature of the file made with our private Release Engineering key. This digital signature can be used to validate the origin of the ISO file: if the digital signature matches the public key of the Release Engineering team, then the file is authentic. Under Linux, you can use the gpg tool to verify the checksum.

Booting the CD

To get in the initial Gentoo environment, you need to boot from the installation CD. How to achieve this depends on the architecture you are using. The first appendix in this book covers the various architecture-specific aspects of a Linux system, including booting CDs.

Once booted, you will see that the installation CD already tried to load the necessary drivers and hands you over to a root prompt, indicating that the system is waiting for further input:

Code Listing 3.1: Resulting prompt after a succesful boot

root ~#

This is the command-line prompt. You are now booted in the initial Gentoo environment, ready to continue.

2. Starting from a minimal environment

2.a. Introduction

The prompt you are now looking at is all-powerful, but a bit daunting at first. If you know how the Linux command line works, this chapter will give you no further surprises and you can easily jump to the next chapter, Preparing the Network.

2.b. Basic navigation

Working command-line

At the prompt you can enter commands to the Gentoo Linux environment. Basic commands are just single words, like ls or ps. Most commands however often require additional words to be added, like cd /var or man ls. These added characters are called arguments.

More often, these arguments represent certain options to the command. For instance, the ls command (used to list the content of a directory) can take several options, like -l (for a lengthy description of each found file), -a (to include hidden files), etc.

In the next Code Listing you'll see this terminology explained. The command shown will list the content of the /var/tmp directory (a temporary location) showing additional information about each file found, including hidden files.

Code Listing 2.1: Command, arguments and options

root ~# ls -la /var/tmp
<--+--><+> <+> <---+-->
   |    |   |      `- Argument, in this case the target directory to list
   |    |   `- Argument, also two options: lengthy description + hidden files
   |    `- Command (list in this case)
   `- Prompt

After having constructed a command, you can execute it by pressing the return key. Your shell (the command-line environment you are currently "in" which interprets commands and helps you navigate through the Linux environment) will then execute the command and show you the results.

The next section will give you a crash course in certain Linux commands that will help you explore your current minimal Gentoo environment. We won't go in much detail - there are plenty guides and books available online that will inform you about the basic Linux tools (Rute's Unix Tutorial Exposed is one of the more famous, freely available books).

Navigating up and down

We have already covered how a Linux file system looks like (the hierarchical structure, remember?). To help you out we'll give you a quick overview on the most common tools you might need to navigate on your Gentoo Linux system:

Command Description Example
ls List the content of a given directory, or the current directory if no directory is given. ls /mnt/cdrom
pwd Show the current working directory; this is the full pathname of the directory you are currently in. pwd
cd Change the current working directory to a different location. If no directory is given, go to the user's home directory. cd /mnt/gentoo
less Show the content of a given file on the screen. You can navigate with the up and down arrows through the file and quit the application by pressing 'q'. less install.txt
rm Remove a file from the system (if you have the required privileges). To remove a directory with all files in it, use the -r option. Be careful with this command though, it won't warn you when you are about to destroy all your data. rm portage.tar

Concurrent terminals

Gentoo's installation CDs allow you to use a couple of terminals simultaneously. That means you can work in one, browse the internet in another and chat on a third. To switch between terminals, type Alt+F# (with F# one of the function keys). You will notice that your current session is at F1.

You can also work in a terminal and then put your working session in the background using a powerful tool called screen. With screen you can even let other people work on your terminal while you are watching every step they perform.

Code Listing 2.2: Working with screen

(Starting a screen session:)
$ screen -S mySession
(To detach a screen session, type 'Ctrl-A' followed by a 'd')

(Reattach to a screen session:)
$ screen -x mySession

(To quit a screen session, just type 'exit':)
$ exit

2.c. Networking utilities

Gentoo offers a few utilities on the installation CDs which you can use to surf on the Internet, download files, chat on the IRC network, etc. We will cover a few of them in this section, but you can't use them until you have configured your network (and Internet connection) which is described in the next chapter.

Surfing on the Internet

Because no documentation can be perfect and no two environments are alike, you will often search for additional information and help on the Internet. Websites such as the main Gentoo documentation repository or powerful search engines like Google are a welcome resource during your installation quest.

To browse through these sites you need a browser. Because the Gentoo installation CDs don't contain a graphical environment you need to use either a different system, a different installation medium or ... a non-graphical browser. The Gentoo installation CDs offer at least one of the following console browsers:

  • lynx, a general purpose web site browser which operates key-driven (for instance 'D' for downloading, 'G' to go to a different site, ...).
  • links2, a featureful browser with support for frames, limited JavaScript, svgalib/framebuffer, background downloading, ... which operates both menu-driven (press 'Escape' to open the menu) as key-driven.

Both browsers support proxy servers, although the first one uses the standard way (setting an HTTP_PROXY environment variable) while the second one requires you to enter the proxy server in the browser.

Code Listing 3.1: Example surfing information with proxy support

(For lynx - ignore the export command if no proxy is needed:)
$ export HTTP_PROXY="http://myproxy.server.tld:8080"
$ lynx http://www.gentoo.org

(For links2 - ignore the -http-proxy option if no proxy is needed:)
$ links2 -http-proxy http://myproxy.server.tld:8080 http://www.gentoo.org

Chatting on an IRC network

When you can't find useful information on the Internet, you can always ask your question on the #gentoo-install or #gentoo IRC channels on FreeNode. Gentoo delivers a terminal-based, yet extremely powerful chat client called irssi.

Its use is quite simple. First, connect to the IRC network. Then, join the channel(s) you want to participate in. You can use Alt+F# to switch between channels (or type in /window # if the key combination fails). To exit the application, type /quit.

Code Listing 3.2: Getting online with irssi

$ irssi -c irc.freenode.net YourNickName
(Wait until the connection is made)
[irssi #] /join #gentoo

Remote shell access

The Gentoo installation CD contains an SSH daemon, which is a tool to allow others to securely connect to your system so they can help you install Gentoo. This service isn't started by default, but if you want to use it you should:

  1. get the Internet connection up and running
  2. create a user account
  3. give the root account a password
  4. start the SSH daemon

If you trust the other person, you can give him your root password, but we advise you to only give limited access to the other person - they should help you identify errors and tell you how to resolve them, not fix the errors themselves. Otherwise you won't learn :)

Code Listing 3.3: Steps to get the SSH service running

(First get the Internet connection up and running, then...)
# useradd -m -G users myuser
# passwd myuser
(Give the 'myuser' user a password)
# passwd
(Give the 'root' user a (different) password)
# /etc/init.d/sshd start

The passwords you set here are limited to the Gentoo installation CD environment and only until you reboot. They are not used for your final Gentoo installation!

3. Preparing the network

3.a. Introduction

Why you need it ...

Not everyone requires a working Internet connection, but if you can it is best if you do - this part of the book only covers installations with a working Internet connection because it is highly advantageous. Not only can you immediately install the latest versions of all the Gentoo packages, it also allows you to go on the Internet to seek help and support, or to chat while your system is installing Gentoo.

In this chapter we cover one specific network type: Ethernet. An Ethernet network connection is often referred to as the wired network, using a RJ-45 network plug. Wireless networking, also known as WiFi or IEEE 802.11x, is also supported from the Gentoo installation CDs but not covered in this chapter (you will find more information about it further in this book).

3.b. Recognizing the hardware

Auto-detection

You first need to configure your system to support your network card. Chances are that the card is already found and supported - to verify this, use the ifconfig tool. This tool will tell you what interfaces are available on your system. Interfaces are where Linux can assign IP address information to. It can represent a networking device but doesn't have to: there are interfaces for the local system (localhost, which doesn't need a network card), network tunnels, bridges, etc.

Code Listing 2.1: Showing all available interfaces

# ifconfig -a

The above command will display statistics about each interface found on your system. The interfaces that start with eth are the ones that interest us the most. Those interfaces represent the network cards. If you have one network card, it will probably be called eth0. If you have two, they would be called eth0 and eth1, etc.

If you have several network cards, you can't tell right now what interface represents what network card. There are several ways to find this out, but the easiest method is to assume that eth0 is the one you are interested in - and if it turns out your network doesn't work, try switching the cable :)

Now, if you have seen eth-interfaces then your hardware is found and supported and you can continue with the next section (DHCP or Static IP Address?). Otherwise you need to find out what chipset your network card uses and load the necessary hardware support.

Manual hardware discovery

With lspci you can get an overview of all PCI devices found on your system. The interesting part of this tool is that it shows the function of the device and the brand and type, making it excellent to discover what you have under the hood of your system.

We are interested in the Ethernet controller or Network controller, so we filter out the output of the lspci command using the grep tool, only giving us those lines that have 'Ethernet controller' or 'Network controller' in it:

Code Listing 2.2: Showing all Ethernet controller devices

(Substitute with 'Network controller' if this doesn't give satisfying results)
# lspci | grep 'Ethernet controller'
0000:06:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8169 
  Gigabit Ethernet (rev 10)

With this information, you can start searching for the kernel module that offers support for your Ethernet card. In the above example, the module is r8169.ko. Although there is a quite efficient way to find out what kernel module exists for what chipset (search through the Linux kernel menu configuration information) we can't use this from the Gentoo installation CD as you don't have this configuration information available yet.

You are not left in the dark though. You can try digging through all the available kernel modules that support network cards, hoping to find any reference of the chipset you have. An easy method is to 'dump' a list of all kernel modules and filter out those that contain a good identifier in their name.

For instance, the above example could lead to the discovery of the r8169.ko module using the following method:

Code Listing 2.3: Filtering the list of kernel modules

# find /lib/modules | grep -i '8169'

The -i tells the grep tool to behave case insensitive. Okay, it doesn't matter for the example since we're asking grep to filter out a number - which isn't affected by case sensitivity - but this is quite important for different filters.

You can try out various terms that occur in the chipset name as well, either as separate jobs, or by providing grep with an entire list of terms you want to filter on. This can be accomplished by adding the -E option and handing it a list, separated by '|' signs and contained within '(' and ')'. The -E tells grep to interprete the filter as a regular expression:

Code Listing 2.4: Filtering the list of kernel modules on a regular expression

# find /lib/modules | grep -iE '(8169|realtek|rtl)'

Once you have found a possible kernel module for your network card, you can try to load it in memory using modprobe. This tool will search for the module, query for possible depending modules (some modules require another module to be loaded first) and then load the module in memory. For instance, for the r8169.ko module:

Code Listing 2.5: Loading a kernel module in memory

(Notice that we have dropped the ".ko" suffix!)
# modprobe r8169

If this command didn't fail out, try ifconfig -a again to see if you have a working interface. If not, keep trying...

3.c. DHCP or static IP address?

Setting the IP address information

Now we need to configure the interface to obtain an IP address, a unique address that identifies your system in the network (Internet Protocol address). In most environments, this can be obtained automatically using DHCP (Dynamic Host Configuration Protocol). This protocol allows interfaces to send out a request for an IP address on the network and receive IP address and routing information from the DHCP server (which is often found in home routers or in enterprise environments as a stand-alone service).

Note: TCP, the other part of the TCP/IP combo, stands for Transfer Control Protocol and is responsible for the applicative communication between two systems. It has no recollection of system addresses (that's the IP's job) but uses ports instead to distinguish one communication session from another.

If you have a DHCP service on your network, you need to run a DHCP client for your interface. With Gentoo, you have dhcpcd at your disposal. Other clients exists as well though, such as dhclient and pump. To automatically obtain an IP address for the eth0 interface, run dhcpcd eth0 after which you can continue with Testing the Network.

Code Listing 3.1: Obtaining an IP address

# dhcpcd eth0

If your network card was automatically detected by Gentoo and your network supports DHCP, you'll probably receive a warning telling you that an instance of dhcpcd is already running. That's okay, it means that the installation already had the network configured for you.

If you need to configure your network using a static IP address, you have to know:

  • which IP address you can use,
  • what IP address your gateway listens to, and
  • what part of the IP address is reserved for the network identification

A gateway is a system that acts as the connection between your network and the outside world. If you have a PC that shares an Internet connection (which often uses NAT - Network Address Translation) it is most likely that its IP address is your gateway IP address.

The network part of the IP address is what separates your IP address from an outside IP address. For instance, if your IP address is 192.168.1.12 and all IP addresses in your network are in the range of 192.168.1.1 to 192.168.1.255, then the network part of the IP address are the first three numbers. This is identified by a netmask 255.255.255.0.

To configure your interface with a static IP address you can use the ifconfig application which we have already seen when we tried to discover what interfaces were available on your system.

The ifconfig tool requires you to pass it the interface, the IP address and the netmask of the network. Suppose that 192.168.1.12 is your IP address and 255.255.255.0 your netmask:

Code Listing 3.2: Running the ifconfig tool

# ifconfig eth0 192.168.1.12 netmask 255.255.255.0 up

Next we need to configure the system to pass on requests for the Internet to the gateway. With the route command you can set up the default gateway, which is the default location where requests that are for an unknown network are passed to. Assuming that 192.168.1.1 is the gateway IP address:

Code Listing 3.3: Setting up the default gateway

# route add default gw 192.168.1.1

We are almost there. You should be able to get on the Internet ... if you knew all the IP addresses of all the servers by heart. To be able to use hostnames as well, you need to tell the system where the name servers are: systems that can translate hostnames to IP addresses. Your Internet Service Provider or network administrator should be able to tell you what the IP addresses for the name servers are.

You need to place these IP addresses in the /etc/resolv.conf file which sole purpose is to configure anything related to name configuration, including where the name servers are located.

To edit this file, you can use nano, a simple text editor for the command line. Other editors that might be available are vi and emacs. nano however is certainly available, so we will use nano as an example. Assuming that the name servers are 123.45.67.89 and 123.45.67.90:

Code Listing 3.4: Editing /etc/resolv.conf

# nano /etc/resolv.conf
(Change the content of the file to contain the name servers:)
nameserver 123.45.67.89
nameserver 123.45.67.90

Now that's done, you should test your network connectivity.

Testing the network

With the ping tool you can send small requests to servers around the Internet (but also on your network) and ask them to reply back. This makes ping a perfect tool to check if a system is reachable. We'll use this tool to verify the network connectivity.

First, we will try to reach a Google web server. We'll send it three requests - if they come back, your network (and Internet connection) is working great and you can continue with the next chapter.

Code Listing 3.5: Sending three requests to www.google.com

# ping -c 3 www.google.com

If you are not able to ping this system by name, you should try to ping an Internet server by its IP address. In the following example, we send three requests to 66.249.93.104, which is an IP address for a Google server. However, IP addresses might change so it is easier if you first verify that this IP address is really functional on a different system which has a working Internet connection.

Code Listing 3.6: Sending three requests to 66.249.93.104

# ping -c 3 66.249.93.104

If this works, then the problem is with the name resolving. Verify that your /etc/resolv.conf contains the correct IP addresses for the name servers. Those IP addresses should be reachable (you can ping those as well to verify). Also verify that /etc/nsswitch.conf has a line that starts with 'hosts' and contains 'dns' as a keyword. This file tells your system where to look for various resources, such as name resolving information. The 'dns' keyword tells the system that /etc/resolv.conf's name servers should be used. The 'files' keyword tells the system that /etc/hosts contains a few IP address with hostnames which should be used as well.

Code Listing 3.7: Verifying the /etc/nsswitch.conf file

# grep -E '^hosts' /etc/nsswitch.conf
hosts:       files dns

Suppose that 66.249.93.104 wasn't reachable as well, you might have an issue with the gateway being malconfigured. Verify if your gateway is set correctly by running route -n: the gateway IP address is the one mentioned right next to the 0.0.0.0 destination. In the next example, we use the awk tool to filter the output of the route command: of the line that starts with 0.0.0.0 we only show the second 'word' (which, in this case, is the gateway IP address):

Code Listing 3.8: Verifying the configured gateway IP address

# route -n | awk '/^0.0.0.0/ {print $2}'

If the gateway IP address seems correct, try to ping it to see if you can reach it. If you can, then the gateway itself is either blocking your Internet connections (perhaps a firewall issue) or the system isn't the gateway at all but just another host on your network.

If you can't reach the gateway IP address but you are confident that the IP address is correct and doesn't have a firewall that is dropping all your requests (including the ping requests), then your interface is malfunctioning. Make sure the network cable is plugged in and the cable is meant for the connection type you are using (the straight versus crossed UTP cable debacle).

4. Putting the minimal environment in place

4.a. Storage

Introduction

A difficult job of any Linux installation is to prepare the partitions to house the Operating System. Each person has a different taste as to what should be on a different partition and what shouldn't, or what file system to use.

The idea behind partitioning is to make some sort of separation between one set of data and another. For instance, people have their /boot separated from the root file system because they want to be able to hide the /boot content from the system during regular operations. Or they want to have /home separate because that would allow them to store all user-specific information, settings and data on a different disk so they are able to easily retrieve the data after an Operating System reinstallation.

You can also choose to have a separate partition because you want to improve performance by using a different file system tweaked to the use of the data stored on the partition.

Deciding on the partition scheme requires intimate knowledge of the file system:

  • What files are stored where?
  • How often are the files used? Read or written?
  • What is the function of the system you are building?
  • What features do the various file systems have?

Because this is something that comes with age (err, experience ;) your first, second, third, ... attempt for a perfect scheme will undoubtedly fail. Lots of people therefore opt for a simple partitioning scheme (like everything on a single partition, with a separate partition for the swap1 information). Others attempt to use a more dynamical approach and use Volume Management.

With Volume Management, you place a layer in between the partitions on the disk and the file systems that hold your data. This layer can be used to combine multiple partitions as if they were just a single one, or to use several logical divisions on a single partition. Of course, Volume Management is much more than that: it makes it easier to move data across partitions, shrink or grow file systems, etc.

Note: 1 Swap space is a specific location on the disk where the Linux kernel can store memory pages (regions of data in memory, assigned for use by processes or by the kernel itself) that will most likely not be used in a while. The Linux kernel will, once your memory is completely filled (not sooner!), move such memory pages to the swap location, thereby freeing internal memory for other, more important memory pages.

Designing a scheme

You should start designing a partitioning scheme for your system. Telling you what partitioning scheme is best for you is impossible, but here are some pointers:

  • For each partition you want to create, ask around how much space the partition might get given the tools and services you want to install. Make sure you design each partition to be larger than the size given by others: growing a file system is not without danger.
  • Not all Linux file system locations are partitionable:
    1. The /etc, /lib, /bin and /sbin locations must stay on the root file system as they are required by the Operating System under any circumstances. This is because they are needed to be able to mount other partitions on the file system.
    2. Only directories can be separated from a file system. Each separation automatically includes all subdirectories.
    3. You can not separate two different locations and still store them on a single partition (logical, that is - with Volume Management physical partitions can hold several logical ones), so you can't put both /usr and /opt on a single location.
    Some locations also require special attention:
    1. The /dev location will already be separated from the main file system by the device manager, so you don't need to devote a specific partition for /dev.
    2. The /tmp and /var/tmp locations are used for temporary file storage. Although the content of these locations is slim in most situations, it can grow exponentially. For instance, during the installation of software through Portage, /var/tmp/portage is used and can require up to a few gigabytes (!) of space.
  • Ask yourself if you really need that nifty file system feature: using Volume Management or (Software) RAID does require more work. Without much guidance, you might lose too much time stabbing at storage related problems.
  • Many backup solutions are file system independant, but some of them aren't. If the backup storage is limited but you are using a file system dependant solution, make sure that the total amount of data that it will backup doesn't exceed the dedicated backup storage size.

We will discuss the more advanced storage solutions in more detail in a different part of this book, but to make you aware of the possibilities we'll give a quick rundown of those features first. We won't integrate these solutions with the installation procedure though as that would complicate things too much.

RAID arrays

RAID stands for Redundant Array of Inexpensive Disks and is a well-known way of putting disks together. There are several RAID levels defined:

  • RAID-Linear places two (or more) disks next to each other and let the user view them as if those disks were a single one. At first, all data is written to the first disk. Once that disk is filled, the second disk is used, etc.
  • RAID-0 is also called striping: unlike with RAID-Linear data is first partitioned after which each 'stripe' of data is written to a disk. Therefore data written to a RAID-0 array is stored across all disks.
  • RAID-1, also known as mirroring, places all data written to the array on all disks. In other words, each member of the RAID-1 array is an exact copy of the others.
  • RAID-5 requires at least three disks. We'll explain its inner workings for three disks, but more than three is perfectly possible as well. When data is sent to the array, it is partitioned. For each two parts of data, a simple checksum is created, so we have three data segments. Those three segments are then stored to a disk. If one of the segments is lost, it can be retrieved from the other two segments if necessary.

Other RAID levels exists but are less frequently used.

RAID arrays are interesting when you need high availability of your data. For instance, with RAID-1, if one of the disks crashes, the other one takes over. Something similar happens with RAID-5: if one of the disks crashes, the other disks can work together to generate the data that was stored on the malfunctioning disk.

If you have a true hardware RAID card, using RAID within Linux (or any other Operating System for that matter) does not require any input at all: the operating system does not handle anything RAID-related. You only see the result, the hardware RAID card handles all the rest.

Pseudo-hardware RAID cards offer various RAID-related services but still require some (or a lot) of operating system input. Such cards therefore require a good working driver and perhaps even a few tools. Although they aren't as transparent as true hardware RAID cards, they still beat the software RAID.

Software RAID allows users to benefit from most of the RAID functionality without requiring any specific hardware. This does require the operating system to handle all the RAID-related tasks, requiring some computing time.

Information about using Software RAID is discussed further in this book.

Logical Volume Management

Unlike RAID, which is most often used for redundancy, LVM allows the user to maximise the flexibility of his storage. Basically LVM (actually, LVM2) should be seen as a layer in between the physical storage (the disks) and the logical view (the file system).

With LVM, you create logical volumes (partitions) which hold the file systems. One or more logical volumes are stored on a volume group which is nothing more than a collection of physical volumes (partitions). A physical volume is an entire disk, or a partition, controller by LVM.

The LVM software offers services on top of this intermediate layer. For instance, you can `spread' data across several partitions (like RAID-Linear or RAID-0), or have several logical partitions on a single physical one. But that's not it. LVM allows you to add (or remove) physical volumes from a volume group without affecting the logical volumes (unless of course the logical volumes require more space than the volume group has to offer), move data from one disk to another without requiring any manual copy procedure or take a snapshot of an entire file system without really having to take a full copy of the system.

These (and more) features make LVM a powerful tool of many system administrators.

Information about LVM2 is discussed further in this book.

Preparing the devices

Before you can start creating the necessary partitions, you need to make sure that the Linux Operating System can work with your hardware. If the installation CD that you used to boot didn't automatically detect your hardware, you will need to load the appropriate support manually.

Although we can start by educating you how the Linux kernel addresses the disks (like /dev/hd* for IDE, /dev/sd* for SCSI and Serial ATA, ...) and what logic is behind it, we will leave this for another document (or perhaps a later chapter :) and immediately tell you how to discover where your disks are.

If your disks are SCSI or Serial ATA (although some SATA disks are treated as native IDE disks, most of them are using a SCSI-like driver), run dmesg and filter out any occurrence of 'disk'. IDE disk users should filter out 'ide' and 'hd':

Code Listing 1.1: Finding out what disk(s) you have

(For SCSI or SATA:)
# dmesg | grep -i disk
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0

(For IDE:)
# dmesg | grep -i ide | grep -i hd
TODO insert IDE output

In the above example we discover that our SCSI (or SATA) disk is at /dev/sda and our IDE disks are at TODO. If this does not reflect the setup you have, you will need to load the appropriate drivers. Otherwise continue with the next section on RAID Arrays.

Your Linux system can tell you what controllers you have and what chipset they use. This information is vital if you need to load additional drivers. The combination of the lspci tool and the grep filter proves to be quite efficient.

For instance, if you have an IDE controller but it wasn't loaded by default, try filtering for 'ide'. Similar actions should be performed for Serial ATA ('sata') or SCSI ('scsi'):

Code Listing 1.2: Findout out what IDE controllers are available on a system

# lspci | grep -iE '(ide|sata)'
0000:00:1f.1 IDE interface: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) 
  IDE Controller (rev 04)
0000:00:1f.2 Class 0106: Intel Corporation 82801FBM (ICH6M) SATA Controller 
  (rev 04)

Based on this information you can try searching for the appropriate support drivers. A quick grep on the content of the /lib/modules directory (which stores all the additional kernel modules):

Code Listing 1.3: Searching for support drivers

# find /lib/modules | grep -iE '(82801|ich6)'
TODO

If you found a matching kernel module, load it in memory and rediscover where your disks are:

Code Listing 1.4: Loading the kernel module

# modprobe TODO

4.b. Partitioning

Architecture-specific

Partitioning is architecture-dependant as partitions are generally tagged for some function. The type of a distribution is therefore a very important setting. Two important types are:

82 (Linux Swap)
The partition holds swap information
83 (Linux)
The partition holds a Linux file system

The partition structure is also architecture-dependant. Some architectures only allow a few partitions (sometimes also called slices) to be available, others have some weird solution to get over a very narrow limit causing confusion in the numbering scheme of the partitions. There are even architectures where certain partition numbers are reserved for a specific use.

You will find architecture-specific partitioning information in the first Appendix of this book, together with an example partition layout which you can use to get started with Gentoo Linux.

Create them, now!

Read the partitioning information for your architecture now and create the partitions you'll use to store Gentoo Linux on.

4.c. Filesystems

Overview

Partitions alone aren't sufficient to store data (unless the partition for a specific purpose, like raw access for databases). You need to apply some structure to the partition so that Linux knows where files and directories are stored, what permissions are set to the file, what security attributes are applied, etc.

This is the task of the file system. A file system is a standard way of storing and retrieving information from a partition. It dictates where the files data is stored and how to get access to the file metadata (everything about a file except the content, like name, creation date, owner, ...).

Linux supports quite a lot file systems, but not all of them are functional enough to store Linux files. For instance, the FAT-family (FAT12 for floppies, FAT16 for small partitions and FAT32 for larger ones) has no concept of permissions while the NTFS-family (all the dozen versions that Microsoft has released thus far) is too complex (partially due to its closed-source nature) and not fully supported.

We will describe the most known file systems that you can use to hold your Linux Operating System together with some information on the tools associated with the file system.

Extended 3

Extended 3 is the journaled version of Extended 2, the older (but proven) file system which is the first real Linux file system ever developed for Linux (before ext2, the Minix file system was used - but that was a long, long time ago). Extended 3 is currently the most used file system as well.

As a journaled file system, ext3 can make sure that the entire file system is consistent at any time. In other words, if your system would ever crash (for instance due to a power interruption), the file system would never contain garbled data - if you were busy writing data to the disk, either the old data is there, or the new data, but not a partial write.

You can choose between no journaling (which basically means that the file system should be seen a an Extended 2), metadata journaling (where only the metadata is consistent across time) where you can choose between writeback (first metadata write, then data) and ordered (first data write, then metadata) and full journaling. The ordered metadata journaling is the default.

Extended 3 supports access control lists and is therefore a candidate file system for more security enhanced Linux kernels who require ACLs to be available for the file system.

To write an ext3 file system on a device, use the mke2fs tool with the -j option (for journaling):

Code Listing 3.1: Writing an ext3 file system on /dev/hda3

# mke2fs -j /dev/hda3

The mke2fs command has a few interesting options as well:

  • Writing a file system to the disk is quite fast. If you want to check the device for bad blocks during the file system write, add a -c option. If you specify this option twice, a full read-write test is performed instead of a read-only test.
  • To speed up lookups in large directories, you can enable directory indexing by adding -O dir_index.
  • Large file systems might free more space by adding -O sparse_super. This will decrease the percentage of blocks used as backups for the file metadata.

The swap file system

Although you can't 'use' the swap space directly, it does use a specific file system to store the memory pages. To create a swap file system on the swap partition, use mkswap:

Code Listing 3.2: Creating a swap file system on /dev/hda2

# mkswap /dev/hda2

Unlike the other file system creation tools, mkswap hardly takes additional options for tuning purposes.

Create the file systems

Create the file systems on your partitions now and don't forget to create the swap file system as well.

4.d. Getting in the minimal environment

Mounting the partitions

The next step is to mount the file systems in the Linux file system hierarchy so that you can use it. As we have said in the previous part, mounting attaches the file system to the current hierarchy at a specified location.

On the Gentoo installation CD, a directory /mnt/gentoo is available to mount your root file system in spe. Let us suppose that the file system is at /dev/hda3, then the mount command would be:

Code Listing 4.1: Mounting the root file system at /mnt/gentoo

# mount /dev/hda3 /mnt/gentoo

You'll also need to mount the other file systems at the correct place. Because your root file system doesn't contain any directories yet, you'll need to create them. For instance, if you have a separate /boot (at /dev/hda1 and /usr (at /dev/hda4) file system:

Code Listing 4.2: Creating mount points prior to mounting the file systems

# mkdir /mnt/gentoo/boot /mnt/gentoo/usr
# mount /dev/hda4 /mnt/gentoo/usr
# mount /dev/hda1 /mnt/gentoo/boot

You will also need to activate the swap partition. This is accomplished using the swapon command:

Code Listing 4.3: Activating the swap space

(Example for a swap file system at /dev/hda2)
# swapon /dev/hda2

Preparing the stage tarball

A Gentoo stage tarball contains a minimal Gentoo environment. If you are booted from a Gentoo universal installation CD you might find the stage of your choice on the CD (probably at /mnt/cdrom/stages). If not, you can download one from one of our mirrors. They are stored in the appropriate release directory under the name stages/.

A tarball is an archive, usually compressed using Lempel-Ziv coding (LZ77 - gzip) or Burrows-Wheeler compression with Huffman coding (bzip2). Uncompressed, you will have a single file (a tar2 file) that contains all the files in the archive, nicely appended one after another.

To download such a stage tarball, first go to /mnt/gentoo. This is required so that, once you start downloading the file, it is stored on the disk and not in memory (the Gentoo installation CD creates a virtual 'disk' in memory so that you can use the CD without requiring any pre-installed Linux system).

Code Listing 4.4: Downloading a stage tarball

# cd /mnt/gentoo
# lynx http://www.gentoo.org/main/en/mirrors.xml

Next, extract the tarball to the /mnt/gentoo location. Use the tar tool with xjpf as options and the tarball as argument:

  • extract the files from the archive
  • use bunzip2 to decompress the archive (j due to shortage of available options :)
  • preserve the permissions that were stored in the tarball
  • use the next file as the archive

Code Listing 4.5: Extracting the stage tarball

# tar xjpf <file>

If your stage tarball is stored on the CD, just use the path to the file for <file>.

Note: 2: the name tar comes from Tape ARchive. The tar tool was (and still is) commonly used for backing up files to tapes which only have linear access (unlike digital media where you can quickly jump from one location to another). Because of this limitation, all files are aligned after another with a table of contents stored in the beginning of the tape. The tar tool is still a very popular tool for creating archives.

Extracting a Portage snapshot

We need to extract another tarball, namely a portage snapshot. Portage is the software management system Gentoo uses. The package information itself is stored in what we call the Portage tree. A portage snapshot is a Portage tree taken at a certain point in time.

On a universal installation CD, you might find such a snapshot at /mnt/gentoo/snapshot, but you can always download a snapshot from the Internet as well. Go to one of our mirrors and locate the most recent portage snapshot in the snapshots/ directory.

Code Listing 4.6: Downloading a Portage tree snapshot

# lynx http://www.gentoo.org/main/en/mirrors.xml

We don't need any specific permission information from the snapshot, so the tar command only requires xjf as options. However, the snapshot must be extracted inside /mnt/gentoo/usr. We could do the same as we did with the stage tarball and first go to /mnt/gentoo/usr before we run the extraction command, but you can also use -C <location> (with a capital C) to inform tar that the /mnt/gentoo/usr location is the destination:

Code Listing 4.7: Extracting a Portage tree snapshot

# tar xjf <snapshot> -C /mnt/gentoo/usr

Preparing the minimal Gentoo environment

As you might have guessed already, we are trying to have /mnt/gentoo contain a fully functional Linux environment. To finish this off, we need to mount a specific pseudo file system called proc at /mnt/gentoo/proc. This is not a file system stored on a disk, but rather an interface to the Linux kernel showing kernel information as regular files. This allows you to retrieve kernel (and system) information by just reading files instead of requiring specific tools.

Code Listing 4.8: Mounting the proc file system

# mount -t proc none /mnt/gentoo/proc

To be able to use the network you have defined (if applicable), you need to copy over the /etc/resolv.conf file to the new Linux environment:

Code Listing 4.9: Copying over the name resolving information file

# cp /etc/resolv.conf /mnt/gentoo/etc

Changing the root from CD to the new environment

The final step now is to change the root of your file system from the one provided by the CD to the one you just set up, namely /mnt/gentoo. Using the chroot tool, your terminal session will not see anything outside /mnt/gentoo unless you finish the chroot itself. Because you need a shell to navigate, we run /bin/bash (the Bourn Again SHell) right after changing the root:

Code Listing 4.10: Changing the root to /mnt/gentoo

# chroot /mnt/gentoo /bin/bash

5. Building the system

5.a. Configuring the Gentoo environment

The build configuration file

As Gentoo is primarily a build the software-distribution it requires a bit more configuration directives than most other distributions. The first and foremost configuration file is /etc/make.conf (remember, we are now inside the chrooted environment - outside it would be /mnt/gentoo/etc/make.conf). Open up the file with nano, an easy to use editor. We add the -w option to turn off word wrapping since this might harm the configuration itself.

Code Listing 1.1: Editing /etc/make.conf

# nano -w /etc/make.conf

Inside this configuration file you can define various configuration directives that affect Portage' behavior or the software building process. We will only discuss a few here - others exist as well, but are not that important at the beginning of the installation procedure.

Each directive is a variable with some specific content. Variables are often used throughout Linux (and this is even more so within Gentoo). To set a certain variable, you define its name followed by an equal sign and then the content of the variable.

Code Listing 1.2: Example variable definition

MYVARIABLE="value of the variable"

5.b. Compiler directives

The compiler flags

The first directives that we'll discuss are the compiler flags. A compiler is a tool that builds executable code from source code and one of the benefits of using Gentoo is that you can define how the compiler should behave. More precisely, you can tell the compiler to use certain optimizations on your system. Although a compiler can take a lot more options than just optimization options, most Gentoo users only use the optimization options.

The GNU Compiler Collection, the compiler toolchain used by most architectures, supports more than hundred optimization flags. Some of them are interesting, others hardly used. Some of them are harmless, others quite intrusive. We must warn you that using anything beyond the Gentoo-recommended optimization flags is not supported.

CFLAGS and CXXFLAGS

The CFLAGS and CXXFLAGS directives are immediately passed to the C and C++ compilers respectively as command line arguments. They are often used to inform the compiler about the destination architecture and optimization settings.

Quite often, the CXXFLAGS is told to contain the same setting as the CFLAGS variable:

Code Listing 2.1: Setting CXXFLAGS

CXXFLAGS="${CFLAGS}"

The GNU Compiler Collection has lots of possible directives. The first directive that most users will want to set is the destination CPU type. This is configurable with the -march=<CPU-TYPE> setting. For most people, finding out what CPU-TYPE to pick is quite a challenge and indeed, it is not possible to document what you should use. What we can tell you is that it is safe to pick an older type while a more recent type can cause malformed software on your system.

To obtain a list of valid -march= settings, please consult the GCC Manual or the GCC Info pages:

Code Listing 2.2: Consulting the gcc info pages

# info gcc
Select GCC Command Options,
       Submodel Options,
and pick your architecture.

A second often used compiler setting is the optimization setting. By adding a -O followed by a number you can ask the compiler to optimize the code in various degrees. Gentoo recommends -O2 (that is "O" for "Optimization", not "0" like in "007"). The highest possible value is -O3 - if you want even stronger optimization settings you'll need to add them to the variable.

If you don't want to optimize the code for speed but for size, use -Os.

The third option we add is -pipe which tells GCC that it can use process pipes instead of temporary files for communication between the various stages of compilation. This considerably improves compiling performance on systems with sufficient memory.

CHOST

The CHOST variable is an identifier for the target host for which the compiler should build software. It is vital that this variable identifies your system, but it is even more vital that you do not edit this variable if you are not performing a bootstrap installation. If you alter this variable, any toolchain rebuild will cause the toolchain to be in an intermediate state, possibly producing malfunctioning libraries.

So your settings would be ...

At the end of this handbook you will find architecture-specific chapters. For each architecture, precise (and valid) examples for various systems are listed so that you can pick one of those if the above explanation isn't sufficient.

5.c. Gentoo directives

USE

The USE variable is probably the most important setting of all inside /etc/make.conf. With this variable you define what purposes your system serves. Each flag set in the USE variable enables or disables specific support in one or more packages.

The idea is pretty simple: if your system has a DVD reader, you'll probably set dvd as one of the USE flags to add support for DVD readers. When you want to play DVDs as well, dvdread should be added. Writing DVDs will require the dvdr setting. Similar, if your system will host an IBM DB2 server (or any application you'll use connects to such a server) you'll probably want db2 as a USE flag.

A list of all possible USE flags can be found in /usr/portage/profiles/use.desc. This is a plain text file which is also reproduced online at the Gentoo web site. But starting to dig through this list is cumbersome as many USE flags will have no clear description (which is not because we want to remain vague on the subject, but because the USE flag is either used by different applications for slightly different purposes, or because the topic itself is too technical).

To resolve this issue, Gentoo provides you with a default USE setting. To find out what the default setting is, run emerge --info and filter out the line that starts with USE:

Code Listing 3.1: Checking the default USE setting

# emerge --info | grep -E ^USE

Any setting you place in /etc/make.conf is added to the USE flag - you don't substitute the default setting but embrace and extend it. Therefore you need to place a hyphen in front of the USE flags you do not want. For instance, the arts USE flag (which enables support for the aRts sound daemon) might be set by default. To deselect it, use -arts. As an example we'll select support for the Enlightened Sound Daemon and disable support for aRts:

Code Listing 3.2: Example USE flag

USE="-arts esd"

There are also USE flags that are specific to a single package. Such USE flags are called local USE flags. Although you can set those USE flags in /etc/make.conf, it is wiser to enable or disable those USE flags on a per-package basis. We'll refrain from explaining how to do this here - you don't need to set USE flags during the installation, Portage is intelligent enough to handle USE flag changes after an installation.

Before we go on with the next setting there are four remarks we want to make regarding USE flags:

  1. Changing the USE flag in this stage of the installation might result in Portage downloading tools you don't want to install at this point yet. A frequently occurring issue is where a USE flag combination triggers the installation of kde-base/kdebase which is a quite huge build. You should consider altering the USE flags at any point later and ask Portage to just rebuild those tools that are affected by the USE flag change.
  2. USE flags allow Gentoo to make decisions regarding optional support and features. If any feature or support is not optional but mandatory or inherent to the package, the respective USE flags are ignored. A good example here is the qt USE flag. If a package can (but doesn't have to) support the Qt Graphical Widget library, it uses the qt USE flag to decide whether or not to build in support for Qt. If the package requires Qt to function, it'll install Qt regardless of USE flag.
  3. Some packages override the default USE flag settings (not the one you specify in /etc/make.conf though) if you install them. For instance, the tetex USE flag is not set by default, but when you install the TeX distribution, Gentoo will automatically enable the tetex USE flag so that other programs can now be built with TeX support if they can handle it.
  4. If you want to hard-set a custom USE flag listing regardless of the default USE flags, you can start with deactivating all USE flags and then list those you want to enable. This can be accomplished using USE="-* flag1 flag2 ...".

The Gentoo Portage repository

Any Gentoo package information is stored inside an ebuild. That is a small text file which contains some metadata about the package (like the description, home page, source code location, ...) and instructions for Gentoo on how to succesfully build and deploy the package on your system.

The complete set of all supported ebuilds is stored inside the Gentoo Portage Tree, also known as the Gentoo Portage Repository. You will find a snapshot of this repository at /usr/portage where all the ebuilds are categorised by function and name. Gentoo Portage, Gentoo's software management tool, builds decisions such as "What packages should be updated" or "What software is available for installation" based on the content of the repository snapshot on your disk.

It is therefore quite important that you regularly update the snapshot on your disk with the latest Gentoo Portage Repository content released by the Gentoo Developers. By default, Gentoo Portage chooses a random mirror1 but it is more efficient to use a mirror that is either close to you or fastest for your environment. The SYNC variable is where you put the location of the Gentoo Portage Repository where you want to take a snapshot of.

Because you can't know what mirrors are available, Gentoo provides an easy to remember naming scheme for the mirrors. The default setting picks a random mirror. The country-based still picks a random mirror, but only of a pool of mirrors inside that country. The single mirror syntax always picks the mirror you define.

Code Listing 3.3: Mirror syntax

(Default)       rsync://rsync.gentoo.org/gentoo-portage
(Country-based) rsync://rsync.${COUNTRYCODE}.gentoo.org/gentoo-portage
(Single mirror) rsync://rsync#.${COUNTRYCODE}.gentoo.org/gentoo-portage

An example setting for /etc/make.conf could be:

Code Listing 3.4: Setting the SYNC variable

SYNC="rsync://rsync3.us.gentoo.org/gentoo-portage"

Note: 1 A mirror is a location on a server on the Internet where the exact content of another server location is replicated. The idea behind mirrors is to decrease the stress put on a single server by allowing clients to retrieve the data from various sources.

Gentoo source code location

Most Gentoo ebuilds inform Gentoo about the original location of the source code. This could be a mirror set (like with the many sourceforge hosted projects) or a fixed URL. Some ebuilds can't point to this location for whatever possible reason. If this is the case, the source code is pushed on the Gentoo mirrors in a specific location called the distfiles/ directory.

The GENTOO_MIRRORS variable declares what mirrors Gentoo Portage should check to find the required source code. Each mirror declared in this variable should point to the parent location of the mirror (i.e. not the distfiles/ directory but one level higher). A list of possible mirrors can be found online.

5.d. Bootstrapping

What is bootstrapping?

A bootstrap procedure prepares a system with the c library and compiler specifically for a certain environment. This procedure is very sensitive for problems which is why you shouldn't touch the bootstrap script (called bootstrap.sh inside /usr/portage/scripts).

And because you shouldn't really touch it, you also don't really need to perform a bootstrap yourself: if you are using a stage-2 or stage-3 tarball, the bootstrapping has already been done. If you do want to rebuild your system (for instance, because you altered your compiler directives), you should follow the method explained next.

Bootstrapping your system from stage 3

If you are performing a bootstrap installation where you don't alter the bootstrap.sh script, the procedure should perform the following steps:

  1. Use your current toolchain to rebuild itself using the new settings
  2. Use the new toolchain to rebuild itself again. Unlike the previous time, your toolchain is now built, not only with the new settings, but also by a toolchain built with those settings.
  3. Use the new toolchain to build the rest of the packages using the new settings.
  4. Rebuild the packages again to make sure that all packages are built against rebuilt libraries. If you don't have any circular dependencies, this won't be necessary, but as you will probably not know if this is the case1 it is better to perform this step anyway.

These steps can all be performed using the following commands:

Code Listing 4.1: Bootstrapping the system

# emerge -e system
# emerge -e world
# emerge -e world

To be honest, the last emerge -e world will rebuild some tools that don't need to be rebuilt: the world collection (all packages that should be installed on your system) contains the system collection (all packages that are vital for your system) as well, so that the system collection is built three times where it only needs to be built twice.

Since in this stage of the installation you don't have any differences between the system collection and the world one, performing emerge -e system twice instead of the system-world-world combination is sufficient.

Note: 1 Although Portage can detect circular dependencies, it only detects those on metadata level. That means that it depends on the content of the ebuild, written by the package maintainer, and not on the real dependencies that the package has (the maintainer can always miss one or two that are almost always met) so Portage doesn't know that it needs to rebuild the dependency before the package.

Bootstrapping your system from stage 1

When you need to bootstrap your system different from the procedure set forth in the bootstrap.sh script, you can use any tarball you like, including a stage1. You can base your bootstrap procedure from the one documented in the bootstrap.sh script but you don't have to. However, make sure that the toolchain you build it built with Portage (using emerge), otherwise the Portage database will be inconsistent.

Code Listing 4.2: Bootstrapping the toolchain

# /usr/portage/scripts/bootstrap.sh

5.e. Progressing from a bootstrapped system

Installing core system packages

A bootstrapped system doesn't offer much beyond some libraries and compiler. You will need additional core system packages before you can actually work on your system. Gentoo Portage obtains a list of core system packages from your profile and installs them on your system after building them with the bootstrapped toolchain.

The list of core system packages is available as the system keyword for emerge. emerge is Gentoo's command-line interface to Gentoo Portage, Gentoo's software management system.

If you have bootstrapped your system previously, or you are installing Gentoo through a stage-2 tarball, or you are installing Gentoo through a stage-3 tarball but want to rebuild your packages with the configuration directives you've set in your make.conf previously, run the following command to build all core system packages for your profile:

Code Listing 5.1: Building core system packages

# emerge -e system

Remember, most users will not need to perform this step: the stage3 tarball provided by Gentoo already contains a prepared system.

6. Building the Linux kernel

6.a. Kernel configuration procedure

Introduction

The Linux kernel is the core of the Linux Operating System. It takes care of resource management (processes, memory, ...), hardware support, networking, file systems, ... and is therefore one of the most vital parts of the system.

Gentoo offers various kernel sources. Each source is based on the vanilla kernel source (the one developed by the main kernel developers) and adds in additional features, hardware support, experimental features, etc. You can pick whatever kernel source you like for your system (as long as your profile allows it of course).

When the source code is installed on your system, you still need to configure and build the kernel before you can use it. The kernel configuration is the trickiest part since a mistake can lead to an unusable kernel, but trying to build in too much leads to irrelevant code in your kernel which you'll never use, take space and might even cause troubles later when there is a kernel bug in that code.

Luckily, Gentoo offers a tool called genkernel which configures, builds and installs a kernel automatically. This might be of interest to you if you have no idea about configuring kernels, you want a basic kernel configuration for all your systems, or you require a kernel that can deal with a majority of hardware.

After configuring the kernel, it is built into an image which your computer loads in memory when you boot the system.

Picking a kernel source

Gentoo maintains a list of supported kernel sources which contains a small introduction about the kernel tree. You can make your choice from this guide, although you can very well pick a fairly generic one right now and use a different kernel later on - the kernel is completely interchangeable on a Linux environment so you don't have to decide on the kernel right here, right now.

The most default, generic kernel source is the vanilla-sources. This kernel tree is the one released by the Linux Kernel Developers, unmodified. Gentoo offers a patched version itself called gentoo-sources. Using these sources has the advantage that Gentoo can release a new kernel tree whenever it deems it necessary.

If you have made your choice, install the kernel sources using emerge. Just add the kernel source name as an argument to emerge and emerge will download and extract the kernel sources on your system:

Code Listing 1.1: Installing the kernel sources

# emerge <kernelsources>

6.b. Building the kernel

Automated build process

If you don't want all the hassle surrounding the kernel installation, you can install genkernel and then have genkernel configure, build and install the Linux kernel for you. This process is quite simple:

Code Listing 2.1: Using genkernel

# emerge genkernel
# genkernel all

However, genkernel is a lot more powerful than this. With this tool, you can maintain your personal kernel configuration and let the tool rebuild newer kernel versions with your settings. You can enable specific features (like bootsplash, lvm2, evms2, raid, ...) and tweak the compiler settings used during the kernel build process (which differs from the settings placed in make.conf!).

For more information on genkernel, please read the Genkernel Guide.

Manual build process

The manual build process consists out of three steps:

  • configuring the kernel,
  • building the kernel, and
  • installing the kernel

To configure the kernel, go to /usr/src/linux and run make menuconfig. You will get a dialog-based interface where you can configure your kernel.

Configuring a kernel isn't hard if you know what hardware you have and what features you want - but then again, if all this is new to you, finding out what features you want or need can be time consuming.

The kernel configuration dialogs has a good Help built-in which even includes search functionality (very useful if you want to search for the location of that network card you have but can't seem to place in the configuration structure).

It is not our intention to describe the kernel configuration process for you - there are several guides about this topic on the Internet and if you really aren't able to succesfully configure a kernel, use genkernel for the time being.

When you're finished with the configuration part, build the kernel by running make (hold on, we'll do this together with the next step). make is a tool that reads in a script called Makefile in the directory you are in. Makefiles are very powerful when used for building software since they are able to only (re)build those parts that need to be (re)build instead of building the entire software over and over again.

To finish the kernel build process, you need to copy over the resulting kernel image to the boot partition and install the kernel modules you have selected. The location of the kernel image depends on the architecture you're using. The following table gives an overview of possible kernel images with the commands needed to build and install the kernel:

Architecture Image Location Build Command
alpha arch/alpha/boot/vmlinux.gz make && make modules_install && make boot
amd64 arch/x86_64/boot/bzImage make && make modules_install
hppa vmlinux make && make modules_install
mips vmlinux make && make modules_install
ppc Apple/IBM vmlinux make && make modules_install
ppc Pegasos arch/ppc/boot/images/zImage.chrp make && make modules_install
ppc64 vmlinux make && make modules_install
sparc32 arch/sparc/boot/image make && make modules_install
sparc64 arch/sparc64/boot/image make && make image modules_install
x86 arch/i386/boot/bzImage make && make modules_install

The build command is divided in two parts - the make instruction we've discussed before and a specific instruction to finish off with some additional steps. The separation is made with the && string. This is a specific operator to the shell, telling the system to continue with the next command if the previous one didn't fail. A similar operator is || which tells the system to execute the next command if the previous one did fail.

Now, execute the build command to create the kernel image. When the process has finished, copy over the file to /boot. It is wise to give the kernel image file a good name in /boot so that you can distinguish one kernel from another. A common used naming scheme is kernel-<version>. The next example copies over an x86 kernel image to /boot/kernel-2.6.12-gentoo-r6:

Code Listing 2.2: Copying over the kernel image to /boot

# cp arch/i386/boot/bzImage /boot/kernel-2.6.12-gentoo-r6

7. Configuring the boot process

7.a. Loading the kernel in memory

Introduction to bootloaders

When your system is powered on, your system will first perform some sanity checks against its own components. When all tests succeed, the system loads the kernel image which you have built previously in memory. This action is performed by the boot loader.

Since loading files in memory is very architecture specific, you should consult the architecture specific information for your architecture now to find out how to install and configure a boot loader.

7.b. Configuring the kernel

Kernel parameters

Inside the configuration file of your boot loader you can enter specific instructions for the Linux kernel. These parameters allow you to tweak and change the kernel behaviour at boot-time.

Important parameters

Root File System Location

The first parameter we'll discuss is the root parameter. This tells the Linux kernel where the root file system of the Linux system is located. You really should provide this parameter as the Linux kernel will otherwise not know where the Linux installation is.

Code Listing 2.1: Example root parameter

(The root file system in this example is at /dev/sda3)
root=/dev/sda3

However, you can only specify the root file system if you are certain that the kernel image (not through separate kernel modules, but really inside the kernel image) has support for:

  • the device controller that governs your disk (for instance the SATA controller if you use SATA disks), and
  • the file system that the partition uses (for instance ext3 support)

If this isn't the case, then you have probably made an initialized RAM disk which allows the Linux kernel to load the appropriate kernel modules in memory before it continues with the Gentoo boot-up. Users of the genkernel tool have indeed made such an initrd file, perhaps without their knowledge.

To use such an initrd file, you need to specify /dev/ram0 as the root file system and the real root file system with the real_root= parameter1. It is also adviseable to inform the kernel about the amount of memory you want to reserve for the RAM disk.

You also need to tell the boot loader where the initrd file is. This is boot loader specific so we don't mention that here - consult the information for your boot loader for more information.

For instance:

Code Listing 2.2: Example kernel parameters for initrd users

root=/dev/ram0 ramdisk=8192 real_root=/dev/sda3

The Initial Program

When the kernel has finished its own procedures and mounted the root file system, it hands over the control to the system to the init process. This process then takes care of the rest of the boot sequence. By default, the kernel looks for this tool at /sbin/init. You can however define another initial program if you like using the init= parameter.

For instance, to get a Unix shell immediately, use init=/bin/sh. This is often used to allow you to remount your partitions and make fixes that prevent your system from booting regularly (or when you forgot your root password).

Single User Mode

To inform the system to boot up in the single user mode (which is the single run level we've talked about in the previous chapter), simply add an S.

Note: 1 The real_root parameter is not really a kernel parameter but is intended for a script inside the initrd file. However, the parameter is used just like kernel parameters which is why we list it here.

Hardware related parameters

ACPI Support

Not all hardware devices are conform the ACPI specification, even though they think they are. This sometimes results in unstable behaviour of the device, or of other devices influenced by the device.

You can specifically disable ACPI support using the acpi=off parameter.

Disabling IDE Controllers

When one of your IDE disks is broken, your system might not be able to boot even though the system itself is stored on a disk controlled by a different IDE controller. If this is the case, you can explicitly disable a controller using ide0=noprobe.

Disabling Multi-Processing

If you have an SMP system (Synchronous MultiProcessor), you can tell the Linux kernel to only use one CPU by setting nosmp.

Disabling USB Support

To disable USB support, use nousb

More parameters

More extensive information about available kernel parameters can be found at /usr/src/linux/Documentation/kernel-parameters.txt.

7.c. The boot sequence

Init

When the Linux kernel has almost finished with its boot process (where it initializes the memory structures, loads drivers, etc.) it mounts the root file system (given by the root parameter) and then starts the init process (which is default at /sbin/init but can be configured).

The init process is responsible for the rest of the system boot sequence. It looks for the /etc/inittab file which contains the instructions how to further boot the system. At first, it fires up the command that is assigned to the boot and bootwait entries which are, in Gentoo's case:

Code Listing 3.1: Bootwait entry

rc::bootwait:/sbin/rc boot

Where init is rather distribution-independant (and quite simple in its use too), /sbin/rc is quite distribution-specific, especially the rc that Gentoo offers. Its task is to make sure that the scripts in a run level are started well or take appropriate action if they aren't.

Once the boot runlevel has succeeded, the init process goes on by executing the command for the specified runlevel. By default, the runlevel entered at the initdefault part of /etc/inittab is started, but you can ask init to start a different run level by specifying its corresponding number as a boot parameter (entirely similar to how you add kernel parameters).

Code Listing 3.2: Default run level and corresponding command

id:3:initdefault:
(...)
l3:3:wait:/sbin/rc default

When this run level has also finished starting its required scripts, the init process starts the terminal processes at the various ttys (the Alt+F# locations where you get a logon prompt):

Code Listing 3.3: Example terminal processes for Alt-F1 till F6

c1:12345:respawn:/sbin/agetty 38400 tty1 linux
c2:2345:respawn:/sbin/agetty 38400 tty2 linux
c3:2345:respawn:/sbin/agetty 38400 tty3 linux
c4:2345:respawn:/sbin/agetty 38400 tty4 linux
c5:2345:respawn:/sbin/agetty 38400 tty5 linux
c6:2345:respawn:/sbin/agetty 38400 tty6 linux

Managing runlevels

You can manage the runlevels using the rc-update tool. Its syntax is quite simple:

Code Listing 3.4: rc-update syntax

# rc-update <add | del> <initscript> <runlevel>

All the init scripts that you can use are located inside /etc/init.d. You will most likely use at least the runlevels boot and default.

  • The boot runlevel makes sure that the most important init scripts, which are required for every succesful system boot, are started properly. Any init script that is added to the boot runlevel may not require any service offered by the init scripts in the default runlevel (as it is started later). It may depend on other scripts in the boot runlevel though, Gentoo's rc is smart enough to tackle dependencies.
  • The default runlevel contains all init scripts which should be started during normal system operation. This is the runlevel where you will probably add most of the init scripts.

8. Configuring the system

8.a. File system information

The fstab file

fstab stands for file system table; when you would take a look at a fully configured /etc/fstab file you can easily see why:

Code Listing 1.1: Example fstab file

/dev/sda8               /             ext3    defaults,noatime          0 0
/dev/sda5               none          swap    sw                        0 0
/dev/sda6               /boot         ext2    noauto,noatime            0 0
/dev/sda7               /home         ext3    defaults,noatime,noexec   0 0
/dev/cdroms/cdrom0      /media/cdrom  auto    defaults,user,noauto      0 0

none                    /proc         proc    defaults                  0 0
none                    /dev/shm      tmpfs   defaults                  0 0

Each line declares what storage location (first field) is mounted at a certain location (second field) using a file system (third field) and mounted with one or more options (fourth field). The last two numbers are not that actively used anymore so you can safely set them to 0 0.

It is vital that your /etc/fstab file is a reflection of your environment. By default, Gentoo offers an almost empty /etc/fstab file with illegal storage locations (such as /dev/BOOT and /dev/ROOT). Any user should change the file, otherwise the system might not boot.

The fstab file is used during the system boot procedure to find out what file systems should be mounted, but also during regular system operation. For instance, when you insert a CD in your CD-ROM player, Linux ought to know where it should mount the CD so that you (and perhaps other users) can reach it.

Mount options

The mount options which you can place in the fourth field in /etc/fstab are well documented in the mount manual page:

Code Listing 1.2: Reading the mount manual page

# man mount

Each set of mount options is documented in a section pertaining to the file system used (for instance, ext2, reiserfs, ...). Some of them are available to all file systems, such as defaults, auto or noauto (automatically mount file system or not).

Special file systems

Some lines in the /etc/fstab file have a none as the storage location. Such file systems are pseudo file systems and do not require any storage on the disk.

  • The proc file system represents kernel information (like statistics, hardware settings, process information, memory data, ...) as regular files on the file system. You can read from those files to obtain the information you need, but these files are never actually written to disk. Every time you read them, the information is recalculated.
  • The tmpfs file system is storage located entirely in memory. Although it is extremely fast, it is also volatile meaning that it loses its content when you reboot the system. The tmpfs file system is often used for temporary file storage (hence the name), but in the previously given /etc/fstab example it serves as a storage point for certain applications who want to share memory without using the shared memory functionality offered by the c library.
  • The sysfs file system (not shown in the example as Gentoo mounts /sys automatically when it is present) is the successor of the proc file system. It servers the same purpose, but is restructured so it scales well in larger environments.

Edit /etc/fstab

Don't forget to edit /etc/fstab to suit your environment. You can use nano to open the file:

Code Listing 1.3: Editing /etc/fstab

# nano -w /etc/fstab

8.b. System logging

Purpose of logging

The system logger is an important daemon on the system. A daemon is a tool that runs in the background; you can't work with it interactively.

The job of the system logger is to obtain information from various processes (and in certain configurations even from remote processes) like logon events, web server requests, security events, kernel messages, ... and write them down in separate files: log files.

Such log files can then be used to resolve issues on the machine (hardware errors are usually quite verbose), generate usage statistics (for instance for web servers), backtrack logon events (for security purposes), etc.

Installing a system logger

Gentoo provides various system loggers, each of them with their own pros and cons: metalog, newsyslog, socklog, sysklogd and syslog-ng. Which one you choose is up to you, but it is quite important that you pick one: if you do not install a system logger, all events will be displayed on your terminal, cluttering up your screen instead of nicely archiving the events in files.

Code Listing 2.1: Installing a system logger

# emerge <systemlogger>

Next you'll need to add the system logger of your choice to the default run level:

Code Listing 2.2: Adding the system logger to the default runlevel

(First find out how the init script is called)
# ls /etc/init.d

(Then, add it to the default runlevel)
# rc-update add <initscript> default

8.c. System information

Root password

With the passwd tool you can set or change any user account password. At first, you need to set the root user his password. Run passwd and enter the new password. The tool will ask you to confirm the password by reentering it after which the password is updated.

Code Listing 3.1: Setting the root user password

# passwd

You might want to verify that your keyboard settings are correct before you enter the root password. If the keyboard settings deviate a bit from what you expect them to be, your root password might actually differ from the one you thought you have entered. As the passwd tool does not echo the characters on screen, you can not verify the password by just looking at it.

On most Gentoo installations, the password itself will be stored in a hashed format in /etc/shadow which is only readable by the root user. Hashed means that the password itself is not shown, but a mathematical result based on the password. A hash gives a theoretically unique value based on certain input (here: the password) but from which you can not revert (i.e. you can not use the hash to obtain the password).

The /etc/passwd file, which contains user account information, is readable by any user. Note though that this file does not always contain your user account information - larger networks will probably store this information on a central server (for instance an LDAP server). Where to look for the account information is stored in /etc/nsswitch.conf at the passwd field.

User account

Next, it is heavily recommended to create a user account for daily tasks. The root user is all-powerful; any mistyped command can severely damage your system. Running your applications as the root user also exposes you to security breaches - although not many Linux viruses exist, the damage that a virus can do depends on the privileges it obtains, and it obtains the privileges from the tool whose flaw it has exploited.

To create a user, use useradd and pass on the -m option so that the user's home directory (/home/username for the username user) is created. Also add the groups to which you want to make the user part of:

Code Listing 3.2: Creating a user

# useradd -m -g users -G wheel,audio,cdrom,games,users john

Most groups are self-explanatory, but the wheel group might need a small introduction.

The wheel group contains all users who can run su to switch from one user to another (including the root user). Only put trusted users in this group. A better alternative to the wheel access (since it still requires the user to know the passwords of the accounts it wants to switch to) is to use sudo of which an excellent guide exists on the Gentoo web site.

8.d. Networking information

Introduction

To configure your wired network, Gentoo uses the /etc/conf.d/net file. Its syntax might seem a bit strange at first - you'll find well-documented examples at /etc/conf.d/net.example - but allows you to configure your entire network easily.

Automatic IP retrieval

If your network interface(s) should retrieve their configuration automatically (using DHCP) you don't need to do anything in this file - you can leave it empty. You will find that Gentoo gives a cosmetical warning that it assumes DHCP since you haven't provided anything. If you don't want any warning, explicitly enable DHCP for each interface:

Code Listing 4.1: Specifying DHCP for eth0

config_eth0=( "dhcp" )

Don't forget to install a DHCP client. Available ones are dhcpcd, dhclient (in the dhcp package), udhcpc (in the udhcp package) and pump.

Code Listing 4.2: Installing a DHCP client

# emerge dhcpcd

If you want to pass additional options to the DHCP client (we refer you to the man page of each DHCP client for more information about the available options) use the <client>_<interface> directive. For instance, to set the time out to 10 seconds (the default for most clients is 1 minute) for the dhcpcd client:

Code Listing 4.3: Setting the DHCP time out

dhcpcd_eth0="-t 10"

Static IP address

If your interface should be configured with a static IP address, you need to provide the following information:

  • the IP address,
  • the gateway (where the network packets for different networks should hop to first) address, and
  • the domain name server(s) (which translate hostnames to IP addresses)

For instance, suppose that your IP address is 192.168.0.2 and you're part of a network where all IP addresses start with 192.168.0, then you specify:

Code Listing 4.4: eth0 configuration for a static IP address in /etc/conf.d/net

config_eth0=( "192.168.0.2/24" )

The /24 tells the configuration that the first three numbers (each number uses 8 bits, so 24 bits in total) define the network and the last number the host. If all of your IP addresses start with 192.168, then the configuration would read 192.168.0.2/16.

Next, we need to define where all network packets should go to if they aren't meant for the internal network: the gateway. For instance, if 192.168.0.1 would be the gateway:

Code Listing 4.5: Adding routing information

route_eth0=( "default via 192.168.0.1" )

The last setting defines where the domain name service servers are. These DNS addresses translate hostnames (such as www.google.com) to an IP address (such as 66.249.93.104). Save /etc/conf.d/net first and then open /etc/resolv.conf:

Code Listing 4.6: Example /etc/resolv.conf

# Substitute the IP addresses with your DNS server addresses
# Contact your network administrator or ISP if you don't know what to enter.
nameserver 195.130.130.133
nameserver 195.130.131.10

The hosts file

The /etc/hosts file is a small table the system uses to make immediate translations between hostnames and IP addresses. This file should at least contain one line:

Code Listing 4.7: Important line in /etc/hosts

127.0.0.1       localhost

All other lines should be set under that line using the following syntax:

Code Listing 4.8: Syntax for /etc/hosts

<ip address> <fully qualified hostname> <aliases>

For instance, if you want to assing the host name gentoobox to your eth0 interface address (for instance, 192.168.0.2):

Code Listing 4.9: Example /etc/hosts line

192.168.0.2     gentoobox

If you use a domainname for your network (like boxes, but it can also be a real reserved domain name like company.com), you should set it like so:

Code Listing 4.10: Example /etc/hosts line for named network

192.168.0.2     gentoobox.boxes     gentoobox

Next to the host names assigned to your interfaces, you can also add in the IP address and hostname information for the other hosts on your system if they aren't known to the DNS servers you've specified (in /etc/resolv.conf, perhaps automatically with DHCP).

Automatically starting the network at boot

With the configuration in place, your next step would be to assure that the configuration is loaded when you boot your system. Go to /etc/init.d and make symbolic link from the net.lo init script to the interfaces you need. For instance, if you have one interface (called eth0):

Code Listing 4.11: Automatically starting the eth0 interface at boot

# cd /etc/init.d
# ln -s net.lo net.eth0
# rc-update add net.eth0 default

These commands might need some explanation:

  • The ln command makes a named link to a file (in this case, net.lo) called net.eth0. This file shouldn't exist before you run this command. There are two types of links one can make:
    • symbolic links merely point to a file or directory. If you remove the destination (like net.lo) then the file is really gone - the symbolic link will point to a non-existing file. You can create a symbolic link using the -s option to the ln command.
    • hard links don't just point to a file, they are actually a second name for the same file. If you remove the destination file, the hard link still contains the content of that file. More technically, if you create a file, you actually reserve some space on a device and create a hard link to it. ln just makes a second hard link to it.
    The advantage to using hard links should be obvious - if you remove one, the content still remains accessible through the other link. However, hard links have one disadvantage to symbolic links: they can not point to files on a different file system. Symbolic links can.
  • The rc-update command configures the boot sequence Gentoo Linux goes through when starting the system. By using the add option, you tell the system that the given script (in this case, net.eth0) should be added to the default runlevel.
    • A runlevel is a name for a certain set of scripts that need to be started in order for the system to function. By default, Gentoo calls its default runlevel ... default. Others are nonetwork (which doesn't start network related scripts), boot (important scripts that must be started) and single (where only those scripts are started that are needed for an administrator to be able to fix a broken system).
    • The script net.eth0 is an init script. Such scripts are written using a specific syntax and reside in /etc/init.d.

8.e. Various configuration settings

The /etc/rc.conf file

The /etc/rc.conf file contains system-wide settings for the entire system. You will find lots of variables already defined in the file, accompanied with lots of documentation.

The first variable you'll see is the UNICODE variable. Unicode (also known as UTF-8) is the new standard for character encoding. Character encoding tells the system what sequence of bits resembles what character. Well-known encodings are ASCII, ISO-8859-1, etc. The UTF-8 or Unicode encoding is important because it is able to provide encodings for every possible language (including special characters like €, but also chinese characters, etc.).

If you are interested in using Unicode on your system, please read the UTF-8 Guide on the Gentoo web site.

Another variable of importance is the DISPLAYMANAGER. A display manager is a tool which shows a graphical logon screen after having booted your system. Most display managers even allow you to automatically log on as a specific user. If you want to use a display manager, you need to install one, add the xdm init script to the default runlevel and make sure that this variable points to the display manager of your choice.

Together with the DISPLAYMANAGER variable you'll find the XSESSION one. This tells the display manager what graphical environment it should load by default if the user didn't specify one explicitly. Well-known graphical environments are KDE, GNOME, XFCE, fluxbox, ... For specific instructions on how this variable influences the graphical logon process please read the comments in the rc.conf file.

Select keyboard language

If you aren't using a US Qwerty keyboard, you'll need to edit the /etc/conf.d/keymaps file to tell the Gentoo system what keyboard layout it should use.

9. Finishing off

9.a. Rebooting the system

Exiting the chrooted environment

The base Gentoo installation is almost finished. Right now, you'll need to exit from the chrooted environment, unmount all mounted file systems from the system and reboot. Then we'll find out if the boot procedure settings are correct: if you can log on to your system, great. If not, well, no worries - you don't need to redo everything all over again :)

To exit the chrooted environment, type exit. When you get back at the installation CD environment, find out what file systems are mounted at the /mnt/gentoo location and unmount them one by one. You can't unmount a file system that still has mounted file systems in it, meaning that you can't unmount /mnt/gentoo before /mnt/gentoo/proc and others are unmounted.

Code Listing 1.1: Exiting the chrooted environment and unmounting the file systems

# exit
# mount | grep '/mnt/gentoo'
/dev/sda3 on /mnt/gentoo type ext3 (rw,noatime)
proc on /mnt/gentoo/proc type proc (rw)
/dev/sda1 on /mnt/gentoo/boot type ext2 (rw,noatime)
# umount /mnt/gentoo/boot /mnt/gentoo/proc /mnt/gentoo

Next, reboot the system and hope for the best...

Code Listing 1.2: Rebooting the system

# reboot

Don't forget to remove the installation media from the system, otherwise you'll boot right into the installation environment again.

Boot failed?

If the reboot failed, you need to dig through the error messages you receive to find out what went wrong. You'll find that the Gentoo Forums represent a wonderful Knowledge Base with solutions to many problems.

To help you get back, we'll explain how to return to the installation environment so that you can fix whatever fault is causing the error.

  1. Reinsert the installation medium and reboot your system so that you are back inside the installation environment, just like you were in the beginning of the Gentoo installation.
  2. Load up any drivers you need and configure your network just like you did with the Gentoo installation.
  3. Instead of taking a stab at the storage configuration, immediately mount all your file systems at /mnt/gentoo. Don't forget to mount the proc file system as well. You never know when you'll need it and it is often forgotten.
  4. Chroot into the Gentoo installation (chroot /mnt/gentoo), run env-update and source /etc/profile so that your session environment is configured correctly.
  5. Now fix whatever needs to be fixed.
  6. Exit the chrooted environment, unmount the partitions and reboot to retry.

9.b. Finishing off the base installation

USE flag changes and rebuilding

If you didn't alter your USE flags during the installation, this is a good time to do it. Log onto your system as root (using the password you supplied previously) and edit the USE variable inside /etc/make.conf using your favorite editor (nano is available by default) and reread the information we gave you about USE flags earlier in this book in the chapter on Building the System.

Once you've updated your USE flags, we'll tell Portage that it needs to rebuild the tools that are affected by your USE flag change. To verify what Portage wants to do, we'll first ask it to show it to us without actually performing the rebuild. The emerge command has an option called --pretend (or -p in short) that does exactly that. When we add the --verbose (or -v in short) option we'll also ask it to display why it wants to rebuild the packages. And of course, we need to ask Portage to do all that just for the packages that are affected by the USE flag change (--newuse, or -N in short):

Code Listing 2.1: Rebuilding packages affected by USE flag change

(In this example, we've changed the nls USE flag:)
# emerge --pretend --verbose --newuse world

These are the packages that I would merge, in order:

Calculating world dependencies ...done!
[ebuild   R   ] sys-apps/man-pages-2.11  -nls* 0 kB 
[ebuild   R   ] sys-apps/grep-2.5.1-r8  -build -nls* +pcre -static 0 kB 
[ebuild   R   ] media-sound/alsa-utils-1.0.10_rc3  -nls* 0 kB 
(...)

If you are okay with whatever Portage proposes, drop the --pretend so that the packages can be rebuilt.

Orphaned packages

Some packages are installed on your system as dependencies of a tool if certain USE flags are set. When you unset the affecting USE flag, Portage will not unmerge the dependency even though the original package is rebuilt. Such dependencies which aren't needed by any package on the system anymore but are stil present are called orphaned packages.

You can ask Portage to find such orphaned packages and remove them from the system. The method is called depclean (which stands for dependency cleaning):

Code Listing 2.2: Running depclean on the system

# emerge --pretend --depclean

Take a look at the packages Portage wants to remove. If you want to scroll through the list, try pressing Ctrl-PgUp or filter the list through the less utility:

Code Listing 2.3: Using 'less' for the depclean output

# emerge --pretend --depclean | less

If you're satisfied with the list, drop the --pretend and let Portage sort the packages out.

Updating the system

Finally, update your system so it uses the latest versions of all packages. First, let Portage obtain a more recent snapshot of the Portage tree:

Code Listing 2.4: Updating the Portage tree

# emerge --sync

Next, ask Portage to update the packages that have a more recent version available. We'll use the --update argument for emerge to inform Portage that we want to update them, but also the --deep argument so that not only those packages you have installed (using emerge <packagename>) and their immediate dependencies are updated, but also the dependencies of the dependencies. You'll also notice we use the --newuse argument again. That is because Gentoo might add a USE flag to the default USE set, either because of a profile update, or because you installed a package that "provides" a USE flag.

Code Listing 2.5: Updating the packages on the system

# emerge --update --deep --newuse world

9.c. Installing additional software

Some recommendations...

Now that your base system is available, you'll still left in the dark as you don't have many tools at your disposal. No graphical environment for the desktop users, no services for the servers, so development tools (apart from the toolchain) for the developers.

Your next stop should be to investigate the Portage tree for software you want to have. You can just browse through /usr/portage and use emerge for every tool you want, but better would be to follow one or more guides from our web site that help you install and configure the tool.

For instance, Gentoo has a nice Xorg Configuration Guide for those who want to set up their Gentoo installation as a graphical environment desktop or workstation. The xorg-x11 tool is the service that provides windowing features and other graphical possibilities to the various desktop environments.

Possible desktop environments are KDE, GNOME or fluxbox who also have configuration guides at the Gentoo web site (KDE Configuration Guide, GNOME Configuration Guide and Fluxbox Configuration Guide).

For those interested more in services should take a look at the Gentoo Security Guide which helps you harden your system configuration.

Other interesting resources are the Gentoo Forums, Gentoo IRC channels and mailinglists.

C. Gentoo Linux for the desktop user

1. Graphical Linux

1.a. The X server

Introduction

Many users believe that Linux is a command-line driven operating system. This isn't true, but the command-line interface is a standard, well-supported input method for Linux. However, graphical input is well supported and rivals other operating systems with its usability, flexibility and stability.

Like all tools, the graphical environment is also "just a tool" build to do what it is supposed to do: provide a graphical environment for the end user and libraries for developers so they can write graphical tools. The base of a graphical environment are the X11 libraries and X11 server.

X11 is a network protocol designed to allow graphical environments to be exported over the network. As such, any graphical environment built using the X11 libraries can run on a server while it is displayed on a client. But we're drifting away now...

The X11 server is a service that performs the rendering of graphical environments. It isn't a graphical environment by itself but offers the base for graphical environments to be built: it is a framework where other software packages build upon.

Gentoo supports the xorg-x11 X11 server.

Configuration

Since the X11 server performs the rendering, you need to configure it to use the hardware you work on. Gentoo has a nice X Server Configuration HOWTO which you definitely should read.

1.b. Desktop Environments

Introduction

With a bare X11 server you won't be able to do much. You need a window manager which takes care of the graphical layout of the environment and possibly even a desktop environment which integrates tools and usability guidelines with a window manager.

A desktop environment is a full blown graphical environment offering everything a desktop might need, all in a coherent package. Backgrounds, file management, drag and drop, screensavers, menus, theming with icons and sounds, virtual desktops, ... you name it, all of that is defined in a desktop environment. This is also why most users are searching for a desktop environment.

Users who want a small graphical environment with just the tools they need often opt for a window manager instead as they don't need all the bells and whistles a desktop environment offers.

The next few paragraphs give a small introduction to various desktop environments. The next section discusses a few window managers. The list is not meant to be exhaustive but rather to provide some guidance to the new Gentoo user.

KDE

With KDE, users are offered a full-blown environment with a plethora of desktop utilities. It seems as if the KDE project tries to contain everything a user might require from a desktop: games, development tools, office suites, imaging support, multimedia tools, desklets, system utilities, ... and all those build upon the same libraries so all tools have a consistent look and feel and offer a well developed drag and drop mechanism.

The KDE project maintains much documentation (in various languages) and offers a quick release cycle with new features and fixes available at every new release. You'll find that the integration of the tools is flawless (the address book is linked from the Personal Information Management tool kontact, E-mail client kmail, Event Manager kjournal and of course the Address book maintenance tool kaddressbook) and the configuration interface kcontrol complete and well documented.

If you are interested in using KDE, don't hesitate to read the KDE Configuration Guide.

GNOME

The GNOME Foundation offers a consistent desktop environment (GNOME) which is developed using strict guidelines, offering a maximum on usability (layout and so on are strictly defined). Many GNOME enthusiasts are proud of their environment because it is simple to use, yet powerful and fully functional.

When you load up GNOME, you will notice that its interface is sober but well designed: the GNOME menu limits itself to the tools you'll most likely use while hiding the rest of the tools that probably confuse most of the users anyway. The window decoration is simple, but gives a nice finished look. Configuration options are limited at first sight but are very easy to comprehend. Real configuration gurus know that GNOME has a very extended configuration model, but it is hidden from the interface because most users wouldn't need it anyway.

The GNOME project has multi-lingual documentation and a good network of related sites where you can find the latest news about GNOME and GNOME tools.

If you are interested in using GNOME, don't hesitate to read the GNOME Configuration Guide.

1.c. Window Managers

Fluxbox

The fluxbox window manager began its life as a spin-off of the blackbox window manager. When you install fluxbox, you'll notice that it is a lot faster than desktop environments. This of course isn't only true for fluxbox but for most other window managers: their job is a lot simpler (in size) than those of a desktop environment.

fluxbox offers the user with a simple interface for window managing, yet supports everything (and more) you require: we aren't talking about window minimalization and maximalization here (of course fluxbox supports that) but about tabbed windows, stickyness, virtual desktops, hotkeys, ...

If you are interested in using fluxbox, don't hesitate to read the Fluxbox Configuration Guide.

1.d. The right tool for the right job

Introduction

After a exaggerating first experience with the graphical Linux environment, many users ask "What's next?" They don't know what tools exist and just playing around, clicking on icons, doesn't help you find out what's possible.

Working well with Linux means that you know what tools to use. In this section, we'll give a head start on various tools and projects. These tools aren't mandatory, but give a nice idea on Linux' possibilities.

Multimedia related

When you ask on #gentoo what the best media player for Gentoo Linux is, you'll get probably two answers: MPlayer and amaroK.

MPlayer is a well-known movie player and includes many features such as DVD playing (although Ogle has a nicer, intuïtive interface), encoding/decoding of dozens of video formats, lots of output formats (ever saw a movie using ASCII characters?) and more. Some might argue that Xine is an easier multimedia player - at the end, it is up to you to decide.

amaroK on the other hand is a music player which integrates nicely in KDE and is very featureful: automated lyric fetching, lots of eye-candy, support for lots of audio formats, ... Another intuïtive media player is the GNOME-related Rhythmbox.

Office Related

With OpenOffice.org you have a full-featured office suite which uses a standard office file format and supports Microsoft Office documents to a great extend. There is also the GNOME Office, a set of separately developed applications for word processing, spreadsheet calculations and database access.

AbiWord is a good choice if you will only be doing word processing. AbiWord is a full-featured word processor that is much lighter and faster than OpenOffice, while still retaining complete interoperability with industry-standard document types. Additionally, AbiWord integrates particularly well with the GNOME desktop.

2. Plug and play

2.a. Identify your needs

Nothing is automated at first

When you first use your system, you'll notice that you need to perform various steps manually which could be done automatically. Gentoo Linux doesn't put its development effort in making a system userfriendly in the sense of allowing everything. Not only can it cause instability on some systems, far from all users appreciate us touching the configuration of their system and it might cause security issues on some environments.

So, if you want certain tasks automated, you'll need to lend your system a hand at first. But you can't help your system if you don't know what you want. Next is a list of automation tasks we'll cover in the next few sections. This list is far from complete, but should give you a good impression on the possibilities. Ask around on our IRC channel or mailinglist if you want to automate other tasks.

  • Removable media covers the automated mounting of removable media like floppies, CD/DVDs, USB storage devices, camera's, ... on your system.
  • Network detection explains how to configure your network card (wireless or not) to automatically discover networks and run a certain configuration based on the network it finds.
  • Data synchronisation describes how to synchronise files and directories on your system with other systems, laptops or PDAs, and also the synchronisation of your journal, address books, etc.
  • Power consumption informs you how you can decrease the power consumption of your system (probably a laptop) by automatically putting the hard disk to sleep, disable wireless if it isn't needed, adjust the screen settings to use less power, etc.

2.b. Removable media

What is ivman?

ivman is a tool that listens to hardware-related events. With this tool, removable media can be automounted and actions can be programmed based on the event type (such as closing a laptop's lid).

Installing and setting up ivman

To setup ivman, you should install it and then have it automatically loaded up on boot:

Code Listing 2.1: Installing and setting up ivman

~# emerge ivman
~# rc-update add ivman default
~# /etc/init.d/ivman start

The tool will mount removable media read/write for any user in the plugdev group, so add your users to that group:

Code Listing 2.2: Adding users to the plugdev group

~# gpasswd -a user plugdev

2.c. Network detection

What you can and can't do

If all your networks are nice enough to provide DHCP services, then you just need to set your interface to ask for an IP address. However, if you need to provide static IP addresses for a few networks, then you're out of luck: as networks don't provide an identification service, there is no way for a computer to know what network it is in.

However, when you are using wireless networks, most of them do provide network identification using the ESSID. The wpa_supplicant tool allows you to pick wireless settings based on the ESSID. The Gentoo baselayout package allows for ESSID-specific network settings.

2.d. Data synchronisation

2.e. Power consumption

3. Software collaboration

3.a. Interoperable data formats

Chosing the file type wisely

Converting files

Using legacy formats

3.b. Drag and drop

The underlying widgets

Trolltech Qt

GNOME GTK

3.c. Team collaboration

Concurrent access and versioning

Wrappers

Plug-ins

3.d. Message busses

Theory

DBus

D. Gentoo Linux for enterprise environments

1. Software RAID

1.a. Software RAID

Introduction

Advantages

Disadvantages

1.b. Setting up software RAID

Installing the tools

Using software RAID

Software RAID for root file system

1.c. Managing software RAID

Adding and removing disks

1.d. Further resources

Online

2. Logical Volume Management

2.a. Logical Volume Management

Introduction

Physical, group and logical

Configuring the kernel

Installing the tools

2.b. Configuring LVM

Creating the meta devices

Choosing a file system

Automatical activation during system boot

2.c. Maintaining LVM

Adding or removing physical extends

Creating a snapshot

3. Backup systems

3.a. Purpose of your files

Know what to backup

User versus system files

Immediately recoverable or not?

3.b. Backing up on a per-file basis

3.c. Backing up file systems

3.d. Backing up an entire system

3.e. Backup strategies

Full backups

Incremental backups

Individual backups

Backup locations

Verify the backups

4. Print server

4.a.

E. System Administration

1. Software management

1.a. Software maintenance

1.b. Using prebuilt software packages

1.c. Understanding ebuilds

2. Log files

2.a.

3. Centralised system management

3.a.

F. Performance tuning

1. Input/output performance

1.a. Know what to measure

Benchmarks

Usage sessions

Gut feeling

1.b. Understanding the chain

System calls

Kernel driver

Hardware

1.c. Tuning the system calls

1.d. Tuning the kernel drivers

1.e. Tuning the hardware

2. Network performance

2.a. Network cards and drivers

2.b. Ethernet network

2.c. Connected Internet

2.d. Wireless networks

2.e. Virtual private networks

3. Rendering performance

3.a. All about drivers

3.b. nVidia-based graphical cards

3.c. ATI-based graphical cards

3.d. Render engines

4. Software profiling

4.a. Execution profiling

4.b. Memory profiling

4.c. Benchmarking

5. User-observed performance

5.a. Latency

5.b. Parallel execution

5.c. Gradual detailing

G. Appendix: architecture specific information

1. The x86 Architecture

1.a. Booting CDs

BIOS

The BIOS (Basic Input/Output System) is the first system started when you power on your computer. It first performs a POST (Power-On Self Test) to verify if your hardware is still in good shape. When the POST gives the BIOS an okay, the BIOS will load the boot loader from the boot device configured in its memory. This boot loader then fires up the operating system and the entire system dance starts...

To boot from a CD, you need to configure your BIOS so that the CD-ROM device is the first boot device. Reboot your system and fire up the BIOS. You will be informed about what key to press right after (or during) the POST to get in the BIOS setup. Most BIOSes use Esc, F1, F2, DEL or F8.

Once you are inside the BIOS setup, search for the setting where you can change the boot device sequence. Some BIOSes place it beneath CMOS Setup, but there isn't a standard - each BIOS has its differences. Change the order so that the CD-ROM device is mentioned first before the first hard disk (HD-0). Such a setting will allow you to boot from a bootable CD if there is one in the CD-ROM drive, or boot from the hard drive otherwise.

1.b. Partitioning the disks

Partition layout

Each disk on an x86 system can have at most four primary partitions. This is a remnant of the old days when four partitions were considered enough. Each primary partition has its identification inside the first sector on the disk (the boot record). When you want more than four partitions, you should configure one of the primary partitions to contain all the non-primary partitions. This large primary partition is called the extended partition and the partitions inside it are called logical partitions.

The de facto standard device naming convention tells us that the primary partitions for a disk are numbered as 1 to 4 while the logical ones are 5 and higher, regardless of how many primary partitions you use. The naming convention also tells us how the disks themselves are named.

  • IDE device names start with hd followed by an alphabetic character which resembles the location of the disk in the system: the primary master1 is hda, the primary slave hdb, the secondary master hdc, etc.
  • SCSI device names (and most Serial ATA ones as well) start with sd followed by an alphabetic character which resembles the position of the disk in the disk chain: the first one is sda, the second one sdb, etc.

Device files are located inside /dev, so if you want to identify the primary master IDE drive you would state /dev/hda.

Note: 1 IDE drives are controlled by an IDE controller. Each controller can govern two IDE devices at most: a master and a slave. The master drive has higher priority when both drives are attempting to send or receive data. A standard x86 system has two controllers, a primary one (ide0) and a secundary one (ide1).

Partitioning using cfdisk

The cfdisk tool allows you to quickly partition your disks and is a lot easier to use than fdisk. When you have fired up cfdisk (which selects /dev/hda by default - you can use a different disk by giving the device file name as an argument) you will get an overview of the available partitions, each one listed with the device name, partition type, file system type and size.

When you take a look at the interface, you'll notice that it is quite self-explanatory:

  • When you select a free space region, you can add new partitions by pressing New. cfdisk will ask you what kind of partition you want (primary or logical) and its size after which the partition is added to the overview pane.
  • When you select a partition entry, you can change the file system type. Just select Type and search for the type you are interested in (probably 82 - Linux swap / Solaris, or 83 - Linux).
  • Since some BIOSes require the partition that stores the boot loader for the operating system to be marked as bootable, there is also an option of doing so.

You should create your partitions, not forgetting to mark at least one of them for swap usage. Even though swap files are supported by Linux, it is not recommended since they have some impact on the system's performance. A dedicated swap partition performs far better than a swap file.

A frequently asked question is how to partition the disk. There is no satisfying answer to that, and any attempt to obtain one will result in a cataclysmic series of flamewars. So we'll stick with a simple suggestion: it doesn't hurt to use two partitions: one for the entire Linux system and one for the swap space. You'll get to know your own preferences when you are more experienced with Linux.

1.c. The make.conf file

Introduction

The information given in this section is not meant to be exhaustive. We provide you with the settings Gentoo supports. If you use different settings we don't say that Gentoo doesn't support it, but it is possible. We list the settings by subarchitecture - a set of machine instructions supported by a range of x86 systems. All subarchitectures are derived from older, compatible subarchitectures. If at any time you are uncertain which one to pick, you should use the oldest subarchitecture. Picking one that is more recent than the one your system supports will result in segmentation faults or internal errors.

The CXXFLAGS setting is never shown, you should set it to the CFLAGS one:

Code Listing 3.1: Setting the CXXFLAGS variable

CFLAGS="..."
CXXFLAGS="${CFLAGS}"

The generic x86 subarchitecture

The next settings work on every x86 system (apart from the i286 and lower):

Code Listing 3.2: Generic x86 settings

CHOST="i386-pc-linux-gnu"
CFLAGS="-march=i386 -O2 -pipe"

The ix86 series

The ix86 series (i486, i586, i686) all refer to Intel- and Intel compatible CPUs. The i586 instruction set is also known as the Pentium and the i686 one as the Pentium Pro, the predecessor of the Pentium MMX.

Code Listing 3.3: ix86 Series

(Substitute i486 with i586 or i686 accordingly)
CHOST="i486-pc-linux-gnu"
CFLAGS="-march=i486 -O2 -pipe"

You can also substitute the i486 value inside the CFLAGS setting (and not the CHOST one) with pentium or pentium-pro for such systems.

Intel CPUs

Additional support is available for various Intel CPUs, such as pentium-mmx, pentium2 (including Celeron), pentium3, pentium4, and nocona:

Code Listing 3.4: Intel CPU series

(Substitute the pentium-mmx value in CFLAGS with the value you need)
CHOST="i686-pc-linux-gnu"
CFLAGS="-march=pentium-mmx -O2 -pipe"

AMD CPUs

AMD CPU users can use any of the following settings for -march: athlon-xp, athlon-mp, athlon-tbird, athlon, k6, k6-2, k6-3.

Code Listing 3.5: AMD CPU series

(Substitute the athlon-xp value in CFLAGS with the value you need)
CHOST="i686-pc-linux-gnu"
CFLAGS="-march=athlon-xp -O2 -pipe"

1.d. Bootloaders

GRUB

The grub bootloader is a powerful application, able to boot various operating systems, including Microsoft Windows. One of its most powerful features is its ability to understand various file systems, which makes it possible for grub to aide you in your boot setup, especially when there are some issues you need to fix.

For instance, you can browse a file system looking for files, reading different grub configurations, using various Linux kernels, locate files on the system (and view their contents), but also hide partitions, boot from a network using BOOTP (a simple file transfer protocol, very often used to send boot images to various systems), change the partition table, ...

GRUB: configuration

To use grub, you need to install it first (from within the chrooted environment):

Code Listing 4.1: Installing GRUB

# emerge grub

Next, edit (or create) the /boot/grub/grub.conf file. We'll first give you a simple example of a grub.conf file:

Code Listing 4.2: Example grub.conf file

default 0
timeout 5

title=Gentoo Linux
root (hd0,5)
kernel /kernel-2.6.14-gentoo-r2 root=/dev/sda8

grub always starts counting from zero. For instance, to boot the first entry by default, we state default 0. The other line, timeout 5, tells GRUB to wait 5 seconds before it actually boots the entry pointed to by the default setting.

This is of course not the most difficult part of GRUB. The entries themselves however are. In the given example, there are three commands given to GRUB:

  1. The title entry tells GRUB what to display to the user when he is asked to make his selection.
  2. The root entry informs GRUB where its own files are stored. This is not the Linux root file system (it can be, but this isn't always true). If you have /boot (where GRUB stores its files) as a separate partition, you point this directive to that partition.
  3. The kernel entry is used by GRUB to know what Linux kernel to boot (relative to the file system where root points to) and what boot parameters to add.

Many users often make a mistake when they configure the root parameter. That's mostly because the syntax used by GRUB to identify partitions is different from what they're used to work with. Its syntax is quite simple:

Code Listing 4.3: GRUBs partition syntax

(hdharddisk-#,partition-#)

The harddisk-# is the hard disk number, starting from 0. If you only have one hard disk, it is 0, regardless of where the disk is at. If you have several disks, start counting from the one which your system checks first. For instance, if you only have IDE disks, your system will probably start with the primary master, then primary slave, then secundary master, ...

The partition-# is the partition number, starting from 0, and uses the same logic used with the partitioning you did earlier. The first four partitions (0-3) are the primary partitions. The logical partitions start from the number 4. So, in the above example, the GRUB files are stored on the second logical partition on the first disk (also known as /dev/sda6).

You'll find more information about GRUB (including nicely commented configuration examples) in the GRUB info pages:

Code Listing 4.4: Retrieving GRUB information

# info grub

GRUB: installation

You still have to install grub in the MBR (Master Boot Record) though, so that your BIOS is able to find and start it. Otherwise, your system will inform you that no operating system is found...

The recommended method uses grub-install to setup GRUB. Yet this tool relies on some information not present on your system yet: the /etc/mtab file, a cache file which contains information about the mounted file systems. Create one that makes grub-install happy, you only need to enter the file system for your root partition (/) and, if you have one, for your boot partition (/boot):

Code Listing 4.5: Example /etc/mtab file

/dev/sda8   /       ext3   rw,noatime   0 0
/dev/sda6   /boot   ext2   rw,noatime   0 0

Then, run grub-install with the device that resembles your first disk that the system will boot. For instance, if that first disk is /dev/sda:

Code Listing 4.6: Installing GRUB in the MBR using grub-install

(grub-install also supports the (hd0) notation)
# grub-install /dev/sda

The grub-install tool will then search through /etc/mtab to find out where the GRUB files are stored and install a minimal boot loader in the MBR whose only job is to find and start the rest of the GRUB files.

If you come to the conclusion that the installation has failed, you can try to perform the grub-install steps manually. Run grub, then enter the configuration commands root (where are the GRUB files located - same as the one in the configuration file grub.conf), setup (where to install GRUB - (hd0) is most likely) and quit (to exit the GRUB installation):

Code Listing 4.7: Performing the GRUB installation steps manually

(The following is just an example)
# grub
grub> root (hd0,5)
grub> setup (hd0)
grub> quit

Print

Updated June 5, 2008

Summary: This handbook tries to extend on various subjects regarding Linux and the Gentoo Linux operating system. It is written with the casual user in mind who wants to learn about Linux rather than just follow instructions to the letter. Although I hope this handbook will eventually be complete, it currently lacks so many important subjects that it is far from ready yet to be officially published.

Sven Vermeulen
Author

Donate to support our development efforts.

Support OSL
Gentoo Centric Hosting: vr.org
Tek Alchemy
SevenL.net
Global Netoptex Inc.
Bytemark
Online Kredit Index
Copyright 2001-2009 Gentoo Foundation, Inc. Questions, Comments? Contact us.