The roots of Linux can be traced back to the origins of Unix TM . In 1969, Ken Thompson of the Research Group at Bell Laboratories began experimenting on a multi-user, multi-tasking operating system using an otherwise idle PDP-7. He was soon joined by Dennis Richie and the two of them, along with other members of the Research Group produced the early versions of Unix TM. Richie was strongly influenced by an earlier project, MULTICS and the name Unix TM is itself a pun on the name MULTICS. Early versions were written in assembly code, but the third version was rewritten in a new programming language, C. C was designed and written by Richie expressly as a programming language for writing operating systems. This rewrite allowed Unix TM to move onto the more powerful PDP-11/45 and 11/70 computers then being produced by DIGITAL. The rest, as they say, is history. Unix TM moved out of the laboratory and into mainstream computing and soon most major computer manufacturers were producing their own versions.
Linux was the solution to a simple need. The only software that Linus Torvalds, Linux's author and principle maintainer was able to afford was Minix. Minix is a simple, Unix TM like, operating system widely used as a teaching aid. Linus was less than impressed with its features, his solution was to write his own software. He took Unix TM as his model as that was an operating system that he was familiar with in his day to day student life. He started with an Intel 386 based PC and started to write. Progress was rapid and, excited by this, Linus offered his efforts to other students via the emerging world wide computer networks, then mainly used by the academic community. Others saw the software and started contributing. Much of this new software was itself the solution to a problem that one of the contributors had. Before long, Linux had become an operating system. It is important to note that Linux contains no Unix TM code, it is a rewrite based on published POSIX standards. Linux is built with and uses a lot of the GNU (GNU's Not Unix TM) software produced by the Free Software Foundation in Cambridge, Massachusetts.
Most people use Linux as a simple tool, often just installing one of the many good CD ROM-based distributions. A lot of Linux users use it to write applications or to run applications written by others. Many Linux users read the HOWTOs1 avidly and feel both the thrill of success when some part of the system has been correctly configured and the frustration of failure when it has not. A minority are bold enough to write device drivers and offer kernel patches to Linus Torvalds, the creator and maintainer of the Linux kernel. Linus accepts additions and modifications to the kernel sources from anyone, anywhere. This might sound like a recipe for anarchy but Linus exercises strict quality control and merges all new code into the kernel himself. At any one time though, there are only a handful of people contributing sources to the Linux kernel.
The majority of Linux users do not look at how the operating system works, how it fits together. This is a shame because looking at Linux is a very good way to learn more about how an operating system functions. Not only is it well written, all the sources are freely available for you to look at. This is because although the authors retain the copyrights to their software, they allow the sources to be freely redistributable under the Free Software Foundation's GNU Public License. At first glance though, the sources can be confusing; you will see directories called kernel, mm and net but what do they contain and how does that code work? What is needed is a broader understanding of the overall structure and aims of Linux. This, in short, is the aim of this book: to promote a clear understanding of how Linux, the operating system, works. To provide a mind model that allows you to picture what is happening within the system as you copy a file from one place to another or read electronic mail. I well remember the excitement that I felt when I first realized just how an operating system actually worked. It is that excitement that I want to pass on to the readers of this book.
My involvement with Linux started late in 1994 when I visited Jim Paradis who was working on a port of Linux to the Alpha AXP processor based systems. I had worked for Digital Equipment Co. Limited since 1984, mostly in networks and communications and in 1992 I started working for the newly formed Digital Semiconductor division. This division's goal was to enter fully into the merchant chip vendor market and sell chips, and in particular the Alpha AXP range of microprocessors but also Alpha AXP system boards outside of Digital. When I first heard about Linux I immediately saw an opportunity to have fun. Jim's enthusiasm was catching and I started to help on the port. As I worked on this, I began more and more to appreciate not only the operating system but also the community of engineers that produces it.
However, Alpha AXP is only one of the many hardware platforms that Linux runs on. Most Linux kernels are running on Intel processor based systems but a growing number of non-Intel Linux systems are becoming more commonly available. Amongst these are Alpha AXP, ARM, MIPS, Sparc and PowerPC. I could have written this book using any one of those platforms but my background and technical experiences with Linux are with Linux on the Alpha AXP and, to a lesser extent on the ARM. This is why this book sometimes uses non-Intel hardware as an example to illustrate some key point. It must be noted that around 95% of the Linux kernel sources are common to all of the hardware platforms that it runs on. Likewise, around 95% of this book is about the machine independent parts of the Linux kernel.
I have deliberately not described the kernel's algorithms, its methods of doing things, in terms of routine_X() calls routine_Y() which increments the foo field of the bar data structure. You can read the code to find these things out. Whenever I need to understand a piece of code or describe it to someone else I often start with drawing its data structures on the white-board. So, I have described many of the relevant kernel data structures and their interrelationships in a fair amount of detail.
Each chapter is fairly independent, like the Linux kernel subsystem that they each describe. Sometimes, though, there are linkages; for example you cannot describe a process without understanding how virtual memory works.
The Hardware Basics chapter (Chapter hw-basics-chapter) gives a brief introduction to the modern PC. An operating system has to work closely with the hardware system that acts as its foundations. The operating system needs certain services that can only be provided by the hardware. In order to fully understand the Linux operating system, you need to understand the basics of the underlying hardware.
The Software Basics chapter (Chapter sw-basics-chapter) introduces basic software principles and looks at assembly and C programing languages. It looks at the tools that are used to build an operating system like Linux and it gives an overview of the aims and functions of an operating system.
The Memory Management chapter (Chapter mm-chapter) describes the way that Linux handles the physical and virtual memory in the system.
The Processes chapter (Chapter processes-chapter) describes what a process is and how the Linux kernel creates, manages and deletes the processes in the system.
Processes communicate with each other and with the kernel to coordinate their activities. Linux supports a number of Inter-Process Communication (IPC) mechanisms. Signals and pipes are two of them but Linux also supports the System V IPC mechanisms named after the Unix TM release in which they first appeared. These interprocess communications mechanisms are described in Chapter IPC-chapter.
The Peripheral Component Interconnect (PCI) standard is now firmly established as the low cost, high performance data bus for PCs. The PCI chapter (Chapter PCI-chapter) describes how the Linux kernel initializes and uses PCI buses and devices in the system.
The Interrupts and Interrupt Handling chapter (Chapter interrupt-chapter) looks at how the Linux kernel handles interrupts. Whilst the kernel has generic mechanisms and interfaces for handling interrupts, some of the interrupt handling details are hardware and architecture specific.
One of Linux's strengths is its support for the many available hardware devices for the modern PC. The Device Drivers chapter (Chapter dd-chapter) describes how the Linux kernel controls the physical devices in the system.
The File system chapter (Chapter filesystem-chapter) describes how the Linux kernel maintains the files in the file systems that it supports. It describes the Virtual File System (VFS) and how the Linux kernel's real file systems are supported.
Networking and Linux are terms that are almost synonymous. In a very real sense Linux is a product of the Internet or World Wide Web (WWW). Its developers and users use the web to exchange information ideas, code and Linux itself is often used to support the networking needs of organizations. Chapter networks-chapter describes how Linux supports the network protocols known collectively as TCP/IP.
The Kernel Mechanisms chapter (Chapter kernel-chapter) looks at some of the general tasks and mechanisms that the Linux kernel needs to supply so that other parts of the kernel work effectively together.
The Modules chapter (Chapter modules-chapter) describes how the Linux kernel can dynamically load functions, for example file systems, only when they are needed.
The Processors chapter (Chapter processors-chapter) gives a brief description of some of the processors that Linux has been ported to.
The Sources chapter (Chapter sources-chapter) describes where in the Linux kernel sources you should start looking for particular kernel functions.
|serif font||identifies commands or other text that is to be typed|
|literally by the user.|
|type font||refers to data structures or fields|
|within data structures.|
Throughout the text there references to pieces of code within the Linux kernel source tree (for example the boxed margin note adjacent to this text ). These are given in case you wish to look at the source code itself and all of the file references are relative to /usr/src/linux. Taking foo/bar.c as an example, the full filename would be /usr/src/linux/foo/bar.c If you are running Linux (and you should), then looking at the code is a worthwhile experience and you can use this book as an aid to understanding the code and as a guide to its many data structures.
Caldera, OpenLinux and the ``C'' logo are trademarks of Caldera, Inc.
Caldera OpenDOS 1997 Caldera, Inc.
DEC is a trademark of Digital Equipment Corporation.
DIGITAL is a trademark of Digital Equipment Corporation.
Linux is a trademark of Linus Torvalds.
Motif is a trademark of The Open System Foundation, Inc.
MSDOS is a trademark of Microsoft Corporation.
Red Hat, glint and the Red Hat logo are trademarks of Red Hat Software, Inc.
UNIX is a registered trademark of X/Open.
XFree86 is a trademark of XFree86 Project, Inc.
X Window System is a trademark of the X Consortium and the Massachusetts Institute of Technology.
I was born in 1957, a few weeks before Sputnik was launched, in the north of England. I first met Unix at University, where a lecturer used it as an example when teaching the notions of kernels, scheduling and other operating systems goodies. I loved using the newly delivered PDP-11 for my final year project. After graduating (in 1982 with a First Class Honours degree in Computer Science) I worked for Prime Computers (Primos) and then after a couple of years for Digital (VMS, Ultrix). At Digital I worked on many things but for the last 5 years there, I worked for the semiconductor group on Alpha and StrongARM evaluation boards. In 1998 I moved to ARM where I have a small group of engineers writing low level firmware and porting operating systems. My children (Esther and Stephen) describe me as a geek.
People often ask me about Linux at work and at home and I am only too happy to oblige. The more that I use Linux in both my professional and personal life the more that I become a Linux zealot. You may note that I use the term `zealot' and not `bigot'; I define a Linux zealot to be an enthusiast that recognizes that there are other operating systems but prefers not to use them. As my wife, Gill, who uses Windows 95 once remarked ``I never realized that we would have his and her operating systems''. For me, as an engineer, Linux suits my needs perfectly. It is a superb, flexible and adaptable engineering tool that I use at work and at home. Most freely available software easily builds on Linux and I can often simply download pre-built executable files or install them from a CD ROM. What else could I use to learn to program in C++, Perl or learn about Java for free?
I must thank the many people who have been kind enough to take the time to e-mail me with comments about this book. I have attempted to incorporated those comments in each new version that I have produced and I am more than happy to receive comments, however please note my new e-mail address.
A number of lecturers have written to me asking if they can use some or parts of this book in order to teach computing. My answer is an emphatic yes; this is one use of the book that I particularly wanted. Who knows, there may be another Linus Torvalds sat in the class.
Special thanks must go to John Rigby and Michael Bauer who gave me full, detailed review notes of the whole book. Not an easy task. Alan Cox and Stephen Tweedie have patiently answered my questions - thanks. I used Larry Ewing's penguins to brighten up the chapters a bit. Finally, thank you to Greg Hankins for accepting this book into the Linux Documentation Project and onto their web site.
1 A HOWTO is just what it sounds like, a document describing how to do something. Many have been written for Linux and all are very useful.