Linux file system glossary!


Guide outline

1.1. Overview
1.2. The Root Directory
1.3. /bin
1.4. /boot
1.5. /dev
1.6. /etc
1.7. /home
1.8. /initrd
1.9. /lib
1.10. /lost+found
1.11. /media
1.12. /mnt
1.13. /opt
1.14. /proc
1.15. /root
1.16. /sbin
1.17. /usr
1.18. /var
1.19. /srv
1.20. /tm
2. Glossary

Additional Linux Resources

Here is a list of resources for learning Linux:

Resources for System Administrators

Linux System Admin Guide- What is Linux Operating System and how it works
Linux System Admin Guide- What are Directory Tree and Filesystem Hierarchy in Linux
Linux System Admin Guide- Introduction to Linux File Systems for System Admins
Linux System Admin Guide- Overview of Linux Virtual Memory and Disk Buffer Cache
Linux System Admin Guide- Best Practices for Monitoring Linux Systems
Linux System Admin Guide- Best Practices for Performing Linux Boots and Shutdowns
Linux System Admin Guide- Best Practices for Making and Managing Backup Operations

Resources for Linux Kernel Programmers

How Linux Operating System Memory Management works
Comprehensive Review of Linux Kernel Operating System Processes
Comprehensive Review of Linux File System Architecture and Management
What are mechanisms behind Linux Kernel task management
How Linux Kernel Sources and Functions work
Comprehensive look at how Linux Data Structures work

Hands-on Linux classes

Introduction to Linux and Shell programming
Introduction to Linux System Administration

Linux Operating System Distributions

Comprehensive list of all Linux OS distributions
Comprehensive list of all special purpose Linux distributions
Comprehensive list of all secure Linux distributions for cybersecurity professionals

Glossary

ARPA
The Advanced Research and Projects Agency of the United States Department of Defense. Also known as DARPA (the “D” stands for “Defense”), it originated in the late 1960s and early 1970s the proposal and standards for the Internet. For this reason, the Internet was initially referred to as ARPANet, and connected the military with the various centers of research around the United States in a way that was intended to have a high degree of survivability against a nuclear attack.
BASH
The Bourne Again Shell and is based on the Bourne shell, sh, the original command interpreter.
Bourne Shell
The Bourne shell is the original Unix shell (command execution program, often called a command interpreter) that was developed at AT&T. Named for its developer, Stephen Bourne, the Bourne shell is also known by its program name, sh. The shell prompt (character displayed to indicate readiness for input) used is the $ symbol. The Bourne shell family includes the Bourne, Korn shell, bash, and zsh shells. Bourne Again Shell (bash) is the free version of the Bourne shell distributed with Linux systems. Bash is similar to the original, but has added features such as command line editing. Its name is sometimes spelled as Bourne Again SHell, the capitalized Hell referring to the difficulty some people have with it.
CLI
A CLI (command line interface) is a user interface to a computer’s operating system or an application in which the user responds to a visual prompt by typing in a command on a specified line, receives a response back from the system, and then enters another command, and so forth. The MS-DOS Prompt application in a Windows operating system is an example of the provision of a command line interface. Today, most users prefer the graphical user interface (GUI) offered by Windows, Mac OS, BeOS, and others. Typically, most of today’s Unix-based systems offer both a command line interface and a graphical user interface.
core
A core file is created when a program terminates unexpectedly, due to a bug, or a violation of the operating system’s or hardware’s protection mechanisms. The operating system kills the program and creates a core file that programmers can use to figure out what went wrong. It contains a detailed description of the state that the program was in when it died. If would like to determine what program a core file came from, use the file command, like this: $ file core That will tell you the name of the program that produced the core dump. You may want to write the maintainer(s) of the program, telling them that their program dumped core. To Enable or Disable Core Dumps you must use the ulimit command in bash, the limit command in tcsh, or the rlimit command in ksh. See the appropriate manual page for details. This setting affects all programs run from the shell (directly or indirectly), not the whole system. If you wish to enable or disable core dumping for all processes by default, you can change the default setting in /usr/include/linux/sched.h. Refer to definition of INIT_TASK, and look also in /usr/include/linux/resource.h. PAM support optimizes the system’s environment, including the amount of memory a user is allowed. In some distributions this parameter is configurable in the /etc/security/limits.conf file. For more information, refer to the Linux Administrator’s Security Guide.
daemon
A process lurking in the background, usually unnoticed, until something triggers it into action. For example, the \cmd{update} daemon wakes up every thirty seconds or so to flush the buffer cache, and the \cmd{sendmail} daemon awakes whenever someone sends mail.
DARPA
The Defense Advanced Research Projects Agency is the central research and development organization for the Department of Defense (DoD). It manages and directs selected basic and applied research and development projects for DoD, and pursues research and technology where risk and payoff are both very high and where success may provide dramatic advances for traditional military roles and missions.
DHCP
Dynamic Host Control Protocol, is a protocol like BOOTP (actually dhcpd includes much of the functionality of BOOTPD). It assigns IP addresses to clients based on lease times. DHCP is used extensively by Microsoft and more recently also by Apple. It is probably essential in any multi-platform environment.
DNS
Domain Name System translates Internet domain and host names to IP addresses. DNS implements a distributed database to store name and address information for all public hosts on the Net. DNS assumes IP addresses do not change (i.e., are statically assigned rather than dynamically assigned). The DNS database resides on a hierarchy of special-purpose servers. When visiting a Web site or other device on the Net, a piece of software called the DNS resolver (usually built into the network operating system) first contacts a DNS server to determine the server’s IP address. If the DNS server does not contain the needed mapping, it will in turn forward the request to a DNS server at the next higher level in the hierarchy. After potentially several forwarding and delegation messages are sent within the DNS hierarchy, the IP address for the given host eventually is delivered to the resolver. DNS also includes support for caching requests and for redundancy. Most network operating systems allow one to enter the IP addresses of primary, secondary, and tertiary DNS servers, each of which can service initial requests from clients. Many ISPs maintain their own DNS servers and use DHCP to automatically assign the addresses of these servers to dial-in clients, so most home users need not be aware of the details behind DNS configuration. Registered domain names and addresses must be renewed periodically, and should a dispute occur between two parties over ownership of a given name, such as in trademarking, ICANN’s Uniform Domain-Name Dispute-Resolution Policy can be invoked. Also known as Domain Name System, Domain Name Service, Domain Name Server.
environment variable
A variable that is available to any program that is started by the shell.
ESD
Enlightened Sound Daemon. This program is designed to mix together several digitized audio streams for playback by a single device.
filesystem
The methods and data structures that an operating system uses to keep track of files on a disk or partition; the way the files are organized on the disk. Also used to describe a partition or disk that is used to store the files or the type of the filesystem.
FSSTND
Often the group, which creates the Linux File System Structure document, or the document itself, is referred to as the ‘FSSTND’. This is short for “file system standard”. This document has helped to standardize the layout of file systems on Linux systems everywhere. Since the original release of the standard, most distributors have adopted it in whole or in part, much to the benefit of all Linux users. It is now often refered to as the FHS (Filesystem Hierarchy Standard) document though since its incorporation into the LSB (Linux Standards Base) Project.
GUI
Graphical User Interface. The use of pictures rather than just words to represent the input and output of a program. A program with a GUI runs under some windowing system (e.g. The X Window System, Microsoft Windows, Acorn RISC OS, NEXTSTEP). The program displays certain icons, buttons, dialogue boxes etc. in its windows on the screen and the user controls it mainly by moving a pointer on the screen (typically controlled by a mouse) and selecting certain objects by pressing buttons on the mouse while the pointer is pointing at them. Though Apple Computer would like to claim they invented the GUI with their Macintosh operating system, the concept originated in the early 1970s at Xerox’s PARC laboratory.
hard link
A directory entry, which maps a filename to an inode, number. A file may have multiple names or hard links. The link count gives the number of names by which a file is accessible. Hard links do not allow multiple names for directories and do not allow multiple names in different filesystems.
init
‘init’ process is the first user level process started by the kernel. init has many important duties, such as starting getty (so that users can log in), implementing run levels, and taking care of orphaned processes. This chapter explains how init is configured and how you can make use of the different run levels. init is one of those programs that are absolutely essential to the operation of a Linux system, but that you still can mostly ignore. Usually, you only need to worry about init if you hook up serial terminals, dial-in (not dial-out) modems, or if you want to change the default run level. When the kernel has started (has been loaded into memory, has started running, and has initialized all device drivers and data structures and such), it finishes its own part of the boot process by starting a user level program, init. Thus, init is always the first process (its process number is always 1). The kernel looks for init in a few locations that have been historically used for it, but the proper location for it is /sbin/init. If the kernel can’t find init, it tries to run /bin/sh, and if that also fails, the startup of the system fails. When init starts, it completes the boot process by doing a number of administrative tasks, such as checking filesystems, cleaning up /tmp, starting various services, and starting a getty for each terminal and virtual console where users should be able to log in. After the system is properly up, init restarts getty for each terminal after a user has logged out (so that the next user can log in). init also adopts orphan processes: when a process starts a child process and dies before its child, the child immediately becomes a child of init. This is important for various technical reasons, but it is good to know it, since it makes it easier to understand process lists and process tree graphs. init itself is not allowed to die. You can’t kill init even with SIGKILL. There are a few variants of init available. Most Linux distributions use sysvinit (written by Miquel van Smoorenburg), which is based on the System V init design. The BSD versions of Unix have a different init. The primary difference is run levels: System V has them, BSD doesn’t.
inode
An inode is the address of a disk block. When you see the inode information through ls, ls prints the address of the first block in the file. You can use this information to tell if two files are really the same file with different names (links). A file has several components: a name, contents, and administrative information such as permissions and modification times. The administrative information is stored in the inode (over the years, the hyphen fell out of “i-node”), along with essential system data such as how long it is, where on the disc the contents of the file are stored, and so on. There are three times in the inode: the time that the contents of the file were last modified (written); the time that the file was last used (read or executed); and the time that the inode itself was last changed, for example to set the permissions. Altering the contents of the file does not affect its usage time and changing the permissions affects only the inode change time. It is important to understand inodes, not only to appreciate the options on ls, but because in a strong sense the inodes are the files. All the directory hierarchy does is provide convenient names for files. The system’s internal name for the file is its i-number: the number of the inode holding the file’s information.
kernel
Part of an operating system that implements the interaction with hardware and the sharing of resources.
libraries
Executables should have no undefined symbols, only useful symbols; all useful programs refer to symbols they do not define (eg. printf or write). These references are resolved by pulling object files from libraries into the executable.
link
A symbolic link (alias in MacOS and shortcut under Windows) is a file that points to another file; this is a commonly used tool. A hard-link rarely created by the user, is a filename that points to a block of data that has several other filenames as well.
man page
Every version of UNIX comes with an extensive collection of online help pages called man pages (short for manual pages). The man pages are the authoritative documentation about your UNIX system. They contain complete information about both the kernel and all the utilities.
MTA
Mail Transfer Agents. Alongside the web, mail is the top reason for the popularity of the Internet. E-mail is an inexpensive and fast method of time-shifted messaging which, much like the Web, is actually based around sending and receiving plain text files. The protocol used is called the Simple Mail Transfer Protocol (SMTP). The server programs that implement SMTP to move mail from one server to another are called MTAs. Once upon a time users would have to Telnet into an SMTP server and use a command line mail program like ‘mutt’ or ‘pine’ to check their mail. Now, GUI based e-mail clients like Mozilla, Kmail and Outlook allow users to check their email off of a local SMTP sever. Additional protocols like POP3 and IMAP4 are used between the SMTP server and desktop mail client to allow clients to manipulate files on, and download from, their local mail server. The programs that implement POP3 and IMAP4 are called Mail Delivery Agents (MDAs). They are generally separate from MTAs.
NFS
Network File System, is the UNIX equivalent of Server Message Block (SMB). It is a way through which different machines can import and export local files between each other. Like SMB though, NFS sends information including user passwords unencrypted, so it’s best to limit its usage to within your local network.
operating system
Software that shares a computer system’s resources (processor, memory, disk space, network bandwidth, and so on) between users and the application programs they run. Controls access to the system to provide security.
PAM
Pluggable Authentication Modules. A suite of shared libraries that determine how a user will be authenticated. For example, conventionally UNIX users authenticate themselves by supplying a password at the password prompt after they have typed their name at the login prompt. In many circumstances, such as internal access to workstations, this simple form of authentication is considered sufficient. In other cases, more information is warranted. If a user wants to log in to an internal system from an external source, like the Internet, more or alternative information may be required – perhaps a one-time password. PAM provides this type of capability and much more. Most important, PAM modules allow you to configure your environment with the necessary level of security.
PATH
The shell looks for commands and programs in a list of file paths stored in the PATH environment variable. An environment variable stores information in a place where other programs and commands can access it. Environment variables store information such as the shell that you are using, your login name, and your current working directory. To see a list of all the environment variables currently defined; type ‘set’ at the prompt. When you type a command at the shell prompt, the shell will look for that command’s program file in each directory listed in the PATH variable, in order. The first program found matching the command you typed will be run. If the command’s program file is not in a directory listed in you PATH environment variable, the shell returns a “commands not found” error. By default, the shell does not look in your current working directory or your home directory for commands This is really a security mechanism so that you don’t execute programs by accident. What if a malicious user put a harmful program called ls in your home directory? If you typed ls and the shell looked for the fake program in your home directory before the real program in the /bin directory, what do you think would happen? If you thought bad things, you are on the right track. Since your PATH doesn’t have the current directory as one of its search locations, programs in your current directory must be called with an absolute path of a relative path specified as ‘./program-name’. To see what directories are part of your PATH enter this command: # echo $PATH /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/bin/X11
pipes and sockets
Special files that programs use to communicate with one another. They are rarely seen, but you might be able to see a socket or two in the /dev/ directory.
process identifier
Shown in the heading of the ps command as PID. The unique number assigned to every process running in the system.
rpc
Remote Procedure Calls. It enables a system to make calls to programs such as NFS across the network transparently, enabling each system to interpret the calls as if they were local. In this case, it would make exported filesystems appear as thought they were local.
set group ID (SGID)
The SGID permission causes a script to run with its group set to the group of the script, rather than the group of the user who started it. It is normally considered extremely bad practice to run a program in this way as it can pose many security problems. Later versions of the Linux kernel will even prohibit the running of shell scripts that have this attribute set.
set user ID (SUID)
The SUID permission causes a script to run as the user who is the owner of the script, rather than the user who started it. It is normally considered extremely bad practice to run a program in this way as it can pose many security problems. Later versions of the Linux kernel will even prohibit the running of shell scripts that have this attribute set.
signal
Software interrupts sent to a program to indicate that an important event has occurred. The events can vary from user requests to illegal memory access errors. Some signals, like the interrupt signal, indicate that a user has asked the program to do something that is not in the usual flow of control.
SSH
The Secure Shell, or SSH, provides a way of running command line and graphical applications, and transferring files, over an encrypted connection, all that will be seen is junk. It is both a protocol and a suite of small command line applications, which can be used for various functions. SSH replaces the old Telnet application, and can be used for secure remote administration of machines across the Internet. However, it also has other features. SSH increases the ease of running applications remotely by setting up X permissions automatically. If you can log into a machine, it allows you to run a graphical application on it, unlike Telnet, which requires users to have an understanding of the X authentication mechanisms that are manipulated through the xauth and xhost commands. SSH also has inbuilt compression, which allows your graphic applications to run much faster over the network. SCP (Secure Copy) and SFTP (Secure FTP) allow transfer of files over the remote link, either via SSH’s own command line utilities or graphical tools like Gnome’s GFTP. Like Telnet, SSH is cross-platform. You can find SSH server and clients for Linux, Unix and all flavours of Windows, BeOS, PalmOS, Java and embedded Oses used in routers.
STDERR
Standard error. A special type of output used for error messages. The file descriptor for STDERR is 2.
STDIN
Standard input. User input is read from STDIN. The file descriptor for STDIN is 0.
STDOUT
Standard output. The output of scripts is usually to STDOUT. The file descriptor for STDOUT is 1.
symbol table
The part of an object table that gives the value of each symbol (usually as a section name and an offset) is called the symbol table. Executables may also have a symbol table, with this one giving the final values of the symbols. Debuggers use the symbol table to present addresses to the user in a symbolic, rather than a numeric form. It is possible to strip the symbol table from executables resulting in a smaller sized executable but this prevents meaningful debugging.
symbolic link or soft link
A special filetype, which is a small pointer file, allowing multiple names for the same file. Unlike hard links, symbolic links can be made for directories and can be made across filesystems. Commands that access the file being pointed to are said to follow the symbolic link. Commands that access the link itself do not follow the symbolic link.
system call
The services provided by the kernel to application programs, and the way in which they are invoked. See section 2 of the manual pages.
system program
Programs that implement high level functionality of an operating system, i.e., things that aren’t directly dependent on the hardware. May sometimes require special privileges to run (e.g., for delivering electronic mail), but often just commonly thought of as part of the system (e.g., a compiler).
tcp-wrappers
Almost all of the services provided through inetd are invoked through tcp-wrappers by way of the tcp-wrappers daemon, tcpd. The tcp-wrappers mechanism provides access control list restrictions and logging for all service requests to the service it wraps. It may be used for either TCP or TCP services as long as the services are invoked through a central daemon process such as inetd. These programs log the client host name of incoming telnet, ftp, rsh, rlogin, finger etc…. requests. Security options are access control per host, domain and/or service; detection of host name spoofing or host address spoofing; booby traps to implement an early-warning system.
ZSH
Zsh was developed by Paul Falstad as a replacement for both the Bourne and C shell. It incorporates features of all the other shells (such as file name completion and a history mechanism) as well as new capabilities. Zsh is considered similar to the Korn shell. Falstad intended to create in zsh a shell that would do whatever a programmer might reasonably hope it would do. Zsh is popular with advanced users. Along with the Korn shell and the C shell, the Bourne shell remains among the three most widely used and is included with all UNIX systems. The Bourne shell is often considered the best shell for developing scripts.