Data Structures Associated with a ProcessEach process on the system has its own list of open files, root filesystem, current working directory, mount points, and so on. Three data structures tie together the VFS layer and the processes on the system: the files_struct, fs_struct, and namespace structure. The files_struct is defined in <linux/file.h>. This table's address is pointed to by the files enTRy in the processor descriptor. All per-process information about open files and file descriptors is contained therein. Here it is, with comments: struct files_struct { atomic_t count; /* structure's usage count */ spinlock_t file_lock; /* lock protecting this structure */ int max_fds; /* maximum number of file objects */ int max_fdset; /* maximum number of file descriptors */ int next_fd; /* next file descriptor number */ struct file **fd; /* array of all file objects */ fd_set *close_on_exec; /* file descriptors to close on exec() */ fd_set *open_fds; /* pointer to open file descriptors */ fd_set close_on_exec_init; /* initial files to close on exec() */ fd_set open_fds_init; /* initial set of file descriptors */ struct file *fd_array[NR_OPEN_DEFAULT]; /* default array of file objects */ }; The fd array points to the list of open file objects. By default, this is the fd_array array. Because NR_OPEN_DEFAULT is equal to 32, this includes room for 32 file objects. If a process opens more than 32 file objects, the kernel allocates a new array and points the fd pointer at it. In this fashion, access to a reasonable number of file objects is quick, taking place in a static array. In the case that a process opens an abnormal number of files, the kernel can create a new array. If the majority of processes on a system open more than 32 files, for optimum performance the administrator can increase the NR_OPEN_DEFAULT preprocessor macro to match. The second process-related structure is fs_struct, which contains filesystem information related to a process and is pointed at by the fs field in the process descriptor. The structure is defined in <linux/fs_struct.h>. Here it is, with comments: struct fs_struct { atomic_t count; /* structure usage count */ rwlock_t lock; /* lock protecting structure */ int umask; /* default file permissions*/ struct dentry *root; /* dentry of the root directory */ struct dentry *pwd; /* dentry of the current directory */ struct dentry *altroot; /* dentry of the alternative root */ struct vfsmount *rootmnt; /* mount object of the root directory */ struct vfsmount *pwdmnt; /* mount object of the current directory */ struct vfsmount *altrootmnt; /* mount object of the alternative root */ }; This structure holds the current working directory (pwd) and root directory of the current process. The third and final structure is the namespace structure, which is defined in <linux/namespace.h> and pointed at by the namespace field in the process descriptor. Per-process namespaces were added to the 2.4 Linux kernel. They enable each process to have a unique view of the mounted filesystems on the systemnot just a unique root directory, but an entirely unique filesystem hierarchy, if desired. Here is the structure, with the usual comments: struct namespace { atomic_t count; /* structure usage count */ struct vfsmount *root; /* mount object of root directory */ struct list_head list; /* list of mount points */ struct rw_semaphore sem; /* semaphore protecting the namespace */ }; The list member specifies a doubly linked list of the mounted filesystems that make up the namespace. These data structures are linked from each process descriptor. For most processes, the process descriptor points to unique files_struct and fs_struct structures. For processes created with the clone flag CLONE_FILES or CLONE_FS, however, these structures are shared[7]. Consequently, multiple process descriptors might point to the same files_struct or fs_struct structure. The count member of each structure provides a reference count to prevent destruction while a process is still using the structure.
The namespace structure works the other way around. By default, all processes share the same namespace (that is, they all see the same filesystem hierarchy from the same mount table). Only when the CLONE_NEWNS flag is specified during clone() is the process given a unique copy of the namespace structure. Because most processes do not provide this flag, all the processes inherit their parents' namespaces. Consequently, on many systems there is only one namespace, although the functionality is but a single CLONE_NEWNS flag away. |