Team LiB
Previous Section Next Section

Syscalls

System calls (often called syscalls in Linux) are typically accessed via function calls. They can define one or more arguments (inputs) and might result in one or more side effects[3], for example writing to a file or copying some data into a provided pointer. System calls also provide a return value of type long[4] that signifies success or error. Usually, although not always, a negative return value denotes an error. A return value of zero is usually (but again not always) a sign of success. Unix system calls, on error, write a special error code into the global errno variable. This variable can be translated into human-readable errors via library functions such as perror().

[3] Note the "might" here. Although nearly all system calls have a side effect (that is, they result in some change of the system's state), a few syscalls, such as getpid(), merely return some data from the kernel.

[4] The use of type long is for compatibility with 64-bit architectures.

Finally, system calls have a defined behavior. For example, the system call getpid() is defined to return an integer that is the current process's PID. The implementation of this syscall in the kernel is very simple:

asmlinkage long sys_getpid(void)
{
        return current->tgid;
}

Note that the definition says nothing of the implementation. The kernel must provide the intended behavior of the system call, but is free to do so with whatever implementation it desires as long as the result is correct. Of course, this system call is as simple as they come and there are not too many other ways to implement it (certainly no simpler method exists)[5].

[5] You might be wondering why doesgetpid() returntgid, the thread group ID ? In normal processes, the TGID is equal to the PID. With threads, the TGID is the same for all threads in a thread group. This enables the threads to call getpid() and get the same PID.

You can make a couple of observations about system calls even from this simple example. First, note the asmlinkage modifier on the function definition. This is a bit of magic to tell the compiler to look only on the stack for this function's arguments. This is a required modifier for all system calls. Second, note that the getpid() system call is defined as sys_getpid() in the kernel. This is the naming convention taken with all system calls in Linux: System call bar() is implemented in the kernel as function sys_bar().

System Call Numbers

In Linux, each system call is assigned a syscall number. This is a unique number that is used to reference a specific system call. When a user-space process executes a system call, the syscall number delineates which syscall was executed; the process does not refer to the syscall by name.

The syscall number is important; when assigned, it cannot change, or else compiled applications will break. Likewise, if a system call is removed, its system call number cannot be recycled, or else previous compiled code would invoke the system call but in reality call another. Linux provides a "not implemented" system call, sys_ni_syscall(), which does nothing except return -ENOSYS, the error corresponding to an invalid system call. This function is used to "plug the hole" in the rare event that a syscall is removed or otherwise made unavailable.

The kernel keeps a list of all registered system calls in the system call table, stored in sys_call_table. This table is architecture dependent and typically defined in enTRy.S, which for x86 is in arch/i386/kernel/. This table assigns each valid syscall to a unique syscall number.

System Call Performance

System calls in Linux are faster than in many other operating systems. This is partly because of Linux's incredibly fast context switch times; entering and exiting the kernel is a streamlined and simple affair. The other factor is the simplicity of the system call handler and the individual system calls themselves.

    Team LiB
    Previous Section Next Section