Introduction
In the world of system administration and software development, understanding how programs interact with the kernel is essential for diagnosing performance issues, debugging errors, and optimizing resources. One of the most powerful and versatile tools for this purpose on Linux is strace. This command allows intercepting and recording the system calls that a process makes, providing a detailed view of its internal behavior without needing to modify the source code.
What is strace?
strace is a command-line utility that runs in front of another process and uses the kernel’s ptrace functionality to capture each system call (syscall) that the process invokes, as well as the signals it receives. Each event is displayed in standard output or can be redirected to a file for later analysis. In addition to the call name, strace shows the arguments with which it was invoked and the return value, making it easy to identify which operation is failing or consuming time.
How to install strace
In most Linux distributions, strace comes preinstalled or is available in the official repositories. On Debian- or Ubuntu-based systems it can be installed with the apt package manager by running: sudo apt update && sudo apt install strace. In Red Hat family distributions, such as CentOS or Fedora, the command is: sudo dnf install strace or sudo yum install strace depending on the version. In Arch Linux it is installed with: sudo pacman -S strace. After installation, simply typing strace in the terminal will display the basic help.
Basic usage
The simplest way to use strace is to prefix it to the command you want to trace. For example, to see all system calls of a program called mi_programa you run: strace ./mi_programa. The output will appear in the terminal, showing each syscall in real time. If the process finishes quickly, it can be useful to redirect the output to a file for leisurely review: strace -o registro.txt ./mi_programa. It is also possible to attach strace to an already running process via its process ID (PID) using the -p option: strace -p 1234.
Most useful options
- -c: counts the time consumed and the number of each type of system call, providing a statistical summary at the end.
- -f: follows child processes created by fork, vfork, or clone, ensuring that no subprocess call is missed.
- -e expr: filters the calls to display; for example, -e trace=open,read,write will show only those three syscalls.
- -s N: specifies the maximum string size to print for buffer arguments, useful to avoid excessively long output.
- -t: adds a timestamp to each output line, facilitating event correlation.
- -T: shows the time taken by each system call, ideal for detecting performance bottlenecks.
- -o file: redirects the entire trace to the indicated file instead of the terminal.
Practical examples
To observe only file opens and reads, you can use the filter option -e: strace -e trace=open,read ./mi_programa. This will reduce the output to the relevant syscalls and make it easier to detect problems such as missing files or insufficient permissions.
If you want to obtain a summary of time consumption by call type, the -c option is very useful: strace -c ./mi_programa. At the end of execution a table will be displayed with the number of calls, total time spent, and average per call for each syscall.
If the process is already running and you want to inspect its behavior without restarting it, simply obtain its PID and apply strace -p: strace -p 5678 -o trace.log. After a few seconds you can interrupt the trace with Ctrl+C and analyze the trace.log file to see which syscalls are occurring at that moment.
To debug a program that fails silently, you can combine the -f option to follow child processes and -o to save the output: strace -f -o hijo.txt ./programa_que_lanza_hijos. Then you review the file for calls that return -1, which indicates an error, and examine the errno value to understand the cause.
Limitations and considerations
Although strace is extremely useful, it has some limitations that are important to keep in mind. Each intercepted system call adds a context overhead due to the use of ptrace, which can significantly slow down the traced application, especially in programs that perform thousands of syscalls per second. Moreover, certain processes with elevated privileges or that execute setuid/setgid binaries may not be traceable without root permissions, as the kernel restricts ptrace access for security reasons. Lastly, the presence of strace can alter execution times and hide concurrency issues that only manifest under real load conditions, so it is recommended to use it in testing or debugging environments and not in critical production.
Conclusion
In summary, strace is an essential tool for any system administrator or developer working on Linux. Its ability to reveal exactly how a program interacts with the kernel makes it a powerful ally for debugging failures, optimizing performance, and understanding the behavior of complex software. With a little practice and knowledge of its most common options, precise diagnoses can be obtained and problem resolution accelerated without needing to modify the source code or reinstall packages.
This post is also available in ESPAÑOL.