Introduction
In the world of system administration and data processing, having tools that allow quick text manipulation is essential. One of the most useful and simple Linux commands is cut, whose main function is to extract columns or fields from a text stream based on delimiters or character positions. Although it may seem limited at first glance, its combination with pipes and other commands makes it a key piece for scripts and log analysis.
Basic Syntax
The general form of the command is:
cut OPTION... [FILE]...
If no file is specified, cut reads from standard input, making it ideal for use in pipelines. The most important options define which part of the text to extract and how fields are separated.
Most Used Options
- -f N: selects field or fields N (for example,
-f 2for the second field). Multiple fields can be specified separated by commas (-f 1,3,5) or ranges (-f 2-4). - -d DELIM: sets the delimiter that separates fields. By default,
cutuses a tab, but with this option you can specify any character, such as a comma (-d ',') or a semicolon (-d ';'). - -b LIST: extracts bytes according to the LIST (useful when working with binary or fixed-width data).
- -c LIST: extracts characters by position, similar to
-bbut counting characters instead of bytes.
Practical Examples
Imagine a CSV file named datos.csv with the following content:
nombre,edad,ciudad Juan,30,Madrid Ana,25,Barcelona Luis,28,Sevilla
To get only the city column, we use:
cut -d ',' -f 3 datos.csv
This returns:
ciudad Madrid Barcelona Sevilla
If we want to remove the header and keep only the values, we can combine with tail:
cut -d ',' -f 3 datos.csv | tail -n +2
Another common case is to extract the first and third field:
cut -d ',' -f 1,3 datos.csv
Result:
nombre,ciudad Juan,Madrid Ana,Barcelona Luis,Sevilla
When the delimiter is not a simple character, such as multiple spaces, we can use -d ' ' and try to reduce multiple spaces with tr -s ' ' before applying cut.
When working with log files where information is in fixed positions, the -c option is very useful. Suppose each line has a 19-character timestamp followed by a message; to get only the message:
cut -c 20- archivo.log
This shows from character 20 to the end of each line.
Tips and Tricks
- Remember that
cutdoes not handle delimiters that are regular expressions; if you need something more complex, combine it withawk. - Use single quotes around the delimiter to prevent the shell from interpreting special characters.
- When working with ranges,
-f 2-means from field 2 to the last, while-f -2indicates from the start up to field 2. - To quickly view the structure of a file, try
cut -f 1-5 -d ',' archivo.csv | head. - In scripts, store the result in a variable:
COL2=$(cut -d ';' -f 2 entrada.txt).
Conclusion
The cut command is a lightweight but powerful tool for extracting text columns in Linux. Its simplicity makes it ideal for quick data processing tasks, while its ability to combine with other shell utilities makes it an indispensable component in any administration toolbox. Mastering its options and knowing when to use it will allow you to save time and write cleaner, more efficient scripts.
This post is also available in ESPAÑOL.