The strings command in Linux: extracting text from binary files

Introduction

In the world of system administration and malware analysis, we often encounter binary files that we cannot read directly. The Linux strings command allows us to extract readable character sequences that are embedded within those binaries, revealing useful information such as error messages, file paths, library versions, and more.

What is strings?

The strings command is part of the binutils package and is available on almost all Linux distributions. Its main function is to scan an input file, look for printable character sequences (generally ASCII) of a minimum length, and display them on standard output. This makes it a quick tool for obtaining clues about the content of executables, shared libraries, kernel objects, or even firmware files.

Basic Syntax

The simplest way to use strings is:

strings archivo.bin

This will print all the strings the program finds, one per line. However, the default behavior can produce a lot of output, so it is common to combine it with options that refine the search.

Most Useful Options

  • -n or --bytes=: sets the minimum length of strings to consider. Default is 4; increasing it to 6 or 8 reduces noise.
  • -a or --all: scans the entire file, not just initialized sections (useful for binaries with data in .rodata sections).
  • -t : displays the offset of each string in the specified format (d = decimal, o = octal, x = hexadecimal).
  • -e : adjusts character encoding (s = simple 7‑bit ASCII, S = 7‑bit big‑endian, l = 32‑bit little‑endian, L = 32‑bit big‑endian). Useful for binaries containing Unicode text.
  • -f: precedes each string with the input file name (useful when processing multiple files).

Practical Examples

Suppose we want to examine a binary called /usr/bin/miprograma and we are interested only in strings of at least 6 characters:

strings -n 6 /usr/bin/miprograma

To also see the hexadecimal position of each finding:

strings -t x -n 6 /usr/bin/miprograma

If the binary might contain UTF‑16 little‑endian text, we use:

strings -e l -n 6 /usr/bin/miprograma

In the context of malware analysis, it is often combined with grep to search for specific indicators:

strings -n 8 sospechoso.exe | grep -i 'http://'

This extracts all long strings and filters those that look like URLs.

Limitations and Considerations

Although strings is powerful, it has some limitations:

  • It only finds byte sequences that match the printability criteria; any encrypted or compressed text will go unnoticed.
  • In very large binaries, the output can be overwhelming; filtering with grep or redirecting to a file helps.
  • The command does not interpret the file structure; therefore, some strings may appear out of context (for example, within embedded image data).
  • On systems with locales different from C, the behavior of what is considered printable can vary; it is recommended to run LC_ALL=C strings ... for consistent results.

Conclusion

The strings command is an essential tool in the arsenal of any system administrator, developer, or security analyst. Its simplicity and speed allow obtaining valuable information from binary files without needing complex disassembly tools. Knowing its options and knowing how to combine them with other utilities such as grep, sort, or uniq maximizes its usefulness in debugging, forensics, and malware reverse engineering tasks.

This post is also available in ESPAÑOL.

Leave a Reply

Your email address will not be published. Required fields are marked *

Esta obra está bajo una Licencia Creative Commons Atribución 4.0 Internacional para Francesc Roig francesc@vivaldi.net .