The file command in Linux: identifying the file type

Introduction

In the day-to-day of a system administrator or a developer, it is common to encounter files whose type is not evident just by their name or extension. The Linux file command allows quickly identifying the type of data contained in a file, based on its actual content and not its name.

What does the file command do?

The file command performs a series of magic tests (magic numbers) and consults its signature database to determine whether a file is a text document, an image, an executable, a compressed file, etc. Its output is a readable description that helps decide the next step to take.

Basic syntax

The simplest use is:

file filename

Several files can be passed at once:

file file1 file2 file3

It also works with standard input by using - as the filename.

Most useful options

  • -b: shows only the description, without the filename.
  • -i: MIME-type output (for example, text/plain; charset=us-ascii).
  • -L: follows symbolic links and analyzes the file they point to.
  • -z: looks inside compressed files to identify their content.
  • -f file: reads the names of the files to analyze from a list file.

Practical examples

Identify an image:

file photo.jpg

Typical output:

photo.jpg: JPEG image data, JFIF standard 1.01, aspect ratio, density 1x1, segment length 16, baseline, precision 8, 1920x1080, components 3

Determine the MIME type of a document:

file -i report.pdf

Output:

report.pdf: application/pdf; charset=binary

Analyze a compressed file without decompressing it:

file -z backup.tar.gz

Output:

backup.tar.gz: gzip compressed data, was "tar.archive", last modified: Mon Sep 30 12:00:00 2025, max compression, OS: Unix

Use the -b option to obtain only the description:

file -b script.sh

Output:

Bourne-Again shell script, ASCII text executable

Interpretation of the output

The output of file usually consists of two parts separated by a colon: the filename and the description. When using -b or processing the output via scripts, it is useful to extract only the second part to make automated decisions.

Limitations and tips

Although file is very reliable, it is not infallible. Very small files or those with ambiguous signatures may be misclassified. In those cases, combining file with other tools such as hexdump or strings can provide more clarity. Additionally, keeping the signature database up to date (the file package in the distribution) ensures detection of new file types.

Conclusion

The file command is an essential tool for any Linux user who needs to quickly know what type of data they are handling. Its simplicity, speed, and wealth of options make it an indispensable ally in administration, debugging, and automation tasks.

This post is also available in ESPAÑOL.

Esta obra está bajo una Licencia Creative Commons Atribución 4.0 Internacional para Francesc Roig francesc@vivaldi.net .