The tar command in Linux: archiving and compressing files

Introduction

The tar command is one of the oldest and most versatile tools in Linux for grouping multiple files into a single file, known as a tarball. Although by itself it does not compress, tar is frequently combined with gzip, bzip2, or xz to obtain compressed files that take up less space and are easier to transfer.

Basic Syntax

The general form of tar is:

tar [options] [files or directories]

The most common options are:

  • -c: create a new archive.
  • -x: extract files from a tarball.
  • -t: list contents without extracting.
  • -v: verbose mode, shows processed files.
  • -f: specifies the tarball file name.

Creating a tarball without compression

To group several files or a directory into a tar file without compression:

tar -cvf respaldo.tar /home/usuario/documentos

This creates respaldo.tar and shows each file added thanks to the -v option.

Extracting a tarball

To restore the files:

tar -xvf respaldo.tar

If you want to extract to a different directory, use -C followed by the path:

tar -xvf respaldo.tar -C /tmp/restaurar

Compression with gzip

The most used combination is tar with gzip, identified by the extension .tar.gz or .tgz:

tar -czvf copia.tar.gz /var/www

The -z option activates gzip, -c creates, -v verbose, and -f specifies the file.

Compression with bzip2

For a higher compression ratio (at the cost of more time) bzip2 is used:

tar -cjvf copia.tar.bz2 /var/www

The -j option indicates bzip2.

Compression with xz

xz offers the best compression ratio, ideal for software distributions:

tar -cJvf copia.tar.xz /var/www

The -J option activates xz.

Listing the content of a compressed tarball

You can see what’s inside without decompressing:

tar -tzf copia.tar.gz

For bzip2 use -tjf and for xz -tJf.

Adding files to an existing tarball

Although tar is not designed to modify files directly, you can use the -r (add) option with uncompressed files:

tar -rvf respaldo.tar nuevo.txt

For compressed tarballs it is necessary to decompress, add, and recompress.

Tips and best practices

  • Always test extraction in a temporary directory before overwriting important files.
  • Use –exclude to omit patterns, for example –exclude=’*.log’.
  • Combine tar with ssh to create remote backups: tar -czf – /etc | ssh usuario@servidor ‘cat > /backup/etc.tar.gz’.
  • Verify integrity with tar -tzf or gzip -t as appropriate.

Conclusion

Mastering tar allows you to efficiently manage groupings and compressions of data in Linux, whether for backups, software distribution, or simple file organization. Practicing the different options and combining compression algorithms will save you time and disk space.

This post is also available in ESPAÑOL.

Leave a Reply

Your email address will not be published. Required fields are marked *

Esta obra está bajo una Licencia Creative Commons Atribución 4.0 Internacional para Francesc Roig francesc@vivaldi.net .