File and folder management is an essential task of all Linux system administrators. System admins often need to archive and zip older files and store them away so that more space becomes available for active project files. In doing so, we need to know how to create an archive file as well as how to work with it like opening it, exploring it and adding to or deleting from files to it.
To put it concisely, an archive is a single file that contains a collection of other files and/or directories. Archive files are by and large used for a transfer (locally or over the internet) or make a backup copy of a collection of files and directories which allow you to work with only one file instead of many. Likewise, archives are used for software application packaging. This single file can be easily compressed for ease of transfer while the files in the archive retain the structure and permissions of the original files. Here is a good article if you like to learn more about how Linux OS works.
This tutorial shows how to use tar to create an archive, list the contents of an archive, and extract the files from an archive. Two common options used with all three of these operations are ‘-f’ and ‘-v’: to specify the name of the archive file, use ‘-f’ followed by the file name; use the ‘-v’ (“verbose”) option to have tar output the names of files as they are processed. While the ‘-v’ option is not necessary, it lets you observe the progress of your tar operation.
We cover the following 3 topics in this tutorial: 1- Make an archive file, 2- List contents of an archive file, and 3- Extract contents from an archive file. We conclude this tutorial by reviewing the 9 Frequently Asked Questions or FAQs related to archive file management. What you take away from this tutorial is essential for performing tasks related to cybersecurity and cloud technology.
To make an archive with tar, use the ‘-c’ (“create”) option, and specify the name of the archive file to create with the ‘-f’ option. It’s common practice to use a name with a ‘.tar’ extension, such as ‘my-backup.tar’.
To create an archive called ‘asset.tar’ from the contents of the ‘asset directory, type:
$ tar -cvf asset.tar assetThis command creates an archive file called ‘asset.tar’ containing the ‘asset directory and all of its contents. The original ‘asset directory remains unchanged.
Use the ‘-z’ option to compress the archive as it is being written. This yields the same output as creating an uncompressed archive and then using gzip to compress it, but it eliminates the extra step. We cover more on archive compressing in our question No 4 of archive FAQ section.
To list the contents of a tar archive without extracting them, use tar with the ‘-t’ option.
To list the contents of an archive called ‘asset.tar’, type:
$ tar -tvf asset.tarThis command lists the contents of the ‘asset.tar’ archive. Using the ‘-v’ option along with the ‘-t’ option causes tar to output the permissions and modification time of each file, along with its file name—the same format used by the ls command with the ‘-l’ option.
To extract (or unpack) the contents of a tar archive, use tar with the ‘-x’ (“extract”) option.
To extract the contents of an archive called ‘asset.tar’, type:
$ tar -xvf asset.tarThis command extracts the contents of the ‘asset.tar’ archive into the current directory.
To extract the contents of a compressed archive called ‘asset.tar.gz’, type:
$ tar -zxvf asset.tar.gz
Now that we have learned how to create an Archive file and list/extract its contents, we can move on to discuss the following 9 FAQ questions that you may experience while working with Linux archives.
Regretfully, once a file has been compressed there is no way to add content to it. Thus, you would have to “unpack” it or extract the contents, edit or add content, and then compress the file again.
This depends on the version of tar being used. Newer versions of tar will support a --delete.
For example, let's say we have files file1 and file2 . They can be removed from file.tar with the following:
$ tar -vf file.tar --delete file1 file2
To remove a directory dir1:
$ tar -f file.tar --delete dir1/*
The simplest way to look at the difference between archiving and compressing is to look at the end result. When you archive files you are combining multiple files into one. So if we archive 10 100kb files you will end up with one 1000kb file. On the other hand if we compress those files we could end up with a file that is only a few kb or close to 100kb. Here is a good article to learn more about how Linux file system work.
As we saw above you can create and archive files using the tar command with the cvf options. To compress the archive file we made there are two options; run the archive file through compression such as gzip. Or use a compression flag when using the tar command. The most common compression flags are z for gzip, j for bzip and J for xz. We can see the first method below:
$ gzip file.tar
Or we can just use a compression flag when using the tar command, here we’ll see the gzip flag “z”:
$ tar -cvzf file.tar /some/directory
As a system admin, you run into many situations where you should archive multiple files or directories simultaneously. To achieve this using the tar command, you just simply supply which files or directories you want to archive as arguments to the tar command as shown below:
$ tar -cvzf file.tar file1 file2 file3 or $ tar -cvzf file.tar /some/directory1 /some/directory2
You may run into a situation where you want to archive a directory or file but you don't need certain files to be archived. To avoid archiving those files or “exclude” them you would use the --exclude option with tar:
$ tar --exclude ‘/some/directory’ -cvf file.tar /home/user
So in this example /home/user would be archived but it would exclude the /some/directory if it was under /home/user. It's important that you put the --exclude option before the source and destination as well as to encapsulate the file or directory being excluded with single quotation marks.
The biggest difference between tar and shar is the fact that shar is a shell script, that when executed will create the files. Shar is plain text which can be an advantage. But its outputs are executable which can pose a security risk. Note that shar is mainly used in the old Linux Operating Systems, so if you are running cyber security patching (see the list of special Linux OS), you may need to use it.
How to use shar:
$ shar file.extension > file.shar
how to unshar:
$ unshar file.shar
ar is mainly used for binary object files. ar will create a flat set of files whereas tar maintains directory structure. So it is much more suitable for distributing directories and files. How to use ar:
$ ar cr libmath.a
where c is create and r is insert file member to archive
to extract an ar file:
$ ar x libmath.a
cpio stands for copy in and out. The function of cpio and tar are fairly similar. However, tar is more widely used and much simpler. The file format is also different between the two. As you’ll see in the example below, cpio is a little more painful to use compared to tar:
$ ls | cpio -ov > /path/to/output/folder/obj.cpio-o is Read a list of filenames terminated by a null character instead of a newline and -v is verbose.
$ cpio -idv < /path/to/output/folder/obj.cpiowhere -i is extract, -d is make directories and -v is verbose
The tar command is very handy for creating backups or compressing files you no longer need. It's good practice to back up files before changing them. If something fails to work as it intended, you will always be able to revert back to the old file. Compressing files no longer in use helps keep systems clean and lowers the disk space usage. There are other utilities available but tar has reigned supreme for its versatility, ease of use and popularity.
If you like to learn more about Linux, taking the following courses is highly recommended:
Here is the list of our 9 free self-paced courses that are highly recommended:
If you like to learn more about Linux, take the following live Linux classes is highly recommended:
If you like to learn more about Linux, reading the following articles and tutorials is highly recommended:
Matt Zand is a serial entrepreneur and the founder of 3 tech startups: DC Web Makers, Coding Bootcamps and High School Technology Services. He is a leading author of Hands-on Smart Contract Development with Hyperledger Fabric book by O’Reilly Media. He has written more than 100 technical articles and tutorials on blockchain development for Hyperledger, Ethereum and Corda R3 platforms. At DC Web Makers, he leads a team of blockchain experts for consulting and deploying enterprise decentralized applications. As chief architect, he has designed and developed blockchain courses and training programs for Coding Bootcamps. He has a master's degree in business management from the University of Maryland. Prior to blockchain development and consulting, he worked as senior web and mobile App developer and consultant, angel investor, business advisor for a few startup companies. You can connect with him on LI: https://www.linkedin.com/in/matt-zand-64047871
We offer private custom tutoring classes both online and in DC, MD and VA for almost all of our courses or bootcamps. Give us a call or email us to discuss your needs.
$50 Limited OfferREGISTER NOW