Open-Source Internship opportunity by OpenGenus for programmers. Apply now.
Reading time: 30 minutes | Coding time: 5 minutes
GNU Tar (Tape Archiver) is an open source file compression and decompression tool. In this article, we will explore how to use it along with its different options.
We will cover the following sub-topics:
- Create a .tar archive file
- Extract a .tar archive file
- List the contents of archive file
- Append files at end of archive files
- Creating a single archive file for multiple file systems
- Create a .tar.gz archive file
- Extract a .tar.gz archive file
- Shell script to understand .tar versus .tar.gz compressed files
- Checking diff between archive file and source file system
- Updating archive after changes in file system
Syntax:
tar -[Options] filename
Options:
-A, --catenate, --concatenate appends newly created tar file with a previous version of the tar file.
-c, --create: Creates a new .tar or .tar.gz archive
-d, --diff, --compare: compares the file system and its archived file
--delete: delete files from the archive
-r, --append: appends file at end of the tar file
-t, --list: show all the files present in the archived tar file
-u, --update: adds the files from the file system not present in the archive file
-x, --extract, --get: extracts the files from archive to the destination folder.
-v, --verbose: verbosely gives summary of each process in execution of tar command.
-z, --gzip: creates or decompresses a tar.gz file
Note: Append adds a file at end of tar file while concatentation adds another tar at end of the tar file.
1. Create a .tar archive file
Syntax:
tar -c[v]f /{destination address}/{compressedfilename}.tar {file system}
Here,
- -c: option is used to create a .tar file
- -v: is an optional tag for verbose, i.e., display a summary of each task performed during file compression
- -f: tag is used to access files to be archived.
- tar creates a compressed file called {compressedfilename}.tar and stores it in the destination address.
- The file system to be archived must be specified with the absolute address.
Implementation:
tar cvf compress.tar /home/nishkarshraj/Desktop/HelloWorld
- A HelloWorld directory exists on the absolute path (with respect to root directory) /home/nishkarshraj/Desktop where nishkarshraj is a user in Linux machine.
- tar compresses the files into an archive file compress.tar and displays progress verbosely.
- Since destination path of the compress.tar file is not specified, it is stored on the current directory of execution of the command, i.e., root directory.
- First command ls shows the content of current directory which has no archive files.
- tar command compresses the HelloWorld directory in the specified path and creates a compress.tar file in current directory.
- ls command entered again shows the existence of compress.tar file in current directory.
2. Extract a .tar archive file
Syntax:
tar -x[v]f {path to}/{filename}.tar
- -x: tag specifies tar to extract the archive file
- -v: is an optional tag which displays summary of each process in the tar extraction.
- -f: tag fetches each file of the archive to be extracted.
- tar tool extracts the filename.tar file in the same folder where the file system previously existed before being compressed.
Note: Reason of tar extracting the files in same location they were archived is that .tar files store namespace file rather than filename. Thus, a file called file1.txt stored at /home/Desktop will be stored as /home/Desktop/file1.txt in archive rather than as file1.txt
Implementation:
tar xvf compress.tar
- tar extracts the archive file present in specified path (here, no path is specified in prefix of the tar file, thus current directory is taken) and sends the extracted files in the same filesystem from which they were compressed.
Working on the same compress.tar file which was created in Task 1:
- First list content of /home/nishkarshraj/Desktop using the ls command and check that it does not contain the HelloWorld Directory.
- Use the tar tool to extract the files from archive.
- List content of /home/nishkarshraj/Desktop again to verify HelloWorld directory is created.
3. List contents of the archive file
It is possible to see the individual files present in the archive file using tar command.
Syntax:
tar -tf filename.tar
- -t option is used to list content of the archive file
- -f option fetches each file present in archive file.
4. Append files at the end of archive files
It is possible to append files at the end of archive files using tar command.
Syntax:
tar -rf {filename}.tar {file to be attached}
or
tar --append -f {filename}.tar {file to be attached}
- -r or --append tag is used to append the file specified at end of archive file.
Here, following command are used on the Shell:
- tar -cvf file.tar /home/nishkarshraj/Desktop/HelloWorld
It creates an archive file for specified HelloWorld directory at current location (root /). - ls
Lists content of current folder to verify the creation of the archive file highlighted in red color. - echo "test data" >> test.data
It creates a new file called test.data with the content "test data" - ls
Lists content of the current folder to verify the creation of test.txt file - tar -rf file.tar test.txt
Appends the test.txt file at the end of file.tar archive. - tar -tf file.tar
Lists the content of the file.tar file showing the newly added test.txt file at end of it.
5. Create a single archive file for multiple file systems
Multiple file systems can be compressed into one archive file by the tar.
Specify all the file system to be compressed in space separated list after the filename.tar in tar command.
Implementation:
Here, two directories, HelloWorld/ and Test/ are compressed into a single archive file.
6. Create a .tar.gz archive file
Tar tool can be used to create another type of archive file with the extension .tar.gz which follows the GNU Compression algorithm.
Syntax:
tar -c[v]zf {Destination path}/{filename}.tar.gz {file system}
-z: -z option specifies the tar to create an archive file using GNU compression algorithm
Implementation:
tar cvzf file1.tar.gz /home/nishkarshraj/Desktop/HelloWorld
- It creates a file1.tar.gz archive file in current directory (Here root directory, /)
- The source file system to be compressed is HelloWorld in the /home/nishkarshraj/Desktop path.
- ls command displays that no archive file is present in the current folder (here, root).
- tar command compresses the HelloWorld directory in the specified path to a archive file file1.tar.gz in the current path.
- ls command entered again displays the newly created .tar.gz archive file.
7. Extract a .tar.gz archive file
Syntax:
tar -x[v]zf {path to}/{filename}.tar.gz
It extracts all files in filename.tar.gz directory and stores them in the source file system.
Implementation:
tar xvzf file1.tar.gz
- ls command for /home/nishkarshraj/Desktop shows that HelloWorld directory does not exists in the path.
- tar xvzf on the file1.tar.gz archive file extracts the folder into /home/nishkarshraj/Desktop path.
8. Shell script to understand .tar versus .tar.gz compressed files
Generally speaking, GNU compressed archived files with extension .tar.gz are more efficient that normal .tar archive files but this is not true for all file systems.
Here, a shell script is created to compress a same file system (HelloWorld directory) using both simple compression and GNU compression algorithm and their respective size are displayed using the du Disk utility command.
Code:
#!/bin/bash
# Simple compression
tar cvf file1.tar /home/nishkarshraj/Desktop/HelloWorld
du -sh file1.tar
# GNU compression
tar cvzf file2.tar.gz /home/nishkarshraj/Desktop/HelloWorld
du -sh file2.tar.gz
Output:
Here,
- Disk usage of .tar file: 12 Kb
- Disk usage of .tar.gz file: 4 Kb
Thus, .tar.gz files have higher compression rate.
9. Check diff between archive file and source file system
Tar tool can be used to check the difference between the .tar archive file and the source file system.
Syntax:
tar -dvf {filename}.tar {path of source folder}
- -d command is used to see the diff
Lets create an archive file from the same HelloWorld directory called file.tar and then change it to check the diff.
Creation of file.tar for HelloWorld directory
Modify the HelloWorld directory
Explanation of the Image
HelloWorld directory consists of two files: Intro.md and test.txt
- We see the diff between file system and archive file.
Since, no modifications are done, diff works as a listing of files in the archive. - We modify the test.txt file by adding "mod" string at the end of file.
- We see the diff again and the tar command lists the files in the archive along with message that:
mod time differs: Modification time of test.txt in archive file and in the file system differs.
size differs: Size of the test.txt file in archive differs from that in the file system.
Deleting files from the file system
Check the diff if files are deleted from the file system.
Explanation of the image:
- We remove the test.txt file using the rm command.
- We see the diff and it lists the content of the archive file with following message after test.txt
Warning: Cannot stat: No such file or directory
This message signifies that the test.txt file in the archive is no longer mapped with the file in archive which means the file has been deleted from file system.
Creation of new files in filesystem
We check the diff of archive and file system by creating new files not existing and thus not mapped in the archive files.
Explanation of the image:
- We create a new file called new.txt containing a string new
- We see the diff but it does not show any output related to new.txt because there was no such file on creation of archive files.
Conclusion:
diff function maps the individual files in the archive against the original file system to check for modification with respect to modification time and size and also to check if the original files are deleted or not but does not check creation of new files in the same file system.
10. Update the archive file with modified file system
Tar can be used to update the archive file to have the same content as that of the modified source file system having diff with it.
Syntax:
tar -uf {filename.tar} {path to source file system}
or
tar --update -f {filename.tar} {path to source file system}
Explanation:
Here, we continue from the last step of diff with a newly created new.txt file and deleted test.txt file as diff with the archive file
- We update the tar file with the current state of the file system.
- On seeing the diff again:
- new.txt file is added.
- test.txt file is not removed from the archive even though it is deleted from the main file system.
Conclusion:
The update command for tar tool updates the archived files if they are modified, adds a new file if created but does not remove archived files that are deleted from the source archive system.