Different ways to unzip a file in Python


When working with archived files there are various ways in which you can unzip your file. In this article we will learn how to extract single or multiple files from a zip file, unzip the whole file etc.

Different ways to unzip a file:

  1. Extracting all the members(content) of an archived file. OR Extracting all the members(content) of the archived file in the current directory.
  2. Extracting all the members(content) of an archived file in another directory.
  3. Extracting only some specific member(content) of an archived file based on different conditions.

Module used:

zipfile: ZIP file format being the most common archive and compression standard the zipfile module is used to access functionalities which would help us to create,read,write,extract and list a ZIP file.

Opening an archive file in read mode

We are going to work on the ZIP file format so we first need to learn how to read a zip file in python or rather open a zip file in read mode.We will use the ZipFile() function of the zipfle module to open a zip file in read mode.

ZipFile():This function is used to open in read,write or append mode or to create a new zip file.It provides us with a way to read ,write and append to a zip file.It creates a zipfile object of the specified file.

Syntax:ZipFile(file,,mode="read/write/append/create",compression="compression method",allowZip64=yes)

Parameters

  • file: This argument should be a filename or the path to a zip file or a file object or a path object.

  • mode:This argument defines the mode in which the zip file needs to be accessed.
    There are four modes:

  1. r:This mode is used to read an existing file.
  2. w:This mode is used to write to a file.
  3. a:This mode is used to append to a file.
  4. x:This mode is used to exclusively create a new file and write to it.
  • compression:This argument stores the compression method for the zip file.it can have the following values:
  1. ZIP_STORED
  2. ZIP_DEFLATED
  3. ZIP_BZIP2
  4. ZIP_LZMA
  • allowZip64:This argument stores a boolean value that is either true or false.If the zipfile is larger than 4 GiB then it requires ZIP64 extension an in this case allowZip64 is set to true otherwise it is d=set to false.

Extracting all the members(content) of the archived file

OR

Extracting all the members(content) of an archived file in the current directory

We can use the functionalities provided by the zipfile module to extract all the content of an archived file.Using the extractall() function of the zipfile module we can extract all the content of the archived file at once.

extractall():

This function takes the path where the archive needs to be extracted,the members(content) that needs to be extracted and the password of the archive if it is encrypted and extracts all the members of archive file if the member argument is set to none.

Syntax:extractall(path,members,pwd)

Parameters:

path:This argument stores a path to directory where the zip files need to be extracted, if it is not specified the file is extracted in the current working directory.

members:This argument should be a list containing the names of the members that we want to extract from the zip file.

pwd:This argument is a string containing the password of the zip file if the file is encrypted.

Example program:

import zipfile
import os
#Printing list of item in the current working directory before extracting contents of zip file
print("List of items in the directory before extraction")
for item in os.listdir(path='.'):
    print(item)
print("\n\n")
#opening the zip file in read mode.
#Extracting all the content of the zip file in current working directory.    
with zipfile.ZipFile("logbackup.zip","r") as zf:
    zf.extractall()
#Printing list of item in the current working directory before extracting contents 
print("List of items in the directory after extraction")    
for item in os.listdir(path='.'):
    print(item)  

Output:

List of items in the directory before extraction
a
b
c
file6
logbackup.zip
zptry.py

List of items in the directory after extraction
a
b
c
file1.txt
file4
file5
file6
logbackup.zip
metadata.txt
zptry.py

Extracting all the members(content) of an archived file in another directory.

Now that we know how to extract the contents of an archived file into the current directory we can learn how to extract the contents into another directory .We might need this feature when are need the files to be organized into a new directory or to be extracted into a particular directory.We will use the extractall() function with the argument path="the path of the directory where we need to extract the content of zip file".

Example program:

import zipfile
import os
#Opening the zip file in read mode
#extracting all the content of the zip file in directory 'trial'.
with zipfile.ZipFile("logbackup.zip","r") as zf:
    zf.extractall('trial')

print("List of items in the specified directory after extraction")
pth=os.path.join(os.getcwd(),'trial')
for item in os.listdir(path=pth):
    print(item)  

Output:

List of items in the specified directory after extraction
file1.txt
file4
file5
metadata.txt

Extracting only some specific member (content) of an archived file based on different conditions.

1.Extracting a list of files from all the files in the archive

We can extract only the files which we want by passing a list of names of the files we want to extract .For this article we assume we have a zip file which contains 12 files name january.txt,february.txt.....upto december.txt respectively and we need to extract only the files name january,july,august so we create a list containing the names of the required files and pass this list as an argument in the extractall() function .

Example program:

import zipfile
import os
flist=['january.txt','july.txt','august.txt']
with zipfile.ZipFile("logbackup.zip","r") as zf:
    zf.extractall('months',members=flist)

print("List of items in the specified directory after extraction")
pth=os.path.join(os.getcwd(),'months')
for item in os.listdir(path=pth):
    print(item)  

Output:

List of items in the specified directory after extraction
august.txt
january.txt
july.txt

There can be various condition which we can use to extract members of the archive like size of file,ate and time of last modification,file name start with 'XXX',whether the member is a directory or not etc.

With this article at OpenGenus, you must have a complete idea of Different ways to unzip a file in Python.