Web application to upload a Word document and convert it to PDF file


Reading time: 35 minutes | Coding time: 30 minutes

In this article you will learn how to develop a web application in Node.JS where you can upload your word document and can download a PDF version of your document. Let's get started.

Before we jump in let us try to understand the overall flow of the of the article which you will be implementing:-

1-First of all set up the working environment make the working directory and install the required packages.
2-In second step you will be working on creating the root page(index.html) which will contain a form to upload a file and submit the uploaded file for the conversion
3-In third step we will start the work on our main file app.js which will define all the functionality of our web application.
4-After we have started the work on app.js then we will work on adding additional pages to make it more user-interactive like adding a /download and a /thankyou page which will be used to provide a download link and a after download link respectively and link all these pages to the app.js file.
5-At last we will define a function to delete the temporary files that were created in the server during the conversion process which will complete the web-application to convert docx to pdf files.

word

Set up the environment for the web-app

First make a directory where all your files and folders will reside.
Go to terminal or cmd write the following commands

mkdir online_pdf_converter
cd online_pdf_converter
mkdir uploads //Folder to temporary store the files
npm init //choose defaults
code .   // open visual studio code

Now,what these commands will do is set up your project and will open visual studio code.

Install the required packages

Now install the following packages/modules that will be required for the web-applicaiton

npm install express --save
npm install docx-pdf --save
npm install express-fileupload --save

Let's us try to understand what these packages will be used for:-
1-> Express will be used for routing pages and creating the server where the web application will run.
2-> Docx-pdf is a npm package which will be used to convert docx file to pdf(if you want you can use other packages also).
3-> Express-fileupload is a npm package which will be used to upload files in the server for conversion.

After installation of these packages/modules the package.json should look like this:

{
  "name": "online_pdf_converter",
  "version": "1.0.0",
  "description": "An online application to convert word document to pdf",
  "main": "app.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "keywords": [
    "Nodejs",
    "express"
  ],
  "author": "Sharmaprateek52",
  "license": "ISC",
  "dependencies": {
    "docx-pdf": "0.0.1",
    "express": "^4.17.1",
    "express-fileupload": "^1.1.6"
  }
}

Let's code

Create the root page

Now, make a html file which will serve as our root page you can name it index.html or anything you like but make the required changes in the app.js file.
What we will do in index.html is that we will define a form which will contain a upload and a sumbit button which will be used to upload the .docx file and start the conversion process respectively.
index.html shoul look like this:-

<!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8">
    <title>PDF converter</title>
  </head>
  <body>
    <div class="container">
    <form class="fileupload" action="upload" method="post" enctype="multipart/form-data">
      <h1>Upload File Here</h1>
      <input type="file" name="upfile" value="">
      <input type="submit"/>
    </form>
    </div>
  </body>
</html>

When you will open index.html it will look something like this:
Screenshot-from-2020-03-07-23-39-59-1

Create app.js file

Now create a app.js file which will be the main page but make sure that it should be of the same name as that of the "main" in the package.json file which you should have defined during npm init process.

In the app.js first of all include all the packages and modules that you will use.It can be done using require keyword as shown below:-

const express = require('express');
var app = express();
var upload = require('express-fileupload');
var docxConverter = require('docx-pdf');
var path = require('path');
var fs = require('fs');

After the inclusion of all the packages in the app.js file we will use these packages along with other node packages to link all the pages and try to run the web application.
Now, after all the packages are included we will first create some variable which will be used during the process and use the upload() function to give the program the power to use express-fileupload packages.Then link the index.html with the root page('/') and define the post function to upload the file for the conversion which will be used to upload the .docx file into the server and a converter function to convert the .docx file to .pdf file.

//Variables
const extend_pdf = '.pdf'
const extend_docx = '.docx'
var down_name
//use express-fileupload
app.use(upload());

//This will route the root page with index.html
app.get('/',function(req,res){
  res.sendFile(__dirname+'/index.html');
})
//Post the upload file
app.post('/upload',function(req,res){
  console.log(req.files);
  if(req.files.upfile){
    var file = req.files.upfile,
      name = file.name,
      type = file.mimetype;
    //File where .docx will be downloaded  
    var uploadpath = __dirname + '/uploads/' + name;
    //Name of the file --ex test,example
    const First_name = name.split('.')[0];
    //Name to download the file
    down_name = First_name;
    //.mv function will be used to move the uploaded file to the
    //upload folder temporarily
    file.mv(uploadpath,function(err){
      if(err){
        console.log(err);
      }else{
        //Path of the downloaded or uploaded file
        var initialPath = path.join(__dirname, `./uploads/${First_name}${extend_docx}`);
        //Path where the converted pdf will be placed/uploaded
        var upload_path = path.join(__dirname, `./uploads/${First_name}${extend_pdf}`);
        //Converter to convert docx to pdf -->docx-pdf is used
        //If you want you can use any other converter
        //For example -- libreoffice-convert or --awesome-unoconv
        docxConverter(initialPath,upload_path,function(err,result){
        if(err){
          console.log(err);
        }
        console.log('result'+result);
        res.sendFile(__dirname+'/down_html.html')
        });
      }
    });
  }else{
    res.send("No File selected !");
    res.end();
  }
});

Create and add additional html files

Now, add a down_html.html file in your current working directory you can name it anything you want.In down_html.html file we will use jquery to define a function which will contain a download button to download the converted file.

<html>
    <head>
        <script src = "https://ajax.googleapis.com/ajax/libs/jquery/3.4.1/jquery.min.js"></script>
        <script src = "https://socket.io/socket.io.js"></script>
    </head>
    <body>
        <h1>Thanks for using download your converted file.</h1>
        <button id = "btn_download">Download</button>
        <script type="text/javascript">
        $("#btn_download").click(function(){
            window.open('/download');
        })
        </script>
    </body>
</html>

It will look something like this:
Screenshot-from-2020-03-07-23-50-27

After adding the down_html.html file let us also add a thankyou page which will be displayed after downloading the pdf file.Let's name it thankyou.html.In this file we will do nothing but add a header which will display Thank You for the download.

<html>
    <head>
        <script src = "https://ajax.googleapis.com/ajax/libs/jquery/3.4.1/jquery.min.js"></script>
        <script src = "https://socket.io/socket.io.js"></script>
    </head>
    <body>
       <h1>Thank You for the download</h1>  
    </body>
</html>

Link all the files with app.js and adding the delete function

After creating all the required html files now link all these files in the app.js.It will be a similar process like linking the index.html.We will also add a delete function to delete the temporary files stored in the server after the user has download the converted pdf file.

app.get('/download', (req,res) =>{
  //This will be used to download the converted file
  res.download(__dirname +`/uploads/${down_name}${extend_pdf}`,`${down_name}${extend_pdf}`,(err) =>{
    if(err){
      res.send(err);
    }else{
      //Delete the files from uploads directory after the use
      console.log('Files deleted');
      const delete_path_doc = process.cwd() + `/uploads/${down_name}${extend_docx}`;
      const delete_path_pdf = process.cwd() + `/uploads/${down_name}${extend_pdf}`;
      try {
        fs.unlinkSync(delete_path_doc)
        fs.unlinkSync(delete_path_pdf)
        //file removed
      } catch(err) {
      console.error(err)
      }
    }
  })
})
//linking of thankyou page
app.get('/thankyou',(req,res) => {
    res.sendFile(__dirname+'/thankyou.html')
})
//Starting the server at port 3000

app.listen(3000,() => {
    console.log("Server Started at port 3000...");
})

Summary

Now what's happening here is that when you upload a valid file and press the submit button it will put your file in the server(uploads folder) and convert that file into a pdf file and will store the converted pdf file temporarily in server but when you download your converted pdf file the docx and pdf file in server(uploads folder) will get deleted using the function we defined in the app.js file.
That's it now run your program.

node app.js

This will run your application.

Learn more