Converting pdf file to html canvas with text selection using PDF.js

In this tutorial, we are going to display a pdf file inside an html canvas object with the ability to select the text in the page.

1. Create a new folder(“root folder” later in the text) called testpdfjsselection.
2. Inside this folder run git clone https://github.com/mozilla/pdf.js.git to download the latest version of the PDF.js library.
3. Put any pdf file you want inside the root folder(in the example below we’ll use “oasis.pdf”).
4. In the root folder create a file called index.html with the following code:

5. Inside the root folder start a local php server via php -S localhost:8080.
6. Open http://localhost:8080 in your browser.

You should see the following output:

You might see that the first page of your PDF document is displayed in the canvas object. Notice that the text is also selectable as there are absolutely positioned div tags with plain text inside above all text strings. Now you may add the annotation feature using the fabric.js library. Or create a neat flipbook using turn.js. Then you can convert all this html pieces into a single pdf file using jsPDF.

13 Comments

  1. Thomas

    Hi – Everything renders EXCEPT the selectable text layer. I’m also getting two console errors: “TypeError: pdfjsLib is undefined” from ui_utils.js (line 38, col 5) and “ReferenceError: TextLayerBuilder is not defined” from index.html (line 43). Any idea?

  2. Thomas

    Hi Vladimir – Thank you for your response. I actually had those files attached. It’s still not working for me (printing the above errors into the console). And I’ve been really hard pressed to find a working example online. Would you mind telling me what version of PDF.js version you are using? The latest stable version is 1.4.20 – that’s what I’m using. Maybe I should try the beta version?

  3. Thomas

    I figured it out! The pdf.js library located here () is the very latest, which causes this example to break. I’ve searched high and low for a working example using the latest build and couldn’t find one (is it me, or is the documentation for PDF.js somewhat chaotic?). Anyway, I had to go and grab an earlier version of the plugin. I included it and although the text layer is a bit unaligned with the actual PDF, it works a charm.

    One last question: Is there a way to only write certain words to the text layer while keeping their position/screen coordinates? I know I can do this with additional jQuery, but was hoping for a baked solution.

  4. Koen

    @Thomas, can you provide a working example?
    I’ve tried almost all versions of PDF.JS from 1.1.125 to 1.4.11 as older versions, but keep getting the same error you had in the first place.

  5. Damjan Pavlica

    Unfortunately, your example is not working for me. I have the same problem as Thomas – everything renders except the selectable text layer. I’m also getting two console errors: “pdfjsLib is undefined” and “TextLayerBuilder is not defined”.

    Can you please put your working example on JSFiddle? I am looking for a working PDF.js text selection example for more then a year now.

    Thanks for your time.
    Daman

  6. Juli Nowlin

    Howdy!

    You Need Leads, Sales, Conversions, Traffic for ryzhak.com ? Will Findet…

    I WILL SEND 5 MILLION MESSAGES VIA WEBSITE CONTACT FORM

    Don’t believe me? Since you’re reading this message then you’re living proof that contact form advertising works!
    We can send your ad to people via their Website Contact Form.

    IF YOU ARE INTERESTED, Contact us => lisaf2zw526@gmail.com

    Regards,
    Nowlin

  7. Demetry Pascal

    doesn’t work for me, I see only black rectange, it doesn’t become pdf
    no errors except [404]: GET /favicon.ico – No such file or directory
    pdf.js is working normal, I did build and I can use it, but not ur script

Leave a Reply

Your email address will not be published. Required fields are marked *