cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

PRO TIP - Validating and checking PDF content in synthetic

Julius_Loman
DynaMight Legend
DynaMight Legend

Browser monitors allow you to handle quite complex scenarios when it comes to testing your application,. Sometimes you need to check in your test case if your application is producing PDF and validate the content. Luckily, there is an easy solution for that by utilizing Mozilla's PDF.js library for working with PDF files in JavaScript.

 

Browser monitors allow you to use JavaScript clickpath events. These events are simply pieces of JavaScript code to be executed in the browser running your browser monitors. Also the code is able to talk back to the synthetic engine using the JavaScript Event API.

 

In the following "almost self-explaining" 😁 code example below we will:

  1. Start an asynchronous execution and tell the synthetic engine to wait for the results.
  2. Load the PDF.js library in the browser.
  3. Get the link to the PDF from the currently opened application page.
  4. Load the PDF by PDF.js library.
  5. Open the first page.
  6. And finally validate if the page contains expected content.
  7. We mark the step as failed when something goes bad of course 😀.

 

 

 

 

api.startAsyncSyntheticEvent();

// Insert PDFJS Library into browser window
var pdfjsTag = document.createElement('script');
pdfjsTag.setAttribute('src', '//mozilla.github.io/pdf.js/build/pdf.mjs');
pdfjsTag.setAttribute('type', 'module');
pdfjsTag.setAttribute('id', 'pdfjslib');
document.head.appendChild(pdfjsTag);

// Wait until the PDFJS library is loaded
pdfjsTag.addEventListener('load', function() {
    api.info("PDFJS - Library loaded");
    // Get link to PDF document from the page
    var url = document.getElementById('pdflink').getAttribute('href');
    api.info("PDFJS - Downloading pdf document from " + url);
    var pdfjsLib = window['pdfjs-dist/build/pdf'];
    pdfjsLib.GlobalWorkerOptions.workersrc='//mozilla.github.io/pdf.js/build/pdf.worker.mjs';
    var loadingTask = pdfjsLib.getDocument(url);
    loadingTask.promise.then(function(pdf) {
        api.info('PDFJS - PDF file loaded');
        // Fetch the first page
        var pageNumber = 1;
        pdf.getPage(pageNumber).then(function(page) {
            api.info('PDFJS - Loaded page ' + pageNumber);
            page.getTextContent().then(function(textContent) {
                var textItems = textContent.items;
                var finalString = "";

                // Concatenate the string of the item to the final string
                for (var i = 0; i < textItems.length; i++) {
                    var item = textItems[i];
                    finalString += item.str + " ";
                }
                api.info('PDFJS - Page text:' + finalString);

                // Check if text is present in the PDF page
                if (finalString.includes("Hello")) {
                    api.finish();
                } else {
                    api.fail("Validation error - Text not present in PDF");
                }
            });
        });
    }, function(reason) {
        // PDF loading error
        console.error("Could not open PDF", reason);
        api.fail("Could not open PDF (" + reason + ")");
    });
});

 

 

 

 
If you run the test locally, you can see the information messages easily in your browser, which is useful for further troubleshooting and fine-tuning your test scenario:

Event 4
------------------------------------------------------------------------------------
2022-09-19 10:32:56 INFO PDFJS - Library loaded
2022-09-19 10:32:56 INFO PDFJS - Downloading pdf document from https://raw.githubusercontent.com/mozilla/pdf.js/ba2edeae/examples/learning/helloworld.pdf
2022-09-19 10:32:56 INFO PDFJS - PDF file loaded
2022-09-19 10:32:56 INFO PDFJS - Loaded page 1
2022-09-19 10:32:56 INFO PDFJS - Page text:Hello, world!


Happy PDF validation!

Certified Dynatrace Master | Alanata a.s., Slovakia, Dynatrace Master Partner
11 REPLIES 11

dannemca
DynaMight Guru
DynaMight Guru

Wow, that's gold!!! Thank you for sharing this!!!

Site Reliability Engineer @ Kyndryl

AntonioSousa
DynaMight Guru
DynaMight Guru

This is 5 stars! Thanks a lot Julius for sharing!

Antonio Sousa

Mizső
DynaMight Guru
DynaMight Guru

Amazing! Really nice solution! Thank you for sharing! Lot more similar please... 😉

Dynatrace Community RockStar 2024, Certified Dynatrace Professional

Babar_Qayyum
DynaMight Guru
DynaMight Guru

Great Job @Julius_Loman 

xu_guo
Dynatrace Organizer
Dynatrace Organizer

Hi @Julius_Loman ,

Thank you for sharing. 

We use this code to validate the content in the PDF. However, the current PDFJS Libraries below are not available. The pages return the 404 code. 

//mozilla.github.io/pdf.js/build/pdf.js

//mozilla.github.io/pdf.js/build/pdf.worker.js

We find the other available PDFJS Libraries below. Can you please update them? Thank you!

pdfjsTag.setAttribute('src', '//mozilla.github.io/pdf.js/build/pdf.mjs');
pdfjsTag.setAttribute('type', 'module');
pdfjsLib.GlobalWorkerOptions.workersrc='//mozilla.github.io/pdf.js/build/pdf.worker.mjs';

 

Sure, corrected.

Certified Dynatrace Master | Alanata a.s., Slovakia, Dynatrace Master Partner

ayyanar_chinna1
Visitor

Hi @Julius_Loman 

i have used this script in my browser clickpath monitor, its failing and throwing error "1601 - JS execution took too long". could you help me get this fix

ayyanar_chinna1_0-1715626422528.png

 

Have you tried running the test locally? 

Certified Dynatrace Master | Alanata a.s., Slovakia, Dynatrace Master Partner

I have directly applied to my monitor updated with pdf link

// Get link to PDF document from the page
var url = document.querySelector("#\\36 030 > td:nth-child(5) > div > a").getAttribute("href");

@Julius_LomanFYI... When I tested the script you provided, I encountered the following error:

Uncaught TypeError: Cannot read properties of null (reading 'getAttribute')

This issue was also captured during synthetic execution.

Jennifer
Observer

Hi Julius,

How would you handle a pdf element that is retrieved from an "onclick" function.  It seems the element can be found by id, but does not retrieve the correct URL for downloading the pdf.

Code example:

<a href="javascript&colon;void(0);" onclick="SubmitForm('rdPage.aspx?rdReport=Rpt_Submission.SubReports.SubmissionUpdate&amp;inpMonth=07&amp;inpYear=2024&amp;inpReportID=0002355&amp;LinkHref=True','_blank','false','',null,null);" id="actEdit_Row1"><span class="line ThemeBold black rowFontLarge">TDL-*****l)</span></a>

 

Any help would be greatly appreciated!

Featured Posts