From Previewing Files to Cracking Hashes


In August 2022, we completed a web application penetration test for a relatively new client. The scope was a pre-existing web application that allowed users to manage their calendars, plan events, upload documents and manage their accounts. The application had been tested by a previous penetration testing supplier and received
About Ruptura InfoSecurity​
Ruptura InfoSecurity are a fully accredited and trusted UK based cyber security provider. You can rest assured that our technical cyber security expertise and level of service is second to none.


The web application allowed users to upload any type of file. These files could then be “previewed” within a separate part of the application. For most file types, this downloaded the file for the user to view on their own machine. However, the web application provided special functionality for Word documents by converting them to PDFs, then using the browser’s built in PDF viewer to view them.

This was beneficial for most users of the application as .docx files were the most frequent file type uploaded and by being able to convert them to PDF on-the-fly didn’t require the user to download the Word document and open it on their machine.

As someone who was testing this application, this immediately raised suspicions for me. Generating PDFs or converting existing content to PDFs is a difficult thing to do safely. A vulnerability commonly seen when generating PDFs is Local File Inclusion (LFI), where it is possible to embed a file’s content into the PDF. One example of this is a bug bounty report by Jonathan Bouman where he was able to chain a server-side XSS into LFI in one of IKEA’s web applications.

However, in this case, the web application’s DOCX to PDF generator wasn’t using an intermediate headless browser (as many PDF libraries do to convert HTML to PDF). To start attacking this, it was important to identify whether the web application implemented its own solution to convert Word documents to PDF or whether it was using an off-the-shelf library. Luckily, many of these libraries embed the library and the version inside the metadata of the PDF.

Proof of Concept

I uploaded a Word document containing “Hello World”, previewed it and downloaded the PDF which was embedded on the web application. I then ran exiftool, a command line utility which reads embedded metadata from a large range of file types.
Bingo! The “Producer” field of the PDF metadata revealed that the web application was using Aspose.Words, an enterprise library for manipulating Word documents. At this point, I was somewhat disappointed though since I had heard of Aspose’s products and knew they were generally fairly robust (and with licenses starting from $4,000, I would expect them to be…) Nevertheless, as an application tester, it is important to experiment with well-established libraries, just on the off-chance it’s implemented incorrectly or there is a bug in the library itself.

Unfortunately exploiting this CVE was out of the question in the time we had for the engagement, but for someone else, this could be an easy remote code execution vulnerability.

I initially spent a lot of time looking for local file inclusion. In my many years of using Microsoft Word, I had never found a way (nor needed to) embed a file’s contents into the body.

Unfortunately, the trusty <iframe src=”file:///etc/passwd”></iframe> wouldn’t work here as we’re dealing with Word documents, not HTML. So was there another way we could somehow embed files to get local file inclusion?

Well, it turns out, there is! Sort of… Word documents support a feature known as linking to an image. This allows you to insert a link to an image, which will change whenever the linked image changes, similar to a symlink on filesystems. Unfortunately, if the linked file isn’t an image, Word will just render a broken image icon. So at this point, it was only possible to read local images, which isn’t overly useful. So, what now?

Obtaining Hashes

Well, it was also possible to embed a link to a file on a remote SMB share. If you’ve tested Windows-based applications before, you probably know where this is going. When a user requests a file from an SMB share, one method of authentication is Net-NTLMv2. Essentially, the user sends their hashed credentials to the service which then verifies the access controls and returns the requested file, service, etc.

Therefore, if we include a link to a remote SMB share, when Aspose fetches it to render the PDF, it also passes the Net-NTLMv2 hash along with it! We could then run a malicious SMB server to dump the Net-NTLMv2 hashes. We used impacket’s but another common choice is Responder. So, to create the document, we insert a link to an image that uses the UNC path for a remote SMB share.

I uploaded the Word document with the linked image to the web application and previewed it, causing Aspose to render it to PDF and…

The service account “adfs” connected to our SMB share, handily providing its Net-NTLMv2 hash too. And not only do we now have the Net-NTLMv2 hash of the service account of this development instance, this account is also joined to a production domain (We didn’t believe it either at first…). While we can’t directly pass this hash, we can fire up our password cracking rig and hopefully crack the password. This then opens up a whole new environment for us to compromise. In this case, it was deemed out of scope to go any further, but the potential and PoC was clearly highlighted to the client at this stage.

The Fix

So what’s the solution for this? Well, it turns out that the Aspose documentation mentions this exact vulnerability and the remediation is quite simple: prevent Aspose from loading external resources when rendering documents. It seems like an oversight that Aspose leaves external resource loading on by default, so much so that even their penetration testing report recommended disabling it by default.

Maybe this feature should be disabled by default…

Lessons Learnt

As a penetration tester, always review the metadata of files downloaded from a web application. They can sometimes provide information about the internal workings of the application and if we’re lucky, leak information about the use of vulnerable libraries.

As a developer, don’t rely on hiding metadata for security. Make sure you keep your libraries updated and to always review the security section within the documentation of the library. Avoid joining your development instance to a production environment, especially if your development instance is publicly accessible (and full of holes :D).

Ruptura InfoSecurity are a UK based cyber security provider. Our services are provided entirely in-house and are fully accredited by industry standard qualifications and standards.