How Hard Is It To Create a PDF Contract Note?

By Gary Kennedy
July 3, 2026

A small experiment with Apache PDFBox turned into a useful discovery: creating a clear, structured PDF record of a trade is not especially difficult. In an afternoon, I was able to generate a landscape “Transaction Statement” from existing trade email data, include provenance information, and produce a PDF whose extracted text was almost as orderly as a data dictionary. The experiment showed how modest the technical barrier is to producing a durable, human-readable and machine-readable contract note document.

In a previous article I wrote about data friction at EasyEquities, and in particular the oddity that the trade notification email appears to be the contract note.

That left me curious about a simple question:

How hard would it be to create a PDF contract note?

I have no experience in pdf report writing but I am guessing there are many specialist tools for producing financial documents at scale, probably with nice templates etc. But we already use Apache PDFBox in our code base for extracting data from PDF files, so out of curiosity I wanted to see what it would be like to use the same library to write a PDF file.

It turned out to be fairly easy.

Not trivial, because PDFs are still PDFs, and you need to think carefully about layout, spacing, fonts, and text placement. But it was not a major project either. It was roughly an afternoon of work to produce a clean, useful contract note.

That makes the decision of Easy Equities to use an email as the contract note all the more surprising.

Starting With Portrait

My first attempt used a portrait layout, mostly because that is how most contract notes seem to be presented.

It worked, but I did not find it especially pleasing. So I tried landscape instead.

That immediately felt better. The document became more compact, more balanced, and easier to scan. The data could could sit beside each other in convenient sections rather than being stacked into a long page.

Example: A generated Babylon Transaction Statement

Example transaction record
Generated from an EasyEquities trade email.

Since my experiment is not an official contract note, I will refer to it as a transaction statement going forward.

A Useful Personal Document Store

The first benefit was immediate and practical.

For my own personal document store, I could produce a PDF transaction statement for every .eml trade email I have from Easy Equities. That gives me a much more practical collection of records than a folder full of email files.

The source email is still important, of course. It remains the original evidence received from the broker. But the generated PDF gives me something easier to read, easier to archive, and easier to use later.

Provenance Belongs In The Document

The second benefit was provenance.

Once I started generating the document myself, it became obvious that every imported transaction should carry provenance information. Not just for EasyEquities, but for all transaction imports.

Where did this transaction come from?

Was it extracted from an email, a PDF, a CSV file, or something else?

What was the source file name?

When was it imported?

This information is easy to overlook when thinking only about the transaction itself. But it becomes important later, especially when records need to be checked, reconciled, or explained.

Our transaction statement now holds this information.

A Blueprint For The Transaction Detail Screen

The third surprise was that the PDF layout became a useful blueprint for the application itself.

At the moment, transaction characteristics are shown as line items in a table. That works well for scanning many transactions at once, but it is not ideal for looking closely at one transaction.

The landscape contract note suggests a better detail view.

A user could double-click a transaction row and see a screen with almost the same structure as the PDF: transaction, account, instrument, consideration, charges, settlement, and provenance. In other words, the document layout is not just a printable artifact. It is also a good user interface model.

That was unexpected, but pleasing.

Writing Order Matters

The last discovery was the most interesting from a data extraction point of view.

When you create the PDF yourself, you control the order and structure of the text written into the file. With PDFBox, the write order is closely aligned with the text extraction order.

That means the extracted text can be made surprisingly clean.

Example: A generated Babylon Transaction Statement extracted text

Example transaction record extracted text
Generated from an EasyEquities trade email.

The result is almost a small data dictionary. Field names and values appear in a deliberate order, first to last in each. The document remains readable for humans, but it is also friendly to machines.

In live PDF contract notes I have looked at, this is often not the case. Text extraction can be a messier business. Values appear out of order. Labels and numbers become separated. Headers, footers, tables, and layout artifacts all get mixed together.

But when the PDF is authored with extraction in mind, the result can be quite beautiful.

I now wonder if we should look at pdf reports which include table structure; for example, a transaction history report, could it be written in a way that also allows convenient text extraction.

A Small Experiment With A Large Lesson

This started as a curiosity.

Could I write a contract note with Apache PDFBox?

The answer was yes, and much more easily than I expected.

In an afternoon, it was possible to create a clean landscape PDF, generate transaction statements from existing email records, include useful provenance, and produce a document whose text extraction order is almost as structured as the visual layout.

That does not mean every broker needs to handcraft PDFs with PDFBox. But it does suggest that providing a proper PDF contract note is not an unreasonable expectation from a broker. For a broker, the transaction data already exists. The customer, account, instrument, quantity, price, charges, settlement date, and net amount are all known at the moment the trade is confirmed. Turning that information into a clear PDF contract note cannot be claimed to be difficult.