I have some sewing patterns that I would like to share (and hopefully swap) but all of the PDFs have a

“This was purchased by John Doe [email protected] #ordernumber - if you are not John Doe, please dob in the person you got this from to [email protected] so we can sick our lawyers on them”

sorta footer on every single page.

Obviously for privacy reasons (and because I don’t actually want lawyers sicked onto me), I need to remove this footer.

These are often complex PDFs with more than a hundred pages and multiple layers.

I managed to successfully remove the editing password (not user/viewing password, just can’t edit without password) with qpdf --decrypt. But removing that footer has left me at a dead end. I have even tried manually removing every single instance of those footers using Master PDF Editor but saving the file flattened it and you are no longer able to show/hide layers which is essential for correct printing. (Please don’t ask me how many different PDF editors I have tried because it has been so so SO many I have lost count).

Not that I really want to have to manually edit this out on what could amount to over a thousand pages but searching for a command to remove a certain phrase has come up empty. Even Master PDF Editor doesn’t seem to have a bulk remove or search and replace function (just search).

I use Linux btw.

  • queerlilhayseed@piefed.blahaj.zone
    link
    fedilink
    English
    arrow-up
    4
    ·
    8 hours ago

    You might be able to do a find and replace with https://github.com/pymupdf/PyMuPDF . I’m not an expert on PDFs, so I’m not sure if it can be done in a way that preserves all the important formatting, but if you feel comfortable DMing me the PDF (or one of similar complexity) I could try to write a script that replaces all instances of the target text in a way that preserves the rest of the document.