I have some sewing patterns that I would like to share (and hopefully swap) but all of the PDFs have a

“This was purchased by John Doe [email protected] #ordernumber - if you are not John Doe, please dob in the person you got this from to [email protected] so we can sick our lawyers on them”

sorta footer on every single page.

Obviously for privacy reasons (and because I don’t actually want lawyers sicked onto me), I need to remove this footer.

These are often complex PDFs with more than a hundred pages and multiple layers.

I managed to successfully remove the editing password (not user/viewing password, just can’t edit without password) with qpdf --decrypt. But removing that footer has left me at a dead end. I have even tried manually removing every single instance of those footers using Master PDF Editor but saving the file flattened it and you are no longer able to show/hide layers which is essential for correct printing. (Please don’t ask me how many different PDF editors I have tried because it has been so so SO many I have lost count).

Not that I really want to have to manually edit this out on what could amount to over a thousand pages but searching for a command to remove a certain phrase has come up empty. Even Master PDF Editor doesn’t seem to have a bulk remove or search and replace function (just search).

I use Linux btw.

  • cecilkorik@piefed.ca
    link
    fedilink
    English
    arrow-up
    19
    ·
    6 hours ago

    Just because the visible footer gets removed doesn’t mean there isn’t other unique tracking information hidden deep in the PDF that could still get the lawyers sicced on you. Depending on how valuable this information is to the company, and how litigious they are, you have to judge how far they might’ve gone and might yet go to protect it.

    Unfortunately, that’s why this kind of copy protection can an actually be an effective tactic to prevent individuals from sharing their copies. While there might be ways to strip this kind of hidden data on simpler PDFs… even resorting to methods like screenshotting or printing and scanning, still cannot give you absolute confidence that there isn’t some subtle unique identifier invisibly hidden in the layout or through subtle inconspicuous variations, especially if you’re doing this regularly and they start targeting you and your account for identification. And on complex PDFs there are so many more ways they could hide this information digitally if they know where to look for it and you don’t. 99% of the time it’s going to be pretty obvious to strip out, but are you willing to take that risk even if you do find a technical method of removing the visible footers? If it’s a one-off, maybe you can get away with it, but in the long term this strategy is not viable and is a trap for rookies.

    The only truly safe way to share digitally watermarked content like this is to buy it with a burner account and full opsec in the first place. Nobody to sic lawyers on if it’s a hacked paypal or a stolen/prepaid credit card or an untraceable email and IP, or in a jurisdiction with no enforcement. Smash and grab, get the data anonymously and get out. Don’t share stuff from your personal account that’s literally got your name and banking information attached to it unless you can confirm it’s bit-for-bit indistinguishable from other innocent copies with something like a checksum.