I archived a long thread that started withΒ [Archive] πππππ’οΉππππππππ on Twitter: “More fun publisher surveillance: Elsevier embeds a hash in the PDF metadata that is unique for each time a PDF is downloaded, this is a diff between metadata from two of the same paper. Combined with access timestamps, they can uniquely identify the source of any shared PDFs. ” / Twitter atΒ [Wayback/Archive] Thread by @json_dirs on Thread Reader App β Thread Reader App.
TL;DR: publishers put hashes in PDF metadata to track back redistribution; they hardly use smarter watermarking as those are difficult to automatically parse; the hashes can be easily removed.





