Is it possible to produce a PDF with un-copyable text?
up vote
66
down vote
favorite
Is it possible to produce a PDF with un-copyable text? I mean, when you want to copy text from the PDF, you can't copy it or what you copy is nonsense characters.
pdf copy-paste drm
|
show 12 more comments
up vote
66
down vote
favorite
Is it possible to produce a PDF with un-copyable text? I mean, when you want to copy text from the PDF, you can't copy it or what you copy is nonsense characters.
pdf copy-paste drm
49
Is it possible? Yes. (Well, sort of -- you could always convert to an image and OCR.) Is it a good idea? No. We must push back against the forces of OCR and commercialism, and push for the causes of open access, searchability, and software freedom. If those who favor open source software don't, no one will.
– frabjous
Feb 17 '11 at 15:16
14
IMHO, it is never a good idea to prevent other people from copying texts in a PDF file through techniques. If we must do such things, don't convert the texts to a image (vector or bitmap). Besides loss of quality, the result file may be very large.
– Leo Liu
Feb 17 '11 at 15:39
26
In addition, you'll do a huge disservice to blind people (though I guess PDF's aren't very accessible even in the best of cases).
– Caramdir
Feb 17 '11 at 15:52
4
"Cracked" isn't the right word. OCR = Optical Character Recognition. It takes an image, analyzes it to try to recognize letter shapes, and then outputs text. Of course, they could always retype what you've written, but that's usually faster.
– frabjous
Feb 19 '11 at 20:12
18
@warem: No, it's not possible. All you need to break it is a thing called "a typist".
– Brent.Longborough
Jun 3 '11 at 21:23
|
show 12 more comments
up vote
66
down vote
favorite
up vote
66
down vote
favorite
Is it possible to produce a PDF with un-copyable text? I mean, when you want to copy text from the PDF, you can't copy it or what you copy is nonsense characters.
pdf copy-paste drm
Is it possible to produce a PDF with un-copyable text? I mean, when you want to copy text from the PDF, you can't copy it or what you copy is nonsense characters.
pdf copy-paste drm
pdf copy-paste drm
edited Jun 3 '11 at 15:56
xport
21.6k29136260
21.6k29136260
asked Feb 17 '11 at 15:01
warem
1,26261930
1,26261930
49
Is it possible? Yes. (Well, sort of -- you could always convert to an image and OCR.) Is it a good idea? No. We must push back against the forces of OCR and commercialism, and push for the causes of open access, searchability, and software freedom. If those who favor open source software don't, no one will.
– frabjous
Feb 17 '11 at 15:16
14
IMHO, it is never a good idea to prevent other people from copying texts in a PDF file through techniques. If we must do such things, don't convert the texts to a image (vector or bitmap). Besides loss of quality, the result file may be very large.
– Leo Liu
Feb 17 '11 at 15:39
26
In addition, you'll do a huge disservice to blind people (though I guess PDF's aren't very accessible even in the best of cases).
– Caramdir
Feb 17 '11 at 15:52
4
"Cracked" isn't the right word. OCR = Optical Character Recognition. It takes an image, analyzes it to try to recognize letter shapes, and then outputs text. Of course, they could always retype what you've written, but that's usually faster.
– frabjous
Feb 19 '11 at 20:12
18
@warem: No, it's not possible. All you need to break it is a thing called "a typist".
– Brent.Longborough
Jun 3 '11 at 21:23
|
show 12 more comments
49
Is it possible? Yes. (Well, sort of -- you could always convert to an image and OCR.) Is it a good idea? No. We must push back against the forces of OCR and commercialism, and push for the causes of open access, searchability, and software freedom. If those who favor open source software don't, no one will.
– frabjous
Feb 17 '11 at 15:16
14
IMHO, it is never a good idea to prevent other people from copying texts in a PDF file through techniques. If we must do such things, don't convert the texts to a image (vector or bitmap). Besides loss of quality, the result file may be very large.
– Leo Liu
Feb 17 '11 at 15:39
26
In addition, you'll do a huge disservice to blind people (though I guess PDF's aren't very accessible even in the best of cases).
– Caramdir
Feb 17 '11 at 15:52
4
"Cracked" isn't the right word. OCR = Optical Character Recognition. It takes an image, analyzes it to try to recognize letter shapes, and then outputs text. Of course, they could always retype what you've written, but that's usually faster.
– frabjous
Feb 19 '11 at 20:12
18
@warem: No, it's not possible. All you need to break it is a thing called "a typist".
– Brent.Longborough
Jun 3 '11 at 21:23
49
49
Is it possible? Yes. (Well, sort of -- you could always convert to an image and OCR.) Is it a good idea? No. We must push back against the forces of OCR and commercialism, and push for the causes of open access, searchability, and software freedom. If those who favor open source software don't, no one will.
– frabjous
Feb 17 '11 at 15:16
Is it possible? Yes. (Well, sort of -- you could always convert to an image and OCR.) Is it a good idea? No. We must push back against the forces of OCR and commercialism, and push for the causes of open access, searchability, and software freedom. If those who favor open source software don't, no one will.
– frabjous
Feb 17 '11 at 15:16
14
14
IMHO, it is never a good idea to prevent other people from copying texts in a PDF file through techniques. If we must do such things, don't convert the texts to a image (vector or bitmap). Besides loss of quality, the result file may be very large.
– Leo Liu
Feb 17 '11 at 15:39
IMHO, it is never a good idea to prevent other people from copying texts in a PDF file through techniques. If we must do such things, don't convert the texts to a image (vector or bitmap). Besides loss of quality, the result file may be very large.
– Leo Liu
Feb 17 '11 at 15:39
26
26
In addition, you'll do a huge disservice to blind people (though I guess PDF's aren't very accessible even in the best of cases).
– Caramdir
Feb 17 '11 at 15:52
In addition, you'll do a huge disservice to blind people (though I guess PDF's aren't very accessible even in the best of cases).
– Caramdir
Feb 17 '11 at 15:52
4
4
"Cracked" isn't the right word. OCR = Optical Character Recognition. It takes an image, analyzes it to try to recognize letter shapes, and then outputs text. Of course, they could always retype what you've written, but that's usually faster.
– frabjous
Feb 19 '11 at 20:12
"Cracked" isn't the right word. OCR = Optical Character Recognition. It takes an image, analyzes it to try to recognize letter shapes, and then outputs text. Of course, they could always retype what you've written, but that's usually faster.
– frabjous
Feb 19 '11 at 20:12
18
18
@warem: No, it's not possible. All you need to break it is a thing called "a typist".
– Brent.Longborough
Jun 3 '11 at 21:23
@warem: No, it's not possible. All you need to break it is a thing called "a typist".
– Brent.Longborough
Jun 3 '11 at 21:23
|
show 12 more comments
9 Answers
9
active
oldest
votes
up vote
43
down vote
accepted
Besides converting all texts to images, one method as I know, is to destroy the Cmaps of the fonts. We can use cmap
package and a special cmap file for this purpose. This cmap file is generated inside the VerbatimOut environment.
(Warning: it does not make much sence to produce un-copyable PDF. OCR is very easy today.)
% pdflatex is required
documentclass{article}
usepackage[resetfonts]{cmap}
usepackage{fancyvrb}
begin{VerbatimOut}{ot1.cmap}
%!PS-Adobe-3.0 Resource-CMap
%%DocumentNeededResources: ProcSet (CIDInit)
%%IncludeResource: ProcSet (CIDInit)
%%BeginResource: CMap (TeX-OT1-0)
%%Title: (TeX-OT1-0 TeX OT1 0)
%%Version: 1.000
%%EndComments
/CIDInit /ProcSet findresource begin
12 dict begin
begincmap
/CIDSystemInfo
<< /Registry (TeX)
/Ordering (OT1)
/Supplement 0
>> def
/CMapName /TeX-OT1-0 def
/CMapType 2 def
1 begincodespacerange
<00> <7F>
endcodespacerange
8 beginbfrange
<00> <01> <0000>
<09> <0A> <0000>
<23> <26> <0000>
<28> <3B> <0000>
<3F> <5B> <0000>
<5D> <5E> <0000>
<61> <7A> <0000>
<7B> <7C> <0000>
endbfrange
40 beginbfchar
<02> <0000>
<03> <0000>
<04> <0000>
<05> <0000>
<06> <0000>
<07> <0000>
<08> <0000>
<0B> <0000>
<0C> <0000>
<0D> <0000>
<0E> <0000>
<0F> <0000>
<10> <0000>
<11> <0000>
<12> <0000>
<13> <0000>
<14> <0000>
<15> <0000>
<16> <0000>
<17> <0000>
<18> <0000>
<19> <0000>
<1A> <0000>
<1B> <0000>
<1C> <0000>
<1D> <0000>
<1E> <0000>
<1F> <0000>
<21> <0000>
<22> <0000>
<27> <0000>
<3C> <0000>
<3D> <0000>
<3E> <0000>
<5C> <0000>
<5F> <0000>
<60> <0000>
<7D> <0000>
<7E> <0000>
<7F> <0000>
endbfchar
endcmap
CMapName currentdict /CMap defineresource pop
end
end
%%EndResource
%%EOF
end{VerbatimOut}
usepackage{lipsum}
begin{document}
lipsum
end{document}
your method worked. thank you. but if i changedocumentclass{article}
todocumentclass[titlepage,a4paper,12pt]{article}
, it didn't work.
– warem
Feb 18 '11 at 2:33
i just found if i didn't define12pt
at the beginning, then defined a newcommand to set the default font size later, your method worked now. i don't why. on the other hand, your method works for the whole text, is it possible to just work part of text?
– warem
Feb 18 '11 at 3:15
resetfonts
doesn't work for12pt
. You can followcmap.sty
to undefine more predefined fonts. I have no much time.
– Leo Liu
Feb 18 '11 at 5:48
7
That method does not work me. My evince allows me happily to copy and paste the text. Alsopdftotext
extracts all the available text. So this method does not work.
– Frederick Nord
Feb 14 '13 at 20:51
1
For me too, this solution don't work.
– dexterdev
Apr 16 '15 at 8:20
|
show 5 more comments
up vote
23
down vote
Luatex allows manipulating fonts in the define_font
callback.
Luaotfload facilitates this even more with an extra hook it installs
right after the font loader has finished its job: the
luaotfload.patch_font
callback.
Normally it is used for serious and constructive tasks like setting a
couple font dimensions or ensuring backward compatibility in the data
structures.
Of course, it can also be abused for dirty hacks like disabling copy
and paste.
At the point where the patch_font
callback is applied, the font is
already defined and ready to use.
All necessary tables are created and put in a place where Luatex
expects them.
Among these is the characters
table that holds preprocessed
information about the glyphs.
In the below code we modify the tounicode
field of each glyph so
that it maps to some random location within the printable ASCII range.
Note that this does not affect the shape and metrics of the glyph since
those are unrelated to the actual codepoint.
As a consequence, the PDF will contain legible text that cannot be
copied.
Package file obfuscate.lua
:
packagedata = packagedata or { }
local mathrandom = math.random
local stringformat = string.format
--- this is the callback by means of which we will obfuscate
--- the tounicode values so they map to random characters of
--- the printable ascii range (between 0x21 / 33 and 0x7e / 126)
local obfuscate = function (tfmdata, _specification)
if not tfmdata or type (tfmdata) ~= "table" then
return
end
local characters = tfmdata.characters
if characters then
for codepoint, char in next, characters do
char.tounicode = stringformat ([[%0.4X]], mathrandom (0x21, 0x7e))
end
end
end
--- we also need some functions to toggle the callback activation so
--- we can obfuscate fonts selectively
local active = false
packagedata.obfuscate_begin = function ()
if not active then
luatexbase.add_to_callback ("luaotfload.patch_font", obfuscate,
"user.obfuscate_font", 1)
active = true
end
end
packagedata.obfuscate_end = function ()
if active then
luatexbase.remove_from_callback ("luaotfload.patch_font",
"user.obfuscate_font")
active = false
end
end
Usage demonstration:
%% we will need these packages
input luatexbase.sty
input luaotfload.sty
%% for inspecting the pdf with an ordinary editor
pdfcompresslevel0
pdfobjcompresslevel0
%% load obfuscation code
RequireLuaModule {obfuscate}
%% convenience macro
def packagecmd #1{directlua {packagedata.#1}}
%% the obfuscate environment, mapping to Lua functions that enable and
%% disable tounicode obfuscation
def beginobfuscate {packagecmd {obfuscate_begin ()}}
def endobfuscate {packagecmd {obfuscate_end ()}}
%%···································································%%
%% Demo
%%···································································%%
%% firstly, load some fonts. within the “obfuscate” environment all
%% fonts will get their cmaps scrambled ...
beginobfuscate
font mainfont = "file:Iwona-Regular.otf:mode=base"
font italicfont = "file:Iwona-Italic.otf:mode=base"
endobfuscate
%% ... while fonts defined outside will have the mapping intact
font boldfont = "file:Iwona-Bold.otf:mode=base"
font bolditalicfont = "file:Iwona-BoldItalic.otf:mode=base"
%% now we can use them in our document like any ordinary font
mainfont
obfuscated text before {italicfont obfuscated too} and after par
obfuscated text before {boldfont not obfuscated} and after par
obfuscated text before {bolditalicfont not obfuscated} and after par
bye
Result in PDF viewer:
Contrast this with the output of pdftotext
:
rf2yC'I_J I_dI r_f{_ 9;H`bp<<L& <99 '5J 'fI_{
rf2yC'I_J I_dI r_f{_ not obfuscated '5J 'fI_{
rf2yC'I_J I_dI r_f{_ not obfuscated '5J 'fI_{
But please forget about all this immediately and never obfuscate a
production text -- don’t be mean to your readers!
EDIT
Because the generous karma donor specifically asked for a Context
solution, I’ll throw that one in as a bonus.
It is a good deal more elegant since it relies on the font goodies
mechanism that allows applying postprocessors to specific fonts which
can afterwards be used just like common font features.
startluacode
local mathrandom = math.random
local stringformat = string.format
--- create a postprocessor
local obfuscate = function (tfmdata)
fonts.goodies.registerpostprocessor (tfmdata, function (tfmdata)
if not tfmdata or type (tfmdata) ~= "table" then
return
end
local characters = tfmdata.characters
if characters then
for codepoint, char in next, characters do
char.tounicode = stringformat ([[%0.4X]], mathrandom (0x21, 0x7e))
end
end
end)
end
--- now register as a font feature
fonts.handlers.otf.features.register {
name = "obfuscate",
description = "treat the reader like a piece of garbage",
default = false,
initializers = {
base = obfuscate,
node = obfuscate,
}
}
stopluacode
%%···································································%%
%% demonstration
%%···································································%%
%% we can now treat the obfuscation postprocessor like any other
%% font feature
definefontfeature [obfuscate] [obfuscate=yes]
definefont [mainfont] [file:Iwona-Regular.otf*obfuscate]
definefont [italicfont] [file:Iwona-Italic.otf*obfuscate]
definefont [boldfont] [file:Iwona-Bold.otf]
definefont [bolditalicfont] [file:Iwona-BoldItalic.otf]
starttext
mainfont
obfuscated text before {italicfont obfuscated too} and after par
obfuscated text before {boldfont not obfuscated} and after par
obfuscated text before {bolditalicfont not obfuscated} and after par
stoptext
1
I'll see your obfuscator and raise you by some free OCR package :D
– Mark K Cowan
Nov 6 '13 at 15:20
1
@Mark Never mind OCR, this is deterministic 1 to 1 character mapping: t=I, e=_, x=d, etc. A few minutes with a document could produce ased
substitution expression for all the changed glyphs. Pipe yourpdftotext
into that and you have a 100% fix. All this does is waste both author (and reader) time without actually solving anything but making them feel like they have. Poor-mans-DRM is even worse than the real thing.
– Caleb
Dec 20 '14 at 15:49
add a comment |
up vote
19
down vote
You can disable the copying of text with the help of PDF encryption. With it you can also disable other things like printing.
You need to use an external PDF tool like pdftk
or of course the full version of Adobe Acrobat to encrypt the PDF.
10
However, encryption doesn't work for (almost all as I know) non-Adobe PDF readers.
– Leo Liu
Feb 17 '11 at 15:14
1
I often use a certain open-source reader (with just one line of code commented out) to bypass PDF protection and passwords. Anyone familiar with SourceForge, GIT and MAKE can easily roll their own in a matter of minutes too.
– Mark K Cowan
Aug 12 '13 at 23:54
@MarkKCowan I know of other, less sophisticated (if but also effective) ways than what you describe; out of sheer curiousity (though that curiousity is not that large that I'd try and patch it myself): Could you provide more verbose details or a link to a commented (indicating the commented-out line) GIT ? - sry about the overuse of (brackets); I'm drunk.
– nutty about natty
Aug 14 '13 at 20:22
It was a long time ago when I built it. I think it was a Java application, there was one particular line which was a const "final boolean <something> = <somevalue>;" which related to password protection. Apparently, Ubuntu was the only distro where the password protection had been enabled, so I flipped that boolean and recompiled to produce a binary which didn't bother with the whole password fake-DRM stuff. Strictly speaking, I changed the value of a boolean constant, rather than commenting out a line.
– Mark K Cowan
Aug 14 '13 at 23:27
Okular has a user setting which determines whether it recognises DRM or not...
– cfr
Aug 25 '14 at 3:39
add a comment |
up vote
18
down vote
Remarks
I use a little script, which converts all my fonts to paths. The script uses the first parameter as input of a .pdf
-file and writes the output to a file with the same name and the extension-rst.pdf
You need Ghostscript for my script to run.
Implementation
Runs on bash
#!/bin/sh
GS=/usr/bin/gs
$GS -sDEVICE=pswrite -dNOCACHE -sOutputFile=- -q -dBATCH -dNOPAUSE "$1" -c quit | ps2pdf - > "${1%%.*}-rst.pdf"
if [ $? -eq 0 ]; then
echo "Output written to ${1%%.*}-rst.pdf"
else
echo "There were errors. See the output."
fi
use ps2write (in stead of pswrite) these days as seen here.
Result
No match for OCR though :D
– Mark K Cowan
Aug 12 '13 at 17:54
6
Well I guess, that there is no way to trick any OCR Software (without adding things, like striking/crossing the text out), because the OCR Software can read, what you can read.
– Henri Menke
Aug 12 '13 at 20:50
Specialised OCR software can break some types of CAPTCHA too... You can use excessive striking/deforming/noise to harden your file against this, but then humans won't be able to read half of it either!
– Mark K Cowan
Aug 12 '13 at 23:52
1
Your script works, but (at least on my files, which are slides produced with beamer) produces very pale and large files. It also takes a relatively long time to finish.
– Anthony Labarre
Oct 22 '13 at 8:46
1
@Trickster You run it likebash script_name "path/to/pdf/file"
. You don't needsudo
as no additional privileges are needed for this script.
– Henri Menke
Nov 24 '13 at 9:30
|
show 2 more comments
up vote
10
down vote
If content can be viewed, it can be copied.
No matter what encryption and restrictions are used, at some point the content must be put out in plain view in order for it to be of any use.
This is probably true of all digital content and most physical content larger than the nanoscale...
For example, a PDF:
- Rasterisation: Printscreen => OCR
- Any protection: Re-type it out
- Content protection: Modified build of an open-source reader
Web content:
- Right-click popup: Opera=>Prevent page receiving content menu events
- Right-click popup: "Menu" button on any modern keyboard
- Flash: Download the SWF file, decompile it using free software
- View page source, use Chrome/Opera/Firefox debugger to get URL of desired content
Audio (e.g. HDCP):
- Headphones socket on TV => line-in socket on PC
- Solder to tap into preamplifier => line-in socket on PC
Video (e.g. HDCP):
- Many, many options... A quick google search will show you.
Encrypted content on someone's laptop/pendrive:
- Digital brute force: brute-force cracking of the encryption key
- Physical brute force: http://imgs.xkcd.com/comics/security.png
One of these is not like the others. The last item is both wrong and a different scenario from your premise.
– Caleb
Dec 20 '14 at 15:54
add a comment |
up vote
5
down vote
The answer is: Yes.
There is a way described here: http://spivey.oriel.ox.ac.uk/corner/Obfuscated_PDF
But it looks tedious and doesn't use pdflatex. The method, however, is described as being portable to PDF. It involves changing glyphs of a font and other dirty things that get you bad dreams.
I didn't find a method described for directly PDF let alone something automated for pdflatex. I'll happily buy you a beverage of your choice if you implement it :-)
add a comment |
up vote
2
down vote
I am using gswin32
only make a pdf after change the format to ps:
"C:Program Files (x86)gsgs9.09bingswin32.exe" -sDEVICE=ps2write -r9000
-dNOPAUSE -sOutputFile=OUTPUT.ps input_insecure.pdf
Now translate to pdf with secure mode 4 (only read and print):
"C:Program Files (x86)gsgs9.09bingswin32.exe" -sDEVICE=pdfwrite -r9000
-dNOPAUSE -sPAPERSIZE=a4 -dPDFSETTINGS=/prepress -dMaxSubsetPct=100
-dSubsetFonts=true -dEmbedAllFonts=true -sOwnerPassword=null -dEncryptionR=3
-dKeyLength=40 -dPermissions=4 -sOutputFile=OUTPUT_secure.pdf output.ps
On a unix-based system, you can probably type gswin32
instead of "C:Program Files (x86)gsgs9.09bingswin32.exe"
.
add a comment |
up vote
1
down vote
Use XeTeX
to at least get some "nonsense characters", see here and here.
Though this would obviously be just a nuisance for most cases/users (which can be avoided using LuaLaTeX
instead), depending on what you are trying to achieve compiling with XeTeX
may prove to add at least some value to your solution...
add a comment |
up vote
1
down vote
You can use ImageMagick to convert the pdf to an image pdf.
Running
convert file1.pdf file2.pdf
will create a pdf called file2.pdf
which is about the same size as the input pdf but since its an image, the text cannot be selected. There is a notable decrease in quality though
add a comment |
9 Answers
9
active
oldest
votes
9 Answers
9
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
43
down vote
accepted
Besides converting all texts to images, one method as I know, is to destroy the Cmaps of the fonts. We can use cmap
package and a special cmap file for this purpose. This cmap file is generated inside the VerbatimOut environment.
(Warning: it does not make much sence to produce un-copyable PDF. OCR is very easy today.)
% pdflatex is required
documentclass{article}
usepackage[resetfonts]{cmap}
usepackage{fancyvrb}
begin{VerbatimOut}{ot1.cmap}
%!PS-Adobe-3.0 Resource-CMap
%%DocumentNeededResources: ProcSet (CIDInit)
%%IncludeResource: ProcSet (CIDInit)
%%BeginResource: CMap (TeX-OT1-0)
%%Title: (TeX-OT1-0 TeX OT1 0)
%%Version: 1.000
%%EndComments
/CIDInit /ProcSet findresource begin
12 dict begin
begincmap
/CIDSystemInfo
<< /Registry (TeX)
/Ordering (OT1)
/Supplement 0
>> def
/CMapName /TeX-OT1-0 def
/CMapType 2 def
1 begincodespacerange
<00> <7F>
endcodespacerange
8 beginbfrange
<00> <01> <0000>
<09> <0A> <0000>
<23> <26> <0000>
<28> <3B> <0000>
<3F> <5B> <0000>
<5D> <5E> <0000>
<61> <7A> <0000>
<7B> <7C> <0000>
endbfrange
40 beginbfchar
<02> <0000>
<03> <0000>
<04> <0000>
<05> <0000>
<06> <0000>
<07> <0000>
<08> <0000>
<0B> <0000>
<0C> <0000>
<0D> <0000>
<0E> <0000>
<0F> <0000>
<10> <0000>
<11> <0000>
<12> <0000>
<13> <0000>
<14> <0000>
<15> <0000>
<16> <0000>
<17> <0000>
<18> <0000>
<19> <0000>
<1A> <0000>
<1B> <0000>
<1C> <0000>
<1D> <0000>
<1E> <0000>
<1F> <0000>
<21> <0000>
<22> <0000>
<27> <0000>
<3C> <0000>
<3D> <0000>
<3E> <0000>
<5C> <0000>
<5F> <0000>
<60> <0000>
<7D> <0000>
<7E> <0000>
<7F> <0000>
endbfchar
endcmap
CMapName currentdict /CMap defineresource pop
end
end
%%EndResource
%%EOF
end{VerbatimOut}
usepackage{lipsum}
begin{document}
lipsum
end{document}
your method worked. thank you. but if i changedocumentclass{article}
todocumentclass[titlepage,a4paper,12pt]{article}
, it didn't work.
– warem
Feb 18 '11 at 2:33
i just found if i didn't define12pt
at the beginning, then defined a newcommand to set the default font size later, your method worked now. i don't why. on the other hand, your method works for the whole text, is it possible to just work part of text?
– warem
Feb 18 '11 at 3:15
resetfonts
doesn't work for12pt
. You can followcmap.sty
to undefine more predefined fonts. I have no much time.
– Leo Liu
Feb 18 '11 at 5:48
7
That method does not work me. My evince allows me happily to copy and paste the text. Alsopdftotext
extracts all the available text. So this method does not work.
– Frederick Nord
Feb 14 '13 at 20:51
1
For me too, this solution don't work.
– dexterdev
Apr 16 '15 at 8:20
|
show 5 more comments
up vote
43
down vote
accepted
Besides converting all texts to images, one method as I know, is to destroy the Cmaps of the fonts. We can use cmap
package and a special cmap file for this purpose. This cmap file is generated inside the VerbatimOut environment.
(Warning: it does not make much sence to produce un-copyable PDF. OCR is very easy today.)
% pdflatex is required
documentclass{article}
usepackage[resetfonts]{cmap}
usepackage{fancyvrb}
begin{VerbatimOut}{ot1.cmap}
%!PS-Adobe-3.0 Resource-CMap
%%DocumentNeededResources: ProcSet (CIDInit)
%%IncludeResource: ProcSet (CIDInit)
%%BeginResource: CMap (TeX-OT1-0)
%%Title: (TeX-OT1-0 TeX OT1 0)
%%Version: 1.000
%%EndComments
/CIDInit /ProcSet findresource begin
12 dict begin
begincmap
/CIDSystemInfo
<< /Registry (TeX)
/Ordering (OT1)
/Supplement 0
>> def
/CMapName /TeX-OT1-0 def
/CMapType 2 def
1 begincodespacerange
<00> <7F>
endcodespacerange
8 beginbfrange
<00> <01> <0000>
<09> <0A> <0000>
<23> <26> <0000>
<28> <3B> <0000>
<3F> <5B> <0000>
<5D> <5E> <0000>
<61> <7A> <0000>
<7B> <7C> <0000>
endbfrange
40 beginbfchar
<02> <0000>
<03> <0000>
<04> <0000>
<05> <0000>
<06> <0000>
<07> <0000>
<08> <0000>
<0B> <0000>
<0C> <0000>
<0D> <0000>
<0E> <0000>
<0F> <0000>
<10> <0000>
<11> <0000>
<12> <0000>
<13> <0000>
<14> <0000>
<15> <0000>
<16> <0000>
<17> <0000>
<18> <0000>
<19> <0000>
<1A> <0000>
<1B> <0000>
<1C> <0000>
<1D> <0000>
<1E> <0000>
<1F> <0000>
<21> <0000>
<22> <0000>
<27> <0000>
<3C> <0000>
<3D> <0000>
<3E> <0000>
<5C> <0000>
<5F> <0000>
<60> <0000>
<7D> <0000>
<7E> <0000>
<7F> <0000>
endbfchar
endcmap
CMapName currentdict /CMap defineresource pop
end
end
%%EndResource
%%EOF
end{VerbatimOut}
usepackage{lipsum}
begin{document}
lipsum
end{document}
your method worked. thank you. but if i changedocumentclass{article}
todocumentclass[titlepage,a4paper,12pt]{article}
, it didn't work.
– warem
Feb 18 '11 at 2:33
i just found if i didn't define12pt
at the beginning, then defined a newcommand to set the default font size later, your method worked now. i don't why. on the other hand, your method works for the whole text, is it possible to just work part of text?
– warem
Feb 18 '11 at 3:15
resetfonts
doesn't work for12pt
. You can followcmap.sty
to undefine more predefined fonts. I have no much time.
– Leo Liu
Feb 18 '11 at 5:48
7
That method does not work me. My evince allows me happily to copy and paste the text. Alsopdftotext
extracts all the available text. So this method does not work.
– Frederick Nord
Feb 14 '13 at 20:51
1
For me too, this solution don't work.
– dexterdev
Apr 16 '15 at 8:20
|
show 5 more comments
up vote
43
down vote
accepted
up vote
43
down vote
accepted
Besides converting all texts to images, one method as I know, is to destroy the Cmaps of the fonts. We can use cmap
package and a special cmap file for this purpose. This cmap file is generated inside the VerbatimOut environment.
(Warning: it does not make much sence to produce un-copyable PDF. OCR is very easy today.)
% pdflatex is required
documentclass{article}
usepackage[resetfonts]{cmap}
usepackage{fancyvrb}
begin{VerbatimOut}{ot1.cmap}
%!PS-Adobe-3.0 Resource-CMap
%%DocumentNeededResources: ProcSet (CIDInit)
%%IncludeResource: ProcSet (CIDInit)
%%BeginResource: CMap (TeX-OT1-0)
%%Title: (TeX-OT1-0 TeX OT1 0)
%%Version: 1.000
%%EndComments
/CIDInit /ProcSet findresource begin
12 dict begin
begincmap
/CIDSystemInfo
<< /Registry (TeX)
/Ordering (OT1)
/Supplement 0
>> def
/CMapName /TeX-OT1-0 def
/CMapType 2 def
1 begincodespacerange
<00> <7F>
endcodespacerange
8 beginbfrange
<00> <01> <0000>
<09> <0A> <0000>
<23> <26> <0000>
<28> <3B> <0000>
<3F> <5B> <0000>
<5D> <5E> <0000>
<61> <7A> <0000>
<7B> <7C> <0000>
endbfrange
40 beginbfchar
<02> <0000>
<03> <0000>
<04> <0000>
<05> <0000>
<06> <0000>
<07> <0000>
<08> <0000>
<0B> <0000>
<0C> <0000>
<0D> <0000>
<0E> <0000>
<0F> <0000>
<10> <0000>
<11> <0000>
<12> <0000>
<13> <0000>
<14> <0000>
<15> <0000>
<16> <0000>
<17> <0000>
<18> <0000>
<19> <0000>
<1A> <0000>
<1B> <0000>
<1C> <0000>
<1D> <0000>
<1E> <0000>
<1F> <0000>
<21> <0000>
<22> <0000>
<27> <0000>
<3C> <0000>
<3D> <0000>
<3E> <0000>
<5C> <0000>
<5F> <0000>
<60> <0000>
<7D> <0000>
<7E> <0000>
<7F> <0000>
endbfchar
endcmap
CMapName currentdict /CMap defineresource pop
end
end
%%EndResource
%%EOF
end{VerbatimOut}
usepackage{lipsum}
begin{document}
lipsum
end{document}
Besides converting all texts to images, one method as I know, is to destroy the Cmaps of the fonts. We can use cmap
package and a special cmap file for this purpose. This cmap file is generated inside the VerbatimOut environment.
(Warning: it does not make much sence to produce un-copyable PDF. OCR is very easy today.)
% pdflatex is required
documentclass{article}
usepackage[resetfonts]{cmap}
usepackage{fancyvrb}
begin{VerbatimOut}{ot1.cmap}
%!PS-Adobe-3.0 Resource-CMap
%%DocumentNeededResources: ProcSet (CIDInit)
%%IncludeResource: ProcSet (CIDInit)
%%BeginResource: CMap (TeX-OT1-0)
%%Title: (TeX-OT1-0 TeX OT1 0)
%%Version: 1.000
%%EndComments
/CIDInit /ProcSet findresource begin
12 dict begin
begincmap
/CIDSystemInfo
<< /Registry (TeX)
/Ordering (OT1)
/Supplement 0
>> def
/CMapName /TeX-OT1-0 def
/CMapType 2 def
1 begincodespacerange
<00> <7F>
endcodespacerange
8 beginbfrange
<00> <01> <0000>
<09> <0A> <0000>
<23> <26> <0000>
<28> <3B> <0000>
<3F> <5B> <0000>
<5D> <5E> <0000>
<61> <7A> <0000>
<7B> <7C> <0000>
endbfrange
40 beginbfchar
<02> <0000>
<03> <0000>
<04> <0000>
<05> <0000>
<06> <0000>
<07> <0000>
<08> <0000>
<0B> <0000>
<0C> <0000>
<0D> <0000>
<0E> <0000>
<0F> <0000>
<10> <0000>
<11> <0000>
<12> <0000>
<13> <0000>
<14> <0000>
<15> <0000>
<16> <0000>
<17> <0000>
<18> <0000>
<19> <0000>
<1A> <0000>
<1B> <0000>
<1C> <0000>
<1D> <0000>
<1E> <0000>
<1F> <0000>
<21> <0000>
<22> <0000>
<27> <0000>
<3C> <0000>
<3D> <0000>
<3E> <0000>
<5C> <0000>
<5F> <0000>
<60> <0000>
<7D> <0000>
<7E> <0000>
<7F> <0000>
endbfchar
endcmap
CMapName currentdict /CMap defineresource pop
end
end
%%EndResource
%%EOF
end{VerbatimOut}
usepackage{lipsum}
begin{document}
lipsum
end{document}
edited Oct 13 '17 at 19:56
Altynbek Isabekov
33
33
answered Feb 17 '11 at 15:29
Leo Liu
62.6k7179258
62.6k7179258
your method worked. thank you. but if i changedocumentclass{article}
todocumentclass[titlepage,a4paper,12pt]{article}
, it didn't work.
– warem
Feb 18 '11 at 2:33
i just found if i didn't define12pt
at the beginning, then defined a newcommand to set the default font size later, your method worked now. i don't why. on the other hand, your method works for the whole text, is it possible to just work part of text?
– warem
Feb 18 '11 at 3:15
resetfonts
doesn't work for12pt
. You can followcmap.sty
to undefine more predefined fonts. I have no much time.
– Leo Liu
Feb 18 '11 at 5:48
7
That method does not work me. My evince allows me happily to copy and paste the text. Alsopdftotext
extracts all the available text. So this method does not work.
– Frederick Nord
Feb 14 '13 at 20:51
1
For me too, this solution don't work.
– dexterdev
Apr 16 '15 at 8:20
|
show 5 more comments
your method worked. thank you. but if i changedocumentclass{article}
todocumentclass[titlepage,a4paper,12pt]{article}
, it didn't work.
– warem
Feb 18 '11 at 2:33
i just found if i didn't define12pt
at the beginning, then defined a newcommand to set the default font size later, your method worked now. i don't why. on the other hand, your method works for the whole text, is it possible to just work part of text?
– warem
Feb 18 '11 at 3:15
resetfonts
doesn't work for12pt
. You can followcmap.sty
to undefine more predefined fonts. I have no much time.
– Leo Liu
Feb 18 '11 at 5:48
7
That method does not work me. My evince allows me happily to copy and paste the text. Alsopdftotext
extracts all the available text. So this method does not work.
– Frederick Nord
Feb 14 '13 at 20:51
1
For me too, this solution don't work.
– dexterdev
Apr 16 '15 at 8:20
your method worked. thank you. but if i change
documentclass{article}
to documentclass[titlepage,a4paper,12pt]{article}
, it didn't work.– warem
Feb 18 '11 at 2:33
your method worked. thank you. but if i change
documentclass{article}
to documentclass[titlepage,a4paper,12pt]{article}
, it didn't work.– warem
Feb 18 '11 at 2:33
i just found if i didn't define
12pt
at the beginning, then defined a newcommand to set the default font size later, your method worked now. i don't why. on the other hand, your method works for the whole text, is it possible to just work part of text?– warem
Feb 18 '11 at 3:15
i just found if i didn't define
12pt
at the beginning, then defined a newcommand to set the default font size later, your method worked now. i don't why. on the other hand, your method works for the whole text, is it possible to just work part of text?– warem
Feb 18 '11 at 3:15
resetfonts
doesn't work for 12pt
. You can follow cmap.sty
to undefine more predefined fonts. I have no much time.– Leo Liu
Feb 18 '11 at 5:48
resetfonts
doesn't work for 12pt
. You can follow cmap.sty
to undefine more predefined fonts. I have no much time.– Leo Liu
Feb 18 '11 at 5:48
7
7
That method does not work me. My evince allows me happily to copy and paste the text. Also
pdftotext
extracts all the available text. So this method does not work.– Frederick Nord
Feb 14 '13 at 20:51
That method does not work me. My evince allows me happily to copy and paste the text. Also
pdftotext
extracts all the available text. So this method does not work.– Frederick Nord
Feb 14 '13 at 20:51
1
1
For me too, this solution don't work.
– dexterdev
Apr 16 '15 at 8:20
For me too, this solution don't work.
– dexterdev
Apr 16 '15 at 8:20
|
show 5 more comments
up vote
23
down vote
Luatex allows manipulating fonts in the define_font
callback.
Luaotfload facilitates this even more with an extra hook it installs
right after the font loader has finished its job: the
luaotfload.patch_font
callback.
Normally it is used for serious and constructive tasks like setting a
couple font dimensions or ensuring backward compatibility in the data
structures.
Of course, it can also be abused for dirty hacks like disabling copy
and paste.
At the point where the patch_font
callback is applied, the font is
already defined and ready to use.
All necessary tables are created and put in a place where Luatex
expects them.
Among these is the characters
table that holds preprocessed
information about the glyphs.
In the below code we modify the tounicode
field of each glyph so
that it maps to some random location within the printable ASCII range.
Note that this does not affect the shape and metrics of the glyph since
those are unrelated to the actual codepoint.
As a consequence, the PDF will contain legible text that cannot be
copied.
Package file obfuscate.lua
:
packagedata = packagedata or { }
local mathrandom = math.random
local stringformat = string.format
--- this is the callback by means of which we will obfuscate
--- the tounicode values so they map to random characters of
--- the printable ascii range (between 0x21 / 33 and 0x7e / 126)
local obfuscate = function (tfmdata, _specification)
if not tfmdata or type (tfmdata) ~= "table" then
return
end
local characters = tfmdata.characters
if characters then
for codepoint, char in next, characters do
char.tounicode = stringformat ([[%0.4X]], mathrandom (0x21, 0x7e))
end
end
end
--- we also need some functions to toggle the callback activation so
--- we can obfuscate fonts selectively
local active = false
packagedata.obfuscate_begin = function ()
if not active then
luatexbase.add_to_callback ("luaotfload.patch_font", obfuscate,
"user.obfuscate_font", 1)
active = true
end
end
packagedata.obfuscate_end = function ()
if active then
luatexbase.remove_from_callback ("luaotfload.patch_font",
"user.obfuscate_font")
active = false
end
end
Usage demonstration:
%% we will need these packages
input luatexbase.sty
input luaotfload.sty
%% for inspecting the pdf with an ordinary editor
pdfcompresslevel0
pdfobjcompresslevel0
%% load obfuscation code
RequireLuaModule {obfuscate}
%% convenience macro
def packagecmd #1{directlua {packagedata.#1}}
%% the obfuscate environment, mapping to Lua functions that enable and
%% disable tounicode obfuscation
def beginobfuscate {packagecmd {obfuscate_begin ()}}
def endobfuscate {packagecmd {obfuscate_end ()}}
%%···································································%%
%% Demo
%%···································································%%
%% firstly, load some fonts. within the “obfuscate” environment all
%% fonts will get their cmaps scrambled ...
beginobfuscate
font mainfont = "file:Iwona-Regular.otf:mode=base"
font italicfont = "file:Iwona-Italic.otf:mode=base"
endobfuscate
%% ... while fonts defined outside will have the mapping intact
font boldfont = "file:Iwona-Bold.otf:mode=base"
font bolditalicfont = "file:Iwona-BoldItalic.otf:mode=base"
%% now we can use them in our document like any ordinary font
mainfont
obfuscated text before {italicfont obfuscated too} and after par
obfuscated text before {boldfont not obfuscated} and after par
obfuscated text before {bolditalicfont not obfuscated} and after par
bye
Result in PDF viewer:
Contrast this with the output of pdftotext
:
rf2yC'I_J I_dI r_f{_ 9;H`bp<<L& <99 '5J 'fI_{
rf2yC'I_J I_dI r_f{_ not obfuscated '5J 'fI_{
rf2yC'I_J I_dI r_f{_ not obfuscated '5J 'fI_{
But please forget about all this immediately and never obfuscate a
production text -- don’t be mean to your readers!
EDIT
Because the generous karma donor specifically asked for a Context
solution, I’ll throw that one in as a bonus.
It is a good deal more elegant since it relies on the font goodies
mechanism that allows applying postprocessors to specific fonts which
can afterwards be used just like common font features.
startluacode
local mathrandom = math.random
local stringformat = string.format
--- create a postprocessor
local obfuscate = function (tfmdata)
fonts.goodies.registerpostprocessor (tfmdata, function (tfmdata)
if not tfmdata or type (tfmdata) ~= "table" then
return
end
local characters = tfmdata.characters
if characters then
for codepoint, char in next, characters do
char.tounicode = stringformat ([[%0.4X]], mathrandom (0x21, 0x7e))
end
end
end)
end
--- now register as a font feature
fonts.handlers.otf.features.register {
name = "obfuscate",
description = "treat the reader like a piece of garbage",
default = false,
initializers = {
base = obfuscate,
node = obfuscate,
}
}
stopluacode
%%···································································%%
%% demonstration
%%···································································%%
%% we can now treat the obfuscation postprocessor like any other
%% font feature
definefontfeature [obfuscate] [obfuscate=yes]
definefont [mainfont] [file:Iwona-Regular.otf*obfuscate]
definefont [italicfont] [file:Iwona-Italic.otf*obfuscate]
definefont [boldfont] [file:Iwona-Bold.otf]
definefont [bolditalicfont] [file:Iwona-BoldItalic.otf]
starttext
mainfont
obfuscated text before {italicfont obfuscated too} and after par
obfuscated text before {boldfont not obfuscated} and after par
obfuscated text before {bolditalicfont not obfuscated} and after par
stoptext
1
I'll see your obfuscator and raise you by some free OCR package :D
– Mark K Cowan
Nov 6 '13 at 15:20
1
@Mark Never mind OCR, this is deterministic 1 to 1 character mapping: t=I, e=_, x=d, etc. A few minutes with a document could produce ased
substitution expression for all the changed glyphs. Pipe yourpdftotext
into that and you have a 100% fix. All this does is waste both author (and reader) time without actually solving anything but making them feel like they have. Poor-mans-DRM is even worse than the real thing.
– Caleb
Dec 20 '14 at 15:49
add a comment |
up vote
23
down vote
Luatex allows manipulating fonts in the define_font
callback.
Luaotfload facilitates this even more with an extra hook it installs
right after the font loader has finished its job: the
luaotfload.patch_font
callback.
Normally it is used for serious and constructive tasks like setting a
couple font dimensions or ensuring backward compatibility in the data
structures.
Of course, it can also be abused for dirty hacks like disabling copy
and paste.
At the point where the patch_font
callback is applied, the font is
already defined and ready to use.
All necessary tables are created and put in a place where Luatex
expects them.
Among these is the characters
table that holds preprocessed
information about the glyphs.
In the below code we modify the tounicode
field of each glyph so
that it maps to some random location within the printable ASCII range.
Note that this does not affect the shape and metrics of the glyph since
those are unrelated to the actual codepoint.
As a consequence, the PDF will contain legible text that cannot be
copied.
Package file obfuscate.lua
:
packagedata = packagedata or { }
local mathrandom = math.random
local stringformat = string.format
--- this is the callback by means of which we will obfuscate
--- the tounicode values so they map to random characters of
--- the printable ascii range (between 0x21 / 33 and 0x7e / 126)
local obfuscate = function (tfmdata, _specification)
if not tfmdata or type (tfmdata) ~= "table" then
return
end
local characters = tfmdata.characters
if characters then
for codepoint, char in next, characters do
char.tounicode = stringformat ([[%0.4X]], mathrandom (0x21, 0x7e))
end
end
end
--- we also need some functions to toggle the callback activation so
--- we can obfuscate fonts selectively
local active = false
packagedata.obfuscate_begin = function ()
if not active then
luatexbase.add_to_callback ("luaotfload.patch_font", obfuscate,
"user.obfuscate_font", 1)
active = true
end
end
packagedata.obfuscate_end = function ()
if active then
luatexbase.remove_from_callback ("luaotfload.patch_font",
"user.obfuscate_font")
active = false
end
end
Usage demonstration:
%% we will need these packages
input luatexbase.sty
input luaotfload.sty
%% for inspecting the pdf with an ordinary editor
pdfcompresslevel0
pdfobjcompresslevel0
%% load obfuscation code
RequireLuaModule {obfuscate}
%% convenience macro
def packagecmd #1{directlua {packagedata.#1}}
%% the obfuscate environment, mapping to Lua functions that enable and
%% disable tounicode obfuscation
def beginobfuscate {packagecmd {obfuscate_begin ()}}
def endobfuscate {packagecmd {obfuscate_end ()}}
%%···································································%%
%% Demo
%%···································································%%
%% firstly, load some fonts. within the “obfuscate” environment all
%% fonts will get their cmaps scrambled ...
beginobfuscate
font mainfont = "file:Iwona-Regular.otf:mode=base"
font italicfont = "file:Iwona-Italic.otf:mode=base"
endobfuscate
%% ... while fonts defined outside will have the mapping intact
font boldfont = "file:Iwona-Bold.otf:mode=base"
font bolditalicfont = "file:Iwona-BoldItalic.otf:mode=base"
%% now we can use them in our document like any ordinary font
mainfont
obfuscated text before {italicfont obfuscated too} and after par
obfuscated text before {boldfont not obfuscated} and after par
obfuscated text before {bolditalicfont not obfuscated} and after par
bye
Result in PDF viewer:
Contrast this with the output of pdftotext
:
rf2yC'I_J I_dI r_f{_ 9;H`bp<<L& <99 '5J 'fI_{
rf2yC'I_J I_dI r_f{_ not obfuscated '5J 'fI_{
rf2yC'I_J I_dI r_f{_ not obfuscated '5J 'fI_{
But please forget about all this immediately and never obfuscate a
production text -- don’t be mean to your readers!
EDIT
Because the generous karma donor specifically asked for a Context
solution, I’ll throw that one in as a bonus.
It is a good deal more elegant since it relies on the font goodies
mechanism that allows applying postprocessors to specific fonts which
can afterwards be used just like common font features.
startluacode
local mathrandom = math.random
local stringformat = string.format
--- create a postprocessor
local obfuscate = function (tfmdata)
fonts.goodies.registerpostprocessor (tfmdata, function (tfmdata)
if not tfmdata or type (tfmdata) ~= "table" then
return
end
local characters = tfmdata.characters
if characters then
for codepoint, char in next, characters do
char.tounicode = stringformat ([[%0.4X]], mathrandom (0x21, 0x7e))
end
end
end)
end
--- now register as a font feature
fonts.handlers.otf.features.register {
name = "obfuscate",
description = "treat the reader like a piece of garbage",
default = false,
initializers = {
base = obfuscate,
node = obfuscate,
}
}
stopluacode
%%···································································%%
%% demonstration
%%···································································%%
%% we can now treat the obfuscation postprocessor like any other
%% font feature
definefontfeature [obfuscate] [obfuscate=yes]
definefont [mainfont] [file:Iwona-Regular.otf*obfuscate]
definefont [italicfont] [file:Iwona-Italic.otf*obfuscate]
definefont [boldfont] [file:Iwona-Bold.otf]
definefont [bolditalicfont] [file:Iwona-BoldItalic.otf]
starttext
mainfont
obfuscated text before {italicfont obfuscated too} and after par
obfuscated text before {boldfont not obfuscated} and after par
obfuscated text before {bolditalicfont not obfuscated} and after par
stoptext
1
I'll see your obfuscator and raise you by some free OCR package :D
– Mark K Cowan
Nov 6 '13 at 15:20
1
@Mark Never mind OCR, this is deterministic 1 to 1 character mapping: t=I, e=_, x=d, etc. A few minutes with a document could produce ased
substitution expression for all the changed glyphs. Pipe yourpdftotext
into that and you have a 100% fix. All this does is waste both author (and reader) time without actually solving anything but making them feel like they have. Poor-mans-DRM is even worse than the real thing.
– Caleb
Dec 20 '14 at 15:49
add a comment |
up vote
23
down vote
up vote
23
down vote
Luatex allows manipulating fonts in the define_font
callback.
Luaotfload facilitates this even more with an extra hook it installs
right after the font loader has finished its job: the
luaotfload.patch_font
callback.
Normally it is used for serious and constructive tasks like setting a
couple font dimensions or ensuring backward compatibility in the data
structures.
Of course, it can also be abused for dirty hacks like disabling copy
and paste.
At the point where the patch_font
callback is applied, the font is
already defined and ready to use.
All necessary tables are created and put in a place where Luatex
expects them.
Among these is the characters
table that holds preprocessed
information about the glyphs.
In the below code we modify the tounicode
field of each glyph so
that it maps to some random location within the printable ASCII range.
Note that this does not affect the shape and metrics of the glyph since
those are unrelated to the actual codepoint.
As a consequence, the PDF will contain legible text that cannot be
copied.
Package file obfuscate.lua
:
packagedata = packagedata or { }
local mathrandom = math.random
local stringformat = string.format
--- this is the callback by means of which we will obfuscate
--- the tounicode values so they map to random characters of
--- the printable ascii range (between 0x21 / 33 and 0x7e / 126)
local obfuscate = function (tfmdata, _specification)
if not tfmdata or type (tfmdata) ~= "table" then
return
end
local characters = tfmdata.characters
if characters then
for codepoint, char in next, characters do
char.tounicode = stringformat ([[%0.4X]], mathrandom (0x21, 0x7e))
end
end
end
--- we also need some functions to toggle the callback activation so
--- we can obfuscate fonts selectively
local active = false
packagedata.obfuscate_begin = function ()
if not active then
luatexbase.add_to_callback ("luaotfload.patch_font", obfuscate,
"user.obfuscate_font", 1)
active = true
end
end
packagedata.obfuscate_end = function ()
if active then
luatexbase.remove_from_callback ("luaotfload.patch_font",
"user.obfuscate_font")
active = false
end
end
Usage demonstration:
%% we will need these packages
input luatexbase.sty
input luaotfload.sty
%% for inspecting the pdf with an ordinary editor
pdfcompresslevel0
pdfobjcompresslevel0
%% load obfuscation code
RequireLuaModule {obfuscate}
%% convenience macro
def packagecmd #1{directlua {packagedata.#1}}
%% the obfuscate environment, mapping to Lua functions that enable and
%% disable tounicode obfuscation
def beginobfuscate {packagecmd {obfuscate_begin ()}}
def endobfuscate {packagecmd {obfuscate_end ()}}
%%···································································%%
%% Demo
%%···································································%%
%% firstly, load some fonts. within the “obfuscate” environment all
%% fonts will get their cmaps scrambled ...
beginobfuscate
font mainfont = "file:Iwona-Regular.otf:mode=base"
font italicfont = "file:Iwona-Italic.otf:mode=base"
endobfuscate
%% ... while fonts defined outside will have the mapping intact
font boldfont = "file:Iwona-Bold.otf:mode=base"
font bolditalicfont = "file:Iwona-BoldItalic.otf:mode=base"
%% now we can use them in our document like any ordinary font
mainfont
obfuscated text before {italicfont obfuscated too} and after par
obfuscated text before {boldfont not obfuscated} and after par
obfuscated text before {bolditalicfont not obfuscated} and after par
bye
Result in PDF viewer:
Contrast this with the output of pdftotext
:
rf2yC'I_J I_dI r_f{_ 9;H`bp<<L& <99 '5J 'fI_{
rf2yC'I_J I_dI r_f{_ not obfuscated '5J 'fI_{
rf2yC'I_J I_dI r_f{_ not obfuscated '5J 'fI_{
But please forget about all this immediately and never obfuscate a
production text -- don’t be mean to your readers!
EDIT
Because the generous karma donor specifically asked for a Context
solution, I’ll throw that one in as a bonus.
It is a good deal more elegant since it relies on the font goodies
mechanism that allows applying postprocessors to specific fonts which
can afterwards be used just like common font features.
startluacode
local mathrandom = math.random
local stringformat = string.format
--- create a postprocessor
local obfuscate = function (tfmdata)
fonts.goodies.registerpostprocessor (tfmdata, function (tfmdata)
if not tfmdata or type (tfmdata) ~= "table" then
return
end
local characters = tfmdata.characters
if characters then
for codepoint, char in next, characters do
char.tounicode = stringformat ([[%0.4X]], mathrandom (0x21, 0x7e))
end
end
end)
end
--- now register as a font feature
fonts.handlers.otf.features.register {
name = "obfuscate",
description = "treat the reader like a piece of garbage",
default = false,
initializers = {
base = obfuscate,
node = obfuscate,
}
}
stopluacode
%%···································································%%
%% demonstration
%%···································································%%
%% we can now treat the obfuscation postprocessor like any other
%% font feature
definefontfeature [obfuscate] [obfuscate=yes]
definefont [mainfont] [file:Iwona-Regular.otf*obfuscate]
definefont [italicfont] [file:Iwona-Italic.otf*obfuscate]
definefont [boldfont] [file:Iwona-Bold.otf]
definefont [bolditalicfont] [file:Iwona-BoldItalic.otf]
starttext
mainfont
obfuscated text before {italicfont obfuscated too} and after par
obfuscated text before {boldfont not obfuscated} and after par
obfuscated text before {bolditalicfont not obfuscated} and after par
stoptext
Luatex allows manipulating fonts in the define_font
callback.
Luaotfload facilitates this even more with an extra hook it installs
right after the font loader has finished its job: the
luaotfload.patch_font
callback.
Normally it is used for serious and constructive tasks like setting a
couple font dimensions or ensuring backward compatibility in the data
structures.
Of course, it can also be abused for dirty hacks like disabling copy
and paste.
At the point where the patch_font
callback is applied, the font is
already defined and ready to use.
All necessary tables are created and put in a place where Luatex
expects them.
Among these is the characters
table that holds preprocessed
information about the glyphs.
In the below code we modify the tounicode
field of each glyph so
that it maps to some random location within the printable ASCII range.
Note that this does not affect the shape and metrics of the glyph since
those are unrelated to the actual codepoint.
As a consequence, the PDF will contain legible text that cannot be
copied.
Package file obfuscate.lua
:
packagedata = packagedata or { }
local mathrandom = math.random
local stringformat = string.format
--- this is the callback by means of which we will obfuscate
--- the tounicode values so they map to random characters of
--- the printable ascii range (between 0x21 / 33 and 0x7e / 126)
local obfuscate = function (tfmdata, _specification)
if not tfmdata or type (tfmdata) ~= "table" then
return
end
local characters = tfmdata.characters
if characters then
for codepoint, char in next, characters do
char.tounicode = stringformat ([[%0.4X]], mathrandom (0x21, 0x7e))
end
end
end
--- we also need some functions to toggle the callback activation so
--- we can obfuscate fonts selectively
local active = false
packagedata.obfuscate_begin = function ()
if not active then
luatexbase.add_to_callback ("luaotfload.patch_font", obfuscate,
"user.obfuscate_font", 1)
active = true
end
end
packagedata.obfuscate_end = function ()
if active then
luatexbase.remove_from_callback ("luaotfload.patch_font",
"user.obfuscate_font")
active = false
end
end
Usage demonstration:
%% we will need these packages
input luatexbase.sty
input luaotfload.sty
%% for inspecting the pdf with an ordinary editor
pdfcompresslevel0
pdfobjcompresslevel0
%% load obfuscation code
RequireLuaModule {obfuscate}
%% convenience macro
def packagecmd #1{directlua {packagedata.#1}}
%% the obfuscate environment, mapping to Lua functions that enable and
%% disable tounicode obfuscation
def beginobfuscate {packagecmd {obfuscate_begin ()}}
def endobfuscate {packagecmd {obfuscate_end ()}}
%%···································································%%
%% Demo
%%···································································%%
%% firstly, load some fonts. within the “obfuscate” environment all
%% fonts will get their cmaps scrambled ...
beginobfuscate
font mainfont = "file:Iwona-Regular.otf:mode=base"
font italicfont = "file:Iwona-Italic.otf:mode=base"
endobfuscate
%% ... while fonts defined outside will have the mapping intact
font boldfont = "file:Iwona-Bold.otf:mode=base"
font bolditalicfont = "file:Iwona-BoldItalic.otf:mode=base"
%% now we can use them in our document like any ordinary font
mainfont
obfuscated text before {italicfont obfuscated too} and after par
obfuscated text before {boldfont not obfuscated} and after par
obfuscated text before {bolditalicfont not obfuscated} and after par
bye
Result in PDF viewer:
Contrast this with the output of pdftotext
:
rf2yC'I_J I_dI r_f{_ 9;H`bp<<L& <99 '5J 'fI_{
rf2yC'I_J I_dI r_f{_ not obfuscated '5J 'fI_{
rf2yC'I_J I_dI r_f{_ not obfuscated '5J 'fI_{
But please forget about all this immediately and never obfuscate a
production text -- don’t be mean to your readers!
EDIT
Because the generous karma donor specifically asked for a Context
solution, I’ll throw that one in as a bonus.
It is a good deal more elegant since it relies on the font goodies
mechanism that allows applying postprocessors to specific fonts which
can afterwards be used just like common font features.
startluacode
local mathrandom = math.random
local stringformat = string.format
--- create a postprocessor
local obfuscate = function (tfmdata)
fonts.goodies.registerpostprocessor (tfmdata, function (tfmdata)
if not tfmdata or type (tfmdata) ~= "table" then
return
end
local characters = tfmdata.characters
if characters then
for codepoint, char in next, characters do
char.tounicode = stringformat ([[%0.4X]], mathrandom (0x21, 0x7e))
end
end
end)
end
--- now register as a font feature
fonts.handlers.otf.features.register {
name = "obfuscate",
description = "treat the reader like a piece of garbage",
default = false,
initializers = {
base = obfuscate,
node = obfuscate,
}
}
stopluacode
%%···································································%%
%% demonstration
%%···································································%%
%% we can now treat the obfuscation postprocessor like any other
%% font feature
definefontfeature [obfuscate] [obfuscate=yes]
definefont [mainfont] [file:Iwona-Regular.otf*obfuscate]
definefont [italicfont] [file:Iwona-Italic.otf*obfuscate]
definefont [boldfont] [file:Iwona-Bold.otf]
definefont [bolditalicfont] [file:Iwona-BoldItalic.otf]
starttext
mainfont
obfuscated text before {italicfont obfuscated too} and after par
obfuscated text before {boldfont not obfuscated} and after par
obfuscated text before {bolditalicfont not obfuscated} and after par
stoptext
edited Sep 11 '13 at 20:27
doncherry
34.6k23134207
34.6k23134207
answered Aug 12 '13 at 20:22
Philipp Gesang
6,5782336
6,5782336
1
I'll see your obfuscator and raise you by some free OCR package :D
– Mark K Cowan
Nov 6 '13 at 15:20
1
@Mark Never mind OCR, this is deterministic 1 to 1 character mapping: t=I, e=_, x=d, etc. A few minutes with a document could produce ased
substitution expression for all the changed glyphs. Pipe yourpdftotext
into that and you have a 100% fix. All this does is waste both author (and reader) time without actually solving anything but making them feel like they have. Poor-mans-DRM is even worse than the real thing.
– Caleb
Dec 20 '14 at 15:49
add a comment |
1
I'll see your obfuscator and raise you by some free OCR package :D
– Mark K Cowan
Nov 6 '13 at 15:20
1
@Mark Never mind OCR, this is deterministic 1 to 1 character mapping: t=I, e=_, x=d, etc. A few minutes with a document could produce ased
substitution expression for all the changed glyphs. Pipe yourpdftotext
into that and you have a 100% fix. All this does is waste both author (and reader) time without actually solving anything but making them feel like they have. Poor-mans-DRM is even worse than the real thing.
– Caleb
Dec 20 '14 at 15:49
1
1
I'll see your obfuscator and raise you by some free OCR package :D
– Mark K Cowan
Nov 6 '13 at 15:20
I'll see your obfuscator and raise you by some free OCR package :D
– Mark K Cowan
Nov 6 '13 at 15:20
1
1
@Mark Never mind OCR, this is deterministic 1 to 1 character mapping: t=I, e=_, x=d, etc. A few minutes with a document could produce a
sed
substitution expression for all the changed glyphs. Pipe your pdftotext
into that and you have a 100% fix. All this does is waste both author (and reader) time without actually solving anything but making them feel like they have. Poor-mans-DRM is even worse than the real thing.– Caleb
Dec 20 '14 at 15:49
@Mark Never mind OCR, this is deterministic 1 to 1 character mapping: t=I, e=_, x=d, etc. A few minutes with a document could produce a
sed
substitution expression for all the changed glyphs. Pipe your pdftotext
into that and you have a 100% fix. All this does is waste both author (and reader) time without actually solving anything but making them feel like they have. Poor-mans-DRM is even worse than the real thing.– Caleb
Dec 20 '14 at 15:49
add a comment |
up vote
19
down vote
You can disable the copying of text with the help of PDF encryption. With it you can also disable other things like printing.
You need to use an external PDF tool like pdftk
or of course the full version of Adobe Acrobat to encrypt the PDF.
10
However, encryption doesn't work for (almost all as I know) non-Adobe PDF readers.
– Leo Liu
Feb 17 '11 at 15:14
1
I often use a certain open-source reader (with just one line of code commented out) to bypass PDF protection and passwords. Anyone familiar with SourceForge, GIT and MAKE can easily roll their own in a matter of minutes too.
– Mark K Cowan
Aug 12 '13 at 23:54
@MarkKCowan I know of other, less sophisticated (if but also effective) ways than what you describe; out of sheer curiousity (though that curiousity is not that large that I'd try and patch it myself): Could you provide more verbose details or a link to a commented (indicating the commented-out line) GIT ? - sry about the overuse of (brackets); I'm drunk.
– nutty about natty
Aug 14 '13 at 20:22
It was a long time ago when I built it. I think it was a Java application, there was one particular line which was a const "final boolean <something> = <somevalue>;" which related to password protection. Apparently, Ubuntu was the only distro where the password protection had been enabled, so I flipped that boolean and recompiled to produce a binary which didn't bother with the whole password fake-DRM stuff. Strictly speaking, I changed the value of a boolean constant, rather than commenting out a line.
– Mark K Cowan
Aug 14 '13 at 23:27
Okular has a user setting which determines whether it recognises DRM or not...
– cfr
Aug 25 '14 at 3:39
add a comment |
up vote
19
down vote
You can disable the copying of text with the help of PDF encryption. With it you can also disable other things like printing.
You need to use an external PDF tool like pdftk
or of course the full version of Adobe Acrobat to encrypt the PDF.
10
However, encryption doesn't work for (almost all as I know) non-Adobe PDF readers.
– Leo Liu
Feb 17 '11 at 15:14
1
I often use a certain open-source reader (with just one line of code commented out) to bypass PDF protection and passwords. Anyone familiar with SourceForge, GIT and MAKE can easily roll their own in a matter of minutes too.
– Mark K Cowan
Aug 12 '13 at 23:54
@MarkKCowan I know of other, less sophisticated (if but also effective) ways than what you describe; out of sheer curiousity (though that curiousity is not that large that I'd try and patch it myself): Could you provide more verbose details or a link to a commented (indicating the commented-out line) GIT ? - sry about the overuse of (brackets); I'm drunk.
– nutty about natty
Aug 14 '13 at 20:22
It was a long time ago when I built it. I think it was a Java application, there was one particular line which was a const "final boolean <something> = <somevalue>;" which related to password protection. Apparently, Ubuntu was the only distro where the password protection had been enabled, so I flipped that boolean and recompiled to produce a binary which didn't bother with the whole password fake-DRM stuff. Strictly speaking, I changed the value of a boolean constant, rather than commenting out a line.
– Mark K Cowan
Aug 14 '13 at 23:27
Okular has a user setting which determines whether it recognises DRM or not...
– cfr
Aug 25 '14 at 3:39
add a comment |
up vote
19
down vote
up vote
19
down vote
You can disable the copying of text with the help of PDF encryption. With it you can also disable other things like printing.
You need to use an external PDF tool like pdftk
or of course the full version of Adobe Acrobat to encrypt the PDF.
You can disable the copying of text with the help of PDF encryption. With it you can also disable other things like printing.
You need to use an external PDF tool like pdftk
or of course the full version of Adobe Acrobat to encrypt the PDF.
edited Feb 17 '11 at 15:17
answered Feb 17 '11 at 15:08
Martin Scharrer♦
197k45631813
197k45631813
10
However, encryption doesn't work for (almost all as I know) non-Adobe PDF readers.
– Leo Liu
Feb 17 '11 at 15:14
1
I often use a certain open-source reader (with just one line of code commented out) to bypass PDF protection and passwords. Anyone familiar with SourceForge, GIT and MAKE can easily roll their own in a matter of minutes too.
– Mark K Cowan
Aug 12 '13 at 23:54
@MarkKCowan I know of other, less sophisticated (if but also effective) ways than what you describe; out of sheer curiousity (though that curiousity is not that large that I'd try and patch it myself): Could you provide more verbose details or a link to a commented (indicating the commented-out line) GIT ? - sry about the overuse of (brackets); I'm drunk.
– nutty about natty
Aug 14 '13 at 20:22
It was a long time ago when I built it. I think it was a Java application, there was one particular line which was a const "final boolean <something> = <somevalue>;" which related to password protection. Apparently, Ubuntu was the only distro where the password protection had been enabled, so I flipped that boolean and recompiled to produce a binary which didn't bother with the whole password fake-DRM stuff. Strictly speaking, I changed the value of a boolean constant, rather than commenting out a line.
– Mark K Cowan
Aug 14 '13 at 23:27
Okular has a user setting which determines whether it recognises DRM or not...
– cfr
Aug 25 '14 at 3:39
add a comment |
10
However, encryption doesn't work for (almost all as I know) non-Adobe PDF readers.
– Leo Liu
Feb 17 '11 at 15:14
1
I often use a certain open-source reader (with just one line of code commented out) to bypass PDF protection and passwords. Anyone familiar with SourceForge, GIT and MAKE can easily roll their own in a matter of minutes too.
– Mark K Cowan
Aug 12 '13 at 23:54
@MarkKCowan I know of other, less sophisticated (if but also effective) ways than what you describe; out of sheer curiousity (though that curiousity is not that large that I'd try and patch it myself): Could you provide more verbose details or a link to a commented (indicating the commented-out line) GIT ? - sry about the overuse of (brackets); I'm drunk.
– nutty about natty
Aug 14 '13 at 20:22
It was a long time ago when I built it. I think it was a Java application, there was one particular line which was a const "final boolean <something> = <somevalue>;" which related to password protection. Apparently, Ubuntu was the only distro where the password protection had been enabled, so I flipped that boolean and recompiled to produce a binary which didn't bother with the whole password fake-DRM stuff. Strictly speaking, I changed the value of a boolean constant, rather than commenting out a line.
– Mark K Cowan
Aug 14 '13 at 23:27
Okular has a user setting which determines whether it recognises DRM or not...
– cfr
Aug 25 '14 at 3:39
10
10
However, encryption doesn't work for (almost all as I know) non-Adobe PDF readers.
– Leo Liu
Feb 17 '11 at 15:14
However, encryption doesn't work for (almost all as I know) non-Adobe PDF readers.
– Leo Liu
Feb 17 '11 at 15:14
1
1
I often use a certain open-source reader (with just one line of code commented out) to bypass PDF protection and passwords. Anyone familiar with SourceForge, GIT and MAKE can easily roll their own in a matter of minutes too.
– Mark K Cowan
Aug 12 '13 at 23:54
I often use a certain open-source reader (with just one line of code commented out) to bypass PDF protection and passwords. Anyone familiar with SourceForge, GIT and MAKE can easily roll their own in a matter of minutes too.
– Mark K Cowan
Aug 12 '13 at 23:54
@MarkKCowan I know of other, less sophisticated (if but also effective) ways than what you describe; out of sheer curiousity (though that curiousity is not that large that I'd try and patch it myself): Could you provide more verbose details or a link to a commented (indicating the commented-out line) GIT ? - sry about the overuse of (brackets); I'm drunk.
– nutty about natty
Aug 14 '13 at 20:22
@MarkKCowan I know of other, less sophisticated (if but also effective) ways than what you describe; out of sheer curiousity (though that curiousity is not that large that I'd try and patch it myself): Could you provide more verbose details or a link to a commented (indicating the commented-out line) GIT ? - sry about the overuse of (brackets); I'm drunk.
– nutty about natty
Aug 14 '13 at 20:22
It was a long time ago when I built it. I think it was a Java application, there was one particular line which was a const "final boolean <something> = <somevalue>;" which related to password protection. Apparently, Ubuntu was the only distro where the password protection had been enabled, so I flipped that boolean and recompiled to produce a binary which didn't bother with the whole password fake-DRM stuff. Strictly speaking, I changed the value of a boolean constant, rather than commenting out a line.
– Mark K Cowan
Aug 14 '13 at 23:27
It was a long time ago when I built it. I think it was a Java application, there was one particular line which was a const "final boolean <something> = <somevalue>;" which related to password protection. Apparently, Ubuntu was the only distro where the password protection had been enabled, so I flipped that boolean and recompiled to produce a binary which didn't bother with the whole password fake-DRM stuff. Strictly speaking, I changed the value of a boolean constant, rather than commenting out a line.
– Mark K Cowan
Aug 14 '13 at 23:27
Okular has a user setting which determines whether it recognises DRM or not...
– cfr
Aug 25 '14 at 3:39
Okular has a user setting which determines whether it recognises DRM or not...
– cfr
Aug 25 '14 at 3:39
add a comment |
up vote
18
down vote
Remarks
I use a little script, which converts all my fonts to paths. The script uses the first parameter as input of a .pdf
-file and writes the output to a file with the same name and the extension-rst.pdf
You need Ghostscript for my script to run.
Implementation
Runs on bash
#!/bin/sh
GS=/usr/bin/gs
$GS -sDEVICE=pswrite -dNOCACHE -sOutputFile=- -q -dBATCH -dNOPAUSE "$1" -c quit | ps2pdf - > "${1%%.*}-rst.pdf"
if [ $? -eq 0 ]; then
echo "Output written to ${1%%.*}-rst.pdf"
else
echo "There were errors. See the output."
fi
use ps2write (in stead of pswrite) these days as seen here.
Result
No match for OCR though :D
– Mark K Cowan
Aug 12 '13 at 17:54
6
Well I guess, that there is no way to trick any OCR Software (without adding things, like striking/crossing the text out), because the OCR Software can read, what you can read.
– Henri Menke
Aug 12 '13 at 20:50
Specialised OCR software can break some types of CAPTCHA too... You can use excessive striking/deforming/noise to harden your file against this, but then humans won't be able to read half of it either!
– Mark K Cowan
Aug 12 '13 at 23:52
1
Your script works, but (at least on my files, which are slides produced with beamer) produces very pale and large files. It also takes a relatively long time to finish.
– Anthony Labarre
Oct 22 '13 at 8:46
1
@Trickster You run it likebash script_name "path/to/pdf/file"
. You don't needsudo
as no additional privileges are needed for this script.
– Henri Menke
Nov 24 '13 at 9:30
|
show 2 more comments
up vote
18
down vote
Remarks
I use a little script, which converts all my fonts to paths. The script uses the first parameter as input of a .pdf
-file and writes the output to a file with the same name and the extension-rst.pdf
You need Ghostscript for my script to run.
Implementation
Runs on bash
#!/bin/sh
GS=/usr/bin/gs
$GS -sDEVICE=pswrite -dNOCACHE -sOutputFile=- -q -dBATCH -dNOPAUSE "$1" -c quit | ps2pdf - > "${1%%.*}-rst.pdf"
if [ $? -eq 0 ]; then
echo "Output written to ${1%%.*}-rst.pdf"
else
echo "There were errors. See the output."
fi
use ps2write (in stead of pswrite) these days as seen here.
Result
No match for OCR though :D
– Mark K Cowan
Aug 12 '13 at 17:54
6
Well I guess, that there is no way to trick any OCR Software (without adding things, like striking/crossing the text out), because the OCR Software can read, what you can read.
– Henri Menke
Aug 12 '13 at 20:50
Specialised OCR software can break some types of CAPTCHA too... You can use excessive striking/deforming/noise to harden your file against this, but then humans won't be able to read half of it either!
– Mark K Cowan
Aug 12 '13 at 23:52
1
Your script works, but (at least on my files, which are slides produced with beamer) produces very pale and large files. It also takes a relatively long time to finish.
– Anthony Labarre
Oct 22 '13 at 8:46
1
@Trickster You run it likebash script_name "path/to/pdf/file"
. You don't needsudo
as no additional privileges are needed for this script.
– Henri Menke
Nov 24 '13 at 9:30
|
show 2 more comments
up vote
18
down vote
up vote
18
down vote
Remarks
I use a little script, which converts all my fonts to paths. The script uses the first parameter as input of a .pdf
-file and writes the output to a file with the same name and the extension-rst.pdf
You need Ghostscript for my script to run.
Implementation
Runs on bash
#!/bin/sh
GS=/usr/bin/gs
$GS -sDEVICE=pswrite -dNOCACHE -sOutputFile=- -q -dBATCH -dNOPAUSE "$1" -c quit | ps2pdf - > "${1%%.*}-rst.pdf"
if [ $? -eq 0 ]; then
echo "Output written to ${1%%.*}-rst.pdf"
else
echo "There were errors. See the output."
fi
use ps2write (in stead of pswrite) these days as seen here.
Result
Remarks
I use a little script, which converts all my fonts to paths. The script uses the first parameter as input of a .pdf
-file and writes the output to a file with the same name and the extension-rst.pdf
You need Ghostscript for my script to run.
Implementation
Runs on bash
#!/bin/sh
GS=/usr/bin/gs
$GS -sDEVICE=pswrite -dNOCACHE -sOutputFile=- -q -dBATCH -dNOPAUSE "$1" -c quit | ps2pdf - > "${1%%.*}-rst.pdf"
if [ $? -eq 0 ]; then
echo "Output written to ${1%%.*}-rst.pdf"
else
echo "There were errors. See the output."
fi
use ps2write (in stead of pswrite) these days as seen here.
Result
edited May 23 '17 at 12:39
Community♦
1
1
answered Aug 12 '13 at 17:45
Henri Menke
68.5k7153257
68.5k7153257
No match for OCR though :D
– Mark K Cowan
Aug 12 '13 at 17:54
6
Well I guess, that there is no way to trick any OCR Software (without adding things, like striking/crossing the text out), because the OCR Software can read, what you can read.
– Henri Menke
Aug 12 '13 at 20:50
Specialised OCR software can break some types of CAPTCHA too... You can use excessive striking/deforming/noise to harden your file against this, but then humans won't be able to read half of it either!
– Mark K Cowan
Aug 12 '13 at 23:52
1
Your script works, but (at least on my files, which are slides produced with beamer) produces very pale and large files. It also takes a relatively long time to finish.
– Anthony Labarre
Oct 22 '13 at 8:46
1
@Trickster You run it likebash script_name "path/to/pdf/file"
. You don't needsudo
as no additional privileges are needed for this script.
– Henri Menke
Nov 24 '13 at 9:30
|
show 2 more comments
No match for OCR though :D
– Mark K Cowan
Aug 12 '13 at 17:54
6
Well I guess, that there is no way to trick any OCR Software (without adding things, like striking/crossing the text out), because the OCR Software can read, what you can read.
– Henri Menke
Aug 12 '13 at 20:50
Specialised OCR software can break some types of CAPTCHA too... You can use excessive striking/deforming/noise to harden your file against this, but then humans won't be able to read half of it either!
– Mark K Cowan
Aug 12 '13 at 23:52
1
Your script works, but (at least on my files, which are slides produced with beamer) produces very pale and large files. It also takes a relatively long time to finish.
– Anthony Labarre
Oct 22 '13 at 8:46
1
@Trickster You run it likebash script_name "path/to/pdf/file"
. You don't needsudo
as no additional privileges are needed for this script.
– Henri Menke
Nov 24 '13 at 9:30
No match for OCR though :D
– Mark K Cowan
Aug 12 '13 at 17:54
No match for OCR though :D
– Mark K Cowan
Aug 12 '13 at 17:54
6
6
Well I guess, that there is no way to trick any OCR Software (without adding things, like striking/crossing the text out), because the OCR Software can read, what you can read.
– Henri Menke
Aug 12 '13 at 20:50
Well I guess, that there is no way to trick any OCR Software (without adding things, like striking/crossing the text out), because the OCR Software can read, what you can read.
– Henri Menke
Aug 12 '13 at 20:50
Specialised OCR software can break some types of CAPTCHA too... You can use excessive striking/deforming/noise to harden your file against this, but then humans won't be able to read half of it either!
– Mark K Cowan
Aug 12 '13 at 23:52
Specialised OCR software can break some types of CAPTCHA too... You can use excessive striking/deforming/noise to harden your file against this, but then humans won't be able to read half of it either!
– Mark K Cowan
Aug 12 '13 at 23:52
1
1
Your script works, but (at least on my files, which are slides produced with beamer) produces very pale and large files. It also takes a relatively long time to finish.
– Anthony Labarre
Oct 22 '13 at 8:46
Your script works, but (at least on my files, which are slides produced with beamer) produces very pale and large files. It also takes a relatively long time to finish.
– Anthony Labarre
Oct 22 '13 at 8:46
1
1
@Trickster You run it like
bash script_name "path/to/pdf/file"
. You don't need sudo
as no additional privileges are needed for this script.– Henri Menke
Nov 24 '13 at 9:30
@Trickster You run it like
bash script_name "path/to/pdf/file"
. You don't need sudo
as no additional privileges are needed for this script.– Henri Menke
Nov 24 '13 at 9:30
|
show 2 more comments
up vote
10
down vote
If content can be viewed, it can be copied.
No matter what encryption and restrictions are used, at some point the content must be put out in plain view in order for it to be of any use.
This is probably true of all digital content and most physical content larger than the nanoscale...
For example, a PDF:
- Rasterisation: Printscreen => OCR
- Any protection: Re-type it out
- Content protection: Modified build of an open-source reader
Web content:
- Right-click popup: Opera=>Prevent page receiving content menu events
- Right-click popup: "Menu" button on any modern keyboard
- Flash: Download the SWF file, decompile it using free software
- View page source, use Chrome/Opera/Firefox debugger to get URL of desired content
Audio (e.g. HDCP):
- Headphones socket on TV => line-in socket on PC
- Solder to tap into preamplifier => line-in socket on PC
Video (e.g. HDCP):
- Many, many options... A quick google search will show you.
Encrypted content on someone's laptop/pendrive:
- Digital brute force: brute-force cracking of the encryption key
- Physical brute force: http://imgs.xkcd.com/comics/security.png
One of these is not like the others. The last item is both wrong and a different scenario from your premise.
– Caleb
Dec 20 '14 at 15:54
add a comment |
up vote
10
down vote
If content can be viewed, it can be copied.
No matter what encryption and restrictions are used, at some point the content must be put out in plain view in order for it to be of any use.
This is probably true of all digital content and most physical content larger than the nanoscale...
For example, a PDF:
- Rasterisation: Printscreen => OCR
- Any protection: Re-type it out
- Content protection: Modified build of an open-source reader
Web content:
- Right-click popup: Opera=>Prevent page receiving content menu events
- Right-click popup: "Menu" button on any modern keyboard
- Flash: Download the SWF file, decompile it using free software
- View page source, use Chrome/Opera/Firefox debugger to get URL of desired content
Audio (e.g. HDCP):
- Headphones socket on TV => line-in socket on PC
- Solder to tap into preamplifier => line-in socket on PC
Video (e.g. HDCP):
- Many, many options... A quick google search will show you.
Encrypted content on someone's laptop/pendrive:
- Digital brute force: brute-force cracking of the encryption key
- Physical brute force: http://imgs.xkcd.com/comics/security.png
One of these is not like the others. The last item is both wrong and a different scenario from your premise.
– Caleb
Dec 20 '14 at 15:54
add a comment |
up vote
10
down vote
up vote
10
down vote
If content can be viewed, it can be copied.
No matter what encryption and restrictions are used, at some point the content must be put out in plain view in order for it to be of any use.
This is probably true of all digital content and most physical content larger than the nanoscale...
For example, a PDF:
- Rasterisation: Printscreen => OCR
- Any protection: Re-type it out
- Content protection: Modified build of an open-source reader
Web content:
- Right-click popup: Opera=>Prevent page receiving content menu events
- Right-click popup: "Menu" button on any modern keyboard
- Flash: Download the SWF file, decompile it using free software
- View page source, use Chrome/Opera/Firefox debugger to get URL of desired content
Audio (e.g. HDCP):
- Headphones socket on TV => line-in socket on PC
- Solder to tap into preamplifier => line-in socket on PC
Video (e.g. HDCP):
- Many, many options... A quick google search will show you.
Encrypted content on someone's laptop/pendrive:
- Digital brute force: brute-force cracking of the encryption key
- Physical brute force: http://imgs.xkcd.com/comics/security.png
If content can be viewed, it can be copied.
No matter what encryption and restrictions are used, at some point the content must be put out in plain view in order for it to be of any use.
This is probably true of all digital content and most physical content larger than the nanoscale...
For example, a PDF:
- Rasterisation: Printscreen => OCR
- Any protection: Re-type it out
- Content protection: Modified build of an open-source reader
Web content:
- Right-click popup: Opera=>Prevent page receiving content menu events
- Right-click popup: "Menu" button on any modern keyboard
- Flash: Download the SWF file, decompile it using free software
- View page source, use Chrome/Opera/Firefox debugger to get URL of desired content
Audio (e.g. HDCP):
- Headphones socket on TV => line-in socket on PC
- Solder to tap into preamplifier => line-in socket on PC
Video (e.g. HDCP):
- Many, many options... A quick google search will show you.
Encrypted content on someone's laptop/pendrive:
- Digital brute force: brute-force cracking of the encryption key
- Physical brute force: http://imgs.xkcd.com/comics/security.png
answered Aug 12 '13 at 17:48
Mark K Cowan
24828
24828
One of these is not like the others. The last item is both wrong and a different scenario from your premise.
– Caleb
Dec 20 '14 at 15:54
add a comment |
One of these is not like the others. The last item is both wrong and a different scenario from your premise.
– Caleb
Dec 20 '14 at 15:54
One of these is not like the others. The last item is both wrong and a different scenario from your premise.
– Caleb
Dec 20 '14 at 15:54
One of these is not like the others. The last item is both wrong and a different scenario from your premise.
– Caleb
Dec 20 '14 at 15:54
add a comment |
up vote
5
down vote
The answer is: Yes.
There is a way described here: http://spivey.oriel.ox.ac.uk/corner/Obfuscated_PDF
But it looks tedious and doesn't use pdflatex. The method, however, is described as being portable to PDF. It involves changing glyphs of a font and other dirty things that get you bad dreams.
I didn't find a method described for directly PDF let alone something automated for pdflatex. I'll happily buy you a beverage of your choice if you implement it :-)
add a comment |
up vote
5
down vote
The answer is: Yes.
There is a way described here: http://spivey.oriel.ox.ac.uk/corner/Obfuscated_PDF
But it looks tedious and doesn't use pdflatex. The method, however, is described as being portable to PDF. It involves changing glyphs of a font and other dirty things that get you bad dreams.
I didn't find a method described for directly PDF let alone something automated for pdflatex. I'll happily buy you a beverage of your choice if you implement it :-)
add a comment |
up vote
5
down vote
up vote
5
down vote
The answer is: Yes.
There is a way described here: http://spivey.oriel.ox.ac.uk/corner/Obfuscated_PDF
But it looks tedious and doesn't use pdflatex. The method, however, is described as being portable to PDF. It involves changing glyphs of a font and other dirty things that get you bad dreams.
I didn't find a method described for directly PDF let alone something automated for pdflatex. I'll happily buy you a beverage of your choice if you implement it :-)
The answer is: Yes.
There is a way described here: http://spivey.oriel.ox.ac.uk/corner/Obfuscated_PDF
But it looks tedious and doesn't use pdflatex. The method, however, is described as being portable to PDF. It involves changing glyphs of a font and other dirty things that get you bad dreams.
I didn't find a method described for directly PDF let alone something automated for pdflatex. I'll happily buy you a beverage of your choice if you implement it :-)
answered Feb 15 '13 at 15:26
Frederick Nord
543615
543615
add a comment |
add a comment |
up vote
2
down vote
I am using gswin32
only make a pdf after change the format to ps:
"C:Program Files (x86)gsgs9.09bingswin32.exe" -sDEVICE=ps2write -r9000
-dNOPAUSE -sOutputFile=OUTPUT.ps input_insecure.pdf
Now translate to pdf with secure mode 4 (only read and print):
"C:Program Files (x86)gsgs9.09bingswin32.exe" -sDEVICE=pdfwrite -r9000
-dNOPAUSE -sPAPERSIZE=a4 -dPDFSETTINGS=/prepress -dMaxSubsetPct=100
-dSubsetFonts=true -dEmbedAllFonts=true -sOwnerPassword=null -dEncryptionR=3
-dKeyLength=40 -dPermissions=4 -sOutputFile=OUTPUT_secure.pdf output.ps
On a unix-based system, you can probably type gswin32
instead of "C:Program Files (x86)gsgs9.09bingswin32.exe"
.
add a comment |
up vote
2
down vote
I am using gswin32
only make a pdf after change the format to ps:
"C:Program Files (x86)gsgs9.09bingswin32.exe" -sDEVICE=ps2write -r9000
-dNOPAUSE -sOutputFile=OUTPUT.ps input_insecure.pdf
Now translate to pdf with secure mode 4 (only read and print):
"C:Program Files (x86)gsgs9.09bingswin32.exe" -sDEVICE=pdfwrite -r9000
-dNOPAUSE -sPAPERSIZE=a4 -dPDFSETTINGS=/prepress -dMaxSubsetPct=100
-dSubsetFonts=true -dEmbedAllFonts=true -sOwnerPassword=null -dEncryptionR=3
-dKeyLength=40 -dPermissions=4 -sOutputFile=OUTPUT_secure.pdf output.ps
On a unix-based system, you can probably type gswin32
instead of "C:Program Files (x86)gsgs9.09bingswin32.exe"
.
add a comment |
up vote
2
down vote
up vote
2
down vote
I am using gswin32
only make a pdf after change the format to ps:
"C:Program Files (x86)gsgs9.09bingswin32.exe" -sDEVICE=ps2write -r9000
-dNOPAUSE -sOutputFile=OUTPUT.ps input_insecure.pdf
Now translate to pdf with secure mode 4 (only read and print):
"C:Program Files (x86)gsgs9.09bingswin32.exe" -sDEVICE=pdfwrite -r9000
-dNOPAUSE -sPAPERSIZE=a4 -dPDFSETTINGS=/prepress -dMaxSubsetPct=100
-dSubsetFonts=true -dEmbedAllFonts=true -sOwnerPassword=null -dEncryptionR=3
-dKeyLength=40 -dPermissions=4 -sOutputFile=OUTPUT_secure.pdf output.ps
On a unix-based system, you can probably type gswin32
instead of "C:Program Files (x86)gsgs9.09bingswin32.exe"
.
I am using gswin32
only make a pdf after change the format to ps:
"C:Program Files (x86)gsgs9.09bingswin32.exe" -sDEVICE=ps2write -r9000
-dNOPAUSE -sOutputFile=OUTPUT.ps input_insecure.pdf
Now translate to pdf with secure mode 4 (only read and print):
"C:Program Files (x86)gsgs9.09bingswin32.exe" -sDEVICE=pdfwrite -r9000
-dNOPAUSE -sPAPERSIZE=a4 -dPDFSETTINGS=/prepress -dMaxSubsetPct=100
-dSubsetFonts=true -dEmbedAllFonts=true -sOwnerPassword=null -dEncryptionR=3
-dKeyLength=40 -dPermissions=4 -sOutputFile=OUTPUT_secure.pdf output.ps
On a unix-based system, you can probably type gswin32
instead of "C:Program Files (x86)gsgs9.09bingswin32.exe"
.
edited Apr 8 '16 at 19:59
Mico
271k30369756
271k30369756
answered Apr 8 '16 at 19:50
Rafael Duarte
312
312
add a comment |
add a comment |
up vote
1
down vote
Use XeTeX
to at least get some "nonsense characters", see here and here.
Though this would obviously be just a nuisance for most cases/users (which can be avoided using LuaLaTeX
instead), depending on what you are trying to achieve compiling with XeTeX
may prove to add at least some value to your solution...
add a comment |
up vote
1
down vote
Use XeTeX
to at least get some "nonsense characters", see here and here.
Though this would obviously be just a nuisance for most cases/users (which can be avoided using LuaLaTeX
instead), depending on what you are trying to achieve compiling with XeTeX
may prove to add at least some value to your solution...
add a comment |
up vote
1
down vote
up vote
1
down vote
Use XeTeX
to at least get some "nonsense characters", see here and here.
Though this would obviously be just a nuisance for most cases/users (which can be avoided using LuaLaTeX
instead), depending on what you are trying to achieve compiling with XeTeX
may prove to add at least some value to your solution...
Use XeTeX
to at least get some "nonsense characters", see here and here.
Though this would obviously be just a nuisance for most cases/users (which can be avoided using LuaLaTeX
instead), depending on what you are trying to achieve compiling with XeTeX
may prove to add at least some value to your solution...
edited Apr 13 '17 at 12:35
Community♦
1
1
answered Aug 14 '13 at 20:31
nutty about natty
1,26221633
1,26221633
add a comment |
add a comment |
up vote
1
down vote
You can use ImageMagick to convert the pdf to an image pdf.
Running
convert file1.pdf file2.pdf
will create a pdf called file2.pdf
which is about the same size as the input pdf but since its an image, the text cannot be selected. There is a notable decrease in quality though
add a comment |
up vote
1
down vote
You can use ImageMagick to convert the pdf to an image pdf.
Running
convert file1.pdf file2.pdf
will create a pdf called file2.pdf
which is about the same size as the input pdf but since its an image, the text cannot be selected. There is a notable decrease in quality though
add a comment |
up vote
1
down vote
up vote
1
down vote
You can use ImageMagick to convert the pdf to an image pdf.
Running
convert file1.pdf file2.pdf
will create a pdf called file2.pdf
which is about the same size as the input pdf but since its an image, the text cannot be selected. There is a notable decrease in quality though
You can use ImageMagick to convert the pdf to an image pdf.
Running
convert file1.pdf file2.pdf
will create a pdf called file2.pdf
which is about the same size as the input pdf but since its an image, the text cannot be selected. There is a notable decrease in quality though
answered Feb 23 '17 at 0:04
Matt G
1237
1237
add a comment |
add a comment |
Thanks for contributing an answer to TeX - LaTeX Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2ftex.stackexchange.com%2fquestions%2f11307%2fis-it-possible-to-produce-a-pdf-with-un-copyable-text%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
49
Is it possible? Yes. (Well, sort of -- you could always convert to an image and OCR.) Is it a good idea? No. We must push back against the forces of OCR and commercialism, and push for the causes of open access, searchability, and software freedom. If those who favor open source software don't, no one will.
– frabjous
Feb 17 '11 at 15:16
14
IMHO, it is never a good idea to prevent other people from copying texts in a PDF file through techniques. If we must do such things, don't convert the texts to a image (vector or bitmap). Besides loss of quality, the result file may be very large.
– Leo Liu
Feb 17 '11 at 15:39
26
In addition, you'll do a huge disservice to blind people (though I guess PDF's aren't very accessible even in the best of cases).
– Caramdir
Feb 17 '11 at 15:52
4
"Cracked" isn't the right word. OCR = Optical Character Recognition. It takes an image, analyzes it to try to recognize letter shapes, and then outputs text. Of course, they could always retype what you've written, but that's usually faster.
– frabjous
Feb 19 '11 at 20:12
18
@warem: No, it's not possible. All you need to break it is a thing called "a typist".
– Brent.Longborough
Jun 3 '11 at 21:23