I’ve written a script to help with this. Written for the bash shell in the MacOS Automator so it may require tweaks for other software.
The idea is to split each pdf in 3 parts and then splice them back together-- the cover, which I’ve rasterized, the images from each page, again rasterized, and the text from each page, blackened and inserted after the images. This makes it easier for me to read the text, and makes it easier for the Kindle to handle the images regardless how they’ve been constructed. It breaks tables of contents.
I’ve also written a varient with -dev dx after each k2pdfopt -mode copy, and with different output file names, for a grayscale output optimized for the Kindle Dx.
By default K2 increases contrast, so if you prefer not to, that’s another tweak.
It requires Ghostscript, Cpdf, K2pdfopt, and Qpdf. Cpdf should be free for non-commercial use, but I’d still prefer an open source alternative to it, and it’s no longer available via Homebrew.
I’ve installed k2pdfopt to ~/Applications and I’ve installed the others using Homebrew.
Each app seems to have slightly inconsistent standards for standard output and standard input. In the end, I instructed each one to export a set filename to a “Splice” folder, or import a set filename from there. I’ve been able to run the whole sequence that way, first splitting, then processing, and then splicing the pdf back together.
I haven’t replaced all the older code where it used ` instead of (), maybe eventually.
for f in “$@”
# Copy and Rasterize 1st page from source pdf using k2pdfopt
~/Applications/k2pdfopt -ui -mode copy -p 1 -x -o “/Users/Marja/Splice/RGBCover_copy.pdf” “$f” $@
# Copy text from source pdf file using Ghostscript, turn text black using Cpdf
# The color conversion strategy should help with the 2nd stage if I switch to Ghostscript
# - and -_ indicate standard output and input
# Due to compatibility issues, dumping to ~/Splice/Text.pdf
/usr/local/bin/gs -sDEVICE=pdfwrite -dFILTERIMAGE -dFILTERVECTOR -dCompatibilityLevel=1.4 -sColorConversionStrategy=RGB -sstdout=%sstderr -dNOPAUSE -dQUIET -dBATCH -sOutputFile="/Users/Marja/Splice/Text.pdf" “$f” &&
/usr/local/bin/cpdf “/Users/Marja/Splice/Text.pdf” -blacktext -o “/Users/Marja/Splice/Blacktext.pdf”
# Copy images from same source pdf file using Ghostscript, rasterize images using K2pdfopt
# Due to compatibility issues, dumping to ~/Splice/Images.pdf
/usr/local/bin/gs -sDEVICE=pdfimage24 -dFILTERTEXT -dCompatibilityLevel=1.4
-g800x1080 -r150 -dPDFFitPage
-sstdout=%sstderr -dNOPAUSE -dQUIET -dBATCH -sOutputFile="/Users/Marja/Splice/Images.pdf" “$f” &&
~/Applications/k2pdfopt -ui -mode copy -x -o “/Users/Marja/Splice/RGBImages_copy.pdf” “/Users/Marja/Splice/Images.pdf” $@ &&
# Splice files using qpdf
basename "$f" .pdf
/usr/local/bin/qpdf --collate “/Users/Marja/Splice/RGBCover_copy.pdf” --pages “/Users/Marja/Splice/RGBCover_copy.pdf” “/Users/Marja/Splice/RGBImages_copy.pdf” “/Users/Marja/Splice/Blacktext.pdf” – “$outputfile”