Revisions
-
Matthew Bafford revised this gist
Jul 2, 2024 . 1 changed file with 15 additions and 9 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1,14 +1,20 @@ PDF tools for comparing PDFs visually (overlaying two PDFs to see changed areas) and using a perceptual hash (numerical value indicating visual difference between the two files). Useful for command line review of PDFs and de-duplication. Configure `git` to use these tools for better PDF history / comparison in `git`. These scripts require `imagemagick` and `poppler`. Both installed from homebrew. --- Setup `git` to use a custom diff using: `.gitattributes`: *.pdf binary diff=pdf `.gitconfig`: [diff "pdf"] ; textconv = ~/bin/pdf2layout command = ~/bin/git-diff-pdf
-
Matthew Bafford revised this gist
Jul 2, 2024 . 2 changed files with 29 additions and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,14 @@ Setup `git` to use a custom diff using: .gitattributes: *.pdf binary diff=pdf .gitconfig: [diff "pdf"] ; textconv = ~/bin/pdf2layout command = ~/bin/git-diff-pdf These scripts require `imagemagick` and `poppler`. Both installed from homebrew. This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,15 @@ #!/bin/bash if [[ -z "$1" || -z "$2" ]]; then echo "Usage: $0 <pdf1> <pdf2>" exit 1 fi echo "comparing [$1] and [$2]" # pdf2layout from poppler on homebrew (brew install poppler) echo "*** text content" diff <(~/bin/pdf2layout "$1") <(~/bin/pdf2layout "$2") echo "*** image perceptual hash" ~/bin/pdf-compare-phash "$1" "$2" -
Matthew Bafford revised this gist
Jul 2, 2024 . 1 changed file with 8 additions and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,8 @@ #!/bin/bash if [[ -z "$1" || -z "$2" ]]; then echo "Usage: $0 <pdf1> <pdf2>" exit 1 fi convert -metric phash "$1" null: "$2" -compose Difference -layers composite -format '%[fx:mean]\n' info: -
Matthew Bafford created this gist
Jul 2, 2024 .There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,52 @@ #!/bin/bash if [[ -z "$1" || -z "$2" ]]; then echo "Usage: $0 <pdf1> <pdf2>" exit 1 fi TMP=$(mktemp --suffix=.png) echo "Comparing [$1] to [$2]" echo "Saving difference in $TMP" echo DENSITY=100 # this supports both simple file names and page indexed file names like: # file[0] file[1] - will either return one line for each page, or a single # line if a single page is specified PAGES1=$(magick identify "$1" | wc -l) PAGES2=$(magick identify "$2" | wc -l) if (( PAGES1 != PAGES2 )); then echo "Number of pages between documents does not match: $PAGES1 != $PAGES2" echo "Only comparing the first page." magick compare -density "$DENSITY" -background white "$1[0]" "$2[0]" "$TMP" PHASH_DIFF=$(~/bin/pdf-compare-phash "$1[0]" "$2[0]") elif (( PAGES1 > 5 )); then echo "Too many pages ($PAGES1 > 5) to create hyper-image with all pages." echo "Only comparing first page." magick compare -density "$DENSITY" -background white "$1[0]" "$2[0]" "$TMP" PHASH_DIFF=$(~/bin/pdf-compare-phash "$1[0]" "$2[0]") else # convert the PDFs into a single image with the pages vertically stacked ALL1=$(mktemp --suffix=.png) magick convert -density "$DENSITY" "$1" -append "$ALL1" ALL2=$(mktemp --suffix=.png) magick convert -density "$DENSITY" "$2" -append "$ALL2" magick compare -density "$DENSITY" -background white "$ALL1" "$ALL2" "$TMP" PHASH_DIFF=$(~/bin/pdf-compare-phash "$ALL1" "$ALL2") fi if [ "$TERM_PROGRAM" = "iTerm.app" ]; then echo "Visual difference between images:" echo "--------------------------------" imgcat-small "$TMP" echo "--------------------------------" else open "$TMP" fi echo "Perceptual hash difference (0 is exactly the same): $PHASH_DIFF"