I have a bunch of PDF files and my Perl program needs to do a full-text search of them to return which ones contain a specific string.
To date I have been using this:
my @search_results = `grep -i -l \"$string\" *.pdf`;
where $string is the text to look for.
However this fails for most pdf's because the file format is obviously not ASCII.
What can I do that's easiest?
There are about 300 pdf's whose name I do not know in advance. PDF::Core is probably overkill. I am trying to get pdftotext and grep to play nice with each other given I don't know the names of the pdf's, I can't find the right syntax yet.
Final solution using Adam Bellaire's suggestion below:
@search_results = `for i in \$( ls ); do pdftotext \$i - | grep --label="\$i" -i -l "$search_string"; done`;