Friend of mine is asking this question. Any of you librarian-types have some wisdom to share with him? "Has anyone had to OCR/index thousands of scanned PDF? Is so, what did you use to automate this one-time operation? I have 13,000 scanned PDFs from which I'd like to extract text info from the 1st page... I'm looking into tesseract and it's derivatives..."
Would you like me to repost it on Library Society of the World https://mokum.place/lsw ? ‎· bentley
Good idea, @bentleywg. I just did that. ‎· Mr. Noodle
