Turning pdfs from nightmares to opportunities.
We usually stay away from pdfs as data sources, because they are just super inconvenient to handle. For my research needs and to help out a colleague, I have developed 2 functions which open up pdfs as new research opportunities. You can do:
- complex search queries on batches of pdfs
- text extraction on zones *you define* in batches of pdf
and these are free, fully click and point, no registration or installation needed, and respectful of your data.
Check it there: https://nocodefunctions.com
Use case: we often have access to one of these great data sources: news articles, academic papers, forms from public administrations and private orgs, press releases, and database extractions of all kinds.
If these documents are formatted as pdfs, the text they contain can't be accessed in a clean way. The 2 functions change that by making it easy to search and extract text from lots of pdfs at once, precisely and with advanced controls.