PDF text search - 2013

Permalink
I have been looking through the forum archives and it seems a couple of years ago, it was not possible to search the content of a PDF in a site search. There was some suggestion from Franz in 2009 that an updated File Manager would change this. Has this been changed? Is it now possible to search the contents of a PDF and return it within page results?

Thanks,
Tim

tofraser
 
Responsive replied on at Permalink Reply
Responsive
I guess you could use google search in the site and make sure you set up the PDF as recommended herehttp://blog.hubspot.com/blog/tabid/6307/bid/28898/How-to-Optimize-a...
tofraser replied on at Permalink Reply
tofraser
that's not ideal, but it may be the best solution...
ryan replied on at Permalink Best Answer Reply
ryan
concrete5 supports extendable file importers so you can run code on files based on their type when they're added to the file manager.

Once you have parsed the contents of a file and placed it in a text attribute you could make that searchable using the file list object. You could also use the document library addon from the marketplace. If you specifically wanted it to show up in the page search you could make sure the file block's getSearchableContent returned the text from the attribute that the pdf's content was stored in. Then when a file block was placed on a page it would add the contents of the pdf to the pages searchable content.