Developing (v7+)

PDF text search - 2013

Permalink October 02, 2013 at 9:58 AM

I have been looking through the forum archives and it seems a couple of years ago, it was not possible to search the content of a PDF in a site search. There was some suggestion from Franz in 2009 that an updated File Manager would change this. Has this been changed? Is it now possible to search the contents of a PDF and return it within page results?

Thanks,
Tim

Responsive replied on Oct 2, 2013 at 10:38 am Permalink Reply

I guess you could use google search in the site and make sure you set up the PDF as recommended herehttp://blog.hubspot.com/blog/tabid/6307/bid/28898/How-to-Optimize-a...

tofraser replied on Oct 2, 2013 at 10:59 am Permalink Reply

that's not ideal, but it may be the best solution...

ryan replied on Oct 4, 2013 at 2:32 pm Permalink Best Answer Reply

concrete5 supports extendable file importers so you can run code on files based on their type when they're added to the file manager.

Once you have parsed the contents of a file and placed it in a text attribute you could make that searchable using the file list object. You could also use the document library addon from the marketplace. If you specifically wanted it to show up in the page search you could make sure the file block's getSearchableContent returned the text from the attribute that the pdf's content was stored in. Then when a file block was placed on a page it would add the contents of the pdf to the pages searchable content.

Forums

Developing (v7+)

PDF text search - 2013

Code

Post Reply

Delete Post

Mark Post as Spam

Destroy Spammer

Sign In?