Searching Contents of Attachments

This forum is for all Flare issues related to the HTML5, WebHelp, WebHelp Plus, and Adobe Air Targets
Post Reply
wrleinwe
Propeller Head
Posts: 13
Joined: Thu Aug 04, 2016 8:10 am

Searching Contents of Attachments

Post by wrleinwe »

Hi all,

I've attached a series of .pdf files to the web outputs I manage — and anticipate adding more. The .pdfs work great (especially for our clients working with Chrome), but Flare's search engine doesn't seem to be looking into the contents of the .pdfs (e.g. if I search for a sentence in one of the .pdfs, nothing comes up; if I search for a term in the .pdfs, nothing comes up).

I'm wondering if I'm missing something obvious that would prevent the search feature from looking into the contents of attached files.

Thanks,

Bill
roboHAL
Sr. Propeller Head
Posts: 254
Joined: Mon Dec 31, 2012 9:57 am

Re: Searching Contents of Attachments

Post by roboHAL »

Hi Bill. No, you're not missing anything obvious. Flare search engine does not consider the content of a PDF file. For that matter, to my understanding the Flare search engine does not consider the content of most other non-html files. There is or was an output type called "WebHelp Plus" which when used with some kind of MS IIS server was supposed to solve the issue of searching PDF's. In my opinion that never really worked all that well. The PDF displays text as an image (not precisely but you can think of it that way). The search engine in Adobe applies an OCR process that recognizes the characters within the image. Flare simply does not do that. Would be nice though! :)
wrleinwe
Propeller Head
Posts: 13
Joined: Thu Aug 04, 2016 8:10 am

Re: Searching Contents of Attachments

Post by wrleinwe »

roboHAL wrote:Hi Bill. No, you're not missing anything obvious. Flare search engine does not consider the content of a PDF file. For that matter, to my understanding the Flare search engine does not consider the content of most other non-html files. There is or was an output type called "WebHelp Plus" which when used with some kind of MS IIS server was supposed to solve the issue of searching PDF's. In my opinion that never really worked all that well. The PDF displays text as an image (not precisely but you can think of it that way). The search engine in Adobe applies an OCR process that recognizes the characters within the image. Flare simply does not do that. Would be nice though! :)
I suspected this would be the answer. Bummer. Any ideas for a potential workaround? I feel like I could embed the text in the 'background' of the html... but that would lead to odd searching results (assuming that would work).
roboHAL
Sr. Propeller Head
Posts: 254
Joined: Mon Dec 31, 2012 9:57 am

Re: Searching Contents of Attachments

Post by roboHAL »

In a way, I can offer a work around. 8)

Realistically, my suggestion should already be the case. Titling. The title of the PDF should be accurate and indicative of its content. Somewhere in the HTML topics - that the Flare search engine can find - the text of the title should be present. The PDF can be hyperlinked from (made accessible from) that text.

In furtherance of that suggestion, usually a title would be somewhere in the neighborhood of one to eight words. If there are significantly more "key" words that you want to associate to that title/PDF, you could create index keywords and assign those keywords to the title text. Flare search will find those words, but again, the presumption is the user will understand to click that link and open/download the PDF.

Moreover, you could provide text instruction that mentions that the Flare output they are looking at does not search inside PDFs, and that once the PDF is opened, indicate that the search functionality in Adobe Reader (or other viewer) should be used.

Hope that helps :)
wrleinwe
Propeller Head
Posts: 13
Joined: Thu Aug 04, 2016 8:10 am

Re: Searching Contents of Attachments

Post by wrleinwe »

roboHAL wrote:In a way, I can offer a work around. 8)

...
Haha. Well, I'm happy to say that I'm familiar with all of these items, and already have these in place!

I'll be sure and update this post if I find some other strategy.
NorthEast
Master Propellus Maximus
Posts: 6363
Joined: Mon Mar 05, 2007 8:33 am

Re: Searching Contents of Attachments

Post by NorthEast »

Actually, I think you could be missing something obvious...

If your server is running IIS, you can use HTML5 Server-based output, which would index PDF and other file formats for your search.

https://help.madcapsoftware.com/flare2017/Content/Output/HTML5_Output/Enabling_HTML5_Server_Output.htm
wrleinwe
Propeller Head
Posts: 13
Joined: Thu Aug 04, 2016 8:10 am

Re: Searching Contents of Attachments

Post by wrleinwe »

Dave Lee wrote:Actually, I think you could be missing something obvious...

If your server is running IIS, you can use HTML5 Server-based output, which would index PDF and other file formats for your search.

https://help.madcapsoftware.com/flare2017/Content/Output/HTML5_Output/Enabling_HTML5_Server_Output.htm
Dave, you rock. Thanks! I'm going to set this up.
Post Reply