Hi there,
I had a bit of a play on the weekend and was able to get the search results using the following approach(note that this was for a Windows application, rather than a web app):
I had a hidden WebBrowser control that I passed in the search URL:
- Code: Select all
<YourPath>/Default.htm#search-<YourSearchQuery>
Once the document was loaded, I found the <ul> element with the id attribute value of "resultList". This element contains all of the search information and looks something like this:
- Code: Select all
<ul id="resultList"><li><h3 class="title"><a href="#Tech Writing Procedures/Automated Builds.htm?Highlight=Automated Build"><b>Automated</b> <b>Build</b>s</a></h3><div class="description">The technical documentation is automatically built and published each night at 5 pm. Overview of the <b>Build</b> Process The overnight <b>build</b> currently performs the following tasks: Author-it documents are built. Flare documents are built. The documents are uploaded to Overnight<b>Build</b> folder. The document ...</div><div class="url"><cite>Tech Writing Procedures/Automated Builds.htm</cite></div></li>
</ul>
There is one <li> child element per search result. This contains the following child elements:
<h3> - The title and URL for the topic (I think this is what you need)
<div> - The meta description or the first paragraph of the topic if no meta description has been set.
From there, it was simply a case of iterating through each <li> child element and extracting the required information.
At this point I should probably point out that I am a Technical Writer that does a bit of development work on the side. Being a self-taught coder, there is probably a more elegant way to extract this information that I am not aware of. However, the above does work, so at least there is one option.
"In an ideal world, software should be simple, well designed, and completely intuitive to end users. In the real world, good documentation is king."