Modify stop words list for HTML5 output

This forum is for all Flare related Tips and Tricks.
Have a tip or trick you use while working in Flare? Share it here.
Post Reply
JRtechw
Propeller Head
Posts: 68
Joined: Thu Oct 05, 2017 8:08 pm

Modify stop words list for HTML5 output

Post by JRtechw »

Yes, it’s possible, but it does involve making an extremely minor change to one of Flare’s main .js files and overwriting this file in the output. I’m not sure if this kind of solution is frowned upon in the user forums, so if this needs to be quietly excised, so be it.

However, in my defence:

1. It’s such a minor change it doesn’t even deserve to be called a ‘hack’. You don’t need to know Javascript in the slightest. All you need is Notepad. I'd say it's about equivalent to over-riding skin styling with custom CSS.

2. The business case for being able to modify the stop word list is so phenomenally strong that I don’t care that much anyway. Just adding three stop words to the default list stripped a huge number of junk results out of our Knowledge Base search and our users love the new search.

So, here it is. This works for Flare 2019 at least, and I'm pretty sure it would work for Flare 2018 versions too.

1. Generate your HTML5 output as normal.
2. Go to the Scripts folder in your output: Output / <username> / <target> / Resources / Scripts.
3. Copy the file MadCapAll.js.
4. Open this file in a basic text editor.
5. Search for ‘stop’. You’ll be taken to this chunk of code:

Code: Select all

MadCap.Utilities.StopWords=Array("a","an","the","to","of","is","for","and","or","do","be","by","how","he","she","on","in","i","at","it","not","no","are","as","but","her","his","its","non","only","than","that","then","they","this","we","were","which","what","with","you","into","about","after","all","also","been","can","come","from","had","has","have","me","made","many","may","more","most","near","over","some","such","their","there","these","under","use","was","when","where","against","among","became","because","between","during","each","early","found","however","include","late","later","med","other","several","through","until","who","your");
(It should be the first result, but if it isn’t, keep searching until you find it.)

6. Add or delete the stop words you want, within double quotes and separated by commas. Don’t worry about ordering. Don’t edit anything outside the parentheses.
7. Save the file with the same name in a local folder.
8. Add a post-build command in your HTML5 Target > Build Events tab:

Code: Select all

copy /Y "C:\<path>\MadCapAll.js" "$(OutputDirectory)\Resources\Scripts\"
Where <path> is where you saved the modified .js file. The /Y forces the overwrite without prompting.

9. Rerun your build and publish and check the difference in search in the output. Your new stop words should produce no results.
10. Distribute the modified file and path to any collaborative authors. If you use source control, each author working on the project will need to put the modified file in the same local path.
11. Each time a new build of Flare is released, repeat the edit of the MadCapAll.js file to capture any new build changes.

That’s it.

Possible Troubleshooting

If nothing has changed, check the timestamp of the MadCapAll.js file in the output > Scripts folder. It should be different to the other files in the folder.

If it’s the same, check your build log to see if your post build command worked or failed.

If it’s different, view the output project in the browser, right-click, and select Inspect. At the bottom of the Inspection window (in Chrome, might be different for other browsers) check the Console tab for error messages. If you see any Javascript errors, you might have accidentally edited outside of the array definition. Start again. It's also possible that you may have saved the file with a different encoding, but this *shouldn't* matter.

I strongly recommend editing MadCapAll.js from the Output folder rather than the original source file that is loaded during build.

Hope this is useful.
WriterAndrew
Propeller Head
Posts: 50
Joined: Tue Mar 05, 2019 2:43 am

Re: Modify stop words list for HTML5 output

Post by WriterAndrew »

JRtechw,
Sounds interesting...
I didn't see it mentioned in your post, but what "three stop words" did you add that stripped lots of your junk results? (If they worked so well for you, they might work equally well for others...)
Thanks
JRtechw
Propeller Head
Posts: 68
Joined: Thu Oct 05, 2017 8:08 pm

Re: Modify stop words list for HTML5 output

Post by JRtechw »

They were 'I', 'how', and 'which' (or at least I'm pretty sure the third one was 'which'). Definitely adding 'I' helped, because most Google-esque search attempts seemed to be 'how do I..', 'how can I...', and variations thereof.
kodster28
Jr. Propeller Head
Posts: 3
Joined: Wed Mar 04, 2020 4:19 pm

Re: Modify stop words list for HTML5 output

Post by kodster28 »

Howdy all,

Thanks JRtechw for your solution.

I made some improvements to make it a bit easier for my team:
1. Created a python program to add stop words
2. Built an .exe file to run the program for other people on our team
3. Added the .exe file to a post-build event

I've included the files in a Google Drive folder if they're helpful: https://drive.google.com/file/d/1HDYiNJ ... sp=sharing

I'll at least have to double-check the program every time Flare creates a new release, but it's a start!
doloremipsum
Sr. Propeller Head
Posts: 290
Joined: Mon Aug 26, 2019 2:11 pm

Re: Modify stop words list for HTML5 output

Post by doloremipsum »

With the help of a developer who was conveniently sitting nearby, I've found an alternative solution to this problem using powershell.

There are three components to this solution:
1) StopWords.txt, which contains a list of words which I need to add to the stop words list. I stored this in the Flare folder, Project/Advanced/StopWords.

Code: Select all

how
what
I
i
(I found I needed to put both capital and miniscule "i" in there to prevent the search from highlighting every instance of the letter i on the page.)

2) AddStopWords.ps1, a powershell script which takes the content from the text file and inserts the words into MadCapAll.js at an appropriate location. You'll see it has a couple of paramaters for the textFilePath and the jsFilePath - this allows it to work on different computers under different users. It is also stored in Project/Advanced/StopWords.

Code: Select all

$textFilePath = $args[0]
$jsFilePath = $args[1]
$additionalStopWords = Get-Content -Path $textFilePath\StopWords.txt -Raw
$additionalStopWords = '"' + ((-split $additionalStopWords) -Join '","') + '",'
(Get-Content -Path $jsFilePath/MadCapAll.js) -replace [regex]::Escape('MadCap.Utilities.StopWords=Array('), ('MadCap.Utilities.StopWords=Array(' + $additionalStopWords) | Out-File -encoding ASCII $jsFilePath/MadCapAll.js
3) Flare post-build script which runs this powershell script and passes the necessary file paths to the parameters. If you have nice file directories that don't have spaces in them, this is simple:

Code: Select all

powershell -Command $(ProjectDirectory)Project\Advanced\StopWords\AddStopWords.ps1 $(ProjectDirectory)Project\Advanced\StopWords\ $(OutputDirectory)\Resources\Scripts\
The -Command part runs the powershell script, and the final two arguments tell it where to find the text file and MadCapAll.js respectively.

However, if you do have spaces in your file directory names, Flare absolutely does not accept this no matter what combination of quotation marks you add in. It really can't handle spaces within filenames when trying to run the script and pass arguments to it. However, I did manage to work around this:

Code: Select all

cd "$(OutputDirectory)"
powershell -Command ..\..\..\..\Project\Advanced\StopWords\AddStopWords.ps1 ..\..\..\..\Project\Advanced\StopWords\ .\Resources\Scripts\
Basically, you have to go to the output directory first, then work your way back up to locate the things in the Project directory. I guess this means it's resolving the spaces in the filepaths at a different point in the process where it handles them properly, and starting from the output directory avoids thorny issues like the username. It does work on both my PC and my coworker's so I'm calling it a success.

One little gotcha, your PC might not allow the powershell script to run by default (see https://learn.microsoft.com/en-us/power ... rshell-7.3). You can change this by running powershell and using the command:

Code: Select all

Set-ExecutionPolicy -ExecutionPolicy RemoteSigned
Bit of a lengthy writeup, but there's so little guidance out there on what you can actually do with post-build commands that I thought I would give a worked example. Hope this helps people trying to escape the tyranny of "how do I" searches!
in hoc foro dolorem ipsum amamus, consectimur, adipisci volumus.
Post Reply