Post-build task to strip whitespace characters from code snippets

This forum is for all Flare related Tips and Tricks.
Have a tip or trick you use while working in Flare? Share it here.
Post Reply
AlexFox
Sr. Propeller Head
Posts: 242
Joined: Thu Oct 19, 2017 1:56 am

Post-build task to strip whitespace characters from code snippets

Post by AlexFox »

As the title suggests, this PowerShell script will search for whitespace in code snippets and remove them.

Specifically, it looks for <code> blocks inside <div class="codeSnippetBody"> blocks and removes any instances of &#160; or &nbsp;

Disclaimers:
  • This is combination of my remedial PS knowledge and some ChatGPT heavy lifting. It works for me but it might not work for you.
  • You will obviously lose all indentation from your code block, you could fix this with some very specific CSS padding but this is out of scope for this thread.
How to use:
1) Save the following content into a .ps1 file in the root of your Project (same place as your .flprj file). Call it anything you like but remember what you call it. I went with remove_160.ps1

Code: Select all

# Define the directory path as an argument or hard-code it if needed
param (
    [string]$directoryPath = "C:\path\to\your\directory"  # Replace with your directory path
)

# Check if directory path exists
if (!(Test-Path $directoryPath)) {
    Write-Host "Directory does not exist: $directoryPath"
    exit
}

# Recursively iterate through all .html and .htm files in the directory and subdirectories
Get-ChildItem -Path $directoryPath -Recurse -Include *.html, *.htm -File | ForEach-Object {
    $fileContent = Get-Content $_.FullName -Raw

    # Use regex to find <code> elements inside <div class="codeSnippetBody"> and remove &#160; and &nbsp;
    $updatedContent = [regex]::Replace($fileContent, '(?is)(<div[^>]*class="codeSnippetBody"[^>]*>.*?<code>)(.*?)(</code>.*?</div>)', {
        param($match)

        # Add logging to check if this section is triggered
        Write-Host "Match Found in File: $($_.FullName)"
        Write-Host "Group 1 (Opening Tags): $($match.Groups[1].Value)"
        Write-Host "Group 2 (Code Content): $($match.Groups[2].Value)"
        Write-Host "Group 3 (Closing Tags): $($match.Groups[3].Value)"
        
        # Remove all occurrences of &#160; and &nbsp; within the <code> content
        $codeContent = $match.Groups[2].Value -replace '&#160;', '' -replace '&nbsp;', ''
        # Optionally, trim extra spaces
        $codeContent = $codeContent -replace '\s+', ' '  # Replace multiple spaces with a single space
        $codeContent = $codeContent.Trim()  # Trim leading and trailing whitespace
        
        # Return the updated content to replace the original match
        return $match.Groups[1].Value + $codeContent + $match.Groups[3].Value
    })

    # Add "<p>Processed by post-build task</p>" before the closing </body> tag
    if ($updatedContent -notlike '*<p>Processed by post-build task</p>*') {
        $updatedContent = $updatedContent -replace '(</body>)', '<p>Processed by post-build task</p>$1'
    }

    # Write the updated content back to the file
    Set-Content $_.FullName -Value $updatedContent
    Write-Host "Updated file: $($_.FullName)"
}
2) In your Target settings, put the following code in the Post-Build Event Command:

Code: Select all

powershell -ExecutionPolicy Bypass -File "$(ProjectDirectory)remove_160.ps1" -directoryPath "$(OutputDirectory)\content\topics\"
3) Build the target!
AlexFox
Sr. Propeller Head
Posts: 242
Joined: Thu Oct 19, 2017 1:56 am

Re: Post-build task to strip whitespace characters from code snippets

Post by AlexFox »

IMPORTANT I left some debug code in the above example. I can't edit the post because these Forums are awful, I apologise, use the revised code below for the PS script, all other steps are the same:

Code: Select all

[code]
# Define the directory path as an argument or hard-code it if needed
param (
    [string]$directoryPath = "C:\path\to\your\directory"  # Replace with your directory path
)

# Check if directory path exists
if (!(Test-Path $directoryPath)) {
    Write-Host "Directory does not exist: $directoryPath"
    exit
}

# Recursively iterate through all .html and .htm files in the directory and subdirectories
Get-ChildItem -Path $directoryPath -Recurse -Include *.html, *.htm -File | ForEach-Object {
    $fileContent = Get-Content $_.FullName -Raw

    # Use regex to find <code> elements inside <div class="codeSnippetBody"> and remove &#160; and &nbsp;
    $updatedContent = [regex]::Replace($fileContent, '(?is)(<div[^>]*class="codeSnippetBody"[^>]*>.*?<code>)(.*?)(</code>.*?</div>)', {
        param($match)

        # Add logging to check if this section is triggered
        Write-Host "Match Found in File: $($_.FullName)"
        Write-Host "Group 1 (Opening Tags): $($match.Groups[1].Value)"
        Write-Host "Group 2 (Code Content): $($match.Groups[2].Value)"
        Write-Host "Group 3 (Closing Tags): $($match.Groups[3].Value)"
        
        # Remove all occurrences of &#160; and &nbsp; within the <code> content
        $codeContent = $match.Groups[2].Value -replace '&#160;', '' -replace '&nbsp;', ''
        # Optionally, trim extra spaces
        $codeContent = $codeContent -replace '\s+', ' '  # Replace multiple spaces with a single space
        $codeContent = $codeContent.Trim()  # Trim leading and trailing whitespace
        
        # Return the updated content to replace the original match
        return $match.Groups[1].Value + $codeContent + $match.Groups[3].Value
    })

    # Write the updated content back to the file
    Set-Content $_.FullName -Value $updatedContent
    Write-Host "Updated file: $($_.FullName)"
}
[/code]
Post Reply