Friday, April 20, 2018

Retaining document relative links when copying from HTML to Word

We have an automated document generation tool that creates a HTML documentation of a given database. The generated HTML structure uses relative links between elements, this is very convenient for users who browse the documentation.
For example, somewhere in the document a database table is documented
<h2>Table <a name="TheTable">TheTable</a></h2>
and somewhere else the table is referenced
<a href="#TheTable>TheTable</a>
Let's create a simplest HTML as an example

This is a reference to <a href="#TheTable">TheTable</a>.

This is the definition of <a name="TheTable">TheTable</a>

and copy/paste it to Word
Take a closer look at what happened with the relative link - it has been pasted as a link to the source HTML document! That's kind of a disaster, I definitely don't want my relative links to suddenly become absolute and what's worse - point from Word to an external, source HTML document!
This inconvenience can be fixed manually, I can just right-click at the link, edit its properties and change the link type from existing file or a web page to a bookmark in this document:
however, manually fixing hundreds of links sounds like a daunting task.
Fortunately, this can be fixed automatically, with a local VBA script in the Word document that basically creates another link in the very same range of the document but with empty Address property - by multiple trials and errors I've determined this is the only difference between external and internal, relative links:
Do follow these steps to have your links fixed then:
  1. Copy/paste your HTML into Word
  2. alt+F11 to open VBA editor
  3. Double click ThisDocument to open a code editor for scripts in current document
  4. Paste the script into the editor window
    Sub FixHyperlinks()
    Dim h As Hyperlink
    For Each h In ActiveDocument.Hyperlinks
        ActiveDocument.Hyperlinks.Add h.Range, "", h.SubAddress
    Next h
    End Sub
  5. Place cursor somewhere inside the script and hit F5 (or click the green triangle) to run the macro
Relative links now correctly point to elements in the very same document, even exporting to PDF retains the correct behavior.

No comments: