While pondering which topic should I write about first, along came a project, and with it an interesting challenge that I thought could turn into an article.
In some highly regulated industries such as Legal and Pharmaceuticals/Healthcare it is not uncommon to get requests for updating an existing translation where the edits in the new version of the source document are indicated by the Track Changes feature.
This type of project presents a unique challenge for quoting, scheduling and processing.
The Challenge: How to get the Track Changes statistics?
The main challenge in this type of project is to get a reliable revision statistics to base the quote and scheduling consideration on.
The first statistic that comes to mind is the word count of the insertions and deletions, however, the word count alone is usually not enough because this is not a straightforward translation project. The number of insertions and deletions can significantly affect the complexity level of the project because each edit needs to be identified in the source, its correspondent location needs to be found in the translated document (and with some documents that lack any orienting elements such as numbers, bullet lists or unique formatting this could be quite difficult), and only then the real work of translation begins. In other words, translating an insertion of 1,000 words in one continuous text segment is a lot easier than revising 1,000 individual insertions of one word, although on first look the word count of these two examples is the same.
Hence, a more detailed statistics is needed.
The Solution: The Reviewing Pane and a Macro
There are different methods to approach this challenge. Some save a copy of the file, then accept all the changes and compare the two versions to get a rough word count. If a TM of the previous translation is available it can be leveraged in a Translation Environment Tool such as OmegaT, SDL Studio, MemoQ, etc., including the very efficient way of working with DOCX files that contain Track Changes that was introduced in SDL Studio 2011 SP2, as explained in more details in an article by Paul Filkin, and I’m sure that there are more methods and tricks being used that I know nothing about.
Personally I find the first method described above too risky and inaccurate, and there isn’t always a TM available. Indeed, the existing translation could be aligned, but this is not always practical or efficient when one is only interested in preparing a quote, not to mention that not everyone is using a Translation Environment Tool to begin with.
So, assuming that there isn’t a TM available, or that it isn’t practical to use one, there is a need for a fairly reliable, quick, and hassle free method for getting the relevant statistics. For that I like to use two built in features of Word: The Reviewing Pane and a Macro.
Please note that I’m using Word 2010. The Reviewing Pane works the same in Word 2007 and 2013, but I’m not sure about Word 2003 or previous versions.
The Reviewing Pane
Word has a nice little and often overlooked feature called the Reviewing Pane, and more specifically, its Revision Summary section. This section shows the number of insertions, deletions, moves (although it is worth noting that a move is counted as one insertion and one deletion, but nevertheless it could be handy to know how much text segments were moved internally) and other edits made to the document.
To bring up the Reviewing Pane:
- Switch to the Review tab on the Ribbon.
- Click the Reviewing Pane button under the Tracking group of the Review tab.
This data is important but incomplete. It is incomplete because one insertion could add one word or any number of words to the document, and therefore the revision statistics must be supplemented by the word count statistic to provide a more accurate and reliable estimation of the work involved.
The Track Changes Statistics Macro
The easiest way to get the revision word count is probably using a macro. I thought that such a macro is readily available, but I was wrong. A quick web search have produced several resources, including this great article by Emma Goldsmith in her Signs & Symptoms of Translation blog. That macro counts only the insertions word count, and although that is generally enough because the more important word count to have is the one for the insertions (for the deletions the number of instances is usually more significant), I still prefer to have a more complete set of statistics available. So, I sat down to write one, and a few hours later…I failed! it didn’t work.
Luckily, when I went back online to search for some hints as to why some of the commands aren’t working, I’ve stumbled upon the full solution in an article by Allen Wyatt of Allen Wyatt’s Word Tips.
Mr. Wyatt was kind enough to agree to share his code in this article, and I thank him for that.
The Code for for the Track Changes Statistics Macro (last updated: November 2015):
Sub GetTCStats() Dim lInsertsWords As Long Dim lInsertsChar As Long Dim lDeletesWords As Long Dim lDeletesChar As Long Dim sTemp As String Dim oRevision As Revision lInsertsWords = 0 lInsertsChar = 0 lDeletesWords = 0 lDeletesChar = 0 For Each oRevision In ActiveDocument.Revisions Select Case oRevision.Type Case wdRevisionInsert lInsertsChar = lInsertsChar + Len(oRevision.Range.Text) lInsertsWords = lInsertsWords + oRevision.Range.Words.Count Case wdRevisionDelete lDeletesChar = lDeletesChar + Len(oRevision.Range.Text) lDeletesWords = lDeletesWords + oRevision.Range.Words.Count End Select Next oRevision sTemp = "Insertions" & vbCrLf sTemp = sTemp & " Words: " & lInsertsWords & vbCrLf sTemp = sTemp & " Characters: " & lInsertsChar & vbCrLf sTemp = sTemp & "Deletions" & vbCrLf sTemp = sTemp & " Words: " & lDeletesWords & vbCrLf sTemp = sTemp & " Characters: " & lDeletesChar & vbCrLf MsgBox sTemp End Sub
Note: GetTCStats (in the first line of code) is the name of the macro. It could be renamed as long as the new name contains up to 80 letters or numbers (without symbols or spaces) and begins with a letter, as per Word’s macro name limitations.
When running this Macro, a small window pops up and displays the word and character counts of the insertion and deletions.
Figuring out how much to charge for updating an existing translation when the changes in the source are indicated by the Track Change feature is somewhat of a challenge. The word count alone doesn’t account for the time and effort required for locating the edits in the source and target document, and therefore a more complete set of statistics is needed.
When no TM is available, or when it is not practical to create or use one, I think that the method described in this article for getting the revision summary (i.e. the number of insertions, deletions, and moves that can affect the complexity level of the project) and the word count statistics is a fairly simple, quick, and reliable one.
It is always a good idea to follow-up and visually inspect the document to determine how many of the insertions are also deletions (i.e. text replacements) and what is the average segment size. Working with larger segments of text is usually easier than working with large number of edits that consist of individual words or short sentences.