Tech Notes‎ > ‎

Remove Hidden Metadata from Word Documents

posted Oct 9, 2010, 9:04 PM by Brian Clark

Unfortunately, metadata has curtailed one of the courtesies attorneys in litigation formerly exhibited through providing discovery requests in an electronic format so that opposing counsel didn’t have to have his assistant re-type your requests when answering discovery. Without going into the definition of metadata and its various forms, metadata is that ‘hidden’ information inside of electronic documents such as Word documents (.doc) and Adobe PDF’s (.pdf). This information can contain such simple things as the author’s name or it can contain scary things such as the change history of a document and comments made to a document. Obviously, this sort of information is something one would not want to always disseminate when distributing a copy of a document.

A colleague once informed me that because of this metadata, he refused to provide a digital copy of his tendered discovery to opposing counsel in any case where it was requested. His reasoning for this contradiction of professional courtesy was the possible existence of metadata in his discovery. Needless to say, most attorneys whose requests were denied didn’t take to it kindly and returned the perceived lack of assistance in turn whenever they could throughout litigation. When we returned to his office after lunch, I downloaded a free add-on for Microsoft Word which removes metadata from Word Documents and additionally demonstrated how to remove metadata from PDF’s using Acrobat 8.0′s built-in features. Having learned how to remove the metadata, my friend will now once again continue to work together with opposing counsel. Today I’m going to show you how you too can once again share your electronic data (either .pdf or .doc) without fear of providing sensitive information through metadata.


Remove Word Metatada 1 

The easiest and cheapest way to remove metadata from Word documents is through Microsoft’s free Office 2003/XP plugin which removes metadata from a .doc file for you. According to the download page here, the plug-in will *”permanently remove hidden and collaboration data, such as change tracking and comments, from Word 2003/XP, Excel 2003/XP, and PowerPoint 2003/XP files”. In practice, I have never found metadata in a Word document after installing and running this plugin and I’ve used all of TechnoEsq’s metadata extractors from computer forensics consulting. To use the plugin, you simply run it and a new icon will appear on your Microsoft Word, Excel and Powerpoint toolbars which removes the metadata from the open document. If the icon does not appear, you can click ‘File’ and then select ‘Remove Hidden Data’. You then save the document (we suggest saving it as a different name so that you can have your ‘cleaned’ version and your commented versions if you would like) and you may then provide that document to anyone with the *assurance that it is metadata free.

Examine DocumentRemoving data from .pdf’s requires Adobe Acrobat 8.0 or a commercial third-party plug-in for prior versions. Simply open a .pdf in Acrobat 8.0 and select ‘Document’ and then click ‘Examine Document’. A window will appear which displays all of the hidden metadata Acrobat found in your .pdf. From this point, you can simple check mark the items you would like to remove (careful, Acrobat defaults to checking all of the items, which include bookmarks and comments which you sometimes don’t want to remove). After removing the documents, you will need to save the .pdf, just as in Microsoft’s programs, to preserve the metadata-free .pdf.

So now you can once again return to the golden-age of professional courtesy, maybe you even want to go so far as emailing your discovery to opposing counsel at the same time as you send it snail-mail. Now wouldn’t that set a tone for cooperation in your case?

*As this is Microsoft, TechnoEsq accepts no liability for the accuracy of this statement.