Tech Notes

Saving WordPerfect Files Without Metadata

posted Oct 9, 2010, 9:11 PM by Brian Clark

by Laura Acklen

According to Wikipedia, the word "metadata" comes from the Greek word "meta" and the Latin word "data," so its literal meaning is "data about data." A library catalog is considered metadata because it contains information about publications. In recent years, the term has been used to describe hidden data that is stored within a document. We frequently hear news reports about how the discovery of metadata exposed information that was meant to be kept secret.

This issue is particularly sticky for law offices. Legal information is often confidential, and its inclusion within files could be detrimental to court cases or contract negotiations. Attorneys state that sending documents containing metadata can lead to the disclosure of confidential client information and the breach of client/attorney confidentiality. As a result, the potential for malpractice claims increases. Given the strict confines of the client/attorney relationship, law offices cannot afford to send a document outside the firm without first cleaning up the metadata.

Fortunately, the issue of metadata is primarily a Microsoft® Word problem. Word stores information about a document in a hidden area at the end of the file. You can't access it, so you can't remove it. Although there are macro routines that can scrub this invisible area, they are not foolproof.

WordPerfect® handles metadata differently from Word. Features that might store hidden or attached data are readily available, so confidential or sensitive information can be removed from a file before it is shared electronically. This is one of the many reasons that legal professionals favor WordPerfect.

Where is metadata stored?
WordPerfect has several areas and features that can be used to store information about a document. You can enable or disable each of these options when you use the Save Without Metadata feature.

Undo/Redo history
As you use the Undo and Redo features, WordPerfect maintains a list of your actions. This list allows you to undo or redo an action if it is performed within a specific number of the most recent actions. The specific number of actions that can be undone or redone is set in the options for the Undo/Redo history. Choose Edit > Undo/Redo History to display the Undo/Redo History dialog box. Choose Options to display the Undo/Redo Options dialog box (see Figure 1). You can either type a new number of history items or use the arrows to increase or decrease the setting.

The number of undo/redo items can be set in the Undo/Redo Options dialog box.

Whether the Undo/Redo history is saved with a document depends on whether the Save Undo/Redo Items with document check box is enabled. If the option is enabled, that information is saved with the document, so whoever opens the document can access the information. For example, if you are negotiating a settlement, you would not want opposing counsel to know that you changed a figure from $300,000 to $1,300,000. For confidential or sensitive documents, the Save Undo/Redo Items with document check box should remain disabled so that WordPerfect removes the Undo/Redo history when the document is saved.

Reviewer's annotations
The Document Review feature is used to review a document and make changes. These changes are then accepted or rejected by the author. If you send the document to someone else without first accepting all of the changes, the recipient will be able to see the proposed changes.

The main difference between the Document Review feature in WordPerfect and the Track Changes feature in Word is that in a WordPerfect document, you see the Review Document dialog box whenever you open the document (see Figure 2), so you are less likely to forget that you have unaccepted changes in the document. In Word, you can simply hide the changes — Word never prompts you about them. You can easily forget they are there. Furthermore, if you are working with a document that someone else created, you won't know that the document contains unaccepted changes.

The Review Document dialog box appears each time you open a document that contains unaccepted reviewer's annotations.

Comment information
Through the Comment feature, you can insert information into a document that will neither be printed nor affect the pagination. The information that you put in a comment is saved in the comment code, so it isn't actually part of the body text. Typically, a comment contains the name of the person who created it. In some cases, comments contain the date and time of their creation.

If a user has entered their initials and selected a user color in the User Information section of the Environment Settings dialog box, their comment bubbles will appear in the user color and their initials will appear in the comment bubble. These small cues help you quickly identify who inserted the comment. You can quickly strip out the user initials and user colors with the Comment Information option in the Save Without Metadata dialog box. The comment text isn't touched and all of the comment bubbles appear with the same – as a generic white bubble.

Document summary data
When creating or saving a document in WordPerfect, you can save summary information within the document. By default, the following information may be present in the summary of a WordPerfect document (other fields can be added and used as well):

  • Descriptive Name
  • Descriptive Type
  • Creation Date
  • Revision Date
  • Author
  • Typist
  • Subject
  • Account
  • Keywords
  • Abstract

Hidden Text
In WordPerfect documents, it is possible to assign a hidden attribute to text by selecting the text, choosing Format > Font, and then enabling the Hidden check box in the Appearance section). The display is toggled on and off with View > Hidden Text so someone can easily turn on the display and view the information. If the hidden text contains information that should not be distributed, it should be removed before sending out the file.

Headers and footers in documents may contain identifying information, such as the name of the person who last modified the document and the date and time the changes were made. Although this information is useful to you, it may not be appropriate to share with others.

WordPerfect documents may contain hyperlinks to other documents or Web pages on an intranet or the Internet. Hyperlinks typically appear as blue underlined text strings. The path and filename of the hyperlinked document can be viewed by looking at the hyperlink's properties (see Figure 3). If the recipients have access to the area in which the hyperlinked document is stored, they will be able to access the files that are hyperlinked.

The path and file name for a hyperlink can be viewed in the Hyperlink Properties dialog box.

OLE Object Information
OLE linked images and other objects may contain linking information, such as the path to the linked image or object. This information can be removed from the document by terminating the OLE link between the object and the associated application. Keep in mind that removing the link may mean that the image or object will no longer be editable from within WordPerfect.

If an object is embedded within a document, the object still retains its own properties, regardless of what is done to the document. These properties include any information not visible in the OLE box window in the document. In other words, if someone extracts an embedded object and opens it in its native application, all of the information that is saved with the object will be visible, including items that cannot be seen in the WordPerfect OLE interface.

Routing Slip
Routing slips allow you to e-mail a document to multiple reviewers. Each reviewer opens the attachment, makes changes, then closes the document. The document is then sent to the next person on the Reviewers list. When the document has been edited by all of the reviewers, it is then sent back to you. See this tutorial for more information on creating routing slips in WordPerfect.

Using the Save Without Metadata Feature
In earlier versions of WordPerfect, the manual cleanup of metadata required a series of steps. Now, you can use a single dialog box to accomplish the same task. WordPerfect X3 introduces the Save Without Metadata feature to ensure that you'll never get caught with confidential or sensitive information in your documents. The new feature makes it simple to remove all metadata from a document, without having to purchase an additional utility or perform a series of manual steps.

To save a WordPerfect document without metadata, choose File > Save Without Metadata. The Save Without Metadata dialog box appears (see Figure 4). WordPerfect automatically adds _mtd to the end of the filename to identify that the file does not contain metadata. If you change the filename, take care not to remove the _mtd. If necessary, click the browse button to move to the drive and folder where you want to save the document.

The new Save Without Metadata feature makes it a snap to remove metadata from a document.

If you enable the Keep Original Document Open check box, both the original document and the metadata-free version remain open. Otherwise, the original document closes, and the metadata-free version remains open. Finally, specify which metadata elements to remove from the document by enabling the appropriate check boxes in the Select Metadata to Remove area, and click Save.

Remove Hidden Metadata from Word Documents

posted Oct 9, 2010, 9:04 PM by Brian Clark

Unfortunately, metadata has curtailed one of the courtesies attorneys in litigation formerly exhibited through providing discovery requests in an electronic format so that opposing counsel didn’t have to have his assistant re-type your requests when answering discovery. Without going into the definition of metadata and its various forms, metadata is that ‘hidden’ information inside of electronic documents such as Word documents (.doc) and Adobe PDF’s (.pdf). This information can contain such simple things as the author’s name or it can contain scary things such as the change history of a document and comments made to a document. Obviously, this sort of information is something one would not want to always disseminate when distributing a copy of a document.

A colleague once informed me that because of this metadata, he refused to provide a digital copy of his tendered discovery to opposing counsel in any case where it was requested. His reasoning for this contradiction of professional courtesy was the possible existence of metadata in his discovery. Needless to say, most attorneys whose requests were denied didn’t take to it kindly and returned the perceived lack of assistance in turn whenever they could throughout litigation. When we returned to his office after lunch, I downloaded a free add-on for Microsoft Word which removes metadata from Word Documents and additionally demonstrated how to remove metadata from PDF’s using Acrobat 8.0′s built-in features. Having learned how to remove the metadata, my friend will now once again continue to work together with opposing counsel. Today I’m going to show you how you too can once again share your electronic data (either .pdf or .doc) without fear of providing sensitive information through metadata.


Remove Word Metatada 1 

The easiest and cheapest way to remove metadata from Word documents is through Microsoft’s free Office 2003/XP plugin which removes metadata from a .doc file for you. According to the download page here, the plug-in will *”permanently remove hidden and collaboration data, such as change tracking and comments, from Word 2003/XP, Excel 2003/XP, and PowerPoint 2003/XP files”. In practice, I have never found metadata in a Word document after installing and running this plugin and I’ve used all of TechnoEsq’s metadata extractors from computer forensics consulting. To use the plugin, you simply run it and a new icon will appear on your Microsoft Word, Excel and Powerpoint toolbars which removes the metadata from the open document. If the icon does not appear, you can click ‘File’ and then select ‘Remove Hidden Data’. You then save the document (we suggest saving it as a different name so that you can have your ‘cleaned’ version and your commented versions if you would like) and you may then provide that document to anyone with the *assurance that it is metadata free.

Examine DocumentRemoving data from .pdf’s requires Adobe Acrobat 8.0 or a commercial third-party plug-in for prior versions. Simply open a .pdf in Acrobat 8.0 and select ‘Document’ and then click ‘Examine Document’. A window will appear which displays all of the hidden metadata Acrobat found in your .pdf. From this point, you can simple check mark the items you would like to remove (careful, Acrobat defaults to checking all of the items, which include bookmarks and comments which you sometimes don’t want to remove). After removing the documents, you will need to save the .pdf, just as in Microsoft’s programs, to preserve the metadata-free .pdf.

So now you can once again return to the golden-age of professional courtesy, maybe you even want to go so far as emailing your discovery to opposing counsel at the same time as you send it snail-mail. Now wouldn’t that set a tone for cooperation in your case?

*As this is Microsoft, TechnoEsq accepts no liability for the accuracy of this statement.

1-2 of 2