Office files and PDFs are materials often handled in whistleblowing.
Meeting minutes, contracts, reports, email attachments, spreadsheets, presentation materials, scanned PDFs. All can be strong evidence, while also being formats where metadata and edit history tend to remain.
In whistleblowing, the issue is not only the file contents, but also how the file was created, who edited it, and what environment it came from.
Information left in Office files
Word, Excel, and PowerPoint files can retain authors, last saved by, company names, comments, tracked changes, hidden sheets, embedded objects, and similar information.
Information
Risk
Author / last saved by
Real names or internal accounts appear
Comments
Names of people involved and review contents remain
Tracked changes
It becomes clear who edited what
Hidden sheets
Data that is not displayed remains
Embedded files
Separate materials or internal information are included
In an Office file, the visible page is not the whole content.
Especially in Excel, be careful with hidden sheets, filters, comments, formulas, and external links.
Office files are formats that easily carry work-in-progress information.
Even if a document looks finished on the surface, traces of editing may remain inside the file.
In Excel, hidden sheets, hidden rows, rows hidden by filters, formulas, defined names, and external data connections become problems.
In PowerPoint, pay attention to speaker notes, hidden slides, embedded images, and templates.
In Word, check tracked changes, comments, headers, footers, and document properties.
Information left in PDFs
PDFs are often assumed to be safe because they look like final versions.
However, PDFs can also retain author, creation software, creation date and time, edit history, annotations, bookmarks, embedded files, and OCR text.
Information
Risk
Author
Original document or worker becomes visible
Creation software
The environment where it was created can be inferred
Annotations / comments
Review history and names remain
OCR text
Text you thought you redacted may remain
Embedded files
Original materials or attachment information are included
Simply converting to PDF does not make it safe.
Redaction and pixelation can also allow the original text to be extracted if the processing method is poor.
PDFs invite complacency because they look like finished distribution copies.
However, PDFs can also retain author information, annotations, bookmarks, attachments, hidden layers, and OCR text.
If redaction only places a black rectangle over the text, the underlying characters may remain.
Even if it appears as an image, text may remain behind it.
When preparing a PDF for publication, check not only its appearance, but also copyable text, annotations, attachments, and properties.
For redacted areas, check that they are not visible when copied, searched, selected, or opened with another tool.
Especially dangerous points in whistleblowing
In whistleblowing, metadata can be dangerous even if it does not directly show a name.
If creation time, version numbers, department names, document numbers, names in comments, or traces of distribution remain, the flow of the material becomes visible.
Information that remains
What can be inferred
Version number
When and to whom the material was distributed
Document number
Managing department or material classification
Commenter
Related department or review staff
Hidden data
Information not meant to be published
Creation date and time
When someone touched the material
If the publishing side casually releases materials, not only the whistleblower but also people involved and unrelated employees may be drawn in.
In whistleblowing, there are people who look for the source of the material.
Those people look not only at the body text, but also at version numbers, distribution destinations, commenters, document numbers, templates, and creation times.
For example, if wording that exists only in the latest version is published, people who had access to the latest version may be suspected.
If commenter names remain, related departments and review paths become visible.
Metadata affects not only the whistleblower, but also document creators, departments that received the material, reviewers, and co-editors.
The publishing side has a responsibility not to publish received materials as-is.
Cautions for checking and processing
When handling Office files and PDFs, separate files for checking, storage, and publication.
If you carelessly process an original file that is needed as evidence, it may become a problem later. On the other hand, you must not leave unnecessary information in a publication file.
Stage
Caution
Receiving
Do not casually open the original file in your everyday environment
Checking
Look at properties, comments, tracked changes, and hidden elements
Storage
Separate the original file and the publication copy
Processing
Check the method for redaction, deletion, and conversion
Rechecking
Check whether information remains in the publication file
Specific tools for checking and removing metadata are covered in another article.
Here, understand that format conversion alone does not make a file safe.
The original file may be important as evidence.
Therefore, directly processing and overwriting the original file can affect evidentiary value and verifiability.
On the other hand, unnecessary information must not remain in the publication file.
For this reason, separate the original file, working copy, and publication copy.
File type
How to handle it
Original file
Store safely to preserve evidentiary value
Working copy
Use for checking and processing
Publication copy
Remove unnecessary information and recheck
Consultation copy
Adjust the scope shown to lawyers or specialists
For high-risk whistleblowing, do not judge from an article alone; also consider consulting a lawyer, news organization, or trusted support contact.
Whether a file should be deleted or preserved relates not only to anonymity, but also to evidentiary value and legal risk.
Pre-publication check
Before publishing Office files or PDFs, look in this order.
Does the filename contain a real name, department name, or project name?
Do properties still contain author, company name, or last saved by?
Do comments, tracked changes, or annotations remain?
Are there hidden sheets, hidden slides, or speaker notes?
Does text remain under PDF redactions?
Did you recheck the converted file in another environment?
Checking does not end after one pass.
Recheck after processing, after conversion, and immediately before publication.
Especially after converting to PDF, treat it as a separate file from the source document, and recheck properties, annotations, and copyable text.
Summary
Office files and PDFs can retain authors, last saved by, company names, comments, tracked changes, hidden sheets, annotations, OCR text, and embedded files.
In whistleblowing, this information connects to the whistleblower, departments, the flow of materials, and distribution scope.
Simply converting to PDF does not make it safe.
Separate the original file, checking copy, and publication copy, and check metadata and invisible elements before publication.
Office files and PDFs can be strong evidence, while also being file formats that describe where they came from.
Related tools
Metadata inspection
ExifTool
An external resource related to this article. Open it only when it fits your situation and threat model.
Why it is listed: It can help with the article topic, but it is outside Anonymity Sense and should be checked before use.