Learn

284 articlesCategory: All
Publishing workflow

How to Inspect Documents Before Publishing

Before publishing a document, check not only the body text, but also the inside of the file, filename, comments, revision history, sharing history, and the state after PDF conversion.

The danger for anonymity is not only that a real name remains in the body text.

Author names, organization names, comments, revision history, hidden sheets, annotations, filenames, and cloud sharing links can also become clues for inferring the person or related people.

This article explains the flow for inspecting documents before publishing.

Inspection Targets

In document inspection, check the document surface and the inside separately.

TargetWhat to check
Body textNames, place names, affiliations, timeline, internal terms
CommentsRelated people's names, editing notes, review content
Revision historyInformation before deletion, editors, edit times
MetadataCreator, creation date and time, app name, company name
FilenameReal name, department name, project or case name, date
Sharing methodCloud link, permissions, recipients, login state

Documents cannot be judged by appearance alone.

Before publishing, check body text, internal information, and sharing method separately.

Basic Inspection Steps

Pre-publication inspection should be done in a fixed order.

OrderTaskReason
1Copy the originalDo not mix the original and public copy
2Read the body textCheck direct identifiers and proper nouns
3Look at comments and revision historyCheck information from the editing process
4Look at metadataCheck creator and app information
5Change the filenameReduce personal information on the outside
6Convert to publication formatConvert to PDF or regenerate if necessary
7Recheck after conversionSee whether information remains in the new file

The reason to follow this order is that when you create a new file partway through, different metadata may be attached.

Inspection includes rechecking after conversion.

Inspecting the Body Text

First, check the body text.

The body text contains not only direct personal information, but also information that narrows candidates.

TypeExample
Direct identifierReal name, email, phone number, address
Affiliation informationCompany name, school name, department, role
TimelineDate, time, description immediately after an event
Internal termsInternal abbreviations, project names, distinctive terms
Related-person informationFamily, colleagues, sources, participants

For anonymization, deleting proper nouns is not enough.

Events known to only a small number of people and wording used only by a specific department also become clues.

Inspecting Filenames and Storage Locations

Filenames are personal information that is easy to overlook.

Even if the body text and metadata are cleaned up, anonymity becomes weaker if a real name, department name, project or case name, or date remains in the filename.

What to checkExample
Real nameyamada_report.pdf, Tanaka_materials.pdf
Department namesales_internal.pdf, hr_case.docx
Case nameproject_x_final.pdf
Date2026-06-12_meeting.pdf
Storage path/Users/name/Company/ and similar paths

Also be careful with storage locations.

If you work in a cloud sync folder, file history and sharing history may remain.

When working on a workplace device or school device, device management logs and antivirus software logs may also be relevant.

Information Added After Conversion

When you convert a document to PDF, turn it into images, or convert it to another format, new metadata may be attached.

ConversionInformation that may be added
Office to PDFCreation app, creation date and time, PDF producer
PDF to imageImage creation date and time, conversion software name
Resaving an imageEditing software name, update time
Audio/video re-encodingEncoder information, creation app
Download from cloudFilename or download time

Conversion is sometimes done to reduce information.

However, the converted file is a new inspection target.

After converting, always check again.

Inspecting Internal Information

Next, check information inside the document.

For Office documents, look at comments, revision history, creator, and hidden sheets.

For PDFs, look at creator, annotations, embedded files, redaction, and hidden text.

For documents containing images or scans, also check text and backgrounds inside the images.

FormatWhat to check
OfficeComments, revision history, creator, company name, hidden sheets
PDFCreator, annotations, embedded files, redaction, hidden text
Documents with images, background, reflections, text, filename
Collaborative documentSharing history, editors, comments, permissions

Even if you convert a document to PDF, recheck it as a PDF.

Conversion is not the end of checking, but work that creates a new inspection target.

Usable Tools and Limits

ExifTool is sometimes used for metadata checks.

URL : https://exiftool.org/

qpdf is a candidate for checking PDF structure and conversion.

URL : https://qpdf.readthedocs.io/

MAT2 is a candidate for metadata removal.

URL : https://0xacab.org/jvoisin/mat2

Tools like MAT2 should be used after checking the distribution source, maintenance status, and supported formats. After processing, do not trust only the result from the same tool; recheck with another method too.

These are useful tools, but they do not judge the safety of document contents for you.

Even if a tool removes metadata, internal terms in the body text, signs inside images, filenames, and sharing paths remain.

Deciding to Stop Before Publishing

If unclear items remain during inspection, it is better not to rush publication.

Stop signReason
Cannot confirm whether creator name disappearedIt may lead closer to the person or organization
Change history remainsInformation before deletion may be visible
Unclear whether redaction was done correctlyUnderlying text may remain
Only a cloud sharing link existsOwner and permission information are relevant
Contains high-risk contentSpecialist or trusted consultation may be needed

When anonymity matters, do not treat items you cannot judge as safe.

Choose one of these: check it, delay publication, reduce information, consult, or do not publish.

Sharing Method After Inspection

Even after inspection is complete, anonymity can break through the sharing method.

Email, cloud sharing, social media DMs, upload forms, and anonymous posting tools leave different records.

Sharing methodCaution
EmailSender, recipient, time, and attachment filename remain
Cloud sharingOwner, sharing permissions, and access history remain
Social media DMConnects to account, sending time, and device information
Upload formIP, User-Agent, and sending time may be recorded
Anonymous posting toolCheck the tool's trust model and file contents

Even if you inspect a file, anonymity breaks if you send it from a real-name account.

Think of document inspection together with checking the sharing method.

Summary

In pre-publication document inspection, check body text, comments, revision history, metadata, filename, and sharing method separately.

You cannot judge document safety by appearance alone.

Even after PDF conversion, creator, annotations, embedded files, and hidden text may remain.

Tools such as ExifTool, qpdf, and MAT2 are useful, but tools alone do not complete anonymity.

Before publishing, separate the original and public copy, recheck after deletion, and check through the body text and transmission path.

Related tools

Metadata inspection

ExifTool

An external resource related to this article. Open it only when it fits your situation and threat model.

Why it is listed: It can help with the article topic, but it is outside Anonymity Sense and should be checked before use.

URL : https://exiftool.org/

Open external site
Metadata removal

MAT2

An external resource related to this article. Open it only when it fits your situation and threat model.

Why it is listed: It can help with the article topic, but it is outside Anonymity Sense and should be checked before use.

URL : https://0xacab.org/jvoisin/mat2

Open external site
PDF inspection

qpdf

An external resource related to this article. Open it only when it fits your situation and threat model.

Why it is listed: It can help with the article topic, but it is outside Anonymity Sense and should be checked before use.

URL : https://qpdf.readthedocs.io/

Open external site

Related articles