Portable Network Graphics
.png
full
textual metadata + date
removal of harmful fields is done with hachoir
Jpeg
.jpeg, .jpg
partial
comment + exif/photoshop/adobe
removal of harmful fields is done with hachoir
Canon Raw tags :
http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/CanonRaw.html
Open Document
.odt, .odx, .ods, ...
full
a meta.xml file
removal of the meta.xml file
Office Openxml
.docx, .pptx, .xlsx, ...
full
a docProps folder containings xml metadata files
removal of the docProps folder
Portable Document Fileformat
.pdf
full
a lot
rendering of the PDF file on a cairo surface with the help of
poppler in order to remove all the internal metadata.
For now, cairo create some metadata.
They can be remove if you install either exiftool, or python-pdfrw.
The next version of python-cairo will support PDF metadata.
Tape ARchive
.tar, .tar.bz2, .tar.gz
full
metadata from the file itself, metadata from the file contained
into the archive, and metadata added by tar to the file at then
creation of the archive
extraction of each file, treatement of the file, add treated file
to a new archive, right before the add, remove the metadata added by tar
itself. When the new archive is complete, remove all his metadata.
Zip
.zip
partial
metadata from the file itself, metadata from the file contained
into the archive, and metadata added by zip to the file when added to
the archive.
extraction of each file, treatement of the file, add treated file
to a new archive. When the new archive is complete, remove all his metadata
metadata added by zip itself to internal files
MPEG Audio
.mp3, .mp2, .mp1, .mpa
full
id3
removal of harmful fields is done with hachoir
Ogg Vorbis
.ogg
full
Vorbis
removal of harmful fields is done with mutagen
Free Lossless Audio Codec
.flac
full
Flac, Vorbis
removal of harmful fields is done with mutagen
Torrent
.torrent
full
torrent
remove all the compromizing metadata with a heavily tuned version
of the bencode lib by Petru Paled