Portable Network Graphics
.png
full
textual metadata + date
removal of harmful fields is done with hachoir
Jpeg
.jpeg, .jpg
partial
comment + exif/photoshop/adobe
removal of harmful fields is done with hachoir
Canon Raw tags :
http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/CanonRaw.html
Open Document
.odt, .odx, .ods, ...
full
a meta.xml file
removal of the meta.xml file
Office Openxml
.docx, .pptx, .xlsx, ...
full
a docProps folder containings xml metadata files
removal of the docProps folder
Portable Document Fileformat
.pdf
full
a lot
rendering of the pdf file on a cairo surface with the help of
poppler in order to remove all the internal metadata,
then removal of the remaining metadata fields of the pdf itself with
pdfrw (the next version of python-cairo will support metadata,
so we should get rid of pdfrw)
Tape ARchive
.tar, .tar.bz2, .tar.gz
full
metadata from the file itself, metadata from the file contained
into the archive, and metadata added by tar to the file at then
creation of the archive
extraction of each file, treatement of the file, add treated file
to a new archive, right before the add, remove the metadata added by tar
itself. When the new archive is complete, remove all his metadata.
Zip
.zip
partial
metadata from the file itself, metadata from the file contained
into the archive, and metadata added by zip to the file when added to
the archive.
extraction of each file, treatement of the file, add treated file
to a new archive. When the new archive is complete, remove all his metadata
metadata added by zip itself to internal files
MPEG Audio
.mp3, .mp2, .mp1, .mpa
full
id3
removal of harmful fields is done with hachoir
Ogg Vorbis
.ogg
full
Vorbis
removal of harmful fields is done with mutagen
Free Lossless Audio Codec
.flac
full
Flac, Vorbis
removal of harmful fields is done with mutagen
Torrent
.torrent
full
torrent
remove all the compromizing metadata