Portable Network Graphics .png full textual metadata + date removal of harmful fields is done with hachoir Jpeg .jpeg, .jpg full comment + exif/photoshop/adobe removal of harmful fields is done with hachoir Open Document .odt, .odx, .ods, ... full a meta.xml file removal of the meta.xml file Office Openxml .docx, .pptx, .xlsx, ... full a docProps folder containings xml metadata files removal of the docProps folder Portable Document Fileformat .pdf full a lot rendering of the pdf file on a cairo surface with the help of poppler in order to remove all the internal metadata, then removal of the remaining metadata fields of the pdf itself with pdfrw (the next version of python-cairo will support metadata, so we should get rid of pdfrw) Tape ARchive .tar, .tar.bz2, .tar.gz full metadata from the file itself, metadata from the file contained into the archive, and metadata added by tar to the file at then creation of the archive extraction of each file, treatement of the file, add treated file to a new archive, right before the add, remove the metadata added by tar itself. When the new archive is complete, remove all his metadata. Zip .zip .partial metadata from the file itself, metadata from the file contained into the archive, and metadata added by zip to the file when added to the archive. extraction of each file, treatement of the file, add treated file to a new archive. When the new archive is complete, remove all his metadata metadata added by zip itself to internal files MPEG Audio .mp3, .mp2, .mp1 full id3 removal of harmful fields is done with hachoir Ogg Vorbis .ogg full Vorbis removal of harmful fields is done with mutagen Free Lossless Audio Codec .flac full Flac, Vorbis removal of harmful fields is done with mutagen