From 697cb36b814d7e01da336c43b1932264302a2528 Mon Sep 17 00:00:00 2001 From: georg Date: Thu, 28 Nov 2019 02:15:20 +0000 Subject: This is mat2, not MAT2 Closes #131 --- doc/implementation_notes.md | 18 +++++++++--------- doc/mat2.1 | 6 +++--- doc/threat_model.md | 24 ++++++++++++------------ 3 files changed, 24 insertions(+), 24 deletions(-) (limited to 'doc') diff --git a/doc/implementation_notes.md b/doc/implementation_notes.md index 7555d2e..e298646 100644 --- a/doc/implementation_notes.md +++ b/doc/implementation_notes.md @@ -4,7 +4,7 @@ Implementation notes Lightweight cleaning mode ------------------------- -Due to *popular* request, MAT2 is providing a *lightweight* cleaning mode, +Due to *popular* request, mat2 is providing a *lightweight* cleaning mode, that only cleans the superficial metadata of your file, but not the ones that might be in **embedded** resources. Like for example, images in a PDF or an office document. @@ -19,7 +19,7 @@ are entirely removed. deleted. For example journalists that are editing a document to erase mentions sources mentions. -- Or they are aware of it, and will likely not expect MAT2 to be able to keep +- Or they are aware of it, and will likely not expect mat2 to be able to keep the revisions, that are basically traces about how, when and who edited the document. @@ -27,15 +27,15 @@ are entirely removed. Race conditions --------------- -MAT2 does its very best to avoid crashing at runtime. This is why it's checking -if the file is valid __at parser creation__. MAT2 doesn't take any measure to +mat2 does its very best to avoid crashing at runtime. This is why it's checking +if the file is valid __at parser creation__. mat2 doesn't take any measure to ensure that the file is not changed between the time the parser is instantiated, and the call to clean or show the metadata. Symlink attacks --------------- -MAT2 output predictable filenames (like yourfile.jpg.cleaned). +mat2 output predictable filenames (like yourfile.jpg.cleaned). This may lead to symlink attack. Please check if you OS prevent against them @@ -65,10 +65,10 @@ didn't remove any *deep metadata*, like the ones in embedded pictures. This was on of the reason MAT was abandoned: the absence of satisfying solution to handle PDF. But apparently, people are ok with [pdf redact tools](https://github.com/firstlookmedia/pdf-redact-tools), that simply -transform the PDF into images. So this is what's MAT2 is doing too. +transform the PDF into images. So this is what's mat2 is doing too. Of course, it would be possible to detect images in PDf file, and process them -with MAT2, but since a PDF can contain a lot of things, like images, videos, +with mat2, but since a PDF can contain a lot of things, like images, videos, javascript, pdf, blobs, … this is the easiest and safest way to clean them. Images handling @@ -81,7 +81,7 @@ XML attacks ----------- Since our threat model conveniently excludes files crafted to specifically -bypass MAT2, fileformats containing harmful XML are out of our scope. -But since MAT2 is using [etree](https://docs.python.org/3/library/xml.html#xml-vulnerabilities) +bypass mat2, fileformats containing harmful XML are out of our scope. +But since mat2 is using [etree](https://docs.python.org/3/library/xml.html#xml-vulnerabilities) to process XML, it's "only" vulnerable to DoS, and not memory corruption: odds are that the user will notice that the cleaning didn't succeed. diff --git a/doc/mat2.1 b/doc/mat2.1 index c63b46b..c03842d 100644 --- a/doc/mat2.1 +++ b/doc/mat2.1 @@ -1,4 +1,4 @@ -.TH MAT2 "1" "May 2019" "MAT2 0.9.0" "User Commands" +.TH mat2 "1" "May 2019" "mat2 0.9.0" "User Commands" .SH NAME mat2 \- the metadata anonymisation toolkit 2 @@ -32,7 +32,7 @@ show program's version number and exit list all supported fileformats .TP \fB\-\-check\-dependencies\fR -check if MAT2 has all the dependencies it needs +check if mat2 has all the dependencies it needs .TP \fB\-V\fR, \fB\-\-verbose\fR show more verbose status information @@ -41,7 +41,7 @@ show more verbose status information how to handle unknown members of archive-style files (policy should be one of: abort, omit, keep) .TP \fB\-s\fR, \fB\-\-show\fR -list harmful metadata detectable by MAT2 without +list harmful metadata detectable by mat2 without removing them .TP \fB\-L\fR, \fB\-\-lightweight\fR diff --git a/doc/threat_model.md b/doc/threat_model.md index 31bfe91..8b97c67 100644 --- a/doc/threat_model.md +++ b/doc/threat_model.md @@ -3,7 +3,7 @@ Threat Model The Metadata Anonymisation Toolkit 2 adversary has a number of goals, capabilities, and counter-attack types that can be -used to guide us towards a set of requirements for the MAT2. +used to guide us towards a set of requirements for the mat2. This is an overhaul of MAT's (the first iteration of the software) one. @@ -53,7 +53,7 @@ Adversary user. This is the strongest position for the adversary to have. In this case, the adversary is capable of inserting arbitrary, custom watermarks specifically for tracking - the user. In general, MAT2 cannot defend against this + the user. In general, mat2 cannot defend against this adversary, but we list it for completeness' sake. - The adversary created the document for a group of users. @@ -65,7 +65,7 @@ Adversary - The adversary did not create the document, the weakest position for the adversary to have. The file format is (most of the time) standard, nothing custom is added: - MAT2 must be able to remove all metadata from the file. + mat2 must be able to remove all metadata from the file. Requirements @@ -73,28 +73,28 @@ Requirements * Processing - - MAT2 *should* avoid interactions with information. + - mat2 *should* avoid interactions with information. Its goal is to remove metadata, and the user is solely responsible for the information of the file. - - MAT2 *must* warn when encountering an unknown - format. For example, in a zipfile, if MAT2 encounters an + - mat2 *must* warn when encountering an unknown + format. For example, in a zipfile, if mat2 encounters an unknown format, it should warn the user, and ask if the file should be added to the anonymised archive that is produced. - - MAT2 *must* not add metadata, since its purpose is to + - mat2 *must* not add metadata, since its purpose is to anonymise files: every added items of metadata decreases anonymity. - - MAT2 *should* handle unknown/hidden metadata fields, + - mat2 *should* handle unknown/hidden metadata fields, like proprietary extensions of open formats. - - MAT2 *must not* fail silently. Upon failure, - MAT2 *must not* modify the file in any way. + - mat2 *must not* fail silently. Upon failure, + mat2 *must not* modify the file in any way. - - MAT2 *might* leak the fact that MAT2 was used on the file, + - mat2 *might* leak the fact that mat2 was used on the file, since it might be uncommon for some file formats to come without any kind of metadata, an adversary might suspect that - the user used MAT2 on certain files. + the user used mat2 on certain files. -- cgit v1.3