summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
Diffstat (limited to 'doc')
-rw-r--r--doc/implementation_notes.md18
-rw-r--r--doc/mat2.16
-rw-r--r--doc/threat_model.md24
3 files changed, 24 insertions, 24 deletions
diff --git a/doc/implementation_notes.md b/doc/implementation_notes.md
index 7555d2e..e298646 100644
--- a/doc/implementation_notes.md
+++ b/doc/implementation_notes.md
@@ -4,7 +4,7 @@ Implementation notes
4Lightweight cleaning mode 4Lightweight cleaning mode
5------------------------- 5-------------------------
6 6
7Due to *popular* request, MAT2 is providing a *lightweight* cleaning mode, 7Due to *popular* request, mat2 is providing a *lightweight* cleaning mode,
8that only cleans the superficial metadata of your file, but not 8that only cleans the superficial metadata of your file, but not
9the ones that might be in **embedded** resources. Like for example, 9the ones that might be in **embedded** resources. Like for example,
10images in a PDF or an office document. 10images in a PDF or an office document.
@@ -19,7 +19,7 @@ are entirely removed.
19 deleted. For example journalists that are editing a document to erase 19 deleted. For example journalists that are editing a document to erase
20 mentions sources mentions. 20 mentions sources mentions.
21 21
22- Or they are aware of it, and will likely not expect MAT2 to be able to keep 22- Or they are aware of it, and will likely not expect mat2 to be able to keep
23 the revisions, that are basically traces about how, when and who edited the 23 the revisions, that are basically traces about how, when and who edited the
24 document. 24 document.
25 25
@@ -27,15 +27,15 @@ are entirely removed.
27Race conditions 27Race conditions
28--------------- 28---------------
29 29
30MAT2 does its very best to avoid crashing at runtime. This is why it's checking 30mat2 does its very best to avoid crashing at runtime. This is why it's checking
31if the file is valid __at parser creation__. MAT2 doesn't take any measure to 31if the file is valid __at parser creation__. mat2 doesn't take any measure to
32ensure that the file is not changed between the time the parser is 32ensure that the file is not changed between the time the parser is
33instantiated, and the call to clean or show the metadata. 33instantiated, and the call to clean or show the metadata.
34 34
35Symlink attacks 35Symlink attacks
36--------------- 36---------------
37 37
38MAT2 output predictable filenames (like yourfile.jpg.cleaned). 38mat2 output predictable filenames (like yourfile.jpg.cleaned).
39This may lead to symlink attack. Please check if you OS prevent 39This may lead to symlink attack. Please check if you OS prevent
40against them 40against them
41 41
@@ -65,10 +65,10 @@ didn't remove any *deep metadata*, like the ones in embedded pictures. This was
65on of the reason MAT was abandoned: the absence of satisfying solution to 65on of the reason MAT was abandoned: the absence of satisfying solution to
66handle PDF. But apparently, people are ok with [pdf redact 66handle PDF. But apparently, people are ok with [pdf redact
67tools](https://github.com/firstlookmedia/pdf-redact-tools), that simply 67tools](https://github.com/firstlookmedia/pdf-redact-tools), that simply
68transform the PDF into images. So this is what's MAT2 is doing too. 68transform the PDF into images. So this is what's mat2 is doing too.
69 69
70Of course, it would be possible to detect images in PDf file, and process them 70Of course, it would be possible to detect images in PDf file, and process them
71with MAT2, but since a PDF can contain a lot of things, like images, videos, 71with mat2, but since a PDF can contain a lot of things, like images, videos,
72javascript, pdf, blobs, … this is the easiest and safest way to clean them. 72javascript, pdf, blobs, … this is the easiest and safest way to clean them.
73 73
74Images handling 74Images handling
@@ -81,7 +81,7 @@ XML attacks
81----------- 81-----------
82 82
83Since our threat model conveniently excludes files crafted to specifically 83Since our threat model conveniently excludes files crafted to specifically
84bypass MAT2, fileformats containing harmful XML are out of our scope. 84bypass mat2, fileformats containing harmful XML are out of our scope.
85But since MAT2 is using [etree](https://docs.python.org/3/library/xml.html#xml-vulnerabilities) 85But since mat2 is using [etree](https://docs.python.org/3/library/xml.html#xml-vulnerabilities)
86to process XML, it's "only" vulnerable to DoS, and not memory corruption: 86to process XML, it's "only" vulnerable to DoS, and not memory corruption:
87odds are that the user will notice that the cleaning didn't succeed. 87odds are that the user will notice that the cleaning didn't succeed.
diff --git a/doc/mat2.1 b/doc/mat2.1
index c63b46b..c03842d 100644
--- a/doc/mat2.1
+++ b/doc/mat2.1
@@ -1,4 +1,4 @@
1.TH MAT2 "1" "May 2019" "MAT2 0.9.0" "User Commands" 1.TH mat2 "1" "May 2019" "mat2 0.9.0" "User Commands"
2 2
3.SH NAME 3.SH NAME
4mat2 \- the metadata anonymisation toolkit 2 4mat2 \- the metadata anonymisation toolkit 2
@@ -32,7 +32,7 @@ show program's version number and exit
32list all supported fileformats 32list all supported fileformats
33.TP 33.TP
34\fB\-\-check\-dependencies\fR 34\fB\-\-check\-dependencies\fR
35check if MAT2 has all the dependencies it needs 35check if mat2 has all the dependencies it needs
36.TP 36.TP
37\fB\-V\fR, \fB\-\-verbose\fR 37\fB\-V\fR, \fB\-\-verbose\fR
38show more verbose status information 38show more verbose status information
@@ -41,7 +41,7 @@ show more verbose status information
41how to handle unknown members of archive-style files (policy should be one of: abort, omit, keep) 41how to handle unknown members of archive-style files (policy should be one of: abort, omit, keep)
42.TP 42.TP
43\fB\-s\fR, \fB\-\-show\fR 43\fB\-s\fR, \fB\-\-show\fR
44list harmful metadata detectable by MAT2 without 44list harmful metadata detectable by mat2 without
45removing them 45removing them
46.TP 46.TP
47\fB\-L\fR, \fB\-\-lightweight\fR 47\fB\-L\fR, \fB\-\-lightweight\fR
diff --git a/doc/threat_model.md b/doc/threat_model.md
index 31bfe91..8b97c67 100644
--- a/doc/threat_model.md
+++ b/doc/threat_model.md
@@ -3,7 +3,7 @@ Threat Model
3 3
4The Metadata Anonymisation Toolkit 2 adversary has a number 4The Metadata Anonymisation Toolkit 2 adversary has a number
5of goals, capabilities, and counter-attack types that can be 5of goals, capabilities, and counter-attack types that can be
6used to guide us towards a set of requirements for the MAT2. 6used to guide us towards a set of requirements for the mat2.
7 7
8This is an overhaul of MAT's (the first iteration of the software) one. 8This is an overhaul of MAT's (the first iteration of the software) one.
9 9
@@ -53,7 +53,7 @@ Adversary
53 user. This is the strongest position for the adversary to 53 user. This is the strongest position for the adversary to
54 have. In this case, the adversary is capable of inserting 54 have. In this case, the adversary is capable of inserting
55 arbitrary, custom watermarks specifically for tracking 55 arbitrary, custom watermarks specifically for tracking
56 the user. In general, MAT2 cannot defend against this 56 the user. In general, mat2 cannot defend against this
57 adversary, but we list it for completeness' sake. 57 adversary, but we list it for completeness' sake.
58 58
59 - The adversary created the document for a group of users. 59 - The adversary created the document for a group of users.
@@ -65,7 +65,7 @@ Adversary
65 - The adversary did not create the document, the weakest 65 - The adversary did not create the document, the weakest
66 position for the adversary to have. The file format is 66 position for the adversary to have. The file format is
67 (most of the time) standard, nothing custom is added: 67 (most of the time) standard, nothing custom is added:
68 MAT2 must be able to remove all metadata from the file. 68 mat2 must be able to remove all metadata from the file.
69 69
70 70
71Requirements 71Requirements
@@ -73,28 +73,28 @@ Requirements
73 73
74* Processing 74* Processing
75 75
76 - MAT2 *should* avoid interactions with information. 76 - mat2 *should* avoid interactions with information.
77 Its goal is to remove metadata, and the user is solely 77 Its goal is to remove metadata, and the user is solely
78 responsible for the information of the file. 78 responsible for the information of the file.
79 79
80 - MAT2 *must* warn when encountering an unknown 80 - mat2 *must* warn when encountering an unknown
81 format. For example, in a zipfile, if MAT2 encounters an 81 format. For example, in a zipfile, if mat2 encounters an
82 unknown format, it should warn the user, and ask if the 82 unknown format, it should warn the user, and ask if the
83 file should be added to the anonymised archive that is 83 file should be added to the anonymised archive that is
84 produced. 84 produced.
85 85
86 - MAT2 *must* not add metadata, since its purpose is to 86 - mat2 *must* not add metadata, since its purpose is to
87 anonymise files: every added items of metadata decreases 87 anonymise files: every added items of metadata decreases
88 anonymity. 88 anonymity.
89 89
90 - MAT2 *should* handle unknown/hidden metadata fields, 90 - mat2 *should* handle unknown/hidden metadata fields,
91 like proprietary extensions of open formats. 91 like proprietary extensions of open formats.
92 92
93 - MAT2 *must not* fail silently. Upon failure, 93 - mat2 *must not* fail silently. Upon failure,
94 MAT2 *must not* modify the file in any way. 94 mat2 *must not* modify the file in any way.
95 95
96 - MAT2 *might* leak the fact that MAT2 was used on the file, 96 - mat2 *might* leak the fact that mat2 was used on the file,
97 since it might be uncommon for some file formats to come 97 since it might be uncommon for some file formats to come
98 without any kind of metadata, an adversary might suspect that 98 without any kind of metadata, an adversary might suspect that
99 the user used MAT2 on certain files. 99 the user used mat2 on certain files.
100 100