summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorjvoisin2019-05-16 20:59:15 +0200
committerjvoisin2019-05-16 20:59:15 +0200
commit13d71a256587c2eb41904480ea9a7bce8e46cd3d (patch)
tree81f165f4fa41dc10710adbbe69a96e88218a2f1d /doc
parent35d550d229b219f5a02cb9194c3bd24329f975ed (diff)
Document the archives handling implementation's details
Diffstat (limited to 'doc')
-rw-r--r--doc/implementation_notes.md26
1 files changed, 21 insertions, 5 deletions
diff --git a/doc/implementation_notes.md b/doc/implementation_notes.md
index cbf76ee..7555d2e 100644
--- a/doc/implementation_notes.md
+++ b/doc/implementation_notes.md
@@ -12,11 +12,16 @@ images in a PDF or an office document.
12Revisions handling 12Revisions handling
13------------------ 13------------------
14 14
15Revisions are handled according to the principle of least astonishment: they are entirely removed. 15Revisions are handled according to the principle of least astonishment: they
16are entirely removed.
16 17
17- Either the users aren't aware of the revisions, are thus they should be deleted. For example journalists that are editing a document to erase mentions sources mentions. 18- Either the users aren't aware of the revisions, are thus they should be
19 deleted. For example journalists that are editing a document to erase
20 mentions sources mentions.
18 21
19- Or they are aware of it, and will likely not expect MAT2 to be able to keep the revisions, that are basically traces about how, when and who edited the document. 22- Or they are aware of it, and will likely not expect MAT2 to be able to keep
23 the revisions, that are basically traces about how, when and who edited the
24 document.
20 25
21 26
22Race conditions 27Race conditions
@@ -37,8 +42,19 @@ against them
37Archives handling 42Archives handling
38----------------- 43-----------------
39 44
40MAT2 doesn't support archives yet, because we haven't found an usable way to ask the user 45By default, when cleaning a non-support file format in an archive,
41what to do when a non-supported files are encountered. 46mat2 will abort with a detailed error message.
47While strongly discouraged, it's possible to override this behaviour to force
48the exclusion, or inclusion of unknown files into the cleaned archive.
49
50While Python's [zipfile](https://docs.python.org/3/library/zipfile.html) module
51provides *safe* way to extract members of a zip archive, the
52[tarfile](https://docs.python.org/3/library/tarfile.html) one doesn't,
53meaning that it's up to mat2 to implement safety checks. Currently,
54it defends against path-traversal, both relative and absolute,
55symlink-related attacks, setuid/setgid attacks, duplicate members, block and
56char devices, … but there might still be dragons lurking there.
57
42 58
43PDF handling 59PDF handling
44------------ 60------------