diff options
Diffstat (limited to 'README.md')
| -rw-r--r-- | README.md | 194 |
1 files changed, 1 insertions, 193 deletions
| @@ -1,193 +1 @@ | |||
| 1 | ``` | # This repository is deprecated, please use https://github.com/jvoisin/mat2 instead \ No newline at end of file | |
| 2 | _____ _____ _____ ___ | ||
| 3 | | | _ |_ _|_ | Keep your data, | ||
| 4 | | | | | |_| | | | | _| trash your meta! | ||
| 5 | |_|_|_|_| |_| |_| |___| | ||
| 6 | |||
| 7 | ``` | ||
| 8 | |||
| 9 | # Metadata and privacy | ||
| 10 | |||
| 11 | Metadata consist of information that characterizes data. | ||
| 12 | Metadata are used to provide documentation for data products. | ||
| 13 | In essence, metadata answer who, what, when, where, why, and how about | ||
| 14 | every facet of the data that are being documented. | ||
| 15 | |||
| 16 | Metadata within a file can tell a lot about you. | ||
| 17 | Cameras record data about when a picture was taken and what | ||
| 18 | camera was used. Office documents like PDF or Office automatically adds | ||
| 19 | author and company information to documents and spreadsheets. | ||
| 20 | Maybe you don't want to disclose those information. | ||
| 21 | |||
| 22 | This is precisely the job of mat2: getting rid, as much as possible, of | ||
| 23 | metadata. | ||
| 24 | |||
| 25 | mat2 provides: | ||
| 26 | - a library called `libmat2`; | ||
| 27 | - a command line tool called `mat2`, | ||
| 28 | - a service menu for Dolphin, KDE's default file manager | ||
| 29 | |||
| 30 | If you prefer a regular graphical user interface, you might be interested in | ||
| 31 | [Metadata Cleaner](https://metadatacleaner.romainvigier.fr/), which is using | ||
| 32 | `mat2` under the hood. | ||
| 33 | |||
| 34 | # Requirements | ||
| 35 | |||
| 36 | - `python3-mutagen` for audio support | ||
| 37 | - `python3-gi-cairo` and `gir1.2-poppler-0.18` for PDF support | ||
| 38 | - `gir1.2-gdkpixbuf-2.0` for images support | ||
| 39 | - `gir1.2-rsvg-2.0` for svg support | ||
| 40 | - `FFmpeg`, optionally, for video support | ||
| 41 | - `libimage-exiftool-perl` for everything else | ||
| 42 | - `bubblewrap`, optionally, for sandboxing | ||
| 43 | |||
| 44 | Please note that mat2 requires at least Python3.5. | ||
| 45 | |||
| 46 | # Requirements setup on macOS (OS X) using [Homebrew](https://brew.sh/) | ||
| 47 | |||
| 48 | ```bash | ||
| 49 | brew install exiftool cairo pygobject3 poppler gdk-pixbuf librsvg ffmpeg | ||
| 50 | ``` | ||
| 51 | |||
| 52 | # Running the test suite | ||
| 53 | |||
| 54 | ```bash | ||
| 55 | $ python3 -m unittest discover -v | ||
| 56 | ``` | ||
| 57 | |||
| 58 | And if you want to see the coverage: | ||
| 59 | |||
| 60 | ```bash | ||
| 61 | $ python3-coverage run --branch -m unittest discover -s tests/ | ||
| 62 | $ python3-coverage report --include -m --include /libmat2/*' | ||
| 63 | ``` | ||
| 64 | |||
| 65 | # How to use mat2 | ||
| 66 | |||
| 67 | ``` | ||
| 68 | usage: mat2 [-h] [-V] [--unknown-members policy] [--inplace] [--no-sandbox] | ||
| 69 | [-v] [-l] [--check-dependencies] [-L | -s] | ||
| 70 | [files [files ...]] | ||
| 71 | |||
| 72 | Metadata anonymisation toolkit 2 | ||
| 73 | |||
| 74 | positional arguments: | ||
| 75 | files the files to process | ||
| 76 | |||
| 77 | optional arguments: | ||
| 78 | -h, --help show this help message and exit | ||
| 79 | -V, --verbose show more verbose status information | ||
| 80 | --unknown-members policy | ||
| 81 | how to handle unknown members of archive-style files | ||
| 82 | (policy should be one of: abort, omit, keep) [Default: | ||
| 83 | abort] | ||
| 84 | --inplace clean in place, without backup | ||
| 85 | --no-sandbox Disable bubblewrap's sandboxing | ||
| 86 | -v, --version show program's version number and exit | ||
| 87 | -l, --list list all supported fileformats | ||
| 88 | --check-dependencies check if mat2 has all the dependencies it needs | ||
| 89 | -L, --lightweight remove SOME metadata | ||
| 90 | -s, --show list harmful metadata detectable by mat2 without | ||
| 91 | removing them | ||
| 92 | ``` | ||
| 93 | |||
| 94 | Note that mat2 **will not** clean files in-place, but will produce, for | ||
| 95 | example, with a file named "myfile.png" a cleaned version named | ||
| 96 | "myfile.cleaned.png". | ||
| 97 | |||
| 98 | ## Web interface | ||
| 99 | |||
| 100 | It's possible to run mat2 as a web service, via | ||
| 101 | [mat2-web](https://0xacab.org/jvoisin/mat2-web). | ||
| 102 | |||
| 103 | If you're using WordPress, you might be interested in [wp-mat](https://git.autistici.org/noblogs/wp-mat) | ||
| 104 | and [wp-mat-server](https://git.autistici.org/noblogs/wp-mat-server). | ||
| 105 | |||
| 106 | ## Desktop GUI | ||
| 107 | |||
| 108 | For GNU/Linux desktops, it's possible to use the | ||
| 109 | [Metadata Cleaner](https://gitlab.com/rmnvgr/metadata-cleaner) GTK application. | ||
| 110 | |||
| 111 | # Supported formats | ||
| 112 | |||
| 113 | The following formats are supported: avi, bmp, css, epub/ncx, flac, gif, jpeg, | ||
| 114 | m4a/mp2/mp3/…, mp4, odc/odf/odg/odi/odp/ods/odt/…, off/opus/oga/spx/…, pdf, | ||
| 115 | png, ppm, pptx/xlsx/docx/…, svg/svgz/…, tar/tar.gz/tar.bz2/tar.xz/…, tiff, | ||
| 116 | torrent, wav, wmv, zip, … | ||
| 117 | |||
| 118 | # Notes about detecting metadata | ||
| 119 | |||
| 120 | While mat2 is doing its very best to display metadata when the `--show` flag is | ||
| 121 | passed, it doesn't mean that a file is clean from any metadata if mat2 doesn't | ||
| 122 | show any. There is no reliable way to detect every single possible metadata for | ||
| 123 | complex file formats. | ||
| 124 | |||
| 125 | This is why you shouldn't rely on metadata's presence to decide if your file must | ||
| 126 | be cleaned or not. | ||
| 127 | |||
| 128 | # Notes about the lightweight mode | ||
| 129 | |||
| 130 | By default, mat2 might alter a bit the data of your files, in order to remove | ||
| 131 | as much metadata as possible. For example, texts in PDF might not be selectable anymore, | ||
| 132 | compressed images might get compressed again, … | ||
| 133 | Since some users might be willing to trade some metadata's presence in exchange | ||
| 134 | of the guarantee that mat2 won't modify the data of their files, there is the | ||
| 135 | `-L` flag that precisely does that. | ||
| 136 | |||
| 137 | # Related software | ||
| 138 | |||
| 139 | - The first iteration of [MAT](https://mat.boum.org) | ||
| 140 | - [Exiftool](https://sno.phy.queensu.ca/~phil/exiftool/mat) | ||
| 141 | - [pdf-redact-tools](https://github.com/firstlookmedia/pdf-redact-tools), that | ||
| 142 | tries to deal with *printer dots* too. | ||
| 143 | - [pdfparanoia](https://github.com/kanzure/pdfparanoia), that removes | ||
| 144 | watermarks from PDF. | ||
| 145 | - [Scrambled Exif](https://f-droid.org/packages/com.jarsilio.android.scrambledeggsif/), | ||
| 146 | an open-source Android application to remove metadata from pictures. | ||
| 147 | - [Dangerzone](https://dangerzone.rocks/), designed to sanitize harmful documents | ||
| 148 | into harmless ones. | ||
| 149 | |||
| 150 | # Contact | ||
| 151 | |||
| 152 | If possible, use the [issues system](https://github.com/jvoisin/mat2/issues) | ||
| 153 | or the [mailing list](https://www.autistici.org/mailman/listinfo/mat-dev) | ||
| 154 | Should a more private contact be needed (eg. for reporting security issues), | ||
| 155 | you can email Julien (jvoisin) Voisin at `julien.voisin+mat2@dustri.org`, | ||
| 156 | using the gpg key `9FCDEE9E1A381F311EA62A7404D041E8171901CC`. | ||
| 157 | |||
| 158 | # Donations | ||
| 159 | |||
| 160 | If you want to donate some money, please give it to [Tails]( https://tails.boum.org/donate/?r=contribute ). | ||
| 161 | |||
| 162 | # License | ||
| 163 | |||
| 164 | This program is free software: you can redistribute it and/or modify | ||
| 165 | it under the terms of the GNU Lesser General Public License as published by | ||
| 166 | the Free Software Foundation, either version 3 of the License, or | ||
| 167 | (at your option) any later version. | ||
| 168 | |||
| 169 | This program is distributed in the hope that it will be useful, | ||
| 170 | but WITHOUT ANY WARRANTY; without even the implied warranty of | ||
| 171 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | ||
| 172 | GNU General Public License for more details. | ||
| 173 | |||
| 174 | You should have received a copy of the GNU Lesser General Public License | ||
| 175 | along with this program. If not, see <http://www.gnu.org/licenses/>. | ||
| 176 | |||
| 177 | Copyright 2018 Julien (jvoisin) Voisin <julien.voisin+mat2@dustri.org> | ||
| 178 | Copyright 2016 Marie-Rose for mat2's logo | ||
| 179 | |||
| 180 | The `tests/data/dirty_with_nsid.docx` file is licensed under GPLv3, | ||
| 181 | and was borrowed from the Calibre project: https://calibre-ebook.com/downloads/demos/demo.docx | ||
| 182 | |||
| 183 | The `narrated_powerpoint_presentation.pptx` file is in the public domain. | ||
| 184 | |||
| 185 | # Thanks | ||
| 186 | |||
| 187 | mat2 wouldn't exist without: | ||
| 188 | |||
| 189 | - the [Google Summer of Code](https://summerofcode.withgoogle.com/); | ||
| 190 | - the fine people from [Tails]( https://tails.boum.org); | ||
| 191 | - friends | ||
| 192 | |||
| 193 | Many thanks to them! | ||
