| Age | Commit message (Collapse) | Author | |
|---|---|---|---|
| 2018-10-01 | Files processed via MAT2 are now accepted without warnings by MS Office | jvoisin | |
| 2018-09-30 | Please mypy | jvoisin | |
| 2018-09-30 | Remove dangling references in MS Office's [Content_types].xml | jvoisin | |
| 2018-09-24 | Second pass of minor formatting | jvoisin | |
| 2018-09-24 | Fix some minor formatting issues | jvoisin | |
| 2018-09-24 | Implement rsid stripping for office files | jvoisin | |
| MS Office XML rsid is a "unique identifier used to track the editing session when the physical character representing this section mark was last formatted." See the following links for details: - https://msdn.microsoft.com/en-us/library/office/documentformat.openxml.wordprocessing.previoussectionproperties.rsidrpr.aspx - https://blogs.msdn.microsoft.com/brian_jones/2006/12/11/whats-up-with-all-those-rsids/. | |||
| 2018-09-24 | Lexicographical sort on xml attributes for office files | jvoisin | |
| In XML, the order of the attributes shouldn't be meaningful, however, MS Office sorts attributes for a given XML tag differently than LibreOffice. | |||
| 2018-09-06 | Split office and archives | jvoisin | |
| 2018-09-05 | Unknown Members: make policy use an Enum | Daniel Kahn Gillmor | |
| Closes #60 Note: this changeset also ensures that clean.cleaned.docx is removed up after the pytest is over. | |||
| 2018-09-05 | Remove defusedxml support and document why | jvoisin | |
| 2018-09-05 | Improve the previous commit | jvoisin | |
| 2018-09-04 | office: try all members, even when one fails | Daniel Kahn Gillmor | |
| the end result will be the same -- an abort -- but the user will get to see all the warnings for a particular file, instead of getting them one at a time. | |||
| 2018-09-04 | document all unknown/unhandlable files even on abort | Daniel Kahn Gillmor | |
| This makes it easy to get a list of all files that mat2 doesn't know how to handle, without having to choose -u keep or -u omit. | |||
| 2018-09-04 | office: create policy for what to do about unknown members | Daniel Kahn Gillmor | |
| previously, encountering an unknown member meant that any parser of this type would abort. now, the user can set parser.unknown_member_policy to either 'omit' or 'keep' if they don't want the current action of 'abort' note that this causes pylint to complain about branching depth for remove_all() because of the nuanced error-handling. I've disabled this check. | |||
| 2018-09-01 | Fix a minor formatting issue | jvoisin | |
| 2018-09-01 | Logging cleanup | dkg | |
| 2018-07-19 | Improve the code's documentation | jvoisin | |
| 2018-07-19 | Minor simplification in how we're handling xml for office files | jvoisin | |
| 2018-07-10 | Remove `print` from libmat, and use the `logging` module instead | jvoisin | |
| This should close #28 | |||
| 2018-07-09 | Make pylint even happier | jvoisin | |
| 2018-07-08 | Fix some pep8 issues spotted by pyflakes | jvoisin | |
| 2018-07-08 | Achieve 100% coverage! | jvoisin | |
| 2018-07-08 | Bump coverage for office files and fix some related crashes | jvoisin | |
| 2018-07-08 | Silence a mypy's stupid warning | jvoisin | |
| 2018-07-08 | Add defusedxml as an (optional) way to prevent XML-based attacks | jvoisin | |
| Those attacks are DoS-only. | |||
| 2018-07-07 | Fix a mistake in office file revisions handling | jvoisin | |
| 2018-07-02 | Improve a bit the formatting of the code thanks to pyflakes3 | jvoisin | |
| 2018-07-01 | Remove docx revisions | jvoisin | |
| 2018-07-01 | MAT2 is now cleaning revisions from odt files! | jvoisin | |
| 2018-07-01 | Remove the thumbnails from libreoffice files | jvoisin | |
| 2018-06-27 | Massively simplify how we're cleaning office files | jvoisin | |
| 2018-06-21 | Improve the reliability of the office parser | jvoisin | |
| 2018-06-21 | Fix some linter warnings | jvoisin | |
| 2018-06-21 | Refactor how offices files are handled | jvoisin | |
| - xml files are no longer considered harmless - Factorization of the `remove_all` method for office files - Explicit whitelist are used - Blacklist are used to skip files completely - Non-blacklisted files are _still cleaned_ - Unsupported files are still triggering an error | |||
| 2018-06-21 | Minor simplification of the office-related code | jvoisin | |
| 2018-06-10 | Minor code simplification | jvoisin | |
| 2018-06-10 | Make the parsing of office format's metadata more robust | jvoisin | |
| 2018-06-10 | Add some tests for non-supported embedded fileformats | jvoisin | |
| 2018-06-04 | Add more typing and use mypy in the CI | jvoisin | |
| 2018-05-18 | Rename some files to simplify packaging | jvoisin | |
| - the `src` folder is now `libmat2` - the `main.py` script is now `mat2.py` | |||
