summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-10-03Bump the changelog0.4.0jvoisin
2018-10-03Don't break office files for MS Officejvoisin
We didn't take the whitelist into account while removing dangling files from [Content_types].xml
2018-10-03Remove file left behind by the testsuitejvoisin
2018-10-03Fix the testsuitejvoisin
2018-10-03Improve mat2's cli reliabilityjvoisin
- Replace some class members by instance members - Don't thread the cleaning process anymore for now
2018-10-02Use [Content_Types].xml to improve MS Office coveragejvoisin
2018-10-02fix typogeorg
2018-10-02Check that cleaning twice doesn't break the filejvoisin
2018-10-02Silence a bit the testsuitejvoisin
2018-10-02Update the CONTRIBUTING.md file wrt. to the previous commitjvoisin
2018-10-01manpage: this is about mat2, not matgeorg
2018-10-01Files processed via MAT2 are now accepted without warnings by MS Officejvoisin
2018-10-01Fix a typo in the README spotted by @georgjvoisin
2018-09-30Please mypyjvoisin
2018-09-30Remove dangling references in MS Office's [Content_types].xmljvoisin
2018-09-26Document mat2's output scheme in the manpage as welljvoisin
2018-09-26Document the output scheme in the READMEjvoisin
2018-09-25Run the testsuite exclusively on Whitewhale for nowjvoisin
This should fix the intermittent failures, thanks to @pollo for the tip
2018-09-24Second pass of minor formattingjvoisin
2018-09-24Fix some minor formatting issuesjvoisin
2018-09-24Implement rsid stripping for office filesjvoisin
MS Office XML rsid is a "unique identifier used to track the editing session when the physical character representing this section mark was last formatted." See the following links for details: - https://msdn.microsoft.com/en-us/library/office/documentformat.openxml.wordprocessing.previoussectionproperties.rsidrpr.aspx - https://blogs.msdn.microsoft.com/brian_jones/2006/12/11/whats-up-with-all-those-rsids/.
2018-09-24Lexicographical sort on xml attributes for office filesjvoisin
In XML, the order of the attributes shouldn't be meaningful, however, MS Office sorts attributes for a given XML tag differently than LibreOffice.
2018-09-20Add a test for zip orderingjvoisin
2018-09-20Make pyflakes happyjvoisin
2018-09-20Split the testsjvoisin
2018-09-18Insert archive members in lexicographic orderjvoisin
2018-09-17Add a link to the gentoo overlayjvoisin
2018-09-12trivial modification of all shebang.Yoann Lamouroux
`/usr/bin/python3` -> `/usr/bin/env python3` It's always better to trust the environment defined path to bin/python, as virtualenv become the way to go.
2018-09-12Bump coverage back to 100%jvoisin
2018-09-09Improve the resilience of MAT2 wrt. corrupted PNGjvoisin
2018-09-06Fix a setuptool-related warningjvoisin
2018-09-06Make pylint happyjvoisin
2018-09-06Split office and archivesjvoisin
2018-09-06Improve a cli test resiliencejvoisin
2018-09-06Mention "scambled exif" as a related softwarejvoisin
2018-09-06Change a bit the previous commitjvoisin
2018-09-05Unknown Members: make policy use an EnumDaniel Kahn Gillmor
Closes #60 Note: this changeset also ensures that clean.cleaned.docx is removed up after the pytest is over.
2018-09-05spelling correction.Daniel Kahn Gillmor
while mat2 has both a thread model (a thread pool that strips metadata in parallel) and a threat model (a list of malicious adversaries and their capabilities that we are trying to defeat), i think this paragraph is talking about the latter.
2018-09-05Remove defusedxml support and document whyjvoisin
2018-09-05Remove short version of dangerous/advanced optionsjvoisin
2018-09-05Add missing dependencies for the Nautilus extension to INSTALL.mdChristian
2018-09-05Make sure target directory exists, assume MAT2 is in parent directoryChristian
2018-09-05Improve the previous commitjvoisin
2018-09-04office: try all members, even when one failsDaniel Kahn Gillmor
the end result will be the same -- an abort -- but the user will get to see all the warnings for a particular file, instead of getting them one at a time.
2018-09-04document all unknown/unhandlable files even on abortDaniel Kahn Gillmor
This makes it easy to get a list of all files that mat2 doesn't know how to handle, without having to choose -u keep or -u omit.
2018-09-04add --unknown-members argument to mat2Daniel Kahn Gillmor
This allows the user to make use of parser.unknown_member_policy for archive formats. At the suggestion of @jvoisin, it also prints a scary warning if the user explicitly chooses 'keep'.
2018-09-04office: create policy for what to do about unknown membersDaniel Kahn Gillmor
previously, encountering an unknown member meant that any parser of this type would abort. now, the user can set parser.unknown_member_policy to either 'omit' or 'keep' if they don't want the current action of 'abort' note that this causes pylint to complain about branching depth for remove_all() because of the nuanced error-handling. I've disabled this check.
2018-09-03Update the release process to create signed tarballsjvoisin
2018-09-01Bump the coverage back to 100%jvoisin
2018-09-01Add a link to the mailing listjvoisin