summaryrefslogtreecommitdiff
path: root/libmat2 (follow)
AgeCommit message (Collapse)Author
2018-10-23Improve type annotation coveragejvoisin
2018-10-23Implement lightweight cleaning for png and tiffjvoisin
2018-10-23Optimize the handling of problematic filesjvoisin
2018-10-22Improve problematic filenames supportjvoisin
2018-10-22Test mat2's reliability wrt. corrupted video filesjvoisin
2018-10-22Implement support for .avi files, via ffmpegjvoisin
- This commit introduces optional dependencies (namely ffmpeg): mat2 will spit a warning when trying to process an .avi file if ffmpeg isn't installed. - Since metadata are obtained via exiftool, this commit also refactors a bit our exfitool wrapper.
2018-10-12Bump mypy typing coveragejvoisin
2018-10-12Refactor lightweight mode implementationjvoisin
2018-10-11Implement recursive metadata for FLAC filesjvoisin
Since FLAC files can contain covers, it makes sense to parse their metadata
2018-10-11Delete pictures of FLAC filesjvoisin
2018-10-05Improve both the typing and the commentsjvoisin
2018-10-05Hide unsupported extensions in `mat2 -l`jvoisin
2018-10-04Trash word/people.xml in office filesjvoisin
2018-10-03libmat2: fix shebanggeorg
Relates 0a2a398c9c797f8a93e8a4d91e80c0582f127354
2018-10-03Don't break office files for MS Officejvoisin
We didn't take the whitelist into account while removing dangling files from [Content_types].xml
2018-10-03Improve mat2's cli reliabilityjvoisin
- Replace some class members by instance members - Don't thread the cleaning process anymore for now
2018-10-02Use [Content_Types].xml to improve MS Office coveragejvoisin
2018-10-02fix typogeorg
2018-10-01Files processed via MAT2 are now accepted without warnings by MS Officejvoisin
2018-09-30Please mypyjvoisin
2018-09-30Remove dangling references in MS Office's [Content_types].xmljvoisin
2018-09-24Second pass of minor formattingjvoisin
2018-09-24Fix some minor formatting issuesjvoisin
2018-09-24Implement rsid stripping for office filesjvoisin
MS Office XML rsid is a "unique identifier used to track the editing session when the physical character representing this section mark was last formatted." See the following links for details: - https://msdn.microsoft.com/en-us/library/office/documentformat.openxml.wordprocessing.previoussectionproperties.rsidrpr.aspx - https://blogs.msdn.microsoft.com/brian_jones/2006/12/11/whats-up-with-all-those-rsids/.
2018-09-24Lexicographical sort on xml attributes for office filesjvoisin
In XML, the order of the attributes shouldn't be meaningful, however, MS Office sorts attributes for a given XML tag differently than LibreOffice.
2018-09-18Insert archive members in lexicographic orderjvoisin
2018-09-12Bump coverage back to 100%jvoisin
2018-09-09Improve the resilience of MAT2 wrt. corrupted PNGjvoisin
2018-09-06Make pylint happyjvoisin
2018-09-06Split office and archivesjvoisin
2018-09-06Change a bit the previous commitjvoisin
2018-09-05Unknown Members: make policy use an EnumDaniel Kahn Gillmor
Closes #60 Note: this changeset also ensures that clean.cleaned.docx is removed up after the pytest is over.
2018-09-05Remove defusedxml support and document whyjvoisin
2018-09-05Improve the previous commitjvoisin
2018-09-04office: try all members, even when one failsDaniel Kahn Gillmor
the end result will be the same -- an abort -- but the user will get to see all the warnings for a particular file, instead of getting them one at a time.
2018-09-04document all unknown/unhandlable files even on abortDaniel Kahn Gillmor
This makes it easy to get a list of all files that mat2 doesn't know how to handle, without having to choose -u keep or -u omit.
2018-09-04office: create policy for what to do about unknown membersDaniel Kahn Gillmor
previously, encountering an unknown member meant that any parser of this type would abort. now, the user can set parser.unknown_member_policy to either 'omit' or 'keep' if they don't want the current action of 'abort' note that this causes pylint to complain about branching depth for remove_all() because of the nuanced error-handling. I've disabled this check.
2018-09-01Bump the coverage back to 100%jvoisin
2018-09-01three minor spelling fixesDaniel Kahn Gillmor
2018-09-01Add archlinux to the CIjvoisin
2018-09-01Fix a minor formatting issuejvoisin
2018-09-01Logging cleanupdkg
2018-08-23Improve the detection of unsupported extensions in uppercasejvoisin
2018-08-23libmat2: images: fix handling of .JPG filesAntoine Tenart
Pixbuf only supports .jpeg files, not .jpg, so libmat2 looks for such an extension and converts it if necessary. As this check is case sensitive, processing .JPG files does not work. Fixes #47. Signed-off-by: Antoine Tenart <antoine.tenart@ack.tf>
2018-07-21AbstractParser: Fix typosgeorg
2018-07-19Improve the code's documentationjvoisin
2018-07-19Minor simplification in how we're handling xml for office filesjvoisin
2018-07-15Add a check for a missed dependency in `./mat2 -c`jvoisin
2018-07-10Remove `print` from libmat, and use the `logging` module insteadjvoisin
This should close #28
2018-07-10Implement a check for dependencies in mat2jvoisin
Example use: ``` $ mat2 -c Dependencies required for MAT2 0.1.3: - Cairo: yes - Exiftool: yes - GdkPixbuf from PyGobject: yes - Mutagen: yes - Poppler from PyGobject: yes - PyGobject: yes ``` This should close #35