1
0
mirror of https://github.com/ilri/csv-metadata-quality.git synced 2024-11-24 14:50:17 +01:00

Strip filename descriptions before checking
Some checks failed
continuous-integration/drone/push Build is failing

When checking for uncommon file extensions in the filename field
we should strip descriptions that are meant for SAF Bundler, for
example: Annual_Report_2020.pdf__description:Report. This ends up
as a false positive that spams the output with warnings.
This commit is contained in:
Alan Orth 2023-02-13 10:59:14 +03:00
parent bde38e9ed4
commit 8bc4cd419c
Signed by: alanorth
GPG Key ID: 0FB860CC9C45B1B9
2 changed files with 7 additions and 0 deletions

View File

@ -15,6 +15,8 @@ because it is deprecated and outdated
- Don't run `fix.separators()` on title or abstract fields
- Don't run whitespace or newline fixes on abstract fields
- Ignore some common non-SPDX licenses
- Ignore `__description` suffix in filenames meant for SAFBuilder when checking
for uncommon file extensions
### Updated
- Python dependencies

View File

@ -286,6 +286,11 @@ def filename_extension(field):
# Iterate over all values
for value in values:
# Strip filename descriptions that are meant for SAF Bundler, for
# example: Annual_Report_2020.pdf__description:Report
if "__description" in value:
value = value.split("__")[0]
# Assume filename extension does not match
filename_extension_match = False