1
0
mirror of https://github.com/ilri/csv-metadata-quality.git synced 2025-05-08 06:06:00 +02:00

Always fix invalid multi-value separators
All checks were successful
continuous-integration/drone/push Build is passing

This is no longer class-ified as "unsafe" as I have yet to see a
case where this was intentional, and it always causes issues when
you import the data in a DSpace repository.
This commit is contained in:
2021-03-13 12:59:45 +02:00
parent f00a07e2cd
commit 1008acf35e
3 changed files with 14 additions and 10 deletions

View File

@ -111,10 +111,9 @@ def run(argv):
df[column] = df[column].apply(check.suspicious_characters, field_name=column)
# Fix: invalid and unnecessary multi-value separators
if args.unsafe_fixes:
df[column] = df[column].apply(fix.separators, field_name=column)
# Run whitespace fix again after fixing invalid separators
df[column] = df[column].apply(fix.whitespace, field_name=column)
df[column] = df[column].apply(fix.separators, field_name=column)
# Run whitespace fix again after fixing invalid separators
df[column] = df[column].apply(fix.whitespace, field_name=column)
# Fix: duplicate metadata values
df[column] = df[column].apply(fix.duplicates, field_name=column)