mirror of
https://github.com/ilri/csv-metadata-quality.git
synced 2025-05-09 14:46:00 +02:00
Add support for fixing "unnecessary" Unicode
These are things like non-breaking spaces, "replacement" characters, etc that add nothing to the metadata and often cause errors during parsing or displaying in a UI.
This commit is contained in:
@ -25,6 +25,9 @@ def main(argv):
|
||||
# Fix: whitespace
|
||||
df[column] = df[column].apply(fix.whitespace)
|
||||
|
||||
# Fix: unnecessary Unicode
|
||||
df[column] = df[column].apply(fix.unnecessary_unicode)
|
||||
|
||||
# Check: invalid multi-value separator
|
||||
df[column] = df[column].apply(check.separators)
|
||||
|
||||
|
Reference in New Issue
Block a user