1
0
mirror of https://github.com/ilri/csv-metadata-quality.git synced 2024-06-29 09:33:46 +02:00
Commit Graph

204 Commits

Author SHA1 Message Date
232d28e13e
Refactor as package with subpackages
This makes it cleaner for introducing checks, fixes, tests, docs,
and tests in the future. Currently can be run like this:

  python -m csv_metadata_quality

CSV input and output paths are still hard coded.

See: https://dev.to/codemouse92/dead-simple-python-project-structure-and-imports-38c6
2019-07-26 22:11:10 +03:00
ef5b8f7244
fix.py: Massive improvements
Use Python's str.strip() instead of kludgy regular expressions and
use split/join to handle multi-value fields more cleanly.
2019-07-26 19:31:55 +03:00
801870e0ba
Add fix.py
Initial working version of metadata cleaning script that fixes lea-
ding and trailing whitespace (even in DSpace multi-value fields).
2019-07-26 19:08:28 +03:00
21b78b9519
Initial commit
Pipenv environment with Pandas.
2019-07-26 17:54:13 +03:00