Commit Graph

5 Commits

Author SHA1 Message Date
Alan Orth 5c6453b397
Add GPLv3 license 2019-07-26 22:16:16 +03:00
Alan Orth 232d28e13e
Refactor as package with subpackages
This makes it cleaner for introducing checks, fixes, tests, docs,
and tests in the future. Currently can be run like this:

  python -m csv_metadata_quality

CSV input and output paths are still hard coded.

See: https://dev.to/codemouse92/dead-simple-python-project-structure-and-imports-38c6
2019-07-26 22:11:10 +03:00
Alan Orth ef5b8f7244
fix.py: Massive improvements
Use Python's str.strip() instead of kludgy regular expressions and
use split/join to handle multi-value fields more cleanly.
2019-07-26 19:31:55 +03:00
Alan Orth 801870e0ba
Add fix.py
Initial working version of metadata cleaning script that fixes lea-
ding and trailing whitespace (even in DSpace multi-value fields).
2019-07-26 19:08:28 +03:00
Alan Orth 21b78b9519
Initial commit
Pipenv environment with Pandas.
2019-07-26 17:54:13 +03:00