|
7fb7f7e03c
|
Add requirements.txt
Generated using pipenv:
$ pipenv lock -r > requirements.txt
|
2019-07-26 23:54:07 +03:00 |
|
|
df1087b26f
|
README.md: Improve introduction, checks, and todo
|
2019-07-26 23:50:41 +03:00 |
|
|
84c3b17678
|
csv_metadata_quality/app.py: Add comment
|
2019-07-26 23:49:13 +03:00 |
|
|
844b968098
|
tests/test.csv: Add invalid multi-value field separator
|
2019-07-26 23:48:45 +03:00 |
|
|
aaf3537ba4
|
Add check for invalid multi-value separators
|
2019-07-26 23:48:24 +03:00 |
|
|
02f9d8a736
|
csv_metadata_quality/check.py: Add check for missing isbn values
|
2019-07-26 23:45:18 +03:00 |
|
|
64e7a73417
|
README.md: Add information about checks and fixes
|
2019-07-26 23:20:16 +03:00 |
|
|
dfd961d720
|
Bring test.csv into project
|
2019-07-26 23:14:37 +03:00 |
|
|
e160b17fb0
|
Add ISSN and ISBN checks using python-stdnum
|
2019-07-26 23:14:10 +03:00 |
|
|
30a4b0005f
|
csv_metadata_quality/fix.py: Remove test function
|
2019-07-26 22:56:40 +03:00 |
|
|
b657c51fd2
|
Add initial README.md with intro, license, and todo
|
2019-07-26 22:18:38 +03:00 |
|
|
5c6453b397
|
Add GPLv3 license
|
2019-07-26 22:16:16 +03:00 |
|
|
232d28e13e
|
Refactor as package with subpackages
This makes it cleaner for introducing checks, fixes, tests, docs,
and tests in the future. Currently can be run like this:
python -m csv_metadata_quality
CSV input and output paths are still hard coded.
See: https://dev.to/codemouse92/dead-simple-python-project-structure-and-imports-38c6
|
2019-07-26 22:11:10 +03:00 |
|
|
ef5b8f7244
|
fix.py: Massive improvements
Use Python's str.strip() instead of kludgy regular expressions and
use split/join to handle multi-value fields more cleanly.
|
2019-07-26 19:31:55 +03:00 |
|
|
801870e0ba
|
Add fix.py
Initial working version of metadata cleaning script that fixes lea-
ding and trailing whitespace (even in DSpace multi-value fields).
|
2019-07-26 19:08:28 +03:00 |
|
|
21b78b9519
|
Initial commit
Pipenv environment with Pandas.
|
2019-07-26 17:54:13 +03:00 |
|