1
0
mirror of https://github.com/ilri/csv-metadata-quality.git synced 2024-11-16 02:57:04 +01:00
Commit Graph

13 Commits

Author SHA1 Message Date
196bb434fa
Add date validation
I'm only concerned with validating issue dates here. In DSpace they
are generally always YYYY, YYY-MM, or YYYY-MM-DD (though in theory
they could be any valid ISO8601 format).

This also checks for cases where the date is missing and where the
metadata has specified multiple dates like "1990||1991", as this is
valid, but there is no practical value for it in our system.
2019-07-28 16:11:36 +03:00
e2bb2d4df9
Main function should be "main()" 2019-07-27 23:09:16 +03:00
c47c064a13
Make output less debuggy 2019-07-27 09:21:13 +03:00
2b41f9416b
csv_metadata_quality/fix.py: Remove extra newline 2019-07-27 01:29:22 +03:00
3cf9f9452b
csv_metadata_quality/check.py: Always return field
We always need to return the field back so apply doesn't set it to
null when creating the new data frame.
2019-07-27 01:28:08 +03:00
18f26c343d
csv_metadata_quality/app.py: Fix path to test.csv 2019-07-27 00:25:30 +03:00
84c3b17678
csv_metadata_quality/app.py: Add comment 2019-07-26 23:49:13 +03:00
aaf3537ba4
Add check for invalid multi-value separators 2019-07-26 23:48:24 +03:00
02f9d8a736
csv_metadata_quality/check.py: Add check for missing isbn values 2019-07-26 23:45:18 +03:00
dfd961d720
Bring test.csv into project 2019-07-26 23:14:37 +03:00
e160b17fb0
Add ISSN and ISBN checks using python-stdnum 2019-07-26 23:14:10 +03:00
30a4b0005f
csv_metadata_quality/fix.py: Remove test function 2019-07-26 22:56:40 +03:00
232d28e13e
Refactor as package with subpackages
This makes it cleaner for introducing checks, fixes, tests, docs,
and tests in the future. Currently can be run like this:

  python -m csv_metadata_quality

CSV input and output paths are still hard coded.

See: https://dev.to/codemouse92/dead-simple-python-project-structure-and-imports-38c6
2019-07-26 22:11:10 +03:00