mirror of
https://github.com/ilri/csv-metadata-quality.git
synced 2025-07-02 04:27:24 +02:00
Compare commits
4 Commits
v0.4.3
...
dbbbc0944a
Author | SHA1 | Date | |
---|---|---|---|
dbbbc0944a
|
|||
d17bf3033c
|
|||
2ec52f1b73
|
|||
aa1abf15a7
|
@ -1,8 +1,12 @@
|
||||
# CSV Metadata Quality  [](https://ci.mjanja.ch/alanorth/csv-metadata-quality)
|
||||
A simple, but opinionated metadata quality checker and fixer designed to work with CSVs in the DSpace ecosystem (though it could theoretically work on any CSV that uses Dublin Core fields as columns). The implementation is essentially a pipeline of checks and fixes that begins with splitting multi-value fields on the standard DSpace "||" separator, trimming leading/trailing whitespace, and then proceeding to more specialized cases like ISSNs, ISBNs, languages, etc.
|
||||
# DSpace CSV Metadata Quality Checker  [](https://ci.mjanja.ch/alanorth/csv-metadata-quality)
|
||||
A simple, but opinionated metadata quality checker and fixer designed to work with CSVs in the DSpace ecosystem (though it could theoretically work on any CSV that uses Dublin Core fields as columns). The implementation is essentially a pipeline of checks and fixes that begins with splitting multi-value fields on the standard DSpace "||" separator, trimming leading/trailing whitespace, and then proceeding to more specialized cases like ISSNs, ISBNs, languages, unnecessary Unicode, AGROVOC terms, etc.
|
||||
|
||||
Requires Python 3.7 or greater (3.8 recommended). CSV and Excel support comes from the [Pandas](https://pandas.pydata.org/) library, though your mileage may vary with Excel because this is much less tested.
|
||||
|
||||
If you use the DSpace CSV metadata quality checker please cite:
|
||||
|
||||
*Orth, A. 2019. DSpace CSV metadata quality checker. Nairobi, Kenya: ILRI. https://hdl.handle.net/10568/110997.*
|
||||
|
||||
## Functionality
|
||||
|
||||
- Validate dates, ISSNs, ISBNs, and multi-value separators ("||")
|
||||
|
Reference in New Issue
Block a user