mirror of
https://github.com/ilri/csv-metadata-quality.git
synced 2024-11-25 15:18:19 +01:00
Compare commits
No commits in common. "d3880a9dfa6b605c64f4d84b69140493ca56cf98" and "a6709c7f826b59713978a5749440c562864cad86" have entirely different histories.
d3880a9dfa
...
a6709c7f82
16
.drone.yml
16
.drone.yml
@ -46,4 +46,20 @@ steps:
|
||||
- python setup.py install
|
||||
- csv-metadata-quality -i data/test.csv -o /tmp/test.csv -e -u --agrovoc-fields dc.subject,cg.coverage.country
|
||||
|
||||
---
|
||||
kind: pipeline
|
||||
type: docker
|
||||
name: python36
|
||||
|
||||
steps:
|
||||
- name: test
|
||||
image: python:3.6-slim
|
||||
commands:
|
||||
- id
|
||||
- python -V
|
||||
- pip install -r requirements-dev.txt
|
||||
- pytest
|
||||
- python setup.py install
|
||||
- csv-metadata-quality -i data/test.csv -o /tmp/test.csv -e -u --agrovoc-fields dc.subject,cg.coverage.country
|
||||
|
||||
# vim: ts=2 sw=2 et
|
||||
|
@ -7,7 +7,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
||||
## Unreleased
|
||||
### Changed
|
||||
- Reformat with black
|
||||
- Requires Python 3.7+ for pandas 1.2.0
|
||||
|
||||
### Updated
|
||||
- Run `poetry update`
|
||||
|
@ -1,7 +1,7 @@
|
||||
# CSV Metadata Quality ![GitHub Actions](https://github.com/ilri/csv-metadata-quality/workflows/Build%20and%20Test/badge.svg) [![builds.sr.ht status](https://builds.sr.ht/~alanorth/csv-metadata-quality.svg)](https://builds.sr.ht/~alanorth/csv-metadata-quality?)
|
||||
A simple, but opinionated metadata quality checker and fixer designed to work with CSVs in the DSpace ecosystem (though it could theoretically work on any CSV that uses Dublin Core fields as columns). The implementation is essentially a pipeline of checks and fixes that begins with splitting multi-value fields on the standard DSpace "||" separator, trimming leading/trailing whitespace, and then proceeding to more specialized cases like ISSNs, ISBNs, languages, etc.
|
||||
|
||||
Requires Python 3.7 or greater (3.8 recommended). CSV and Excel support comes from the [Pandas](https://pandas.pydata.org/) library, though your mileage may vary with Excel because this is much less tested.
|
||||
Requires Python 3.6 or greater (3.8 recommended). CSV and Excel support comes from the [Pandas](https://pandas.pydata.org/) library, though your mileage may vary with Excel because this is much less tested.
|
||||
|
||||
## Functionality
|
||||
|
||||
|
@ -69,9 +69,7 @@ def test_check_unnecessary_separators(capsys):
|
||||
check.separators(field, field_name)
|
||||
|
||||
captured = capsys.readouterr()
|
||||
assert (
|
||||
captured.out == f"Unnecessary multi-value separator ({field_name}): {field}\n"
|
||||
)
|
||||
assert captured.out == f"Unnecessary multi-value separator ({field_name}): {field}\n"
|
||||
|
||||
|
||||
def test_check_valid_separators():
|
||||
|
Loading…
Reference in New Issue
Block a user