1
0
mirror of https://github.com/ilri/csv-metadata-quality.git synced 2024-11-25 23:28:18 +01:00

Compare commits

..

No commits in common. "27b2d81ca867a8dfe7ac368b5bd7799dbeab8608" and "d76e72532a3e2c07480cba1ce72b3f3588a1cf3f" have entirely different histories.

3 changed files with 3 additions and 6 deletions

View File

@ -4,11 +4,6 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## Unreleased changes
### Added
- Check dates in dcterms.issued field as well, not just fields that have the
word "date" in them
## [0.4.4] - 2021-02-21
### Added
- Accept dates formatted in ISO 8601 extended with combined date and time, for

View File

@ -109,6 +109,8 @@ This currently uses the [Python langid](https://github.com/saffsd/langid.py) lib
- Add an option to drop invalid AGROVOC subjects?
- Add tests for application invocation, ie `tests/test_app.py`?
- Validate ISSNs or journal titles against CrossRef API?
- Better ISO 8601 date parsing (currently only supports simple dates, perhaps we need to use dateutil.parser.parseiso())
- Fix lazy date check (assumes field name has "date" but could be dcterms.issued etc!)
## License
This work is licensed under the [GPLv3](https://www.gnu.org/licenses/gpl-3.0.en.html).

View File

@ -142,7 +142,7 @@ def run(argv):
df[column] = df[column].apply(check.isbn)
# Check: invalid date
match = re.match(r"^.*?(date|dcterms\.issued).*$", column)
match = re.match(r"^.*?date.*$", column)
if match is not None:
df[column] = df[column].apply(check.date, field_name=column)