1
0
mirror of https://github.com/ilri/csv-metadata-quality.git synced 2025-05-09 14:46:00 +02:00

Add support for validating languages

Will validate against ISO 639-2 or ISO 639-3 depending on how long
the language field is. Otherwise will return that the language is
invalid.

Does not currently have any support for generic values like "Other".
This commit is contained in:
2019-07-29 18:59:42 +03:00
parent 1978fa7b48
commit a36454a3ac
5 changed files with 117 additions and 15 deletions

View File

@ -8,6 +8,7 @@ Requires Python 3.6 or greater. CSV and Excel support comes from the [Pandas](ht
- Read/write CSV files
- Read Excel files
- Validate dates, ISSNs, ISBNs, and multi-value separators ("||")
- Validate languages against ISO 639-2 and ISO 639-3
- Fix leading, trailing, and excessive whitespace
- Fix invalid multi-value separators (`|`) using `--unsafe-fixes`
- Remove unnecessary Unicode like [non-breaking spaces](https://en.wikipedia.org/wiki/Non-breaking_space), [replacement characters](https://en.wikipedia.org/wiki/Specials_(Unicode_block)#Replacement_character), etc