1
0
mirror of https://github.com/ilri/csv-metadata-quality.git synced 2024-06-26 08:03:46 +02:00

README.md: Try to simplify list of functionality

This commit is contained in:
Alan Orth 2019-07-29 18:25:38 +03:00
parent 0eb852a65b
commit e49b4e8f22
Signed by: alanorth
GPG Key ID: 0FB860CC9C45B1B9

View File

@ -5,14 +5,14 @@ Requires Python 3.6 or greater. CSV and Excel support comes from the [Pandas](ht
## Functionality
- Read/write CSV files
- Read Excel files
- Validate dates, ISSNs, ISBNs, and multi-value separators ("||")
- Fix leading, trailing, and excessive whitespace
- Fix invalid multi-value separators ("|") using `--unsafe-fixes`
- Remove unnecessary Unicode like [non-breaking spaces](https://en.wikipedia.org/wiki/Non-breaking_space), [replacement characters](https://en.wikipedia.org/wiki/Specials_(Unicode_block)#Replacement_character), etc
- Check for "suspicious" characters that could indicate encoding or copy/paste issues, for example "foreˆt" should be "forêt"
- Remove duplicate metadata values
- Read/write CSV files
- Read Excel files
- Validate dates, ISSNs, ISBNs, and multi-value separators ("||")
- Fix leading, trailing, and excessive whitespace
- Fix invalid multi-value separators (`|`) using `--unsafe-fixes`
- Remove unnecessary Unicode like [non-breaking spaces](https://en.wikipedia.org/wiki/Non-breaking_space), [replacement characters](https://en.wikipedia.org/wiki/Specials_(Unicode_block)#Replacement_character), etc
- Check for "suspicious" characters that indicate encoding or copy/paste issues, for example "foreˆt" should be "forêt"
- Remove duplicate metadata values
## Installation
The easiest way to install CSV Metadata Quality is with [pipenv](https://github.com/pypa/pipenv):