1
0
mirror of https://github.com/ilri/csv-metadata-quality.git synced 2025-05-09 22:56:01 +02:00

Use py3langid instead of langid

Faster and more modern code for Python 3 as a drop-in replacement.

See: https://adrien.barbaresi.eu/blog/language-detection-langid-py-faster.html
This commit is contained in:
2023-12-28 14:11:21 +03:00
parent fb341dd9fa
commit a21ffb0fa8
3 changed files with 3 additions and 2 deletions

View File

@ -15,6 +15,7 @@ fields
### Changed
- Don't run newline fix on description fields
- Install requests-cache in main run() function instead of check.agrovoc() function so we only incur the overhead once
- Use py3langid instead of langid, see: [How to make language detection with langid.py faster](https://adrien.barbaresi.eu/blog/language-detection-langid-py-faster.html)
### Updated
- Python dependencies, including Pandas 2.0.0 and [Arrow-backed dtypes](https://datapythonista.me/blog/pandas-20-and-the-arrow-revolution-part-i)