Python's built-in unicodedata library includes the is_normalized()
function starting with Python 3.8. This utility function allows us
to do the same thing with earlier Python versions.
See: https://docs.python.org/3/library/unicodedata.html
The Unicode normalization feature requires Python 3.8 because the
unicodedata.is_normalized() function only appears there. If I find
another way to check if a string is normalized without normalizing
it first I will drop the requirements back down to Python 3.6.
See: https://docs.python.org/3/library/unicodedata.html
Black only releases pre-release versions, which causes issues with
pipenv. Instead of always running pipenv with "--pre" and potenti-
ally letting in some other pre-release versions for other depende-
ncies, I would rather specify the latest black version explicitly.
See: https://github.com/psf/black/issues/517
See: https://github.com/microsoft/vscode-python/issues/5171
The Pipfile knows it was created with Python 3.8, yet we're running
with multiple Python versions on Travis. I'm curious if would work
better to use pip to install dependencies instead of pipenv in this
case.
Works decenty well assuming the title, abstract, and citation fields
are an accurate representation of the language as identified by the
language field. Handles ISO 639-1 (alpha 2) and ISO 639-3 (alpha 3)
values seamlessly.
This includes updated pipenv environment, test data, pytest tests
for both correct and incorrect ISO 639-1 and ISO 639-3 languages,
and a new command line option "-e".
ISO 639-1 uses two-letter codes and ISO 639-3 uses three-letter codes.
Technically there ISO 639-2/T and ISO 639-2/B, which also uses three
letter codes, but those are not supported by the pycountry library
so I won't even worry about them.
See: https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes