mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Update notes for 2020-07-06
This commit is contained in:
@ -261,4 +261,11 @@ $ csvcut -c1 /tmp/2020-07-05-subjects-upper.csv | head -n 6500 > 2020-07-05-cgsp
|
||||
$ ./agrovoc-lookup.py -i 2020-07-05-cgspace-subjects.txt -om 2020-07-05-cgspace-subjects-matched.txt -or 2020-07-05-cgspace-subjects-rejected.txt -d
|
||||
```
|
||||
|
||||
## 2020-07-06
|
||||
|
||||
- I made some optimizations to the suite of Python utility scripts in our DSpace directory as well as the [csv-metadata-quality](https://github.com/ilri/csv-metadata-quality) script
|
||||
- Mostly to make more efficient usage of the requests cache and to use parameterized requests instead of building the request URL by concatenating the URL with query parameters
|
||||
- I modified the `agrovoc-lookup.py` script to save its results as a CSV, with the subject, language, type of match (preferred, alternate, and total number of matches) rather than save two separate files
|
||||
- Note that I see `prefLabel`, `matchedPrefLabel`, and `altLabel` in the REST API responses and I'm not sure what the second one means
|
||||
|
||||
<!-- vim: set sw=2 ts=2: -->
|
||||
|
Reference in New Issue
Block a user