Add notes for 2023-03-09

This commit is contained in:
2023-03-09 17:01:50 +03:00
parent 5787bc326c
commit bee6532af2
31 changed files with 61 additions and 36 deletions

View File

@ -164,6 +164,12 @@ value.replace("<jats:sub>","").replace("</jats:sub>", "").replace("<jats:sup>","
```
- I uploaded the 350 items to DSpace Test so Peter and Abenet can explore them
- I exported a list of authors, affiliations, and funders from the new items to let Peter correct them:
```console
$ csvcut -c dc.contributor.author /tmp/new-items.csv | sed -e 1d -e 's/"//g' -e 's/||/\n/g' | sort | uniq -c | sort -nr | awk '{$1=""; print $0}' | sed -e 's/^ //' > /tmp/new-authors.csv
```
- Meeting with FAO AGRIS team about how to detect duplicates
- They are currently using a sha256 hash on titles, which will work, but will only return exact matches
- I told them to try to normalize the string, drop stop words, etc to increase the possibility that the hash matches
@ -172,4 +178,10 @@ value.replace("<jats:sub>","").replace("</jats:sub>", "").replace("<jats:sup>","
- I said I prefer to write a small script for her that will check the first author and first affiliation... I could do it easily in Python, but would need to put a web frontend on it for her
- Unless we could do that in AReS reports somehow
## 2023-03-09
- Apply a bunch of corrections to authors, affiliations, and donors on the new items on DSpace Test
- Meeting with Peter and Abenet about future OpenRXV developments, DSpace 7, etc
- I submitted an [issue on MEL asking them to add provenance metadata when submitting to CGSpace](https://github.com/CodeObia/MEL/issues/11173)
<!-- vim: set sw=2 ts=2: -->