Add notes for 2020-02-11

This commit is contained in:
2020-02-11 12:52:11 +02:00
parent 4bd85f9323
commit dc637490f1
89 changed files with 187 additions and 96 deletions

View File

@ -389,4 +389,51 @@ $ cat out.dspace510-1 | ../FlameGraph/stackcollapse-perf.pl | grep -E '^java' |
- He uploaded them here: https://cgspace.cgiar.org/handle/10568/105926
- On a whim I checked and found five duplicates there, which means Sisay didn't even check
## 2020-02-10
- Follow up with [Atmire about DSpace 6.x upgrade](https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=706)
- I raised the issue of targetting 6.4-SNAPSHOT as well as the Discovery indexing performance issues in 6.x
## 2020-02-11
- Maria from Bioversity asked me to add some ORCID iDs to our controlled vocabulary so I combined them with our existing ones and updated the names from the ORCID API:
```
$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-id.xml /tmp/bioversity-orcid-ids.txt | grep -oE '[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}' | sort | uniq > /tmp/2020-02-11-combined-orcids.txt
$ ./resolve-orcids.py -i /tmp/2020-02-11-combined-orcids.txt -o /tmp/2020-02-11-combined-names.txt -d
# sort names, copy to cg-creator-id.xml, add XML formatting, and then format with tidy (preserving accents)
$ tidy -xml -utf8 -iq -m -w 0 dspace/config/controlled-vocabularies/cg-creator-id.xml
```
- Then I noticed some author names had changed, so I captured the old and new names in a CSV file and fixed them using `fix-metadata-values.py`:
```
$ ./fix-metadata-values.py -i 2020-02-11-correct-orcid-ids.csv -db dspace -u dspace -p 'fuuu' -f cg.creator.id -t correct -m 240 -d
```
- On a hunch I decided to try to add these ORCID iDs to existing items that might not have them yet
- I checked the database for likely matches to the author name and then created a CSV with the author names and ORCID iDs:
```
dc.contributor.author,cg.creator.id
"Staver, Charles",charles staver: 0000-0002-4532-6077
"Staver, C.",charles staver: 0000-0002-4532-6077
"Fungo, R.",Robert Fungo: 0000-0002-4264-6905
"Remans, R.",Roseline Remans: 0000-0003-3659-8529
"Remans, Roseline",Roseline Remans: 0000-0003-3659-8529
"Rietveld A.",Anne Rietveld: 0000-0002-9400-9473
"Rietveld, A.",Anne Rietveld: 0000-0002-9400-9473
"Rietveld, A.M.",Anne Rietveld: 0000-0002-9400-9473
"Rietveld, Anne M.",Anne Rietveld: 0000-0002-9400-9473
"Fongar, A.",Andrea Fongar: 0000-0003-2084-1571
"Müller, Anna",Anna Müller: 0000-0003-3120-8560
"Müller, A.",Anna Müller: 0000-0003-3120-8560
```
- Running the `add-orcid-identifiers-csv.py` script I added 144 ORCID iDs to items!
```
$ ./add-orcid-identifiers-csv.py -i /tmp/2020-02-11-add-orcid-ids.csv -db dspace -u dspace -p 'fuuu'
```
<!-- vim: set sw=2 ts=2: -->