diff --git a/content/posts/2021-11.md b/content/posts/2021-11.md index c09401c93..ce19dfc10 100644 --- a/content/posts/2021-11.md +++ b/content/posts/2021-11.md @@ -249,6 +249,102 @@ Total number of hits from bots: 295492 $ ./ilri/fix-metadata-values.py -i /tmp/authors.csv -db dspace -u dspace -p 'fuuu' -f dc.contributor.author -t 'correct' -m 3 ``` -- Then I imported to CGSpace and started a full Discovery re-index +- Then I imported to CGSpace and started a full Discovery re-index: + +```console +$ time chrt -b 0 ionice -c2 -n7 nice -n19 dspace index-discovery -b + +real 272m43.818s +user 183m4.543s +sys 2m47.988 +``` + +## 2021-11-28 + +- Run system updates on AReS server (linode20) and update all Docker containers and reboot + - Then I started a fresh harvest as I always do on Sunday +- I am experimenting with pinning npm version 7 on OpenRXV frontend because of these Angular errors: + +```console +npm WARN EBADENGINE Unsupported engine { +npm WARN EBADENGINE package: '@angular-devkit/architect@0.901.15', +npm WARN EBADENGINE required: { node: '>= 10.13.0', npm: '^6.11.0 || ^7.5.6', yarn: '>= 1.13.0' }, +npm WARN EBADENGINE current: { node: 'v12.22.7', npm: '8.1.3' } +npm WARN EBADENGINE } +``` + +## 2021-11-29 + +- Tezira reached out to me to say that submissions on CGSpace are taking forever +- I see a definite increase in locks in the last few days: + +![PostgreSQL locks week](/cgspace-notes/2021/11/postgres_locks_ALL-week.png) + +- The locks are all held by dspaceWeb (XMLUI): + +```console +$ psql -c "SELECT application_name FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid" | sort | uniq -c | sort -n + 1 + 1 ------------------ + 1 (1394 rows) + 1 application_name + 9 psql + 1385 dspaceWeb +``` + +- I restarted PostgreSQL and the locks dropped down: + +```console +$ psql -c "SELECT application_name FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid" | sort | uniq -c | sort -n + 1 + 1 ------------------ + 1 (103 rows) + 1 application_name + 9 psql + 94 dspaceWeb +``` + +## 2021-11-30 + +- IWMI sent me ORCID identifiers for some new staff + - We currently have 1332 unique identifiers, so this adds sixteen new ones: + +```console +$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-identifier.xml /tmp/iwmi-orcids.txt | grep -oE '[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}' | sort | uniq > /tmp/2021-11-30-combined-orcids.txt +$ wc -l /tmp/2021-11-30-combined-orcids.txt +1348 /tmp/2021-11-30-combined-orcids.txt +``` + +- After I combined them and removed duplicates, I resolved all the names using my `resolve-orcids.py` script: + +```console +$ ./ilri/resolve-orcids.py -i /tmp/2021-11-30-combined-orcids.txt -o /tmp/2021-11-30-combined-orcids-names.txt +``` + +- Then I updated some ORCID identifiers that had changed in the XML: + +```console +$ cat 2021-11-30-fix-orcids.csv +cg.creator.identifier,correct +"ADEBOWALE AKANDE: 0000-0002-6521-3272","ADEBOWALE AD AKANDE: 0000-0002-6521-3272" +"Daniel Ortiz Gonzalo: 0000-0002-5517-1785","Daniel Ortiz-Gonzalo: 0000-0002-5517-1785" +"FRIDAY ANETOR: 0000-0003-3137-1958","Friday Osemenshan Anetor: 0000-0003-3137-1958" +"Sander Muilerman: 0000-0001-9103-3294","Sander Muilerman-Rodrigo: 0000-0001-9103-3294" +$ ./ilri/fix-metadata-values.py -i 2021-11-30-fix-orcids.csv -db dspace -u dspace -p 'fuuu' -f cg.creator.identifier -t 'correct' -m 247 +``` + +- Tag existing items from the IWMI's new authors with ORCID iDs using `add-orcid-identifiers-csv.py` (7 new metadata fields added): + +```console +$ cat 2021-11-30-add-orcids.csv +dc.contributor.author,cg.creator.identifier +"Liaqat, U.W.","Umar Waqas Liaqat: 0000-0001-9027-5232" +"Liaqat, Umar Waqas","Umar Waqas Liaqat: 0000-0001-9027-5232" +"Munyaradzi, M.","Munyaradzi Junia Mutenje: 0000-0002-7829-9300" +"Mutenje, Munyaradzi","Munyaradzi Junia Mutenje: 0000-0002-7829-9300" +"Rex, William","William Rex: 0000-0003-4979-5257" +"Shrestha, Shisher","Nirman Shrestha: 0000-0002-0996-8611" +$ ./ilri/add-orcid-identifiers-csv.py -i 2021-11-30-add-orcids.csv -db dspace -u dspace -p 'fuuu' +``` diff --git a/docs/2021-11/index.html b/docs/2021-11/index.html index 20c84b903..29098a2b6 100644 --- a/docs/2021-11/index.html +++ b/docs/2021-11/index.html @@ -18,7 +18,7 @@ $ zstd statistics-2019.json - + @@ -42,9 +42,9 @@ $ zstd statistics-2019.json "@type": "BlogPosting", "headline": "November, 2021", "url": "https://alanorth.github.io/cgspace-notes/2021-11/", - "wordCount": "1682", + "wordCount": "2080", "datePublished": "2021-11-02T22:27:07+02:00", - "dateModified": "2021-11-27T12:18:52+02:00", + "dateModified": "2021-11-27T14:37:33+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -389,9 +389,91 @@ Found 3 hits from 188.134.31.88 in statistics
$ ./ilri/fix-metadata-values.py -i /tmp/authors.csv -db dspace -u dspace -p 'fuuu' -f dc.contributor.author -t 'correct' -m 3
$ time chrt -b 0 ionice -c2 -n7 nice -n19 dspace index-discovery -b
+
+real 272m43.818s
+user 183m4.543s
+sys 2m47.988
+
npm WARN EBADENGINE Unsupported engine {
+npm WARN EBADENGINE package: '@angular-devkit/architect@0.901.15',
+npm WARN EBADENGINE required: { node: '>= 10.13.0', npm: '^6.11.0 || ^7.5.6', yarn: '>= 1.13.0' },
+npm WARN EBADENGINE current: { node: 'v12.22.7', npm: '8.1.3' }
+npm WARN EBADENGINE }
+
$ psql -c "SELECT application_name FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid" | sort | uniq -c | sort -n
+ 1
+ 1 ------------------
+ 1 (1394 rows)
+ 1 application_name
+ 9 psql
+ 1385 dspaceWeb
+
$ psql -c "SELECT application_name FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid" | sort | uniq -c | sort -n
+ 1
+ 1 ------------------
+ 1 (103 rows)
+ 1 application_name
+ 9 psql
+ 94 dspaceWeb
+
$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-identifier.xml /tmp/iwmi-orcids.txt | grep -oE '[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}' | sort | uniq > /tmp/2021-11-30-combined-orcids.txt
+$ wc -l /tmp/2021-11-30-combined-orcids.txt
+1348 /tmp/2021-11-30-combined-orcids.txt
+
resolve-orcids.py
script:$ ./ilri/resolve-orcids.py -i /tmp/2021-11-30-combined-orcids.txt -o /tmp/2021-11-30-combined-orcids-names.txt
+
$ cat 2021-11-30-fix-orcids.csv
+cg.creator.identifier,correct
+"ADEBOWALE AKANDE: 0000-0002-6521-3272","ADEBOWALE AD AKANDE: 0000-0002-6521-3272"
+"Daniel Ortiz Gonzalo: 0000-0002-5517-1785","Daniel Ortiz-Gonzalo: 0000-0002-5517-1785"
+"FRIDAY ANETOR: 0000-0003-3137-1958","Friday Osemenshan Anetor: 0000-0003-3137-1958"
+"Sander Muilerman: 0000-0001-9103-3294","Sander Muilerman-Rodrigo: 0000-0001-9103-3294"
+$ ./ilri/fix-metadata-values.py -i 2021-11-30-fix-orcids.csv -db dspace -u dspace -p 'fuuu' -f cg.creator.identifier -t 'correct' -m 247
+
add-orcid-identifiers-csv.py
(7 new metadata fields added):$ cat 2021-11-30-add-orcids.csv
+dc.contributor.author,cg.creator.identifier
+"Liaqat, U.W.","Umar Waqas Liaqat: 0000-0001-9027-5232"
+"Liaqat, Umar Waqas","Umar Waqas Liaqat: 0000-0001-9027-5232"
+"Munyaradzi, M.","Munyaradzi Junia Mutenje: 0000-0002-7829-9300"
+"Mutenje, Munyaradzi","Munyaradzi Junia Mutenje: 0000-0002-7829-9300"
+"Rex, William","William Rex: 0000-0003-4979-5257"
+"Shrestha, Shisher","Nirman Shrestha: 0000-0002-0996-8611"
+$ ./ilri/add-orcid-identifiers-csv.py -i 2021-11-30-add-orcids.csv -db dspace -u dspace -p 'fuuu'
+