March, 2021

Mon Mar 01, 2021 in Notes

2021-03-01

Discuss some OpenRXV issues with Abdullah from CodeObia
- He’s trying to work on the DSpace 6+ metadata schema autoimport using the DSpace 6+ REST API
- Also, we found some issues building and running OpenRXV currently due to ecosystem shift in the Node.js dependencies

2021-03-02

I fixed three build and runtime issues in OpenRXV:
- fix highcharts-angular and ngx-tour-core build
- frontend/package.json: Pin @types/ramda at 0.27.34
Then I merged a few fixes that Abdullah had worked on last week

2021-03-03

I fixed another frontend build warning on OpenRXV
Then I updated the frontend container to use Node.js 12 and Ubuntu 20.04
Also, I added a GitHub Actions workflow to build the frontend
I did some testing of Abdullah’s patch for the values mapping search on OpenRXV
- It still doesn’t work with multi-word values, so I recorded a video with wf-recorder and uploaded it to the issue for him to investigate

2021-03-04

Peter is having issues with the workflow since yesterday
- I looked at the Munin stats and see a high number of database locks since yesterday

PostgreSQL locks week PostgreSQL connections week

I looked at the number of connections in PostgreSQL and it’s definitely high again:

$ psql -c 'SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;' | wc -l
1020

I reported it to Atmire to take a look, on the same issue we had been tracking this before
Abenet asked me to add a new ORCID for ILRI staff member Zoe Campbell
I added it to the controlled vocabulary and then tagged her existing items on CGSpace using my add-orcid-identifier.py script:

$ cat 2021-03-04-add-zoe-campbell-orcid.csv 
dc.contributor.author,cg.creator.identifier
"Campbell, Zoë","Zoe Campbell: 0000-0002-4759-9976"
"Campbell, Zoe A.","Zoe Campbell: 0000-0002-4759-9976"
$ ./ilri/add-orcid-identifiers-csv.py -i 2021-03-04-add-zoe-campbell-orcid.csv -db dspace -u dspace -p 'fuuu'

I still need to do cleanup on the journal articles metadata
- Peter sent me some cleanups but I can’t use them in the search/replace format he gave
- I think it’s better to export the metadata values with IDs and import cleaned up ones as CSV

localhost/dspace63= > \COPY (SELECT dspace_object_id AS id, text_value as "cg.journal" FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=251) to /tmp/2021-02-24-journals.csv WITH CSV HEADER;
COPY 32087

I used OpenRefine to remove all journal values that didn’t have one of these values: ; ( )
- Then I cloned the cg.journal field to cg.volume and cg.issue
- I used some GREL expressions like these to extract the journal name, volume, and issue:

value.partition(';')[0].trim() # to get journal names
value.partition(/[0-9]+\([0-9]+\)/)[1].replace(/^(\d+)\(\d+\)/,"$1") # to get journal volumes
value.partition(/[0-9]+\([0-9]+\)/)[1].replace(/^\d+\((\d+)\)/,"$1") # to get journal issues

Then I uploaded the changes to CGSpace using dspace metadata-import
Margarita from CCAFS was asking about an error deleting some items that were showing up in Google and should have been private
- The error was “Authorization denied for action OBSOLETE (DELETE) on BITSTREAM:bd157345-448e …”
- I searched the DSpace issue tracker and found several issues reporting this:
- The issue is apparently with non-admin users who are in the admin and submit groups of the owning collection…
- In this case the item was uploaded to the CCAFS Reports collection, and Margarita is a non-admin user who is a member of the collection’s admin and submit groups, exactly as the issue described
- I added a comment about our issue to DS-4297
Yesterday Abenet added me to a WLE collection approver/editer steps so we can try to figure out why Niroshini is having issues adding metadata to Udana’s submissions
- I edited Udana’s submission to CGSpace:
  - corrected the title
  - added language English
  - changed the link to the external item page instead of PDF
  - added SDGs from the external item page
  - added AGROVOC subjects from the external item page
  - added pagination (extent)
  - changed the license to “other” because CC-BY-NC-ND is not printed anywhere in the PDF or external item page

2021-03-05

I migrated the Docker bind mount for the AReS Elasticsearch container to a Docker volume:

$ docker-compose -f docker/docker-compose.yml down
$ docker volume create docker_esData_7
$ docker container create --name es_dummy -v docker_esData_7:/usr/share/elasticsearch/data:rw elasticsearch:7.6.2
$ docker cp docker/esData_7/nodes es_dummy:/usr/share/elasticsearch/data
$ docker rm es_dummy
# edit docker/docker-compose.yml to switch from bind mount to volume
$ docker-compose -f docker/docker-compose.yml up -d

The trick is that when you create a volume like “myvolume” from a docker-compose.yml file, Docker will create it with the name “docker_myvolume”
- If you create it manually on the command line with docker volume create myvolume then the name is literally “myvolume”
I still need to make the changes to git master and add these notes to the pull request so Moayad and others can benefit
Delete the openrxv-items-temp index to test a fresh harvesting:

$ curl -XDELETE 'http://localhost:9200/openrxv-items-temp'

2021-03-05

Check the results of the AReS harvesting from last night:

$ curl -s 'http://localhost:9200/openrxv-items-temp/_count?q=*&pretty'
{
  "count" : 101761,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  }
}

Set the current items index to read only and make a backup:

$ curl -X PUT "localhost:9200/openrxv-items/_settings" -H 'Content-Type: application/json' -d' {"settings": {"index.blocks.write":true}}'
$ curl -s -X POST http://localhost:9200/openrxv-items/_clone/openrxv-items-2021-03-05

Delete the current items index and clone the temp one to it:

$ curl -XDELETE 'http://localhost:9200/openrxv-items'
$ curl -X PUT "localhost:9200/openrxv-items-temp/_settings" -H 'Content-Type: application/json' -d'{"settings": {"index.blocks.write": true}}'
$ curl -s -X POST http://localhost:9200/openrxv-items-temp/_clone/openrxv-items

Then delete the temp and backup:

$ curl -XDELETE 'http://localhost:9200/openrxv-items-temp'
{"acknowledged":true}%
$ curl -XDELETE 'http://localhost:9200/openrxv-items-2021-03-05'

I made some pull requests to OpenRXV:
- docker/docker-compose.yml: Use docker volumes
- docker/docker-compose.yml: Pin Redis to version 5
I deployed the latest changes from the last few days on AReS production