CGSpace Notes

Documenting day-to-day work on the CGSpace repository.

June, 2021

2021-06-01

  • IWMI notified me that AReS was down with an HTTP 502 error
    • Looking at UptimeRobot I see it has been down for 33 hours, but I never got a notification
    • I don’t see anything in the Elasticsearch container logs, or the systemd journal on the host, but I notice that the angular_nginx container isn’t running
    • I simply started it and AReS was running again:
$ docker-compose -f docker/docker-compose.yml start angular_nginx
  • Margarita from CCAFS emailed me to say that workflow alerts haven’t been working lately
    • I guess this is related to the SMTP issues last week
    • I had fixed the config, but didn’t restart Tomcat so DSpace didn’t load the new variables
    • I ran all system updates on CGSpace (linode18) and DSpace Test (linode26) and rebooted the servers

2021-06-03

  • Meeting with AMCOW and IWMI to discuss AMCOW getting IWMI’s content into the new AMCOW Knowledge Hub
    • At first we spent some time talking about DSpace communities/collections and the REST API, but then they said they actually prefer to send queries to sites on the fly and cache them in Redis for some time
    • That’s when I thought they could perhaps use the OpenSearch, but I can’t remember if it’s possible to limit by community, or only collection…
    • Looking now, I see there is a “scope” parameter that can be used for community or collection, for example:
https://cgspace.cgiar.org/open-search/discover?query=subject:water%20scarcity&scope=10568/16814&order=DESC&rpp=100&sort_by=2&start=1
  • That will sort by date issued (see: webui.itemlist.sort-option.2 in dspace.cfg), give 100 results per page, and start on item 1
  • Otherwise, another alternative would be to use the IWMI CSV that we are already exporting every week
  • Fill out the CGIAR-AGROVOC Task Group: Survey on the current CGIAR use of AGROVOC survey on behalf of CGSpace

2021-06-06

  • The Elasticsearch indexes are messed up so I dumped and re-created them correctly:
curl -XDELETE 'http://localhost:9200/openrxv-items-final'
curl -XDELETE 'http://localhost:9200/openrxv-items-temp'
curl -XPUT 'http://localhost:9200/openrxv-items-final'
curl -XPUT 'http://localhost:9200/openrxv-items-temp'
curl -s -X POST 'http://localhost:9200/_aliases' -H 'Content-Type: application/json' -d'{"actions" : [{"add" : { "index" : "openrxv-items-final", "alias" : "openrxv-items"}}]}'
elasticdump --input=/home/aorth/openrxv-items_mapping.json --output=http://localhost:9200/openrxv-items-final --type=mapping
elasticdump --input=/home/aorth/openrxv-items_data.json --output=http://localhost:9200/openrxv-items-final --type=data --limit=1000
  • Then I started a harvesting on AReS

2021-06-07

  • The harvesting on AReS completed successfully
  • Provide feedback to FAO on how we use AGROVOC for their “AGROVOC call for use cases”

2021-06-10

  • Skype with Moayad to discuss AReS harvesting improvements
    • He will work on a plugin that reads the XML sitemap to get all item IDs and checks whether we have them or not