CGSpace Notes

Documenting day-to-day work on the CGSpace repository.

February, 2021

2021-02-01

  • Abenet said that CIP found more duplicate records in their export from AReS
  • I had a call with CodeObia to discuss the work on OpenRXV
  • Check the results of the AReS harvesting from last night:
$ curl -s 'http://localhost:9200/openrxv-items-temp/_count?q=*&pretty'
{
  "count" : 100875,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  }
}
Read more →

January, 2021

2021-01-03

  • Peter notified me that some filters on AReS were broken again
    • It’s the same issue with the field names getting .keyword appended to the end that I already filed an issue on OpenRXV about last month
    • I fixed the broken filters (careful to not edit any others, lest they break too!)
  • Fix an issue with start page number for the DSpace REST API and statistics API in OpenRXV
    • The start page had been “1” in the UI, but in the backend they were doing some gymnastics to adjust to the zero-based offset/limit/page of the DSpace REST API and the statistics API
    • I adjusted it to default to 0 and added a note to the admin screen
    • I realized that this issue was actually causing the first page of 100 statistics to be missing…
    • For example, this item has 51 views on CGSpace, but 0 on AReS
Read more →

December, 2020

2020-12-01

  • Atmire responded about the issue with duplicate data in our Solr statistics
    • They noticed that some records in the statistics-2015 core haven’t been migrated with the AtomicStatisticsUpdateCLI tool yet and assumed that I haven’t migrated any of the records yet
    • That’s strange, as I checked all ten cores and 2015 is the only one with some unmigrated documents, as according to the cua_version field
    • I started processing those (about 411,000 records):
Read more →

November, 2020

2020-11-01

  • Continue with processing the statistics-2019 Solr core with the AtomicStatisticsUpdateCLI tool on DSpace Test
    • So far we’ve spent at least fifty hours to process the statistics and statistics-2019 core… wow.
Read more →

October, 2020

2020-10-06

  • Add tests for the new /items POST handlers to the DSpace 6.x branch of my dspace-statistics-api
  • Trying to test the changes Atmire sent last week but I had to re-create my local database from a recent CGSpace dump
    • During the FlywayDB migration I got an error:
Read more →

September, 2020

2020-09-02

  • Replace Marissa van Epp for Rhys Bucknall in the CCAFS groups on CGSpace because Marissa no longer works at CCAFS
  • The AReS Explorer hasn’t updated its index since 2020-08-22 when I last forced it
    • I restarted it again now and told Moayad that the automatic indexing isn’t working
  • Add Alliance of Bioversity International and CIAT to affiliations on CGSpace
  • Abenet told me that the general search text on AReS doesn’t get reset when you use the “Reset Filters” button
  • I filed an issue on OpenRXV to make some minor edits to the admin UI: https://github.com/ilri/OpenRXV/issues/40
Read more →

August, 2020

2020-08-02

  • I spent a few days working on a Java-based curation task to tag items with ISO 3166-1 Alpha2 country codes based on their cg.coverage.country text values
    • It looks up the names in ISO 3166-1 first, and then in our CGSpace countries mapping (which has five or so of Peter’s preferred “display” country names)
    • It implements a “force” mode too that will clear existing country codes and re-tag everything
    • It is class based so I can easily add support for other vocabularies, and the technique could even be used for organizations with mappings to ROR and Clarisa…
Read more →

July, 2020

2020-07-01

  • A few users noticed that CGSpace wasn’t loading items today, item pages seem blank
    • I looked at the PostgreSQL locks but they don’t seem unusual
    • I guess this is the same “blank item page” issue that we had a few times in 2019 that we never solved
    • I restarted Tomcat and PostgreSQL and the issue was gone
  • Since I was restarting Tomcat anyways I decided to redeploy the latest changes from the 5_x-prod branch and I added a note about COVID-19 items to the CGSpace frontpage at Peter’s request
Read more →

June, 2020

2020-06-01

  • I tried to run the AtomicStatisticsUpdateCLI CUA migration script on DSpace Test (linode26) again and it is still going very slowly and has tons of errors like I noticed yesterday
    • I sent Atmire the dspace.log from today and told them to log into the server to debug the process
  • In other news, I checked the statistics API on DSpace 6 and it’s working
  • I tried to build the OAI registry on the freshly migrated DSpace 6 on DSpace Test and I get an error:
Read more →