CGSpace Notes

Documenting day-to-day work on the CGSpace repository.

August, 2020

2020-08-02

  • I spent a few days working on a Java-based curation task to tag items with ISO 3166-1 Alpha2 country codes based on their cg.coverage.country text values
    • It looks up the names in ISO 3166-1 first, and then in our CGSpace countries mapping (which has five or so of Peter’s preferred “display” country names)
    • It implements a “force” mode too that will clear existing country codes and re-tag everything
    • It is class based so I can easily add support for other vocabularies, and the technique could even be used for organizations with mappings to ROR and Clarisa…
Read more →

July, 2020

2020-07-01

  • A few users noticed that CGSpace wasn’t loading items today, item pages seem blank
    • I looked at the PostgreSQL locks but they don’t seem unusual
    • I guess this is the same “blank item page” issue that we had a few times in 2019 that we never solved
    • I restarted Tomcat and PostgreSQL and the issue was gone
  • Since I was restarting Tomcat anyways I decided to redeploy the latest changes from the 5_x-prod branch and I added a note about COVID-19 items to the CGSpace frontpage at Peter’s request
Read more →

June, 2020

2020-06-01

  • I tried to run the AtomicStatisticsUpdateCLI CUA migration script on DSpace Test (linode26) again and it is still going very slowly and has tons of errors like I noticed yesterday
    • I sent Atmire the dspace.log from today and told them to log into the server to debug the process
  • In other news, I checked the statistics API on DSpace 6 and it’s working
  • I tried to build the OAI registry on the freshly migrated DSpace 6 on DSpace Test and I get an error:
Read more →

May, 2020

2020-05-02

  • Peter said that CTA is having problems submitting an item to CGSpace
    • Looking at the PostgreSQL stats it seems to be the same issue that Tezira was having last week, as I see the number of connections in ‘idle in transaction’ and ‘waiting for lock’ state are increasing again
    • I see that CGSpace (linode18) is still using PostgreSQL JDBC driver version 42.2.11, and there were some bugs related to transactions fixed in 42.2.12 (which I had updated in the Ansible playbooks, but not deployed yet)
Read more →

April, 2020

2020-04-02

  • Maria asked me to update Charles Staver’s ORCID iD in the submission template and on CGSpace, as his name was lower case before, and now he has corrected it
    • I updated the fifty-eight existing items on CGSpace
  • Looking into the items Udana had asked about last week that were missing Altmetric donuts:
  • On the same note, the one item Abenet pointed out last week now has a donut with score of 104 after I tweeted it last week
Read more →

February, 2020

2020-02-02

  • Continue working on porting CGSpace’s DSpace 5 code to DSpace 6.3 that I started yesterday
    • Sign up for an account with MaxMind so I can get the GeoLite2-City.mmdb database
    • I still need to wire up the API credentials and cron job into the Ansible infrastructure playbooks
    • Fix some minor issues in the config and XMLUI themes, like removing Atmire stuff
    • The code finally builds and runs with a fresh install
Read more →

January, 2020

2020-01-06

  • Open a ticket with Atmire to request a quote for the upgrade to DSpace 6
  • Last week Altmetric responded about the item that had a lower score than than its DOI
    • The score is now linked to the DOI
    • Another item that had the same problem in 2019 has now also linked to the score for its DOI
    • Another item that had the same problem in 2019 has also been fixed

2020-01-07

  • Peter Ballantyne highlighted one more WLE item that is missing the Altmetric score that its DOI has
    • The DOI has a score of 259, but the Handle has no score at all
    • I tweeted the CGSpace repository link
Read more →

December, 2019

2019-12-01

  • Upgrade CGSpace (linode18) to Ubuntu 18.04:
    • Check any packages that have residual configs and purge them:
    • # dpkg -l | grep -E ‘^rc’ | awk ‘{print $2}’ | xargs dpkg -P
    • Make sure all packages are up to date and the package manager is up to date, then reboot:
# apt update && apt full-upgrade
# apt-get autoremove && apt-get autoclean
# dpkg -C
# reboot
Read more →

November, 2019

2019-11-04

  • Peter noticed that there were 5.2 million hits on CGSpace in 2019-10 according to the Atmire usage statistics
    • I looked in the nginx logs and see 4.6 million in the access logs, and 1.2 million in the API logs:
# zcat --force /var/log/nginx/*access.log.*.gz | grep -cE "[0-9]{1,2}/Oct/2019"
4671942
# zcat --force /var/log/nginx/{rest,oai,statistics}.log.*.gz | grep -cE "[0-9]{1,2}/Oct/2019"
1277694
  • So 4.6 million from XMLUI and another 1.2 million from API requests
  • Let’s see how many of the REST API requests were for bitstreams (because they are counted in Solr stats):
# zcat --force /var/log/nginx/rest.log.*.gz | grep -c -E "[0-9]{1,2}/Oct/2019"
1183456 
# zcat --force /var/log/nginx/rest.log.*.gz | grep -E "[0-9]{1,2}/Oct/2019" | grep -c -E "/rest/bitstreams"
106781
Read more →