CGSpace Notes

Documenting day-to-day work on the CGSpace repository.

March, 2019

2019-03-01

  • I checked IITA’s 259 Feb 14 records from last month for duplicates using Atmire’s Duplicate Checker on a fresh snapshot of CGSpace on my local machine and everything looks good
  • I am now only waiting to hear from her about where the items should go, though I assume Journal Articles go to IITA Journal Articles collection, etc…
  • Looking at the other half of Udana’s WLE records from 2018-11
    • I finished the ones for Restoring Degraded Landscapes (RDL), but these are for Variability, Risks and Competing Uses (VRC)
    • I did the usual cleanups for whitespace, added regions where they made sense for certain countries, cleaned up the DOI link formats, added rights information based on the publications page for a few items
    • Most worryingly, there are encoding errors in the abstracts for eleven items, for example:
    • 68.15% � 9.45 instead of 68.15% ± 9.45
    • 2003�2013 instead of 2003–2013
  • I think I will need to ask Udana to re-copy and paste the abstracts with more care using Google Docs

2019-03-03

  • Trying to finally upload IITA’s 259 Feb 14 items to CGSpace so I exported them from DSpace Test:
$ mkdir 2019-03-03-IITA-Feb14
$ dspace export -i 10568/108684 -t COLLECTION -m -n 0 -d 2019-03-03-IITA-Feb14
  • As I was inspecting the archive I noticed that there were some problems with the bitsreams:
    • First, Sisay didn’t include the bitstream descriptions
    • Second, only five items had bitstreams and I remember in the discussion with IITA that there should have been nine!
    • I had to refer to the original CSV from January to find the file names, then download and add them to the export contents manually!
  • After adding the missing bitstreams and descriptions manually I tested them again locally, then imported them to a temporary collection on CGSpace:
$ dspace import -a -c 10568/99832 -e aorth@stfu.com -m 2019-03-03-IITA-Feb14.map -s /tmp/2019-03-03-IITA-Feb14
  • DSpace’s export function doesn’t include the collections for some reason, so you need to import them somewhere first, then export the collection metadata and re-map the items to proper owning collections based on their types using OpenRefine or something
  • After re-importing to CGSpace to apply the mappings, I deleted the collection on DSpace Test and ran the dspace cleanup script
  • Merge the IITA research theme changes from last month to the 5_x-prod branch (#413)
    • I will deploy to CGSpace soon and then think about how to batch tag all IITA’s existing items with this metadata