2019-03-01
- I checked IITA’s 259 Feb 14 records from last month for duplicates using Atmire’s Duplicate Checker on a fresh snapshot of CGSpace on my local machine and everything looks good
- I am now only waiting to hear from her about where the items should go, though I assume Journal Articles go to IITA Journal Articles collection, etc…
- Looking at the other half of Udana’s WLE records from 2018-11
- I finished the ones for Restoring Degraded Landscapes (RDL), but these are for Variability, Risks and Competing Uses (VRC)
- I did the usual cleanups for whitespace, added regions where they made sense for certain countries, cleaned up the DOI link formats, added rights information based on the publications page for a few items
- Most worryingly, there are encoding errors in the abstracts for eleven items, for example:
- 68.15% � 9.45 instead of 68.15% ± 9.45
- 2003�2013 instead of 2003–2013
- I think I will need to ask Udana to re-copy and paste the abstracts with more care using Google Docs
2019-03-03
- Trying to finally upload IITA’s 259 Feb 14 items to CGSpace so I exported them from DSpace Test:
$ mkdir 2019-03-03-IITA-Feb14
$ dspace export -i 10568/108684 -t COLLECTION -m -n 0 -d 2019-03-03-IITA-Feb14
- As I was inspecting the archive I noticed that there were some problems with the bitsreams:
- First, Sisay didn’t include the bitstream descriptions
- Second, only five items had bitstreams and I remember in the discussion with IITA that there should have been nine!
- I had to refer to the original CSV from January to find the file names, then download and add them to the export contents manually!
- After adding the missing bitstreams and descriptions manually I tested them again locally, then imported them to a temporary collection on CGSpace:
$ dspace import -a -c 10568/99832 -e aorth@stfu.com -m 2019-03-03-IITA-Feb14.map -s /tmp/2019-03-03-IITA-Feb14
- DSpace’s export function doesn’t include the collections for some reason, so you need to import them somewhere first, then export the collection metadata and re-map the items to proper owning collections based on their types using OpenRefine or something
- After re-importing to CGSpace to apply the mappings, I deleted the collection on DSpace Test and ran the
dspace cleanup
script
- Merge the IITA research theme changes from last month to the
5_x-prod
branch (#413)
- I will deploy to CGSpace soon and then think about how to batch tag all IITA’s existing items with this metadata
- Deploy Tomcat 7.0.93 on CGSpace (linode18) after having tested it on DSpace Test (linode19) for a week
2019-03-06
- Abenet was having problems with a CIP user account, I think that the user could not register
- I suspect it’s related to the email issue that ICT hasn’t responded about since last week
- As I thought, I still cannot send emails from CGSpace:
$ dspace test-email
About to send test email:
- To: blah@stfu.com
- Subject: DSpace test email
- Server: smtp.office365.com
Error sending email:
- Error: javax.mail.AuthenticationFailedException
- I will send a follow-up to ICT to ask them to reset the password
2019-03-07
- ICT reset the email password and I confirmed that it is working now
- Generate a controlled vocabulary of 1187 AGROVOC subjects from the top 1500 that I checked last month, dumping the terms themselves using
csvcut
and then applying XML controlled vocabulary format in vim and then checking with tidy for good measure:
$ csvcut -c name 2019-02-22-subjects.csv > dspace/config/controlled-vocabularies/dc-contributor-author.xml
$ # apply formatting in XML file
$ tidy -xml -utf8 -iq -m -w 0 dspace/config/controlled-vocabularies/dc-subject.xml
- Atmire noticed my message about the “solr_update_time_stamp” error on the dspace-tech mailing list and created an issue on their tracker to discuss it with me
- They say the error is harmless, but has nevertheless been fixed in their newer module versions