I checked IITA’s 259 Feb 14 records from last month for duplicates using Atmire’s Duplicate Checker on a fresh snapshot of CGSpace on my local machine and everything looks good
I am now only waiting to hear from her about where the items should go, though I assume Journal Articles go to IITA Journal Articles collection, etc…
Looking at the other half of Udana’s WLE records from 2018-11
I finished the ones for Restoring Degraded Landscapes (RDL), but these are for Variability, Risks and Competing Uses (VRC)
I did the usual cleanups for whitespace, added regions where they made sense for certain countries, cleaned up the DOI link formats, added rights information based on the publications page for a few items
Most worryingly, there are encoding errors in the abstracts for eleven items, for example:
68.15% <20> 9.45 instead of 68.15% ± 9.45
2003<EFBFBD>2013 instead of 2003–2013
I think I will need to ask Udana to re-copy and paste the abstracts with more care using Google Docs
I checked IITA’s 259 Feb 14 records from last month for duplicates using Atmire’s Duplicate Checker on a fresh snapshot of CGSpace on my local machine and everything looks good
I am now only waiting to hear from her about where the items should go, though I assume Journal Articles go to IITA Journal Articles collection, etc…
Looking at the other half of Udana’s WLE records from 2018-11
I finished the ones for Restoring Degraded Landscapes (RDL), but these are for Variability, Risks and Competing Uses (VRC)
I did the usual cleanups for whitespace, added regions where they made sense for certain countries, cleaned up the DOI link formats, added rights information based on the publications page for a few items
Most worryingly, there are encoding errors in the abstracts for eleven items, for example:
68.15% <20> 9.45 instead of 68.15% ± 9.45
2003<EFBFBD>2013 instead of 2003–2013
I think I will need to ask Udana to re-copy and paste the abstracts with more care using Google Docs
<li>I checked IITA’s 259 Feb 14 records from last month for duplicates using Atmire’s Duplicate Checker on a fresh snapshot of CGSpace on my local machine and everything looks good</li>
<li>I am now only waiting to hear from her about where the items should go, though I assume Journal Articles go to IITA Journal Articles collection, etc…</li>
<li>Looking at the other half of Udana’s WLE records from 2018-11
<ul>
<li>I finished the ones for Restoring Degraded Landscapes (RDL), but these are for Variability, Risks and Competing Uses (VRC)</li>
<li>I did the usual cleanups for whitespace, added regions where they made sense for certain countries, cleaned up the DOI link formats, added rights information based on the publications page for a few items</li>
<li>Most worryingly, there are encoding errors in the abstracts for eleven items, for example:</li>
<li>68.15% <20> 9.45 instead of 68.15% ± 9.45</li>
<li>2003<EFBFBD>2013 instead of 2003–2013</li>
</ul></li>
<li>I think I will need to ask Udana to re-copy and paste the abstracts with more care using Google Docs</li>
<li>As I was inspecting the archive I noticed that there were some problems with the bitsreams:
<ul>
<li>First, Sisay didn’t include the bitstream descriptions</li>
<li>Second, only five items had bitstreams and I remember in the discussion with IITA that there should have been nine!</li>
<li>I had to refer to the original CSV from January to find the file names, then download and add them to the export contents manually!</li>
</ul></li>
<li>After adding the missing bitstreams and descriptions manually I tested them again locally, then imported them to a temporary collection on CGSpace:</li>
<li>DSpace’s export function doesn’t include the collections for some reason, so you need to import them somewhere first, then export the collection metadata and re-map the items to proper owning collections based on their types using OpenRefine or something</li>
<li>After re-importing to CGSpace to apply the mappings, I deleted the collection on DSpace Test and ran the <code>dspace cleanup</code> script</li>