CGSpace Notes

Documenting day-to-day work on the CGSpace repository.

July, 2023

2023-07-01 Export CGSpace to check for missing Initiative collection mappings Start harvesting on AReS 2023-07-02 Minor edits to the crossref_doi_lookup.py script while running some checks from 22,000 CGSpace DOIs 2023-07-03 I analyzed the licenses declared by Crossref and found with high confidence that ~400 of ours were incorrect I took the more accurate ones from Crossref and updated the items on CGSpace I took a few hundred ISBNs as well for where we were missing them I also tagged ~4,700 items with missing licenses as “Copyrighted; all rights reserved” based on their Crossref license status being TDM, mostly from Elsevier, Wiley, and Springer Checking a dozen or so manually, I confirmed that if Crossref only has a TDM license then it’s usually copyrighted (could still be open access, but we can’t tell via Crossref) I would be curious to write a script to check the Unpaywall API for open access status… In the past I found that their license status was not very accurate, but the open access status might be more reliable More minor work on the DSpace 7 item views I learned some new Angular template syntax I created a custom component to show Creative Commons licenses on the simple item page I also decided that I don’t like the Impact Area icons as a component because they don’t have any visual meaning Read more →

June, 2023

2023-06-02

  • Spend some time testing my post_bitstreams.py script to update thumbnails for items on CGSpace
    • Interestingly I found an item with a JFIF thumbnail and another with a WebP thumbnail…
  • Meeting with Valentina, Stefano, and Sara about MODS metadata in CGSpace
    • They have experience with improving the MODS interface in MELSpace’s OAI-PMH for use with AGRIS and were curious if we could do the same in CGSpace
    • From what I can see we need to upgrade the MODS schema from 3.1 to 3.7 and then just add a bunch of our fields to the crosswalk
Read more →

May, 2023

2023-05-03

  • Alliance’s TIP team emailed me to ask about issues authenticating on CGSpace
    • It seems their password expired, which is annoying
  • I continued looking at the CGSpace subjects for the FAO / AGROVOC exercise that I started last week
    • There are many of our subjects that would match if they added a “-” like “high yielding varieties” or used singular…
    • Also I found at least two spelling mistakes, for example “decison support systems”, which would match if it was spelled correctly
  • Work on cleaning, proofing, and uploading twenty-seven records for IFPRI to CGSpace
Read more →

April, 2023

2023-04-02

  • Run all system updates on CGSpace and reboot it
  • I exported CGSpace to CSV to check for any missing Initiative collection mappings
    • I also did a check for missing country/region mappings with csv-metadata-quality
  • Start a harvest on AReS
Read more →

March, 2023

2023-03-01

  • Remove cg.subject.wle and cg.identifier.wletheme from CGSpace input form after confirming with IWMI colleagues that they no longer need them (WLE closed in 2021)
  • iso-codes 4.13.0 was released, which incorporates my changes to the common names for Iran, Laos, and Syria
  • I finally got through with porting the input form from DSpace 6 to DSpace 7
Read more →

February, 2023

2023-02-01

  • Export CGSpace to cross check the DOI metadata with Crossref
    • I want to try to expand my use of their data to journals, publishers, volumes, issues, etc…
Read more →

January, 2023

2023-01-01

  • Apply some more ORCID identifiers to items on CGSpace using my 2022-09-22-add-orcids.csv file
    • I want to update all ORCID names and refresh them in the database
    • I see we have some new ones that aren’t in our list if I combine with this file:
Read more →

December, 2022

2022-12-01

  • Fix some incorrect regions on CGSpace
    • I exported the CCAFS and IITA communities, extracted just the country and region columns, then ran them through csv-metadata-quality to fix the regions
  • Add a few more authors to my CSV with author names and ORCID identifiers and tag 283 items!
  • Replace “East Asia” with “Eastern Asia” region on CGSpace (UN M.49 region)
Read more →

November, 2022

2022-11-01

  • Last night I re-synced DSpace 7 Test from CGSpace
    • I also updated all my local 7_x-dev branches on the latest upstreams
  • I spent some time updating the authorizations in Alliance collections
    • I want to make sure they use groups instead of individuals where possible!
  • I reverted the Cocoon autosave change because it was more of a nuissance that Peter can’t upload CSVs from the web interface and is a very low severity security issue
Read more →