CGSpace Notes

Documenting day-to-day work on the CGSpace repository.

July, 2023

2023-07-01

  • Export CGSpace to check for missing Initiative collection mappings
  • Start harvesting on AReS

2023-07-02

  • Minor edits to the crossref_doi_lookup.py script while running some checks from 22,000 CGSpace DOIs

2023-07-03

  • I analyzed the licenses declared by Crossref and found with high confidence that ~400 of ours were incorrect
    • I took the more accurate ones from Crossref and updated the items on CGSpace
    • I took a few hundred ISBNs as well for where we were missing them
    • I also tagged ~4,700 items with missing licenses as “Copyrighted; all rights reserved” based on their Crossref license status being TDM, mostly from Elsevier, Wiley, and Springer
    • Checking a dozen or so manually, I confirmed that if Crossref only has a TDM license then it’s usually copyrighted (could still be open access, but we can’t tell via Crossref)
    • I would be curious to write a script to check the Unpaywall API for open access status…
    • In the past I found that their license status was not very accurate, but the open access status might be more reliable
  • More minor work on the DSpace 7 item views
    • I learned some new Angular template syntax
    • I created a custom component to show Creative Commons licenses on the simple item page
    • I also decided that I don’t like the Impact Area icons as a component because they don’t have any visual meaning