2024-07-01
- A bit of work to clean up duplicate DOIs on CGSpace
- A handful of book chapters, working papers, and journal articles using the wrong DOI
- I tried to delete all users who have been inactive since six years ago (July 1, 2018):
Read more →
2024-06-03
- Working on IFPRI datasets
Read more →
2024-05-01
- I dumped all the CGSpace DOIs and resolved them with my
crossref_doi_lookup.py
script
- Then I did some work to add missing abstracts (about 900!), volumes, issues, licenses, publishers, and types, etc
Read more →
2024-04-04
- Work on CGSpace duplicate DOIs more
Read more →
2024-03-01
- Last week Bizu reported an issue with the “browse by issue date” drop down
Read more →
2024-02-05
Read more →
2024-01-02
- Work on preparation of new server for DSpace 7 migration
- I’m not quite sure what we need to do for the Handle server
- For now I just ran the
dspace make-handle-config
script and diffed it with the one from DSpace 6
- I sent the bundle to the Handle admins to make sure it’s OK before we do the migration
- Continue testing and debugging the cgspace-java-helpers on DSpace 7
- Work on IFPRI ISNAR archive cleanup
Read more →
2023-12-01 There is still high load on CGSpace and I don’t know why I don’t see a high number of sessions compared to previous days in the last few weeks $ for file in dspace.log.2023-11-[23]*; do echo "$file"; grep -a -oE 'session_id=[A-Z0-9]{32}' "$file" | sort | uniq | wc -l; done dspace.log.2023-11-20 22865 dspace.log.2023-11-21 20296 dspace.log.2023-11-22 19688 dspace.log.2023-11-23 17906 dspace.log.2023-11-24 18453 dspace.log.2023-11-25 17513 dspace.log.2023-11-26 19037 dspace.log.2023-11-27 21103 dspace.log.2023-11-28 23023 dspace.log.2023-11-29 23545 dspace.
Read more →
2023-11-01
- Work a bit on the ETL pipeline for the CGIAR Climate Change Synthesis
- I improved the filtering and wrote some Python using pandas to merge my sources more reliably
2023-11-02
- Export CGSpace to check missing Initiative collection mappings
- Start a harvest on AReS
Read more →
2023-10-02
- Export CGSpace to check DOIs against Crossref
- I found that Crossref’s metadata is in the public domain under the CC0 license
- One interesting thing is the abstracts, which are copyrighted by the copyright owner, meaning Crossref cannot waive the copyright under the terms of the CC0 license, because it is not theirs to waive
- We can be on the safe side by using only abstracts for items that are licensed under Creative Commons
Read more →