2024-04-04
- Work on CGSpace duplicate DOIs more
2024-04-08
- Start working on IFPRI’s 2022 batch import
- I ran the duplicate checker against CGSpace and started downloading all linked PDFs
2024-04-09
- Continue working on IFPRI’s 2022 batch import
- I started validating the potential duplicates in OpenRefine
2024-04-12
- Finish working on the 650 IFPRI 2022 records that were not already on CGSpace, then uploaded them
- I need to merge the metadata for the remaining 212 that are already on CGSpace
- Spend some time looking at duplicate DOIs again…
2024-04-13
- Spend some time looking at duplicate DOIs again…
2024-04-14
- Spend some time looking at duplicate DOIs again…
2024-04-15
$ time curl -s -o /dev/null 'https://cgspace.cgiar.org/server/api/discover/search/objects?query=cg.identifier.project%3AIFPRI*&scope=8f1e9650-fe87-4e6e-889a-1cacfb747408&page=0&size=100&embed=thumbnail,bundles/bitstreams&sort=dcterms.issued,desc'
curl -s -o /dev/null 0.01s user 0.01s system 0% cpu 47.515 total
$ time curl -s -o /dev/null 'https://cgspace.cgiar.org/server/api/discover/search/objects?query=cg.identifier.project%3AIFPRI*&scope=8f1e9650-fe87-4e6e-889a-1cacfb747408&page=0&size=100&sort=dcterms.issued,desc'
curl -s -o /dev/null 0.01s user 0.01s system 0% cpu 4.764 total
- Finalize processing the remaining 206 items from the IFPRI 2022 batch set that already existed on CGSpace
- I merged metadata with the existing items
- There are still six remaining items that I identified as being duplicates (3x2) in the IFPRI set itself
2024-04-16
- Spend some time looking at duplicate DOIs again…