diff --git a/content/posts/2023-10.md b/content/posts/2023-10.md index 91c2a9175..af80a48b0 100644 --- a/content/posts/2023-10.md +++ b/content/posts/2023-10.md @@ -136,4 +136,15 @@ forEach(value.parseXml().select("jats|p"),i,i.xmlText()).join("") - I used my `crossref_doi_lookup.py` script to fetch the metadata for them using their DOIs, then did a bunch of cleanup in OpenRefine - Test some LDAP patches for DSpace 7 +## 2023-10-30 + +- Some work on metadata for Aditi's review + - I found more preprints grrrr + +## 2023-10-31 + +- Peter got back to me with the cleanups on ILRI journal articles from Altmetric that we didn't have on CGSpace + - I did another duplicate check and found four more duplicates that had been uploaded yesterday + - Then I did a quick sanity check and uploaded the remaining 19 items to CGSpace + diff --git a/content/posts/2023-11.md b/content/posts/2023-11.md new file mode 100644 index 000000000..800926db0 --- /dev/null +++ b/content/posts/2023-11.md @@ -0,0 +1,24 @@ +--- +title: "November, 2023" +date: 2023-11-02T12:59:36+03:00 +author: "Alan Orth" +categories: ["Notes"] +--- + +## 2023-11-01 + +- Work a bit on the ETL pipeline for the CGIAR Climate Change Synthesis + - I improved the filtering and wrote some Python using pandas to merge my sources more reliably + +## 2023-11-02 + +- Export CGSpace to check missing Initiative collection mappings +- Start a harvest on AReS + + + +- IFPRI contacted us about importing their Slideshare presentations to CGSpace + - There are ~1,700 of them and date back to as early as 2008 + - I did a quick cleanup of the metadata export from Slideshare (including tagging with some AGROVOC in OpenRefine) and uploaded to DSpace Test + +