diff --git a/content/posts/2023-09.md b/content/posts/2023-09.md index 021a7b688..fc3aa27a4 100644 --- a/content/posts/2023-09.md +++ b/content/posts/2023-09.md @@ -18,4 +18,46 @@ categories: ["Notes"] - It still feels hacky, but using [AfterViewInit](https://stackoverflow.com/questions/41936631/how-to-trigger-the-function-after-dom-markup-is-loaded-in-angular-style-applicat), and importing the Altmetric `embed.js` in the component works - The style on mobile also needs work... +## 2023-09-06 + +- Discussion with Marie about finalizing the output types list on GitHub + - I did some review and cleanup in preparation for publishing the new list + +## 2023-09-07 + +- Export CGSpace to start doing a review of the metadata +- First I will start by extracting all items with DOIs, along with some fields I can compare against Crossref: + +```console +$ csvgrep -c 'cg.identifier.doi[en_US]' -r 'doi.org' ~/Downloads/2023-09-07-cgspace.csv \ + | csvcut -c 'id,dc.title[en_US],dcterms.issued[en_US],dcterms.available[en_US],cg.issn[en_US],cg.isbn[en_US],cg.volume[en_US],cg.issue[en_US],cg.number[en_US],dcterms.extent[en_US],cg.identifier.doi[en_US],cg.reviewStatus[en_US],cg.isijournal[en_US],dcterms.license[en_US],dcterms.accessRights[en_US],dcterms.type[en_US],dc.identifier.uri[en_US]' \ + > /tmp/2023-09-07-cgspace-dois.csv +$ csvgrep -c 'cg.identifier.doi[en_US]' -r 'doi.org' ~/Downloads/2023-09-07-cgspace.csv | csvcut -c 'cg.identifier.doi[en_US]' | sed 1d > /tmp/2023-09-07-cgspace-dois.txt +``` + +- Then I resolved the DOIs from Crossref: + +```console +$ ./ilri/crossref_doi_lookup.py -i /tmp/2023-09-07-cgspace-dois.txt -o /tmp/2023-09-07-cgspace-dois-results.csv -e a.orth@cgiar.org +``` + +- A user emailed to ask about uploading a 180MB PDF to CGSpace + - I used GhostScript to try reducing it using the `screen`, `ebook` and `prepress` presets: + +```console +$ gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/screen -dNOPAUSE -dQUIET -dBATCH -sOutputFile=primer-screen.pdf Primer\ \(digital\)_Climate-\ smart\ and\ regenerative\ agriculture\ in\ climate\ change\ adaptation.pdf +$ gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/ebook -dNOPAUSE -dQUIET -dBATCH -sOutputFile=primer-ebook.pdf Primer\ \(digital\)_Climate-\ smart\ and\ regenerative\ agriculture\ in\ climate\ change\ adaptation.pdf +$ gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/prepress -dNOPAUSE -dQUIET -dBATCH -sOutputFile=primer-prepress.pdf Primer\ \(digital\)_Climate-\ smart\ and\ regenerative\ agriculture\ in\ climate\ change\ adaptation.pdf +``` + +- The `prepress` one is 300DPI and looks visually identical to the original, so I proposed that we use that one + +## 2023-09-08 + +- I did a review of the metadata for our items with DOIs, comparing with data from Crossref + - I spot checked a handful of issue / online dates and licenses, and saw that Crossref's dates are always more accurate than ours when they differ + - I also filled in some missing volumes, issues, ISSNs, and extents + - This results in 14,000 changes to existing items, which will take several days to import unfortunately + - After eight hours the first file is only about 2/3 finished... sigh + diff --git a/docs/2023-09/index.html b/docs/2023-09/index.html index 097171892..6a557e7e0 100644 --- a/docs/2023-09/index.html +++ b/docs/2023-09/index.html @@ -15,7 +15,7 @@ Start a harvest on AReS - + @@ -36,9 +36,9 @@ Start a harvest on AReS "@type": "BlogPosting", "headline": "September, 2023", "url": "https://alanorth.github.io/cgspace-notes/2023-09/", - "wordCount": "54", + "wordCount": "341", "datePublished": "2023-09-02T17:29:36+03:00", - "dateModified": "2023-09-02T17:37:15+03:00", + "dateModified": "2023-09-04T09:16:51+03:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -126,6 +126,51 @@ Start a harvest on AReS +

2023-09-06

+ +

2023-09-07

+ +
$ csvgrep -c 'cg.identifier.doi[en_US]' -r 'doi.org' ~/Downloads/2023-09-07-cgspace.csv \
+    | csvcut -c 'id,dc.title[en_US],dcterms.issued[en_US],dcterms.available[en_US],cg.issn[en_US],cg.isbn[en_US],cg.volume[en_US],cg.issue[en_US],cg.number[en_US],dcterms.extent[en_US],cg.identifier.doi[en_US],cg.reviewStatus[en_US],cg.isijournal[en_US],dcterms.license[en_US],dcterms.accessRights[en_US],dcterms.type[en_US],dc.identifier.uri[en_US]' \
+    > /tmp/2023-09-07-cgspace-dois.csv
+$ csvgrep -c 'cg.identifier.doi[en_US]' -r 'doi.org' ~/Downloads/2023-09-07-cgspace.csv | csvcut -c 'cg.identifier.doi[en_US]' | sed 1d > /tmp/2023-09-07-cgspace-dois.txt
+
+
$ ./ilri/crossref_doi_lookup.py -i /tmp/2023-09-07-cgspace-dois.txt -o /tmp/2023-09-07-cgspace-dois-results.csv -e a.orth@cgiar.org
+
+
$ gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/screen -dNOPAUSE -dQUIET -dBATCH -sOutputFile=primer-screen.pdf Primer\ \(digital\)_Climate-\ smart\ and\ regenerative\ agriculture\ in\ climate\ change\ adaptation.pdf 
+$ gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/ebook -dNOPAUSE -dQUIET -dBATCH -sOutputFile=primer-ebook.pdf Primer\ \(digital\)_Climate-\ smart\ and\ regenerative\ agriculture\ in\ climate\ change\ adaptation.pdf
+$ gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/prepress -dNOPAUSE -dQUIET -dBATCH -sOutputFile=primer-prepress.pdf Primer\ \(digital\)_Climate-\ smart\ and\ regenerative\ agriculture\ in\ climate\ change\ adaptation.pdf
+
+

2023-09-08

+ diff --git a/docs/categories/index.html b/docs/categories/index.html index 02948449e..0fb8e64c9 100644 --- a/docs/categories/index.html +++ b/docs/categories/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/index.html b/docs/categories/notes/index.html index 5e221858a..e8e9b8f2d 100644 --- a/docs/categories/notes/index.html +++ b/docs/categories/notes/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/2/index.html b/docs/categories/notes/page/2/index.html index 70c91acd5..7504a2ca6 100644 --- a/docs/categories/notes/page/2/index.html +++ b/docs/categories/notes/page/2/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/3/index.html b/docs/categories/notes/page/3/index.html index 2f6367ad4..8e07c3dbe 100644 --- a/docs/categories/notes/page/3/index.html +++ b/docs/categories/notes/page/3/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/4/index.html b/docs/categories/notes/page/4/index.html index c91cffa2b..87be774d6 100644 --- a/docs/categories/notes/page/4/index.html +++ b/docs/categories/notes/page/4/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/5/index.html b/docs/categories/notes/page/5/index.html index 369688d85..429e2c6e5 100644 --- a/docs/categories/notes/page/5/index.html +++ b/docs/categories/notes/page/5/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/6/index.html b/docs/categories/notes/page/6/index.html index 19f2fb668..68fe49fa5 100644 --- a/docs/categories/notes/page/6/index.html +++ b/docs/categories/notes/page/6/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/7/index.html b/docs/categories/notes/page/7/index.html index 134d75295..98e30d420 100644 --- a/docs/categories/notes/page/7/index.html +++ b/docs/categories/notes/page/7/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/8/index.html b/docs/categories/notes/page/8/index.html index 511c633bb..59230e31d 100644 --- a/docs/categories/notes/page/8/index.html +++ b/docs/categories/notes/page/8/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/index.html b/docs/index.html index 2b2de2a37..ef961900e 100644 --- a/docs/index.html +++ b/docs/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/10/index.html b/docs/page/10/index.html index f566556f7..6e4c32db9 100644 --- a/docs/page/10/index.html +++ b/docs/page/10/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/2/index.html b/docs/page/2/index.html index eb5470e0b..4720aefec 100644 --- a/docs/page/2/index.html +++ b/docs/page/2/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/3/index.html b/docs/page/3/index.html index a259b7c12..4e8d53d01 100644 --- a/docs/page/3/index.html +++ b/docs/page/3/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/4/index.html b/docs/page/4/index.html index 10818c24d..5f24f1327 100644 --- a/docs/page/4/index.html +++ b/docs/page/4/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/5/index.html b/docs/page/5/index.html index 0169b78b7..b4e8036f8 100644 --- a/docs/page/5/index.html +++ b/docs/page/5/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/6/index.html b/docs/page/6/index.html index dc7bdb819..32eac7c3c 100644 --- a/docs/page/6/index.html +++ b/docs/page/6/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/7/index.html b/docs/page/7/index.html index 490d5b5eb..472d410e1 100644 --- a/docs/page/7/index.html +++ b/docs/page/7/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/8/index.html b/docs/page/8/index.html index 08e5767c2..59ac5dd19 100644 --- a/docs/page/8/index.html +++ b/docs/page/8/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/9/index.html b/docs/page/9/index.html index 1481da10d..9c2a024f8 100644 --- a/docs/page/9/index.html +++ b/docs/page/9/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/index.html b/docs/posts/index.html index dd39cd80f..7bd97e9d3 100644 --- a/docs/posts/index.html +++ b/docs/posts/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/10/index.html b/docs/posts/page/10/index.html index e43056ecf..dea3e8c67 100644 --- a/docs/posts/page/10/index.html +++ b/docs/posts/page/10/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/2/index.html b/docs/posts/page/2/index.html index abd27702d..235c04003 100644 --- a/docs/posts/page/2/index.html +++ b/docs/posts/page/2/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/3/index.html b/docs/posts/page/3/index.html index 7fe05b982..647df5b05 100644 --- a/docs/posts/page/3/index.html +++ b/docs/posts/page/3/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/4/index.html b/docs/posts/page/4/index.html index 567d17720..07ccdd081 100644 --- a/docs/posts/page/4/index.html +++ b/docs/posts/page/4/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/5/index.html b/docs/posts/page/5/index.html index 5519ba01b..676145454 100644 --- a/docs/posts/page/5/index.html +++ b/docs/posts/page/5/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/6/index.html b/docs/posts/page/6/index.html index f771c5999..0c0c645ac 100644 --- a/docs/posts/page/6/index.html +++ b/docs/posts/page/6/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/7/index.html b/docs/posts/page/7/index.html index fab5cee92..597ac748d 100644 --- a/docs/posts/page/7/index.html +++ b/docs/posts/page/7/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/8/index.html b/docs/posts/page/8/index.html index 6bfaff1fb..94b7e9160 100644 --- a/docs/posts/page/8/index.html +++ b/docs/posts/page/8/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/9/index.html b/docs/posts/page/9/index.html index 85d1f2b19..3cc6e6c94 100644 --- a/docs/posts/page/9/index.html +++ b/docs/posts/page/9/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/sitemap.xml b/docs/sitemap.xml index 6d769630b..b043db62b 100644 --- a/docs/sitemap.xml +++ b/docs/sitemap.xml @@ -3,19 +3,19 @@ xmlns:xhtml="http://www.w3.org/1999/xhtml"> https://alanorth.github.io/cgspace-notes/categories/ - 2023-09-02T17:37:15+03:00 + 2023-09-04T09:16:51+03:00 https://alanorth.github.io/cgspace-notes/ - 2023-09-02T17:37:15+03:00 + 2023-09-04T09:16:51+03:00 https://alanorth.github.io/cgspace-notes/categories/notes/ - 2023-09-02T17:37:15+03:00 + 2023-09-04T09:16:51+03:00 https://alanorth.github.io/cgspace-notes/posts/ - 2023-09-02T17:37:15+03:00 + 2023-09-04T09:16:51+03:00 https://alanorth.github.io/cgspace-notes/2023-09/ - 2023-09-02T17:37:15+03:00 + 2023-09-04T09:16:51+03:00 https://alanorth.github.io/cgspace-notes/2023-08/ 2023-09-01T08:10:02+03:00