diff --git a/content/posts/2023-03.md b/content/posts/2023-03.md index 41f4c17d5..5d76ffec9 100644 --- a/content/posts/2023-03.md +++ b/content/posts/2023-03.md @@ -202,6 +202,7 @@ $ ls -lh 10568-126388-* - Seems like 600px is 3 to 4 times larger file size, so maybe we should shoot for 400px or 500px - I decided on 500px + - I started re-generating new thumbnails for the ILRI Publications, CGIAR Initiatives, and other collections - On that note, I also re-worked the XMLUI item display to show larger thumbnails (from a max-width of 128px to 200px) - And now that I'm looking at thumbnails I am curious what it would take to get DSpace to generate WebP or AVIF thumbnails - Peter sent me citations and ILRI subjects for the 350 new ILRI publications @@ -209,4 +210,34 @@ $ ls -lh 10568-126388-* - I merged Peter's citations and subjects with the other metadata, ran one last duplicate check (and found one item!), then ran the items through csv-metadata-quality and uploaded them to CGSpace - In the end it was only 348 items for some reason... +## 2023-03-12 + +- Start a harvest on AReS + +## 2023-03-13 + +- Extract a list of DOIs from the Creative Commons licensed ILRI journal articles that I uploaded last week, skipping any that are "no derivatives" (ND): + +```console +$ csvgrep -c 'dc.description.provenance[en]' -m 'Made available in DSpace on 2023-03-10' /tmp/ilri-articles.csv \ + | csvgrep -c 'dcterms.license[en_US]' -r 'CC(0|\-BY)' + | csvgrep -c 'dcterms.license[en_US]' -i -r '\-ND\-' + | csvcut -c 'id,cg.identifier.doi[en_US],dcterms.type[en_US]' > 2023-03-13-journal-articles.csv +``` + +- I want to write a script to download the PDFs and create thumbnails for them, then upload to CGSpace + - I wrote one based on `post_ciat_pdfs.py` but it seems there is an issue uploading anything other than a PDF + - When I upload a JPG or a PNG the file begins with: + +```console +Content-Disposition: form-data; name="file"; filename="10.1017-s0031182013001625.pdf.jpg" +``` + +- ... this means it is invalid... + - I tried in both the `ORIGINAL` and `THUMBNAIL` bundle, and with different filenames + - I tried manually on the command line with `http` and both PDF and PNG work... hmmmm + - Hmm, this seems to have been due to some difference in behavior between the `files` and `data` parameters of `requests.get()` + - I finalized the `post_bitstreams.py` script and uploaded eighty-five PDF thumbnails +- It seems Bizu uploaded covers for a handful so I deleted them and ran them through the script to get proper thumbnails + diff --git a/docs/2023-03/index.html b/docs/2023-03/index.html index 0d0f5d3e2..abcf095bc 100644 --- a/docs/2023-03/index.html +++ b/docs/2023-03/index.html @@ -16,7 +16,7 @@ I finally got through with porting the input form from DSpace 6 to DSpace 7 - + @@ -38,9 +38,9 @@ I finally got through with porting the input form from DSpace 6 to DSpace 7 "@type": "BlogPosting", "headline": "March, 2023", "url": "https://alanorth.github.io/cgspace-notes/2023-03/", - "wordCount": "1692", + "wordCount": "1911", "datePublished": "2023-03-01T07:58:36+03:00", - "dateModified": "2023-03-09T17:01:50+03:00", + "dateModified": "2023-03-10T17:34:05+03:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -341,6 +341,7 @@ pd.options.mode.nullable_dtypes = True
  • Seems like 600px is 3 to 4 times larger file size, so maybe we should shoot for 400px or 500px
  • On that note, I also re-worked the XMLUI item display to show larger thumbnails (from a max-width of 128px to 200px)
  • @@ -353,6 +354,38 @@ pd.options.mode.nullable_dtypes = True +

    2023-03-12

    + +

    2023-03-13

    + +
    $ csvgrep -c 'dc.description.provenance[en]' -m 'Made available in DSpace on 2023-03-10' /tmp/ilri-articles.csv \
    +    | csvgrep -c 'dcterms.license[en_US]' -r 'CC(0|\-BY)'
    +    | csvgrep -c 'dcterms.license[en_US]' -i -r '\-ND\-'
    +    | csvcut -c 'id,cg.identifier.doi[en_US],dcterms.type[en_US]' > 2023-03-13-journal-articles.csv
    +
    +
    Content-Disposition: form-data; name="file"; filename="10.1017-s0031182013001625.pdf.jpg"
    +
    diff --git a/docs/categories/index.html b/docs/categories/index.html index fbef998ef..dc58c1c9c 100644 --- a/docs/categories/index.html +++ b/docs/categories/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/index.html b/docs/categories/notes/index.html index 7a0537c07..be191f98e 100644 --- a/docs/categories/notes/index.html +++ b/docs/categories/notes/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/2/index.html b/docs/categories/notes/page/2/index.html index 63430bcfd..f838eb12c 100644 --- a/docs/categories/notes/page/2/index.html +++ b/docs/categories/notes/page/2/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/3/index.html b/docs/categories/notes/page/3/index.html index 91886e75d..28f1b3ff5 100644 --- a/docs/categories/notes/page/3/index.html +++ b/docs/categories/notes/page/3/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/4/index.html b/docs/categories/notes/page/4/index.html index 3647c1708..b492cd035 100644 --- a/docs/categories/notes/page/4/index.html +++ b/docs/categories/notes/page/4/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/5/index.html b/docs/categories/notes/page/5/index.html index 8c7af35a6..365941a23 100644 --- a/docs/categories/notes/page/5/index.html +++ b/docs/categories/notes/page/5/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/6/index.html b/docs/categories/notes/page/6/index.html index 55291aca0..1e8650e51 100644 --- a/docs/categories/notes/page/6/index.html +++ b/docs/categories/notes/page/6/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/7/index.html b/docs/categories/notes/page/7/index.html index 869424dec..339468c79 100644 --- a/docs/categories/notes/page/7/index.html +++ b/docs/categories/notes/page/7/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/index.html b/docs/index.html index 43ee6e3a6..6174b7587 100644 --- a/docs/index.html +++ b/docs/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/10/index.html b/docs/page/10/index.html index 180dc7683..9aaffdb0f 100644 --- a/docs/page/10/index.html +++ b/docs/page/10/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/2/index.html b/docs/page/2/index.html index 71bef9fe1..a8eeb70b5 100644 --- a/docs/page/2/index.html +++ b/docs/page/2/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/3/index.html b/docs/page/3/index.html index 70f6b9730..6e111f191 100644 --- a/docs/page/3/index.html +++ b/docs/page/3/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/4/index.html b/docs/page/4/index.html index ddbc241bc..ce03c7f17 100644 --- a/docs/page/4/index.html +++ b/docs/page/4/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/5/index.html b/docs/page/5/index.html index 56ae57335..0b2b46ce4 100644 --- a/docs/page/5/index.html +++ b/docs/page/5/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/6/index.html b/docs/page/6/index.html index 448e70928..cc7f3436c 100644 --- a/docs/page/6/index.html +++ b/docs/page/6/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/7/index.html b/docs/page/7/index.html index edf425e9b..bcc393617 100644 --- a/docs/page/7/index.html +++ b/docs/page/7/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/8/index.html b/docs/page/8/index.html index a6e8321ab..42730a04c 100644 --- a/docs/page/8/index.html +++ b/docs/page/8/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/9/index.html b/docs/page/9/index.html index 409fb6e1c..567b7c736 100644 --- a/docs/page/9/index.html +++ b/docs/page/9/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/index.html b/docs/posts/index.html index 99ef665d4..0e76b19b4 100644 --- a/docs/posts/index.html +++ b/docs/posts/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/10/index.html b/docs/posts/page/10/index.html index 5e5ca8580..4abf29ba5 100644 --- a/docs/posts/page/10/index.html +++ b/docs/posts/page/10/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/2/index.html b/docs/posts/page/2/index.html index 438fc9fbd..1aa3c2b6a 100644 --- a/docs/posts/page/2/index.html +++ b/docs/posts/page/2/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/3/index.html b/docs/posts/page/3/index.html index 9b8917e6e..ce7c547b2 100644 --- a/docs/posts/page/3/index.html +++ b/docs/posts/page/3/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/4/index.html b/docs/posts/page/4/index.html index 1eb6fffe5..2147e96d5 100644 --- a/docs/posts/page/4/index.html +++ b/docs/posts/page/4/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/5/index.html b/docs/posts/page/5/index.html index fb16b7c99..37b55601d 100644 --- a/docs/posts/page/5/index.html +++ b/docs/posts/page/5/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/6/index.html b/docs/posts/page/6/index.html index 32d9cd489..74bf97cbd 100644 --- a/docs/posts/page/6/index.html +++ b/docs/posts/page/6/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/7/index.html b/docs/posts/page/7/index.html index 7ed513435..26fd20379 100644 --- a/docs/posts/page/7/index.html +++ b/docs/posts/page/7/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/8/index.html b/docs/posts/page/8/index.html index 64fbc54dc..5b1112141 100644 --- a/docs/posts/page/8/index.html +++ b/docs/posts/page/8/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/9/index.html b/docs/posts/page/9/index.html index e04e75506..4d841970a 100644 --- a/docs/posts/page/9/index.html +++ b/docs/posts/page/9/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/sitemap.xml b/docs/sitemap.xml index c7ddb7805..543673bfb 100644 --- a/docs/sitemap.xml +++ b/docs/sitemap.xml @@ -3,19 +3,19 @@ xmlns:xhtml="http://www.w3.org/1999/xhtml"> https://alanorth.github.io/cgspace-notes/categories/ - 2023-03-09T17:01:50+03:00 + 2023-03-10T17:34:05+03:00 https://alanorth.github.io/cgspace-notes/ - 2023-03-09T17:01:50+03:00 + 2023-03-10T17:34:05+03:00 https://alanorth.github.io/cgspace-notes/2023-03/ - 2023-03-09T17:01:50+03:00 + 2023-03-10T17:34:05+03:00 https://alanorth.github.io/cgspace-notes/categories/notes/ - 2023-03-09T17:01:50+03:00 + 2023-03-10T17:34:05+03:00 https://alanorth.github.io/cgspace-notes/posts/ - 2023-03-09T17:01:50+03:00 + 2023-03-10T17:34:05+03:00 https://alanorth.github.io/cgspace-notes/2023-02/ 2023-03-01T08:30:25+03:00