Add notes

This commit is contained in:
Alan Orth 2023-03-13 21:22:25 +03:00
parent 345cd4365b
commit 40fe625083
Signed by: alanorth
GPG Key ID: 0FB860CC9C45B1B9
31 changed files with 100 additions and 36 deletions

View File

@ -202,6 +202,7 @@ $ ls -lh 10568-126388-*
- Seems like 600px is 3 to 4 times larger file size, so maybe we should shoot for 400px or 500px
- I decided on 500px
- I started re-generating new thumbnails for the ILRI Publications, CGIAR Initiatives, and other collections
- On that note, I also re-worked the XMLUI item display to show larger thumbnails (from a max-width of 128px to 200px)
- And now that I'm looking at thumbnails I am curious what it would take to get DSpace to generate WebP or AVIF thumbnails
- Peter sent me citations and ILRI subjects for the 350 new ILRI publications
@ -209,4 +210,34 @@ $ ls -lh 10568-126388-*
- I merged Peter's citations and subjects with the other metadata, ran one last duplicate check (and found one item!), then ran the items through csv-metadata-quality and uploaded them to CGSpace
- In the end it was only 348 items for some reason...
## 2023-03-12
- Start a harvest on AReS
## 2023-03-13
- Extract a list of DOIs from the Creative Commons licensed ILRI journal articles that I uploaded last week, skipping any that are "no derivatives" (ND):
```console
$ csvgrep -c 'dc.description.provenance[en]' -m 'Made available in DSpace on 2023-03-10' /tmp/ilri-articles.csv \
| csvgrep -c 'dcterms.license[en_US]' -r 'CC(0|\-BY)'
| csvgrep -c 'dcterms.license[en_US]' -i -r '\-ND\-'
| csvcut -c 'id,cg.identifier.doi[en_US],dcterms.type[en_US]' > 2023-03-13-journal-articles.csv
```
- I want to write a script to download the PDFs and create thumbnails for them, then upload to CGSpace
- I wrote one based on `post_ciat_pdfs.py` but it seems there is an issue uploading anything other than a PDF
- When I upload a JPG or a PNG the file begins with:
```console
Content-Disposition: form-data; name="file"; filename="10.1017-s0031182013001625.pdf.jpg"
```
- ... this means it is invalid...
- I tried in both the `ORIGINAL` and `THUMBNAIL` bundle, and with different filenames
- I tried manually on the command line with `http` and both PDF and PNG work... hmmmm
- Hmm, this seems to have been due to some difference in behavior between the `files` and `data` parameters of `requests.get()`
- I finalized the `post_bitstreams.py` script and uploaded eighty-five PDF thumbnails
- It seems Bizu uploaded covers for a handful so I deleted them and ran them through the script to get proper thumbnails
<!-- vim: set sw=2 ts=2: -->

View File

@ -16,7 +16,7 @@ I finally got through with porting the input form from DSpace 6 to DSpace 7
<meta property="og:type" content="article" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2023-03/" />
<meta property="article:published_time" content="2023-03-01T07:58:36+03:00" />
<meta property="article:modified_time" content="2023-03-09T17:01:50+03:00" />
<meta property="article:modified_time" content="2023-03-10T17:34:05+03:00" />
@ -38,9 +38,9 @@ I finally got through with porting the input form from DSpace 6 to DSpace 7
"@type": "BlogPosting",
"headline": "March, 2023",
"url": "https://alanorth.github.io/cgspace-notes/2023-03/",
"wordCount": "1692",
"wordCount": "1911",
"datePublished": "2023-03-01T07:58:36+03:00",
"dateModified": "2023-03-09T17:01:50+03:00",
"dateModified": "2023-03-10T17:34:05+03:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@ -341,6 +341,7 @@ pd.options.mode.nullable_dtypes = True
<li>Seems like 600px is 3 to 4 times larger file size, so maybe we should shoot for 400px or 500px
<ul>
<li>I decided on 500px</li>
<li>I started re-generating new thumbnails for the ILRI Publications, CGIAR Initiatives, and other collections</li>
</ul>
</li>
<li>On that note, I also re-worked the XMLUI item display to show larger thumbnails (from a max-width of 128px to 200px)</li>
@ -353,6 +354,38 @@ pd.options.mode.nullable_dtypes = True
</ul>
</li>
</ul>
<h2 id="2023-03-12">2023-03-12</h2>
<ul>
<li>Start a harvest on AReS</li>
</ul>
<h2 id="2023-03-13">2023-03-13</h2>
<ul>
<li>Extract a list of DOIs from the Creative Commons licensed ILRI journal articles that I uploaded last week, skipping any that are &ldquo;no derivatives&rdquo; (ND):</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ csvgrep -c <span style="color:#e6db74">&#39;dc.description.provenance[en]&#39;</span> -m <span style="color:#e6db74">&#39;Made available in DSpace on 2023-03-10&#39;</span> /tmp/ilri-articles.csv <span style="color:#ae81ff">\
</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span> | csvgrep -c &#39;dcterms.license[en_US]&#39; -r &#39;CC(0|\-BY)&#39;
</span></span><span style="display:flex;"><span> | csvgrep -c &#39;dcterms.license[en_US]&#39; -i -r &#39;\-ND\-&#39;
</span></span><span style="display:flex;"><span> | csvcut -c &#39;id,cg.identifier.doi[en_US],dcterms.type[en_US]&#39; &gt; 2023-03-13-journal-articles.csv
</span></span></code></pre></div><ul>
<li>I want to write a script to download the PDFs and create thumbnails for them, then upload to CGSpace
<ul>
<li>I wrote one based on <code>post_ciat_pdfs.py</code> but it seems there is an issue uploading anything other than a PDF</li>
<li>When I upload a JPG or a PNG the file begins with:</li>
</ul>
</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>Content-Disposition: form-data; name=&#34;file&#34;; filename=&#34;10.1017-s0031182013001625.pdf.jpg&#34;
</span></span></code></pre></div><ul>
<li>&hellip; this means it is invalid&hellip;
<ul>
<li>I tried in both the <code>ORIGINAL</code> and <code>THUMBNAIL</code> bundle, and with different filenames</li>
<li>I tried manually on the command line with <code>http</code> and both PDF and PNG work&hellip; hmmmm</li>
<li>Hmm, this seems to have been due to some difference in behavior between the <code>files</code> and <code>data</code> parameters of <code>requests.get()</code></li>
<li>I finalized the <code>post_bitstreams.py</code> script and uploaded eighty-five PDF thumbnails</li>
</ul>
</li>
<li>It seems Bizu uploaded covers for a handful so I deleted them and ran them through the script to get proper thumbnails</li>
</ul>
<!-- raw HTML omitted -->

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/" />
<meta property="og:updated_time" content="2023-03-09T17:01:50+03:00" />
<meta property="og:updated_time" content="2023-03-10T17:34:05+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2023-03-09T17:01:50+03:00" />
<meta property="og:updated_time" content="2023-03-10T17:34:05+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2023-03-09T17:01:50+03:00" />
<meta property="og:updated_time" content="2023-03-10T17:34:05+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2023-03-09T17:01:50+03:00" />
<meta property="og:updated_time" content="2023-03-10T17:34:05+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2023-03-09T17:01:50+03:00" />
<meta property="og:updated_time" content="2023-03-10T17:34:05+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2023-03-09T17:01:50+03:00" />
<meta property="og:updated_time" content="2023-03-10T17:34:05+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2023-03-09T17:01:50+03:00" />
<meta property="og:updated_time" content="2023-03-10T17:34:05+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2023-03-09T17:01:50+03:00" />
<meta property="og:updated_time" content="2023-03-10T17:34:05+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2023-03-09T17:01:50+03:00" />
<meta property="og:updated_time" content="2023-03-10T17:34:05+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2023-03-09T17:01:50+03:00" />
<meta property="og:updated_time" content="2023-03-10T17:34:05+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2023-03-09T17:01:50+03:00" />
<meta property="og:updated_time" content="2023-03-10T17:34:05+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2023-03-09T17:01:50+03:00" />
<meta property="og:updated_time" content="2023-03-10T17:34:05+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2023-03-09T17:01:50+03:00" />
<meta property="og:updated_time" content="2023-03-10T17:34:05+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2023-03-09T17:01:50+03:00" />
<meta property="og:updated_time" content="2023-03-10T17:34:05+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2023-03-09T17:01:50+03:00" />
<meta property="og:updated_time" content="2023-03-10T17:34:05+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2023-03-09T17:01:50+03:00" />
<meta property="og:updated_time" content="2023-03-10T17:34:05+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2023-03-09T17:01:50+03:00" />
<meta property="og:updated_time" content="2023-03-10T17:34:05+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2023-03-09T17:01:50+03:00" />
<meta property="og:updated_time" content="2023-03-10T17:34:05+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2023-03-09T17:01:50+03:00" />
<meta property="og:updated_time" content="2023-03-10T17:34:05+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2023-03-09T17:01:50+03:00" />
<meta property="og:updated_time" content="2023-03-10T17:34:05+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2023-03-09T17:01:50+03:00" />
<meta property="og:updated_time" content="2023-03-10T17:34:05+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2023-03-09T17:01:50+03:00" />
<meta property="og:updated_time" content="2023-03-10T17:34:05+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2023-03-09T17:01:50+03:00" />
<meta property="og:updated_time" content="2023-03-10T17:34:05+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2023-03-09T17:01:50+03:00" />
<meta property="og:updated_time" content="2023-03-10T17:34:05+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2023-03-09T17:01:50+03:00" />
<meta property="og:updated_time" content="2023-03-10T17:34:05+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2023-03-09T17:01:50+03:00" />
<meta property="og:updated_time" content="2023-03-10T17:34:05+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2023-03-09T17:01:50+03:00" />
<meta property="og:updated_time" content="2023-03-10T17:34:05+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2023-03-09T17:01:50+03:00" />
<meta property="og:updated_time" content="2023-03-10T17:34:05+03:00" />

View File

@ -3,19 +3,19 @@
xmlns:xhtml="http://www.w3.org/1999/xhtml">
<url>
<loc>https://alanorth.github.io/cgspace-notes/categories/</loc>
<lastmod>2023-03-09T17:01:50+03:00</lastmod>
<lastmod>2023-03-10T17:34:05+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/</loc>
<lastmod>2023-03-09T17:01:50+03:00</lastmod>
<lastmod>2023-03-10T17:34:05+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/2023-03/</loc>
<lastmod>2023-03-09T17:01:50+03:00</lastmod>
<lastmod>2023-03-10T17:34:05+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/categories/notes/</loc>
<lastmod>2023-03-09T17:01:50+03:00</lastmod>
<lastmod>2023-03-10T17:34:05+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/posts/</loc>
<lastmod>2023-03-09T17:01:50+03:00</lastmod>
<lastmod>2023-03-10T17:34:05+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/2023-02/</loc>
<lastmod>2023-03-01T08:30:25+03:00</lastmod>