Add notes for 2022-12-08

This commit is contained in:
Alan Orth 2022-12-08 18:59:57 +02:00
parent 4200ae4189
commit 1bafe6ce71
Signed by: alanorth
GPG Key ID: 0FB860CC9C45B1B9
29 changed files with 107 additions and 34 deletions

View File

@ -88,5 +88,40 @@ $ csvgrep -c matched -m true /tmp/cgspace-matches.csv | wc -l
- This means I've added a few thousand UN M.49 regions to the `cg.coverage.subregion` field in the last few days
- I had to extract them from CGSpace and delete them using `delete-metadata-values.py`
- My [DSpace 7.x pull request to tell ImageMagick about the PDF CropBox](https://github.com/DSpace/DSpace/pull/8550) was merged
- Start a harvest on AReS
## 2022-12-08
- While on the plane I decided to fix some ORCID identifiers, as I had seen some poorly formatted ones
- I couldn't remember the XPath syntax so this was kinda ghetto:
```console
$ xmllint --xpath '//node/isComposedBy/node()' dspace/config/controlled-vocabularies/cg-creator-identifier.xml | grep -oE 'label=".*"' | sed -e 's/label="//' -e 's/"$//' > /tmp/orcid-names.txt
$ ./ilri/update-orcids.py -i /tmp/orcid-names.txt -db dspace -u dspace -p 'fuuu' -m 247
```
- After that there were still some poorly formatted ones that my script didn't fix, so perhaps these are new ones not in our list
- I dumped them and combined with the existing ones to resolve later:
```console
localhost/dspace= ☘ \COPY (SELECT dspace_object_id,text_value FROM metadatavalue WHERE metadata_field_id=247 AND text_value LIKE '%http%') to /tmp/orcid-formatting.txt;
COPY 36
```
- I think there are really just some new ones...
```console
$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-identifier.xml /tmp/orcid-formatting.txt| grep -oE '[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}' | sort -u > /tmp/2022-12-08-orcids.txt
$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-identifier.xml | grep -oE '[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}' | sort -u | wc -l
1907
$ wc -l /tmp/2022-12-08-orcids.txt
1939 /tmp/2022-12-08-orcids.txt
```
- Then I applied these updates on CGSpace
- Maria mentioned that she was getting a lot more items in her daily subscription emails
- I had a hunch it was related to me updating the `last_modified` timestamp after updating a bunch of countries, regions, etc in items
- Then today I noticed this option in `dspace.cfg`: `eperson.subscription.onlynew`
- By default DSpace sends notifications for modified items too! I've disabled it now...
<!-- vim: set sw=2 ts=2: -->

View File

@ -20,7 +20,7 @@ Replace &ldquo;East Asia&rdquo; with &ldquo;Eastern Asia&rdquo; region on CGSpac
<meta property="og:type" content="article" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2022-12/" />
<meta property="article:published_time" content="2022-12-01T08:52:36+03:00" />
<meta property="article:modified_time" content="2022-12-04T03:19:49+03:00" />
<meta property="article:modified_time" content="2022-12-07T22:59:37+01:00" />
@ -46,9 +46,9 @@ Replace &ldquo;East Asia&rdquo; with &ldquo;Eastern Asia&rdquo; region on CGSpac
"@type": "BlogPosting",
"headline": "December, 2022",
"url": "https://alanorth.github.io/cgspace-notes/2022-12/",
"wordCount": "617",
"wordCount": "843",
"datePublished": "2022-12-01T08:52:36+03:00",
"dateModified": "2022-12-04T03:19:49+03:00",
"dateModified": "2022-12-07T22:59:37+01:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@ -209,6 +209,44 @@ Replace &ldquo;East Asia&rdquo; with &ldquo;Eastern Asia&rdquo; region on CGSpac
</ul>
</li>
<li>My <a href="https://github.com/DSpace/DSpace/pull/8550">DSpace 7.x pull request to tell ImageMagick about the PDF CropBox</a> was merged</li>
<li>Start a harvest on AReS</li>
</ul>
<h2 id="2022-12-08">2022-12-08</h2>
<ul>
<li>While on the plane I decided to fix some ORCID identifiers, as I had seen some poorly formatted ones
<ul>
<li>I couldn&rsquo;t remember the XPath syntax so this was kinda ghetto:</li>
</ul>
</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ xmllint --xpath <span style="color:#e6db74">&#39;//node/isComposedBy/node()&#39;</span> dspace/config/controlled-vocabularies/cg-creator-identifier.xml | grep -oE <span style="color:#e6db74">&#39;label=&#34;.*&#34;&#39;</span> | sed -e <span style="color:#e6db74">&#39;s/label=&#34;//&#39;</span> -e <span style="color:#e6db74">&#39;s/&#34;$//&#39;</span> &gt; /tmp/orcid-names.txt
</span></span><span style="display:flex;"><span>$ ./ilri/update-orcids.py -i /tmp/orcid-names.txt -db dspace -u dspace -p <span style="color:#e6db74">&#39;fuuu&#39;</span> -m <span style="color:#ae81ff">247</span>
</span></span></code></pre></div><ul>
<li>After that there were still some poorly formatted ones that my script didn&rsquo;t fix, so perhaps these are new ones not in our list
<ul>
<li>I dumped them and combined with the existing ones to resolve later:</li>
</ul>
</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace= ☘ \COPY (SELECT dspace_object_id,text_value FROM metadatavalue WHERE metadata_field_id=247 AND text_value LIKE &#39;%http%&#39;) to /tmp/orcid-formatting.txt;
</span></span><span style="display:flex;"><span>COPY 36
</span></span></code></pre></div><ul>
<li>I think there are really just some new ones&hellip;</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-identifier.xml /tmp/orcid-formatting.txt| grep -oE <span style="color:#e6db74">&#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39;</span> | sort -u &gt; /tmp/2022-12-08-orcids.txt
</span></span><span style="display:flex;"><span>$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-identifier.xml | grep -oE <span style="color:#e6db74">&#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39;</span> | sort -u | wc -l
</span></span><span style="display:flex;"><span>1907
</span></span><span style="display:flex;"><span>$ wc -l /tmp/2022-12-08-orcids.txt
</span></span><span style="display:flex;"><span>1939 /tmp/2022-12-08-orcids.txt
</span></span></code></pre></div><ul>
<li>Then I applied these updates on CGSpace</li>
<li>Maria mentioned that she was getting a lot more items in her daily subscription emails
<ul>
<li>I had a hunch it was related to me updating the <code>last_modified</code> timestamp after updating a bunch of countries, regions, etc in items</li>
<li>Then today I noticed this option in <code>dspace.cfg</code>: <code>eperson.subscription.onlynew</code></li>
<li>By default DSpace sends notifications for modified items too! I&rsquo;ve disabled it now&hellip;</li>
</ul>
</li>
</ul>
<!-- raw HTML omitted -->

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/" />
<meta property="og:updated_time" content="2022-12-04T03:19:49+03:00" />
<meta property="og:updated_time" content="2022-12-07T22:59:37+01:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2022-12-04T03:19:49+03:00" />
<meta property="og:updated_time" content="2022-12-07T22:59:37+01:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2022-12-04T03:19:49+03:00" />
<meta property="og:updated_time" content="2022-12-07T22:59:37+01:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2022-12-04T03:19:49+03:00" />
<meta property="og:updated_time" content="2022-12-07T22:59:37+01:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2022-12-04T03:19:49+03:00" />
<meta property="og:updated_time" content="2022-12-07T22:59:37+01:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2022-12-04T03:19:49+03:00" />
<meta property="og:updated_time" content="2022-12-07T22:59:37+01:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2022-12-04T03:19:49+03:00" />
<meta property="og:updated_time" content="2022-12-07T22:59:37+01:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2022-12-04T03:19:49+03:00" />
<meta property="og:updated_time" content="2022-12-07T22:59:37+01:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2022-12-04T03:19:49+03:00" />
<meta property="og:updated_time" content="2022-12-07T22:59:37+01:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2022-12-04T03:19:49+03:00" />
<meta property="og:updated_time" content="2022-12-07T22:59:37+01:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2022-12-04T03:19:49+03:00" />
<meta property="og:updated_time" content="2022-12-07T22:59:37+01:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2022-12-04T03:19:49+03:00" />
<meta property="og:updated_time" content="2022-12-07T22:59:37+01:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2022-12-04T03:19:49+03:00" />
<meta property="og:updated_time" content="2022-12-07T22:59:37+01:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2022-12-04T03:19:49+03:00" />
<meta property="og:updated_time" content="2022-12-07T22:59:37+01:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2022-12-04T03:19:49+03:00" />
<meta property="og:updated_time" content="2022-12-07T22:59:37+01:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2022-12-04T03:19:49+03:00" />
<meta property="og:updated_time" content="2022-12-07T22:59:37+01:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2022-12-04T03:19:49+03:00" />
<meta property="og:updated_time" content="2022-12-07T22:59:37+01:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2022-12-04T03:19:49+03:00" />
<meta property="og:updated_time" content="2022-12-07T22:59:37+01:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2022-12-04T03:19:49+03:00" />
<meta property="og:updated_time" content="2022-12-07T22:59:37+01:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2022-12-04T03:19:49+03:00" />
<meta property="og:updated_time" content="2022-12-07T22:59:37+01:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2022-12-04T03:19:49+03:00" />
<meta property="og:updated_time" content="2022-12-07T22:59:37+01:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2022-12-04T03:19:49+03:00" />
<meta property="og:updated_time" content="2022-12-07T22:59:37+01:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2022-12-04T03:19:49+03:00" />
<meta property="og:updated_time" content="2022-12-07T22:59:37+01:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2022-12-04T03:19:49+03:00" />
<meta property="og:updated_time" content="2022-12-07T22:59:37+01:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2022-12-04T03:19:49+03:00" />
<meta property="og:updated_time" content="2022-12-07T22:59:37+01:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2022-12-04T03:19:49+03:00" />
<meta property="og:updated_time" content="2022-12-07T22:59:37+01:00" />

View File

@ -3,19 +3,19 @@
xmlns:xhtml="http://www.w3.org/1999/xhtml">
<url>
<loc>https://alanorth.github.io/cgspace-notes/categories/</loc>
<lastmod>2022-12-04T03:19:49+03:00</lastmod>
<lastmod>2022-12-07T22:59:37+01:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/</loc>
<lastmod>2022-12-04T03:19:49+03:00</lastmod>
<lastmod>2022-12-07T22:59:37+01:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/2022-12/</loc>
<lastmod>2022-12-04T03:19:49+03:00</lastmod>
<lastmod>2022-12-07T22:59:37+01:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/categories/notes/</loc>
<lastmod>2022-12-04T03:19:49+03:00</lastmod>
<lastmod>2022-12-07T22:59:37+01:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/posts/</loc>
<lastmod>2022-12-04T03:19:49+03:00</lastmod>
<lastmod>2022-12-07T22:59:37+01:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/2022-11/</loc>
<lastmod>2022-12-03T10:46:29+03:00</lastmod>