Add notes for 2022-11-28

This commit is contained in:
Alan Orth 2022-11-28 17:42:46 +03:00
parent f5750dab39
commit 8199de67ad
Signed by: alanorth
GPG Key ID: 0FB860CC9C45B1B9
29 changed files with 151 additions and 34 deletions

View File

@ -367,4 +367,56 @@ java.lang.IndexOutOfBoundsException: 1-based index out of bounds: 2
- I synced DSpace 7 Test with CGSpace
- I had to follow my notes from 2022-03 to delete the missing Atmire migrations
## 2022-11-28
- Update `ilri/fix-metadata-values.py` to update the `last_modified` date for items when it updates metadata
- This should allow us to use the normal `index-discovery` (with out `-b`) as well as having REST API responses showing a correct last modified date
- Maria asked me to add some ORCID identifiers for Alliance staff to the controlled vocabulary
- I also updated the `add-orcid-identifiers-csv.py` to update the `last_modified` timestamp of the item
- I re-factored my CGSpace Python scripts to use a helper `util.py` module with common functions
- For now it only has the one for updating an item's `last_modified` timestamp but I will gradually add more
- I also ran our list of ORCID identifiers against ORCID's API to see if anyone changed their name format
- Then I ran them on CGSpace with `ilri/update-orcids.py` to fix them
- Normalize the `text_lang` values for CGSpace metadata again:
```console
localhost/dspacetest= ☘ SELECT DISTINCT text_lang, count(text_lang) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) GROUP BY text_lang ORDER BY count DESC;
text_lang │ count
───────────┼─────────
en_US │ 2912429
│ 108387
en │ 12457
fr │ 2
vi │ 2
es │ 1
␀ │ 0
(7 rows)
Time: 624.651 ms
localhost/dspacetest= ☘ BEGIN;
BEGIN
Time: 0.130 ms
localhost/dspacetest= ☘ UPDATE metadatavalue SET text_lang='en_US' WHERE dspace_object_id IN (SELECT uuid FROM item) AND text_lang IN ('en', '');
UPDATE 120844
Time: 4074.879 ms (00:04.075)
localhost/dspacetest= ☘ SELECT DISTINCT text_lang, count(text_lang) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) GROUP BY text_lang ORDER BY count DESC;
text_lang │ count
───────────┼─────────
en_US │ 3033273
fr │ 2
vi │ 2
es │ 1
␀ │ 0
(5 rows)
Time: 346.913 ms
localhost/dspacetest= ☘ COMMIT;
```
- Discussing the UN M.49 regions on CGSpace with Valentina and Abenet
- The PRMS team is confused about our regions, which are mostly UN M.49 with some legacy stuff using different ones
- I think we can fix all the stuff for Initiatives from this year very easily, then work on the legacy stuff later
- Also, I noticed that that [country_converter was using the wrong UN M.49 region for Myanmar](https://github.com/konstantinstadler/country_converter/issues/124)
- I submitted a [pull request](https://github.com/konstantinstadler/country_converter/pull/125)
<!-- vim: set sw=2 ts=2: -->

View File

@ -24,7 +24,7 @@ I reverted the Cocoon autosave change because it was more of a nuissance that Pe
<meta property="og:type" content="article" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2022-11/" />
<meta property="article:published_time" content="2022-11-01T09:11:36+03:00" />
<meta property="article:modified_time" content="2022-11-27T12:38:48+03:00" />
<meta property="article:modified_time" content="2022-11-27T13:52:43+03:00" />
@ -54,9 +54,9 @@ I reverted the Cocoon autosave change because it was more of a nuissance that Pe
"@type": "BlogPosting",
"headline": "November, 2022",
"url": "https://alanorth.github.io/cgspace-notes/2022-11/",
"wordCount": "2297",
"wordCount": "2640",
"datePublished": "2022-11-01T09:11:36+03:00",
"dateModified": "2022-11-27T12:38:48+03:00",
"dateModified": "2022-11-27T13:52:43+03:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@ -544,6 +544,71 @@ I reverted the Cocoon autosave change because it was more of a nuissance that Pe
</ul>
</li>
</ul>
<h2 id="2022-11-28">2022-11-28</h2>
<ul>
<li>Update <code>ilri/fix-metadata-values.py</code> to update the <code>last_modified</code> date for items when it updates metadata
<ul>
<li>This should allow us to use the normal <code>index-discovery</code> (with out <code>-b</code>) as well as having REST API responses showing a correct last modified date</li>
</ul>
</li>
<li>Maria asked me to add some ORCID identifiers for Alliance staff to the controlled vocabulary
<ul>
<li>I also updated the <code>add-orcid-identifiers-csv.py</code> to update the <code>last_modified</code> timestamp of the item</li>
</ul>
</li>
<li>I re-factored my CGSpace Python scripts to use a helper <code>util.py</code> module with common functions
<ul>
<li>For now it only has the one for updating an item&rsquo;s <code>last_modified</code> timestamp but I will gradually add more</li>
</ul>
</li>
<li>I also ran our list of ORCID identifiers against ORCID&rsquo;s API to see if anyone changed their name format
<ul>
<li>Then I ran them on CGSpace with <code>ilri/update-orcids.py</code> to fix them</li>
</ul>
</li>
<li>Normalize the <code>text_lang</code> values for CGSpace metadata again:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspacetest= ☘ SELECT DISTINCT text_lang, count(text_lang) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) GROUP BY text_lang ORDER BY count DESC;
</span></span><span style="display:flex;"><span> text_lang │ count
</span></span><span style="display:flex;"><span>───────────┼─────────
</span></span><span style="display:flex;"><span> en_US │ 2912429
</span></span><span style="display:flex;"><span> │ 108387
</span></span><span style="display:flex;"><span> en │ 12457
</span></span><span style="display:flex;"><span> fr │ 2
</span></span><span style="display:flex;"><span> vi │ 2
</span></span><span style="display:flex;"><span> es │ 1
</span></span><span style="display:flex;"><span> ␀ │ 0
</span></span><span style="display:flex;"><span>(7 rows)
</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Time: 624.651 ms
</span></span><span style="display:flex;"><span>localhost/dspacetest= ☘ BEGIN;
</span></span><span style="display:flex;"><span>BEGIN
</span></span><span style="display:flex;"><span>Time: 0.130 ms
</span></span><span style="display:flex;"><span>localhost/dspacetest= ☘ UPDATE metadatavalue SET text_lang=&#39;en_US&#39; WHERE dspace_object_id IN (SELECT uuid FROM item) AND text_lang IN (&#39;en&#39;, &#39;&#39;);
</span></span><span style="display:flex;"><span>UPDATE 120844
</span></span><span style="display:flex;"><span>Time: 4074.879 ms (00:04.075)
</span></span><span style="display:flex;"><span>localhost/dspacetest= ☘ SELECT DISTINCT text_lang, count(text_lang) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) GROUP BY text_lang ORDER BY count DESC;
</span></span><span style="display:flex;"><span> text_lang │ count
</span></span><span style="display:flex;"><span>───────────┼─────────
</span></span><span style="display:flex;"><span> en_US │ 3033273
</span></span><span style="display:flex;"><span> fr │ 2
</span></span><span style="display:flex;"><span> vi │ 2
</span></span><span style="display:flex;"><span> es │ 1
</span></span><span style="display:flex;"><span> ␀ │ 0
</span></span><span style="display:flex;"><span>(5 rows)
</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Time: 346.913 ms
</span></span><span style="display:flex;"><span>localhost/dspacetest= ☘ COMMIT;
</span></span></code></pre></div><ul>
<li>Discussing the UN M.49 regions on CGSpace with Valentina and Abenet
<ul>
<li>The PRMS team is confused about our regions, which are mostly UN M.49 with some legacy stuff using different ones</li>
<li>I think we can fix all the stuff for Initiatives from this year very easily, then work on the legacy stuff later</li>
<li>Also, I noticed that that <a href="https://github.com/konstantinstadler/country_converter/issues/124">country_converter was using the wrong UN M.49 region for Myanmar</a></li>
<li>I submitted a <a href="https://github.com/konstantinstadler/country_converter/pull/125">pull request</a></li>
</ul>
</li>
</ul>
<!-- raw HTML omitted -->

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/" />
<meta property="og:updated_time" content="2022-11-27T12:38:48+03:00" />
<meta property="og:updated_time" content="2022-11-27T13:52:43+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2022-11-27T12:38:48+03:00" />
<meta property="og:updated_time" content="2022-11-27T13:52:43+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2022-11-27T12:38:48+03:00" />
<meta property="og:updated_time" content="2022-11-27T13:52:43+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2022-11-27T12:38:48+03:00" />
<meta property="og:updated_time" content="2022-11-27T13:52:43+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2022-11-27T12:38:48+03:00" />
<meta property="og:updated_time" content="2022-11-27T13:52:43+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2022-11-27T12:38:48+03:00" />
<meta property="og:updated_time" content="2022-11-27T13:52:43+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2022-11-27T12:38:48+03:00" />
<meta property="og:updated_time" content="2022-11-27T13:52:43+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2022-11-27T12:38:48+03:00" />
<meta property="og:updated_time" content="2022-11-27T13:52:43+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2022-11-27T12:38:48+03:00" />
<meta property="og:updated_time" content="2022-11-27T13:52:43+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2022-11-27T12:38:48+03:00" />
<meta property="og:updated_time" content="2022-11-27T13:52:43+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2022-11-27T12:38:48+03:00" />
<meta property="og:updated_time" content="2022-11-27T13:52:43+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2022-11-27T12:38:48+03:00" />
<meta property="og:updated_time" content="2022-11-27T13:52:43+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2022-11-27T12:38:48+03:00" />
<meta property="og:updated_time" content="2022-11-27T13:52:43+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2022-11-27T12:38:48+03:00" />
<meta property="og:updated_time" content="2022-11-27T13:52:43+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2022-11-27T12:38:48+03:00" />
<meta property="og:updated_time" content="2022-11-27T13:52:43+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2022-11-27T12:38:48+03:00" />
<meta property="og:updated_time" content="2022-11-27T13:52:43+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2022-11-27T12:38:48+03:00" />
<meta property="og:updated_time" content="2022-11-27T13:52:43+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2022-11-27T12:38:48+03:00" />
<meta property="og:updated_time" content="2022-11-27T13:52:43+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2022-11-27T12:38:48+03:00" />
<meta property="og:updated_time" content="2022-11-27T13:52:43+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2022-11-27T12:38:48+03:00" />
<meta property="og:updated_time" content="2022-11-27T13:52:43+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2022-11-27T12:38:48+03:00" />
<meta property="og:updated_time" content="2022-11-27T13:52:43+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2022-11-27T12:38:48+03:00" />
<meta property="og:updated_time" content="2022-11-27T13:52:43+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2022-11-27T12:38:48+03:00" />
<meta property="og:updated_time" content="2022-11-27T13:52:43+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2022-11-27T12:38:48+03:00" />
<meta property="og:updated_time" content="2022-11-27T13:52:43+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2022-11-27T12:38:48+03:00" />
<meta property="og:updated_time" content="2022-11-27T13:52:43+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2022-11-27T12:38:48+03:00" />
<meta property="og:updated_time" content="2022-11-27T13:52:43+03:00" />

View File

@ -3,19 +3,19 @@
xmlns:xhtml="http://www.w3.org/1999/xhtml">
<url>
<loc>https://alanorth.github.io/cgspace-notes/categories/</loc>
<lastmod>2022-11-27T12:38:48+03:00</lastmod>
<lastmod>2022-11-27T13:52:43+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/</loc>
<lastmod>2022-11-27T12:38:48+03:00</lastmod>
<lastmod>2022-11-27T13:52:43+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/categories/notes/</loc>
<lastmod>2022-11-27T12:38:48+03:00</lastmod>
<lastmod>2022-11-27T13:52:43+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/2022-11/</loc>
<lastmod>2022-11-27T12:38:48+03:00</lastmod>
<lastmod>2022-11-27T13:52:43+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/posts/</loc>
<lastmod>2022-11-27T12:38:48+03:00</lastmod>
<lastmod>2022-11-27T13:52:43+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/2022-10/</loc>
<lastmod>2022-10-31T16:59:47+03:00</lastmod>