Add notes for 2023-02-12

This commit is contained in:
Alan Orth 2023-02-13 10:33:16 +03:00
parent d5214f02e1
commit 0b64999280
Signed by: alanorth
GPG Key ID: 0FB860CC9C45B1B9
31 changed files with 119 additions and 36 deletions

View File

@ -87,4 +87,45 @@ curl -f -H "Content-Type: application/json" -X POST "https://dspacetest.cgiar.or
- Export CGSpace to update Initiative mappings and country/region mappings
- Then start a harvest on AReS
## 2023-02-09
- Do some minor work on the CSS on the DSpace 7 test
## 2023-02-10
- I noticed a large number of PostgreSQL locks from dspaceWeb on CGSpace:
```console
$ psql -c 'SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;' | grep -o -E '(dspaceWeb|dspaceApi|dspaceCli)' | sort | uniq -c
2033 dspaceWeb
```
- Looking at the lock age, I see some already 1 day old, including this curious query:
```console
select nextval ('public.registrationdata_seq')
```
- I killed all locks that were more than a few hours old
- Export CGSpace to update Initiative collection mappings
- Discuss adding `dcterms.available` to the submission form
- I also looked in the `dcterms.description` field on CGSpace and found ~1,500 items where the is an indication of an online published date
- Using some facets in OpenRefine I narrowed down the ones mentioning "online" and then extracted the dates to a new column:
```console
cells['dcterms.description[en_US]'].value.replace(/.*?(\d+{2}) ([a-zA-Z]+) (\d+{2}).*/,"$3-$2-$1")
```
- Then to handle formats like "2022-April-26" and "2021-Nov-11" I used some replacement GRELs (note the order so we don't replace short patterns in longer strings prematurely):
```console
value.replace("January","01").replace("February","02").replace("March","03").replace("April","04").replace("May","05").replace("June","06").replace("July","07").replace("August","08").replace("September","09").replace("October","10").replace("November","11").replace("December","12")
value.replace("Jan","01").replace("Feb","02").replace("Mar","03").replace("Apr","04").replace("May","05").replace("Jun","06").replace("Jul","07").replace("Aug","08").replace("Sep","09").replace("Oct","10").replace("Nov","11").replace("Dec","12")
```
- This covered about 1,300 items, then I did about 100 more messier ones with some more regex wranling
- I removed the `dcterms.description[en_US]` field from items where I updated the dates
- Then I added `dcterms.available` to the submission form and the item view
- We need to announce this to the editors
<!-- vim: set sw=2 ts=2: -->

View File

@ -18,7 +18,7 @@ I want to try to expand my use of their data to journals, publishers, volumes, i
<meta property="og:type" content="article" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2023-02/" />
<meta property="article:published_time" content="2023-02-01T10:57:36+03:00" />
<meta property="article:modified_time" content="2023-02-01T10:57:36+03:00" />
<meta property="article:modified_time" content="2023-02-09T08:50:54+03:00" />
@ -42,9 +42,9 @@ I want to try to expand my use of their data to journals, publishers, volumes, i
"@type": "BlogPosting",
"headline": "February, 2023",
"url": "https://alanorth.github.io/cgspace-notes/2023-02/",
"wordCount": "597",
"wordCount": "821",
"datePublished": "2023-02-01T10:57:36+03:00",
"dateModified": "2023-02-01T10:57:36+03:00",
"dateModified": "2023-02-09T08:50:54+03:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@ -210,6 +210,48 @@ I want to try to expand my use of their data to journals, publishers, volumes, i
</ul>
</li>
</ul>
<h2 id="2023-02-09">2023-02-09</h2>
<ul>
<li>Do some minor work on the CSS on the DSpace 7 test</li>
</ul>
<h2 id="2023-02-10">2023-02-10</h2>
<ul>
<li>I noticed a large number of PostgreSQL locks from dspaceWeb on CGSpace:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">&#39;SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;&#39;</span> | grep -o -E <span style="color:#e6db74">&#39;(dspaceWeb|dspaceApi|dspaceCli)&#39;</span> | sort | uniq -c
</span></span><span style="display:flex;"><span> 2033 dspaceWeb
</span></span></code></pre></div><ul>
<li>Looking at the lock age, I see some already 1 day old, including this curious query:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>select nextval (&#39;public.registrationdata_seq&#39;)
</span></span></code></pre></div><ul>
<li>I killed all locks that were more than a few hours old</li>
<li>Export CGSpace to update Initiative collection mappings</li>
<li>Discuss adding <code>dcterms.available</code> to the submission form
<ul>
<li>I also looked in the <code>dcterms.description</code> field on CGSpace and found ~1,500 items where the is an indication of an online published date</li>
<li>Using some facets in OpenRefine I narrowed down the ones mentioning &ldquo;online&rdquo; and then extracted the dates to a new column:</li>
</ul>
</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>cells[&#39;dcterms.description[en_US]&#39;].value.replace(/.*?(\d+{2}) ([a-zA-Z]+) (\d+{2}).*/,&#34;$3-$2-$1&#34;)
</span></span></code></pre></div><ul>
<li>Then to handle formats like &ldquo;2022-April-26&rdquo; and &ldquo;2021-Nov-11&rdquo; I used some replacement GRELs (note the order so we don&rsquo;t replace short patterns in longer strings prematurely):</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>value.replace(&#34;January&#34;,&#34;01&#34;).replace(&#34;February&#34;,&#34;02&#34;).replace(&#34;March&#34;,&#34;03&#34;).replace(&#34;April&#34;,&#34;04&#34;).replace(&#34;May&#34;,&#34;05&#34;).replace(&#34;June&#34;,&#34;06&#34;).replace(&#34;July&#34;,&#34;07&#34;).replace(&#34;August&#34;,&#34;08&#34;).replace(&#34;September&#34;,&#34;09&#34;).replace(&#34;October&#34;,&#34;10&#34;).replace(&#34;November&#34;,&#34;11&#34;).replace(&#34;December&#34;,&#34;12&#34;)
</span></span><span style="display:flex;"><span>value.replace(&#34;Jan&#34;,&#34;01&#34;).replace(&#34;Feb&#34;,&#34;02&#34;).replace(&#34;Mar&#34;,&#34;03&#34;).replace(&#34;Apr&#34;,&#34;04&#34;).replace(&#34;May&#34;,&#34;05&#34;).replace(&#34;Jun&#34;,&#34;06&#34;).replace(&#34;Jul&#34;,&#34;07&#34;).replace(&#34;Aug&#34;,&#34;08&#34;).replace(&#34;Sep&#34;,&#34;09&#34;).replace(&#34;Oct&#34;,&#34;10&#34;).replace(&#34;Nov&#34;,&#34;11&#34;).replace(&#34;Dec&#34;,&#34;12&#34;)
</span></span></code></pre></div><ul>
<li>This covered about 1,300 items, then I did about 100 more messier ones with some more regex wranling
<ul>
<li>I removed the <code>dcterms.description[en_US]</code> field from items where I updated the dates</li>
</ul>
</li>
<li>Then I added <code>dcterms.available</code> to the submission form and the item view
<ul>
<li>We need to announce this to the editors</li>
</ul>
</li>
</ul>
<!-- raw HTML omitted -->

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/" />
<meta property="og:updated_time" content="2023-02-01T10:57:36+03:00" />
<meta property="og:updated_time" content="2023-02-09T08:50:54+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2023-02-01T10:57:36+03:00" />
<meta property="og:updated_time" content="2023-02-09T08:50:54+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2023-02-01T10:57:36+03:00" />
<meta property="og:updated_time" content="2023-02-09T08:50:54+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2023-02-01T10:57:36+03:00" />
<meta property="og:updated_time" content="2023-02-09T08:50:54+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2023-02-01T10:57:36+03:00" />
<meta property="og:updated_time" content="2023-02-09T08:50:54+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2023-02-01T10:57:36+03:00" />
<meta property="og:updated_time" content="2023-02-09T08:50:54+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2023-02-01T10:57:36+03:00" />
<meta property="og:updated_time" content="2023-02-09T08:50:54+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2023-02-01T10:57:36+03:00" />
<meta property="og:updated_time" content="2023-02-09T08:50:54+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2023-02-01T10:57:36+03:00" />
<meta property="og:updated_time" content="2023-02-09T08:50:54+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2023-02-01T10:57:36+03:00" />
<meta property="og:updated_time" content="2023-02-09T08:50:54+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2023-02-01T10:57:36+03:00" />
<meta property="og:updated_time" content="2023-02-09T08:50:54+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2023-02-01T10:57:36+03:00" />
<meta property="og:updated_time" content="2023-02-09T08:50:54+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2023-02-01T10:57:36+03:00" />
<meta property="og:updated_time" content="2023-02-09T08:50:54+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2023-02-01T10:57:36+03:00" />
<meta property="og:updated_time" content="2023-02-09T08:50:54+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2023-02-01T10:57:36+03:00" />
<meta property="og:updated_time" content="2023-02-09T08:50:54+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2023-02-01T10:57:36+03:00" />
<meta property="og:updated_time" content="2023-02-09T08:50:54+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2023-02-01T10:57:36+03:00" />
<meta property="og:updated_time" content="2023-02-09T08:50:54+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2023-02-01T10:57:36+03:00" />
<meta property="og:updated_time" content="2023-02-09T08:50:54+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2023-02-01T10:57:36+03:00" />
<meta property="og:updated_time" content="2023-02-09T08:50:54+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2023-02-01T10:57:36+03:00" />
<meta property="og:updated_time" content="2023-02-09T08:50:54+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2023-02-01T10:57:36+03:00" />
<meta property="og:updated_time" content="2023-02-09T08:50:54+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2023-02-01T10:57:36+03:00" />
<meta property="og:updated_time" content="2023-02-09T08:50:54+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2023-02-01T10:57:36+03:00" />
<meta property="og:updated_time" content="2023-02-09T08:50:54+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2023-02-01T10:57:36+03:00" />
<meta property="og:updated_time" content="2023-02-09T08:50:54+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2023-02-01T10:57:36+03:00" />
<meta property="og:updated_time" content="2023-02-09T08:50:54+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2023-02-01T10:57:36+03:00" />
<meta property="og:updated_time" content="2023-02-09T08:50:54+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2023-02-01T10:57:36+03:00" />
<meta property="og:updated_time" content="2023-02-09T08:50:54+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2023-02-01T10:57:36+03:00" />
<meta property="og:updated_time" content="2023-02-09T08:50:54+03:00" />

View File

@ -3,19 +3,19 @@
xmlns:xhtml="http://www.w3.org/1999/xhtml">
<url>
<loc>https://alanorth.github.io/cgspace-notes/categories/</loc>
<lastmod>2023-02-01T10:57:36+03:00</lastmod>
<lastmod>2023-02-09T08:50:54+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/</loc>
<lastmod>2023-02-01T10:57:36+03:00</lastmod>
<lastmod>2023-02-09T08:50:54+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/2023-02/</loc>
<lastmod>2023-02-01T10:57:36+03:00</lastmod>
<lastmod>2023-02-09T08:50:54+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/categories/notes/</loc>
<lastmod>2023-02-01T10:57:36+03:00</lastmod>
<lastmod>2023-02-09T08:50:54+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/posts/</loc>
<lastmod>2023-02-01T10:57:36+03:00</lastmod>
<lastmod>2023-02-09T08:50:54+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/2023-01/</loc>
<lastmod>2023-01-31T22:20:38+03:00</lastmod>