Notes for 2024-01-23

This commit is contained in:
Alan Orth 2024-01-24 08:24:50 +03:00
parent 57fe0587a4
commit 300b2e4271
Signed by: alanorth
GPG Key ID: 0FB860CC9C45B1B9
34 changed files with 140 additions and 39 deletions

View File

@ -372,4 +372,49 @@ tee /tmp/ips.txt | wc -l
- 396982 Google Cloud
- The load on the server *immediately* dropped
## 2024-01-17
- It turns out AS701 (UUNET) is Verizon Business, which is used as an ISP for many staff at IFPRI
- This was causing them to see HTTP 429 "too many requests" errors on CGSpace
- I removed this ASN from the rate limiting
## 2024-01-18
- Start looking at Solr stats again
- I found one statistics record that has 22,000 of the same collection in `owningColl` and 22,000 of the same community in `owningComm`
- The record is from 2015 and think it would be easier to delete it than fix it:
```console
$ curl http://localhost:8983/solr/statistics/update -H "Content-type: text/xml" --data-binary '<delete><query>uid:3b4eefba-a302-4172-a286-dcb25d70129e</query></delete>'
```
- Looking again, there are at least 1,000 of these so I will need to come up with an actual solution to fix these
- I'm noticing we have 1,800+ links to defunct resources on bioversityinternational.org in the `cg.link.permalink` field
- I should ask Alliance if they have any plans to fix those, or upload them to CGSpace
## 2024-01-22
- Meeting with IWMI about ORCID integration on CGSpace now that we've migrated to DSpace 7
- File an issue for the inaccurate DSpace statistics: https://github.com/DSpace/DSpace/issues/9275
## 2024-01-23
- Meeting with IWMI about ORCID integration and the DSpace API for use with WordPress
- IFPRI sent me an list of their author ORCIDs to add to our controlled vocabulary
- I joined them with our current list and resolved their names on ORCID and updated them in our database:
```console
$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-identifier.xml ~/Downloads/IFPRI\ ORCiD\ All.csv | grep -oE '[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}' | sort -u > /tmp/2024-01-23-orcids.txt
$ ./ilri/resolve_orcids.py -i /tmp/2024-01-23-orcids.txt -o /tmp/2024-01-23-orcids-names.txt -d
$ ./ilri/update_orcids.py -i /tmp/2024-01-23-orcids-names.txt -db dspace -u dspace -p fuuu
```
- This adds about 400 new identifiers to the controlled vocabulary
- I consolidated our various project identifier fields for closed programs into one `cg.identifer.project`:
- `cg.identifier.ccafsproject`
- `cg.identifier.ccafsprojectpii`
- `cg.identifier.ciatproject`
- `cg.identifier.cpwfproject`
- I prefixed the existing 2,644 metadata values with "CCAFS", "CIAT", or "CPWF" so we can figure out where they came from if need be, and deleted the old fields from the metadata registry
<!-- vim: set sw=2 ts=2: -->

View File

@ -22,7 +22,7 @@ Work on IFPRI ISNAR archive cleanup
<meta property="og:type" content="article" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2024-01/" />
<meta property="article:published_time" content="2024-01-02T10:08:00+03:00" />
<meta property="article:modified_time" content="2024-01-10T17:21:12+03:00" />
<meta property="article:modified_time" content="2024-01-18T15:59:49+03:00" />
@ -50,9 +50,9 @@ Work on IFPRI ISNAR archive cleanup
"@type": "BlogPosting",
"headline": "January, 2024",
"url": "https://alanorth.github.io/cgspace-notes/2024-01/",
"wordCount": "1847",
"wordCount": "2164",
"datePublished": "2024-01-02T10:08:00+03:00",
"dateModified": "2024-01-10T17:21:12+03:00",
"dateModified": "2024-01-18T15:59:49+03:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@ -511,6 +511,62 @@ Work on IFPRI ISNAR archive cleanup
</ul>
</li>
</ul>
<h2 id="2024-01-17">2024-01-17</h2>
<ul>
<li>It turns out AS701 (UUNET) is Verizon Business, which is used as an ISP for many staff at IFPRI
<ul>
<li>This was causing them to see HTTP 429 &ldquo;too many requests&rdquo; errors on CGSpace</li>
<li>I removed this ASN from the rate limiting</li>
</ul>
</li>
</ul>
<h2 id="2024-01-18">2024-01-18</h2>
<ul>
<li>Start looking at Solr stats again
<ul>
<li>I found one statistics record that has 22,000 of the same collection in <code>owningColl</code> and 22,000 of the same community in <code>owningComm</code></li>
<li>The record is from 2015 and think it would be easier to delete it than fix it:</li>
</ul>
</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl http://localhost:8983/solr/statistics/update -H <span style="color:#e6db74">&#34;Content-type: text/xml&#34;</span> --data-binary <span style="color:#e6db74">&#39;&lt;delete&gt;&lt;query&gt;uid:3b4eefba-a302-4172-a286-dcb25d70129e&lt;/query&gt;&lt;/delete&gt;&#39;</span>
</span></span></code></pre></div><ul>
<li>Looking again, there are at least 1,000 of these so I will need to come up with an actual solution to fix these</li>
<li>I&rsquo;m noticing we have 1,800+ links to defunct resources on bioversityinternational.org in the <code>cg.link.permalink</code> field
<ul>
<li>I should ask Alliance if they have any plans to fix those, or upload them to CGSpace</li>
</ul>
</li>
</ul>
<h2 id="2024-01-22">2024-01-22</h2>
<ul>
<li>Meeting with IWMI about ORCID integration on CGSpace now that we&rsquo;ve migrated to DSpace 7</li>
<li>File an issue for the inaccurate DSpace statistics: <a href="https://github.com/DSpace/DSpace/issues/9275">https://github.com/DSpace/DSpace/issues/9275</a></li>
</ul>
<h2 id="2024-01-23">2024-01-23</h2>
<ul>
<li>Meeting with IWMI about ORCID integration and the DSpace API for use with WordPress</li>
<li>IFPRI sent me an list of their author ORCIDs to add to our controlled vocabulary
<ul>
<li>I joined them with our current list and resolved their names on ORCID and updated them in our database:</li>
</ul>
</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-identifier.xml ~/Downloads/IFPRI<span style="color:#ae81ff">\ </span>ORCiD<span style="color:#ae81ff">\ </span>All.csv | grep -oE <span style="color:#e6db74">&#39;[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}&#39;</span> | sort -u &gt; /tmp/2024-01-23-orcids.txt
</span></span><span style="display:flex;"><span>$ ./ilri/resolve_orcids.py -i /tmp/2024-01-23-orcids.txt -o /tmp/2024-01-23-orcids-names.txt -d
</span></span><span style="display:flex;"><span>$ ./ilri/update_orcids.py -i /tmp/2024-01-23-orcids-names.txt -db dspace -u dspace -p fuuu
</span></span></code></pre></div><ul>
<li>This adds about 400 new identifiers to the controlled vocabulary</li>
<li>I consolidated our various project identifier fields for closed programs into one <code>cg.identifer.project</code>:
<ul>
<li><code>cg.identifier.ccafsproject</code></li>
<li><code>cg.identifier.ccafsprojectpii</code></li>
<li><code>cg.identifier.ciatproject</code></li>
<li><code>cg.identifier.cpwfproject</code></li>
</ul>
</li>
<li>I prefixed the existing 2,644 metadata values with &ldquo;CCAFS&rdquo;, &ldquo;CIAT&rdquo;, or &ldquo;CPWF&rdquo; so we can figure out where they came from if need be, and deleted the old fields from the metadata registry</li>
</ul>
<!-- raw HTML omitted -->

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/" />
<meta property="og:updated_time" content="2024-01-10T17:21:12+03:00" />
<meta property="og:updated_time" content="2024-01-18T15:59:49+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2024-01-10T17:21:12+03:00" />
<meta property="og:updated_time" content="2024-01-18T15:59:49+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2024-01-10T17:21:12+03:00" />
<meta property="og:updated_time" content="2024-01-18T15:59:49+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2024-01-10T17:21:12+03:00" />
<meta property="og:updated_time" content="2024-01-18T15:59:49+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2024-01-10T17:21:12+03:00" />
<meta property="og:updated_time" content="2024-01-18T15:59:49+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2024-01-10T17:21:12+03:00" />
<meta property="og:updated_time" content="2024-01-18T15:59:49+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2024-01-10T17:21:12+03:00" />
<meta property="og:updated_time" content="2024-01-18T15:59:49+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2024-01-10T17:21:12+03:00" />
<meta property="og:updated_time" content="2024-01-18T15:59:49+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2024-01-10T17:21:12+03:00" />
<meta property="og:updated_time" content="2024-01-18T15:59:49+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2024-01-10T17:21:12+03:00" />
<meta property="og:updated_time" content="2024-01-18T15:59:49+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2024-01-10T17:21:12+03:00" />
<meta property="og:updated_time" content="2024-01-18T15:59:49+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2024-01-10T17:21:12+03:00" />
<meta property="og:updated_time" content="2024-01-18T15:59:49+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2024-01-10T17:21:12+03:00" />
<meta property="og:updated_time" content="2024-01-18T15:59:49+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2024-01-10T17:21:12+03:00" />
<meta property="og:updated_time" content="2024-01-18T15:59:49+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2024-01-10T17:21:12+03:00" />
<meta property="og:updated_time" content="2024-01-18T15:59:49+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2024-01-10T17:21:12+03:00" />
<meta property="og:updated_time" content="2024-01-18T15:59:49+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2024-01-10T17:21:12+03:00" />
<meta property="og:updated_time" content="2024-01-18T15:59:49+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2024-01-10T17:21:12+03:00" />
<meta property="og:updated_time" content="2024-01-18T15:59:49+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2024-01-10T17:21:12+03:00" />
<meta property="og:updated_time" content="2024-01-18T15:59:49+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2024-01-10T17:21:12+03:00" />
<meta property="og:updated_time" content="2024-01-18T15:59:49+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2024-01-10T17:21:12+03:00" />
<meta property="og:updated_time" content="2024-01-18T15:59:49+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2024-01-10T17:21:12+03:00" />
<meta property="og:updated_time" content="2024-01-18T15:59:49+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2024-01-10T17:21:12+03:00" />
<meta property="og:updated_time" content="2024-01-18T15:59:49+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2024-01-10T17:21:12+03:00" />
<meta property="og:updated_time" content="2024-01-18T15:59:49+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2024-01-10T17:21:12+03:00" />
<meta property="og:updated_time" content="2024-01-18T15:59:49+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2024-01-10T17:21:12+03:00" />
<meta property="og:updated_time" content="2024-01-18T15:59:49+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2024-01-10T17:21:12+03:00" />
<meta property="og:updated_time" content="2024-01-18T15:59:49+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2024-01-10T17:21:12+03:00" />
<meta property="og:updated_time" content="2024-01-18T15:59:49+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2024-01-10T17:21:12+03:00" />
<meta property="og:updated_time" content="2024-01-18T15:59:49+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2024-01-10T17:21:12+03:00" />
<meta property="og:updated_time" content="2024-01-18T15:59:49+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2024-01-10T17:21:12+03:00" />
<meta property="og:updated_time" content="2024-01-18T15:59:49+03:00" />

View File

@ -3,19 +3,19 @@
xmlns:xhtml="http://www.w3.org/1999/xhtml">
<url>
<loc>https://alanorth.github.io/cgspace-notes/categories/</loc>
<lastmod>2024-01-10T17:21:12+03:00</lastmod>
<lastmod>2024-01-18T15:59:49+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/</loc>
<lastmod>2024-01-10T17:21:12+03:00</lastmod>
<lastmod>2024-01-18T15:59:49+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/2024-01/</loc>
<lastmod>2024-01-10T17:21:12+03:00</lastmod>
<lastmod>2024-01-18T15:59:49+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/categories/notes/</loc>
<lastmod>2024-01-10T17:21:12+03:00</lastmod>
<lastmod>2024-01-18T15:59:49+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/posts/</loc>
<lastmod>2024-01-10T17:21:12+03:00</lastmod>
<lastmod>2024-01-18T15:59:49+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/2023-12/</loc>
<lastmod>2023-12-29T12:08:57+03:00</lastmod>