Add notes for 2021-06-25

This commit is contained in:
Alan Orth 2021-06-25 21:32:18 +03:00
parent b36808718c
commit 0f2fe01a42
Signed by: alanorth
GPG Key ID: 0FB860CC9C45B1B9
25 changed files with 120 additions and 30 deletions

View File

@ -324,4 +324,49 @@ $ docker logs api 2>/dev/null | grep dspace_add_missing_items | sort | uniq | wc
- Spent a few hours with Moayad troubleshooting and improving OpenRXV
- We found a bug in the harvesting code that can occur when you are harvesting DSpace 5 and DSpace 6 instances, as DSpace 5 uses numeric (long) IDs, and DSpace 6 uses UUIDs
## 2021-06-25
- The new OpenRXV code creates almost 200,000 jobs when the plugins start
- I figured out how to use [bee-queue/arena](https://github.com/bee-queue/arena/tree/master/example) to view our Bull job queue
- Also, we can see the jobs directly using redis-cli:
```console
$ redis-cli
127.0.0.1:6379> SCAN 0 COUNT 5
1) "49152"
2) 1) "bull:plugins:476595"
2) "bull:plugins:367382"
3) "bull:plugins:369228"
4) "bull:plugins:438986"
5) "bull:plugins:366215"
```
- We can apparently get the names of the jobs in each hash using `hget`:
```console
127.0.0.1:6379> TYPE bull:plugins:401827
hash
127.0.0.1:6379> HGET bull:plugins:401827 name
"dspace_add_missing_items"
```
- I whipped up a one liner to get the keys for all plugin jobs, convert to redis `HGET` commands to extract the value of the name field, and then sort them by their counts:
```console
$ redis-cli KEYS "bull:plugins:*" \
| sed -e 's/^bull/HGET bull/' -e 's/\([[:digit:]]\)$/\1 name/' \
| ncat -w 3 localhost 6379 \
| grep -v -E '^\$' | sort | uniq -c | sort -h
3 dspace_health_check
4 -ERR wrong number of arguments for 'hget' command
12 mel_downloads_and_views
129 dspace_altmetrics
932 dspace_downloads_and_views
186428 dspace_add_missing_items
```
- Note that this uses `ncat` to send commands directly to redis all at once instead of one at a time (`netcat` didn't work here, as it doesn't know when our input is finished and never quits)
- I thought of using `redis-cli --pipe` but then you have to construct the commands in the redis protocol format with the number of args and length of each command
- There is clearly something wrong with the new DSpace health check plugin, as it creates WAY too many jobs every time we run the plugins
<!-- vim: set sw=2 ts=2: -->

View File

@ -20,7 +20,7 @@ I simply started it and AReS was running again:
<meta property="og:type" content="article" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2021-06/" />
<meta property="article:published_time" content="2021-06-01T10:51:07+03:00" />
<meta property="article:modified_time" content="2021-06-22T15:22:15+03:00" />
<meta property="article:modified_time" content="2021-06-25T09:34:29+03:00" />
@ -46,9 +46,9 @@ I simply started it and AReS was running again:
"@type": "BlogPosting",
"headline": "June, 2021",
"url": "https://alanorth.github.io/cgspace-notes/2021-06/",
"wordCount": "2396",
"wordCount": "2651",
"datePublished": "2021-06-01T10:51:07+03:00",
"dateModified": "2021-06-22T15:22:15+03:00",
"dateModified": "2021-06-25T09:34:29+03:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@ -487,6 +487,51 @@ $ grep -oE '&quot;handle&quot;:&quot;([[:digit:]]|\.)+/[[:digit:]]+&quot;' cgspa
</ul>
</li>
</ul>
<h2 id="2021-06-25">2021-06-25</h2>
<ul>
<li>The new OpenRXV code creates almost 200,000 jobs when the plugins start
<ul>
<li>I figured out how to use <a href="https://github.com/bee-queue/arena/tree/master/example">bee-queue/arena</a> to view our Bull job queue</li>
<li>Also, we can see the jobs directly using redis-cli:</li>
</ul>
</li>
</ul>
<pre><code class="language-console" data-lang="console">$ redis-cli
127.0.0.1:6379&gt; SCAN 0 COUNT 5
1) &quot;49152&quot;
2) 1) &quot;bull:plugins:476595&quot;
2) &quot;bull:plugins:367382&quot;
3) &quot;bull:plugins:369228&quot;
4) &quot;bull:plugins:438986&quot;
5) &quot;bull:plugins:366215&quot;
</code></pre><ul>
<li>We can apparently get the names of the jobs in each hash using <code>hget</code>:</li>
</ul>
<pre><code class="language-console" data-lang="console">127.0.0.1:6379&gt; TYPE bull:plugins:401827
hash
127.0.0.1:6379&gt; HGET bull:plugins:401827 name
&quot;dspace_add_missing_items&quot;
</code></pre><ul>
<li>I whipped up a one liner to get the keys for all plugin jobs, convert to redis <code>HGET</code> commands to extract the value of the name field, and then sort them by their counts:</li>
</ul>
<pre><code class="language-console" data-lang="console">$ redis-cli KEYS &quot;bull:plugins:*&quot; \
| sed -e 's/^bull/HGET bull/' -e 's/\([[:digit:]]\)$/\1 name/' \
| ncat -w 3 localhost 6379 \
| grep -v -E '^\$' | sort | uniq -c | sort -h
3 dspace_health_check
4 -ERR wrong number of arguments for 'hget' command
12 mel_downloads_and_views
129 dspace_altmetrics
932 dspace_downloads_and_views
186428 dspace_add_missing_items
</code></pre><ul>
<li>Note that this uses <code>ncat</code> to send commands directly to redis all at once instead of one at a time (<code>netcat</code> didn&rsquo;t work here, as it doesn&rsquo;t know when our input is finished and never quits)
<ul>
<li>I thought of using <code>redis-cli --pipe</code> but then you have to construct the commands in the redis protocol format with the number of args and length of each command</li>
</ul>
</li>
<li>There is clearly something wrong with the new DSpace health check plugin, as it creates WAY too many jobs every time we run the plugins</li>
</ul>
<!-- raw HTML omitted -->

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/" />
<meta property="og:updated_time" content="2021-06-22T15:22:15+03:00" />
<meta property="og:updated_time" content="2021-06-25T09:34:29+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2021-06-22T15:22:15+03:00" />
<meta property="og:updated_time" content="2021-06-25T09:34:29+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2021-06-22T15:22:15+03:00" />
<meta property="og:updated_time" content="2021-06-25T09:34:29+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2021-06-22T15:22:15+03:00" />
<meta property="og:updated_time" content="2021-06-25T09:34:29+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2021-06-22T15:22:15+03:00" />
<meta property="og:updated_time" content="2021-06-25T09:34:29+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2021-06-22T15:22:15+03:00" />
<meta property="og:updated_time" content="2021-06-25T09:34:29+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2021-06-22T15:22:15+03:00" />
<meta property="og:updated_time" content="2021-06-25T09:34:29+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2021-06-22T15:22:15+03:00" />
<meta property="og:updated_time" content="2021-06-25T09:34:29+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2021-06-22T15:22:15+03:00" />
<meta property="og:updated_time" content="2021-06-25T09:34:29+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2021-06-22T15:22:15+03:00" />
<meta property="og:updated_time" content="2021-06-25T09:34:29+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2021-06-22T15:22:15+03:00" />
<meta property="og:updated_time" content="2021-06-25T09:34:29+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2021-06-22T15:22:15+03:00" />
<meta property="og:updated_time" content="2021-06-25T09:34:29+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2021-06-22T15:22:15+03:00" />
<meta property="og:updated_time" content="2021-06-25T09:34:29+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2021-06-22T15:22:15+03:00" />
<meta property="og:updated_time" content="2021-06-25T09:34:29+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2021-06-22T15:22:15+03:00" />
<meta property="og:updated_time" content="2021-06-25T09:34:29+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2021-06-22T15:22:15+03:00" />
<meta property="og:updated_time" content="2021-06-25T09:34:29+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2021-06-22T15:22:15+03:00" />
<meta property="og:updated_time" content="2021-06-25T09:34:29+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2021-06-22T15:22:15+03:00" />
<meta property="og:updated_time" content="2021-06-25T09:34:29+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2021-06-22T15:22:15+03:00" />
<meta property="og:updated_time" content="2021-06-25T09:34:29+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2021-06-22T15:22:15+03:00" />
<meta property="og:updated_time" content="2021-06-25T09:34:29+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2021-06-22T15:22:15+03:00" />
<meta property="og:updated_time" content="2021-06-25T09:34:29+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2021-06-22T15:22:15+03:00" />
<meta property="og:updated_time" content="2021-06-25T09:34:29+03:00" />

View File

@ -3,19 +3,19 @@
xmlns:xhtml="http://www.w3.org/1999/xhtml">
<url>
<loc>https://alanorth.github.io/cgspace-notes/categories/</loc>
<lastmod>2021-06-22T15:22:15+03:00</lastmod>
<lastmod>2021-06-25T09:34:29+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/</loc>
<lastmod>2021-06-22T15:22:15+03:00</lastmod>
<lastmod>2021-06-25T09:34:29+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/2021-06/</loc>
<lastmod>2021-06-22T15:22:15+03:00</lastmod>
<lastmod>2021-06-25T09:34:29+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/categories/notes/</loc>
<lastmod>2021-06-22T15:22:15+03:00</lastmod>
<lastmod>2021-06-25T09:34:29+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/posts/</loc>
<lastmod>2021-06-22T15:22:15+03:00</lastmod>
<lastmod>2021-06-25T09:34:29+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/2021-05/</loc>
<lastmod>2021-05-30T22:09:06+03:00</lastmod>