mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Add notes for 2022-03-04
This commit is contained in:
@ -32,7 +32,7 @@ First I exported all the 2019 stats from CGSpace:
|
||||
$ ./run.sh -s http://localhost:8081/solr/statistics -f 'time:2019-*' -a export -o statistics-2019.json -k uid
|
||||
$ zstd statistics-2019.json
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.92.2" />
|
||||
<meta name="generator" content="Hugo 0.93.1" />
|
||||
|
||||
|
||||
|
||||
@ -123,16 +123,16 @@ $ zstd statistics-2019.json
|
||||
<li>I experimented with manually sharding the Solr statistics on DSpace Test</li>
|
||||
<li>First I exported all the 2019 stats from CGSpace:</li>
|
||||
</ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-console" data-lang="console">$ ./run.sh -s http://localhost:8081/solr/statistics -f <span style="color:#e6db74">'time:2019-*'</span> -a export -o statistics-2019.json -k uid
|
||||
$ zstd statistics-2019.json
|
||||
</code></pre></div><ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./run.sh -s http://localhost:8081/solr/statistics -f <span style="color:#e6db74">'time:2019-*'</span> -a export -o statistics-2019.json -k uid
|
||||
</span></span><span style="display:flex;"><span>$ zstd statistics-2019.json
|
||||
</span></span></code></pre></div><ul>
|
||||
<li>Then on DSpace Test I created a <code>statistics-2019</code> core with the same instance dir as the main <code>statistics</code> core (as <a href="https://wiki.lyrasis.org/display/DSDOC6x/Testing+Solr+Shards">illustrated in the DSpace docs</a>)</li>
|
||||
</ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-console" data-lang="console">$ mkdir -p /home/dspacetest.cgiar.org/solr/statistics-2019/data
|
||||
# create core in Solr admin
|
||||
$ curl -s <span style="color:#e6db74">"http://localhost:8081/solr/statistics/update?softCommit=true"</span> -H <span style="color:#e6db74">"Content-Type: text/xml"</span> --data-binary <span style="color:#e6db74">"<delete><query>time:2019-*</query></delete>"</span>
|
||||
$ ./run.sh -s http://localhost:8081/solr/statistics-2019 -a import -o statistics-2019.json -k uid
|
||||
</code></pre></div><ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ mkdir -p /home/dspacetest.cgiar.org/solr/statistics-2019/data
|
||||
</span></span><span style="display:flex;"><span># create core in Solr admin
|
||||
</span></span><span style="display:flex;"><span>$ curl -s <span style="color:#e6db74">"http://localhost:8081/solr/statistics/update?softCommit=true"</span> -H <span style="color:#e6db74">"Content-Type: text/xml"</span> --data-binary <span style="color:#e6db74">"<delete><query>time:2019-*</query></delete>"</span>
|
||||
</span></span><span style="display:flex;"><span>$ ./run.sh -s http://localhost:8081/solr/statistics-2019 -a import -o statistics-2019.json -k uid
|
||||
</span></span></code></pre></div><ul>
|
||||
<li>The key thing above is that you create the core in the Solr admin UI, but the data directory must already exist so you have to do that first in the file system</li>
|
||||
<li>I restarted the server after the import was done to see if the cores would come back up OK
|
||||
<ul>
|
||||
@ -165,13 +165,13 @@ $ ./run.sh -s http://localhost:8081/solr/statistics-2019 -a import -o statistics
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-console" data-lang="console">91.213.50.11 - - [03/Nov/2021:06:47:20 +0100] "HEAD /bitstream/handle/10568/106239/U19ArtSimonikovaChromosomeInthomNodev.pdf?sequence=1%60%20WHERE%206158%3D6158%20AND%204894%3D4741--%20kIlq&isAllowed=y HTTP/1.1" 200 0 "https://cgspace.cgiar.org:443/bitstream/handle/10568/106239/U19ArtSimonikovaChromosomeInthomNodev.pdf" "Mozilla/5.0 (X11; U; Linux i686; en-CA; rv:1.8.0.10) Gecko/20070223 Fedora/1.5.0.10-1.fc5 Firefox/1.5.0.10"
|
||||
</code></pre></div><ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>91.213.50.11 - - [03/Nov/2021:06:47:20 +0100] "HEAD /bitstream/handle/10568/106239/U19ArtSimonikovaChromosomeInthomNodev.pdf?sequence=1%60%20WHERE%206158%3D6158%20AND%204894%3D4741--%20kIlq&isAllowed=y HTTP/1.1" 200 0 "https://cgspace.cgiar.org:443/bitstream/handle/10568/106239/U19ArtSimonikovaChromosomeInthomNodev.pdf" "Mozilla/5.0 (X11; U; Linux i686; en-CA; rv:1.8.0.10) Gecko/20070223 Fedora/1.5.0.10-1.fc5 Firefox/1.5.0.10"
|
||||
</span></span></code></pre></div><ul>
|
||||
<li>Another is in China, and they grabbed 1,200 PDFs from the REST API in under an hour:</li>
|
||||
</ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-console" data-lang="console"># zgrep 222.129.53.160 /var/log/nginx/rest.log.2.gz | wc -l
|
||||
1178
|
||||
</code></pre></div><ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># zgrep 222.129.53.160 /var/log/nginx/rest.log.2.gz | wc -l
|
||||
</span></span><span style="display:flex;"><span>1178
|
||||
</span></span></code></pre></div><ul>
|
||||
<li>I will continue to split the Solr statistics back into year-shards on DSpace Test (linode26)
|
||||
<ul>
|
||||
<li>Today I did all 2018 stats…</li>
|
||||
@ -183,9 +183,9 @@ $ ./run.sh -s http://localhost:8081/solr/statistics-2019 -a import -o statistics
|
||||
<ul>
|
||||
<li>Update all Docker containers on AReS and rebuild OpenRXV:</li>
|
||||
</ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-console" data-lang="console">$ docker images | grep -v ^REPO | sed <span style="color:#e6db74">'s/ \+/:/g'</span> | cut -d: -f1,2 | xargs -L1 docker pull
|
||||
$ docker-compose build
|
||||
</code></pre></div><ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ docker images | grep -v ^REPO | sed <span style="color:#e6db74">'s/ \+/:/g'</span> | cut -d: -f1,2 | xargs -L1 docker pull
|
||||
</span></span><span style="display:flex;"><span>$ docker-compose build
|
||||
</span></span></code></pre></div><ul>
|
||||
<li>Then restart the server and start a fresh harvest</li>
|
||||
<li>Continue splitting the Solr statistics into yearly shards on DSpace Test (doing 2017, 2016, 2015, and 2014 today)</li>
|
||||
<li>Several users wrote to me last week to say that workflow emails haven’t been working since 2021-10-21 or so
|
||||
@ -194,33 +194,33 @@ $ docker-compose build
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-console" data-lang="console">$ dspace test-email
|
||||
<span style="color:#960050;background-color:#1e0010">
|
||||
</span><span style="color:#960050;background-color:#1e0010"></span>About to send test email:
|
||||
- To: fuuuu
|
||||
- Subject: DSpace test email
|
||||
- Server: smtp.office365.com
|
||||
<span style="color:#960050;background-color:#1e0010">
|
||||
</span><span style="color:#960050;background-color:#1e0010"></span>Error sending email:
|
||||
- Error: javax.mail.SendFailedException: Send failure (javax.mail.AuthenticationFailedException: 535 5.7.139 Authentication unsuccessful, the user credentials were incorrect. [AM5PR0701CA0005.eurprd07.prod.outlook.com]
|
||||
)
|
||||
<span style="color:#960050;background-color:#1e0010">
|
||||
</span><span style="color:#960050;background-color:#1e0010"></span>Please see the DSpace documentation for assistance.
|
||||
</code></pre></div><ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ dspace test-email
|
||||
</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
|
||||
</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>About to send test email:
|
||||
</span></span><span style="display:flex;"><span> - To: fuuuu
|
||||
</span></span><span style="display:flex;"><span> - Subject: DSpace test email
|
||||
</span></span><span style="display:flex;"><span> - Server: smtp.office365.com
|
||||
</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
|
||||
</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Error sending email:
|
||||
</span></span><span style="display:flex;"><span> - Error: javax.mail.SendFailedException: Send failure (javax.mail.AuthenticationFailedException: 535 5.7.139 Authentication unsuccessful, the user credentials were incorrect. [AM5PR0701CA0005.eurprd07.prod.outlook.com]
|
||||
</span></span><span style="display:flex;"><span>)
|
||||
</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
|
||||
</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Please see the DSpace documentation for assistance.
|
||||
</span></span></code></pre></div><ul>
|
||||
<li>I sent a message to ILRI ICT to ask them to check the account/password</li>
|
||||
<li>I want to do one last test of the Elasticsearch updates on OpenRXV so I got a snapshot of the latest Elasticsearch volume used on the production AReS instance:</li>
|
||||
</ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-console" data-lang="console"># tar czf openrxv_esData_7.tar.xz /var/lib/docker/volumes/openrxv_esData_7
|
||||
</code></pre></div><ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># tar czf openrxv_esData_7.tar.xz /var/lib/docker/volumes/openrxv_esData_7
|
||||
</span></span></code></pre></div><ul>
|
||||
<li>Then on my local server:</li>
|
||||
</ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-console" data-lang="console">$ mv ~/.local/share/containers/storage/volumes/openrxv_esData_7/ ~/.local/share/containers/storage/volumes/openrxv_esData_7.2021-11-07.bak
|
||||
$ tar xf /tmp/openrxv_esData_7.tar.xz -C ~/.local/share/containers/storage/volumes --strip-components<span style="color:#f92672">=</span><span style="color:#ae81ff">4</span>
|
||||
$ find ~/.local/share/containers/storage/volumes/openrxv_esData_7 -type f -exec chmod <span style="color:#ae81ff">660</span> <span style="color:#f92672">{}</span> <span style="color:#ae81ff">\;</span>
|
||||
$ find ~/.local/share/containers/storage/volumes/openrxv_esData_7 -type d -exec chmod <span style="color:#ae81ff">770</span> <span style="color:#f92672">{}</span> <span style="color:#ae81ff">\;</span>
|
||||
# copy backend/data to /tmp <span style="color:#66d9ef">for</span> the repository setup/layout
|
||||
$ rsync -av --partial --progress --delete provisioning@ares:/tmp/data/ backend/data
|
||||
</code></pre></div><ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ mv ~/.local/share/containers/storage/volumes/openrxv_esData_7/ ~/.local/share/containers/storage/volumes/openrxv_esData_7.2021-11-07.bak
|
||||
</span></span><span style="display:flex;"><span>$ tar xf /tmp/openrxv_esData_7.tar.xz -C ~/.local/share/containers/storage/volumes --strip-components<span style="color:#f92672">=</span><span style="color:#ae81ff">4</span>
|
||||
</span></span><span style="display:flex;"><span>$ find ~/.local/share/containers/storage/volumes/openrxv_esData_7 -type f -exec chmod <span style="color:#ae81ff">660</span> <span style="color:#f92672">{}</span> <span style="color:#ae81ff">\;</span>
|
||||
</span></span><span style="display:flex;"><span>$ find ~/.local/share/containers/storage/volumes/openrxv_esData_7 -type d -exec chmod <span style="color:#ae81ff">770</span> <span style="color:#f92672">{}</span> <span style="color:#ae81ff">\;</span>
|
||||
</span></span><span style="display:flex;"><span># copy backend/data to /tmp <span style="color:#66d9ef">for</span> the repository setup/layout
|
||||
</span></span><span style="display:flex;"><span>$ rsync -av --partial --progress --delete provisioning@ares:/tmp/data/ backend/data
|
||||
</span></span></code></pre></div><ul>
|
||||
<li>This seems to work: all items, stats, and repository setup/layout are OK</li>
|
||||
<li>I merged my <a href="https://github.com/ilri/OpenRXV/pull/126">Elasticsearch pull request</a> from last month into OpenRXV</li>
|
||||
</ul>
|
||||
@ -245,21 +245,21 @@ $ rsync -av --partial --progress --delete provisioning@ares:/tmp/data/ backend/d
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-console" data-lang="console">RuntimeError
|
||||
<span style="color:#960050;background-color:#1e0010">
|
||||
</span><span style="color:#960050;background-color:#1e0010"></span>Unable to find installation candidates for regex (2021.11.9)
|
||||
<span style="color:#960050;background-color:#1e0010">
|
||||
</span><span style="color:#960050;background-color:#1e0010"></span>at /usr/lib/python3.9/site-packages/poetry/installation/chooser.py:72 in choose_for
|
||||
68│
|
||||
69│ links.append(link)
|
||||
70│
|
||||
71│ if not links:
|
||||
→ 72│ raise RuntimeError(
|
||||
73│ "Unable to find installation candidates for {}".format(package)
|
||||
74│ )
|
||||
75│
|
||||
76│ # Get the best link
|
||||
</code></pre></div><ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>RuntimeError
|
||||
</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
|
||||
</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Unable to find installation candidates for regex (2021.11.9)
|
||||
</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
|
||||
</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>at /usr/lib/python3.9/site-packages/poetry/installation/chooser.py:72 in choose_for
|
||||
</span></span><span style="display:flex;"><span> 68│
|
||||
</span></span><span style="display:flex;"><span> 69│ links.append(link)
|
||||
</span></span><span style="display:flex;"><span> 70│
|
||||
</span></span><span style="display:flex;"><span> 71│ if not links:
|
||||
</span></span><span style="display:flex;"><span> → 72│ raise RuntimeError(
|
||||
</span></span><span style="display:flex;"><span> 73│ "Unable to find installation candidates for {}".format(package)
|
||||
</span></span><span style="display:flex;"><span> 74│ )
|
||||
</span></span><span style="display:flex;"><span> 75│
|
||||
</span></span><span style="display:flex;"><span> 76│ # Get the best link
|
||||
</span></span></code></pre></div><ul>
|
||||
<li>So that’s super annoying… I’m going to try using Pipenv again…</li>
|
||||
</ul>
|
||||
<h2 id="2021-11-10">2021-11-10</h2>
|
||||
@ -280,16 +280,16 @@ $ rsync -av --partial --progress --delete provisioning@ares:/tmp/data/ backend/d
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-console" data-lang="console">$ docker-compose down
|
||||
$ sudo tar czf openrxv_esData_7-2021-11-14.tar.xz /var/lib/docker/volumes/openrxv_esData_7
|
||||
$ cp -a backend/data backend/data.2021-11-14
|
||||
</code></pre></div><ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ docker-compose down
|
||||
</span></span><span style="display:flex;"><span>$ sudo tar czf openrxv_esData_7-2021-11-14.tar.xz /var/lib/docker/volumes/openrxv_esData_7
|
||||
</span></span><span style="display:flex;"><span>$ cp -a backend/data backend/data.2021-11-14
|
||||
</span></span></code></pre></div><ul>
|
||||
<li>Then I checked out the latest git commit, updated all images, rebuilt the project:</li>
|
||||
</ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-console" data-lang="console">$ docker images | grep -v ^REPO | sed <span style="color:#e6db74">'s/ \+/:/g'</span> | cut -d: -f1,2 | xargs -L1 docker pull
|
||||
$ docker-compose build
|
||||
$ docker-compose up -d
|
||||
</code></pre></div><ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ docker images | grep -v ^REPO | sed <span style="color:#e6db74">'s/ \+/:/g'</span> | cut -d: -f1,2 | xargs -L1 docker pull
|
||||
</span></span><span style="display:flex;"><span>$ docker-compose build
|
||||
</span></span><span style="display:flex;"><span>$ docker-compose up -d
|
||||
</span></span></code></pre></div><ul>
|
||||
<li>Then I updated the repository configurations and started a fresh harvest</li>
|
||||
<li>Help Francesca from the Alliance with a question about embargos on CGSpace items
|
||||
<ul>
|
||||
@ -315,11 +315,11 @@ $ docker-compose up -d
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-console" data-lang="console">$ ./ilri/check-spider-ip-hits.sh -f /tmp/ips -p
|
||||
Purging 10893 hits from 87.203.87.141 in statistics
|
||||
<span style="color:#960050;background-color:#1e0010">
|
||||
</span><span style="color:#960050;background-color:#1e0010"></span>Total number of bot hits purged: 10893
|
||||
</code></pre></div><ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-ip-hits.sh -f /tmp/ips -p
|
||||
</span></span><span style="display:flex;"><span>Purging 10893 hits from 87.203.87.141 in statistics
|
||||
</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
|
||||
</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of bot hits purged: 10893
|
||||
</span></span></code></pre></div><ul>
|
||||
<li>I did a bit more work documenting and tweaking the PostgreSQL configuration for CGSpace and DSpace Test in the Ansible infrastructure playbooks
|
||||
<ul>
|
||||
<li>I finally deployed the changes on both servers</li>
|
||||
@ -344,8 +344,8 @@ Purging 10893 hits from 87.203.87.141 in statistics
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-console" data-lang="console">$ vipsthumbnail AR<span style="color:#ae81ff">\ </span>RTB<span style="color:#ae81ff">\ </span>2020.pdf -s <span style="color:#ae81ff">600</span> -o <span style="color:#e6db74">'%s.jpg[Q=85,optimize_coding,strip]'</span>
|
||||
</code></pre></div><ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ vipsthumbnail AR<span style="color:#ae81ff">\ </span>RTB<span style="color:#ae81ff">\ </span>2020.pdf -s <span style="color:#ae81ff">600</span> -o <span style="color:#e6db74">'%s.jpg[Q=85,optimize_coding,strip]'</span>
|
||||
</span></span></code></pre></div><ul>
|
||||
<li>I sent an email to the OpenArchives.org contact to ask for help with the OAI validator
|
||||
<ul>
|
||||
<li>Someone responded to say that there have been a number of complaints about this on the oai-pmh mailing list recently…</li>
|
||||
@ -365,20 +365,20 @@ Purging 10893 hits from 87.203.87.141 in statistics
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-console" data-lang="console">$ ./ilri/check-spider-ip-hits.sh -f /tmp/ips.txt
|
||||
Found 8352 hits from 138.201.49.199 in statistics
|
||||
Found 9374 hits from 78.46.89.18 in statistics
|
||||
Found 2112 hits from 93.179.69.74 in statistics
|
||||
Found 1 hits from 31.6.77.23 in statistics
|
||||
Found 5 hits from 34.209.213.122 in statistics
|
||||
Found 86772 hits from 163.172.68.99 in statistics
|
||||
Found 77 hits from 163.172.70.248 in statistics
|
||||
Found 15842 hits from 163.172.71.24 in statistics
|
||||
Found 172954 hits from 104.154.216.0 in statistics
|
||||
Found 3 hits from 188.134.31.88 in statistics
|
||||
<span style="color:#960050;background-color:#1e0010">
|
||||
</span><span style="color:#960050;background-color:#1e0010"></span>Total number of hits from bots: 295492
|
||||
</code></pre></div><h2 id="2021-11-27">2021-11-27</h2>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/check-spider-ip-hits.sh -f /tmp/ips.txt
|
||||
</span></span><span style="display:flex;"><span>Found 8352 hits from 138.201.49.199 in statistics
|
||||
</span></span><span style="display:flex;"><span>Found 9374 hits from 78.46.89.18 in statistics
|
||||
</span></span><span style="display:flex;"><span>Found 2112 hits from 93.179.69.74 in statistics
|
||||
</span></span><span style="display:flex;"><span>Found 1 hits from 31.6.77.23 in statistics
|
||||
</span></span><span style="display:flex;"><span>Found 5 hits from 34.209.213.122 in statistics
|
||||
</span></span><span style="display:flex;"><span>Found 86772 hits from 163.172.68.99 in statistics
|
||||
</span></span><span style="display:flex;"><span>Found 77 hits from 163.172.70.248 in statistics
|
||||
</span></span><span style="display:flex;"><span>Found 15842 hits from 163.172.71.24 in statistics
|
||||
</span></span><span style="display:flex;"><span>Found 172954 hits from 104.154.216.0 in statistics
|
||||
</span></span><span style="display:flex;"><span>Found 3 hits from 188.134.31.88 in statistics
|
||||
</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
|
||||
</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>Total number of hits from bots: 295492
|
||||
</span></span></code></pre></div><h2 id="2021-11-27">2021-11-27</h2>
|
||||
<ul>
|
||||
<li>Peter sent me corrections for the authors that I had sent him back in 2021-09
|
||||
<ul>
|
||||
@ -387,16 +387,16 @@ Found 3 hits from 188.134.31.88 in statistics
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-console" data-lang="console">$ ./ilri/fix-metadata-values.py -i /tmp/authors.csv -db dspace -u dspace -p <span style="color:#e6db74">'fuuu'</span> -f dc.contributor.author -t <span style="color:#e6db74">'correct'</span> -m <span style="color:#ae81ff">3</span>
|
||||
</code></pre></div><ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/fix-metadata-values.py -i /tmp/authors.csv -db dspace -u dspace -p <span style="color:#e6db74">'fuuu'</span> -f dc.contributor.author -t <span style="color:#e6db74">'correct'</span> -m <span style="color:#ae81ff">3</span>
|
||||
</span></span></code></pre></div><ul>
|
||||
<li>Then I imported to CGSpace and started a full Discovery re-index:</li>
|
||||
</ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-console" data-lang="console">$ time chrt -b <span style="color:#ae81ff">0</span> ionice -c2 -n7 nice -n19 dspace index-discovery -b
|
||||
<span style="color:#960050;background-color:#1e0010">
|
||||
</span><span style="color:#960050;background-color:#1e0010"></span>real 272m43.818s
|
||||
user 183m4.543s
|
||||
sys 2m47.988
|
||||
</code></pre></div><h2 id="2021-11-28">2021-11-28</h2>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ time chrt -b <span style="color:#ae81ff">0</span> ionice -c2 -n7 nice -n19 dspace index-discovery -b
|
||||
</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">
|
||||
</span></span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010"></span>real 272m43.818s
|
||||
</span></span><span style="display:flex;"><span>user 183m4.543s
|
||||
</span></span><span style="display:flex;"><span>sys 2m47.988
|
||||
</span></span></code></pre></div><h2 id="2021-11-28">2021-11-28</h2>
|
||||
<ul>
|
||||
<li>Run system updates on AReS server (linode20) and update all Docker containers and reboot
|
||||
<ul>
|
||||
@ -405,12 +405,12 @@ sys 2m47.988
|
||||
</li>
|
||||
<li>I am experimenting with pinning npm version 7 on OpenRXV frontend because of these Angular errors:</li>
|
||||
</ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-console" data-lang="console">npm WARN EBADENGINE Unsupported engine {
|
||||
npm WARN EBADENGINE package: '@angular-devkit/architect@0.901.15',
|
||||
npm WARN EBADENGINE required: { node: '>= 10.13.0', npm: '^6.11.0 || ^7.5.6', yarn: '>= 1.13.0' },
|
||||
npm WARN EBADENGINE current: { node: 'v12.22.7', npm: '8.1.3' }
|
||||
npm WARN EBADENGINE }
|
||||
</code></pre></div><h2 id="2021-11-29">2021-11-29</h2>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>npm WARN EBADENGINE Unsupported engine {
|
||||
</span></span><span style="display:flex;"><span>npm WARN EBADENGINE package: '@angular-devkit/architect@0.901.15',
|
||||
</span></span><span style="display:flex;"><span>npm WARN EBADENGINE required: { node: '>= 10.13.0', npm: '^6.11.0 || ^7.5.6', yarn: '>= 1.13.0' },
|
||||
</span></span><span style="display:flex;"><span>npm WARN EBADENGINE current: { node: 'v12.22.7', npm: '8.1.3' }
|
||||
</span></span><span style="display:flex;"><span>npm WARN EBADENGINE }
|
||||
</span></span></code></pre></div><h2 id="2021-11-29">2021-11-29</h2>
|
||||
<ul>
|
||||
<li>Tezira reached out to me to say that submissions on CGSpace are taking forever</li>
|
||||
<li>I see a definite increase in locks in the last few days:</li>
|
||||
@ -419,24 +419,24 @@ npm WARN EBADENGINE }
|
||||
<ul>
|
||||
<li>The locks are all held by dspaceWeb (XMLUI):</li>
|
||||
</ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-console" data-lang="console">$ psql -c <span style="color:#e6db74">"SELECT application_name FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid"</span> | sort | uniq -c | sort -n
|
||||
1
|
||||
1 ------------------
|
||||
1 (1394 rows)
|
||||
1 application_name
|
||||
9 psql
|
||||
1385 dspaceWeb
|
||||
</code></pre></div><ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">"SELECT application_name FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid"</span> | sort | uniq -c | sort -n
|
||||
</span></span><span style="display:flex;"><span> 1
|
||||
</span></span><span style="display:flex;"><span> 1 ------------------
|
||||
</span></span><span style="display:flex;"><span> 1 (1394 rows)
|
||||
</span></span><span style="display:flex;"><span> 1 application_name
|
||||
</span></span><span style="display:flex;"><span> 9 psql
|
||||
</span></span><span style="display:flex;"><span> 1385 dspaceWeb
|
||||
</span></span></code></pre></div><ul>
|
||||
<li>I restarted PostgreSQL and the locks dropped down:</li>
|
||||
</ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-console" data-lang="console">$ psql -c <span style="color:#e6db74">"SELECT application_name FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid"</span> | sort | uniq -c | sort -n
|
||||
1
|
||||
1 ------------------
|
||||
1 (103 rows)
|
||||
1 application_name
|
||||
9 psql
|
||||
94 dspaceWeb
|
||||
</code></pre></div><h2 id="2021-11-30">2021-11-30</h2>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ psql -c <span style="color:#e6db74">"SELECT application_name FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid"</span> | sort | uniq -c | sort -n
|
||||
</span></span><span style="display:flex;"><span> 1
|
||||
</span></span><span style="display:flex;"><span> 1 ------------------
|
||||
</span></span><span style="display:flex;"><span> 1 (103 rows)
|
||||
</span></span><span style="display:flex;"><span> 1 application_name
|
||||
</span></span><span style="display:flex;"><span> 9 psql
|
||||
</span></span><span style="display:flex;"><span> 94 dspaceWeb
|
||||
</span></span></code></pre></div><h2 id="2021-11-30">2021-11-30</h2>
|
||||
<ul>
|
||||
<li>IWMI sent me ORCID identifiers for some new staff
|
||||
<ul>
|
||||
@ -444,36 +444,36 @@ npm WARN EBADENGINE }
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-console" data-lang="console">$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-identifier.xml /tmp/iwmi-orcids.txt | grep -oE <span style="color:#e6db74">'[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}'</span> | sort | uniq > /tmp/2021-11-30-combined-orcids.txt
|
||||
$ wc -l /tmp/2021-11-30-combined-orcids.txt
|
||||
1348 /tmp/2021-11-30-combined-orcids.txt
|
||||
</code></pre></div><ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat ~/src/git/DSpace/dspace/config/controlled-vocabularies/cg-creator-identifier.xml /tmp/iwmi-orcids.txt | grep -oE <span style="color:#e6db74">'[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}'</span> | sort | uniq > /tmp/2021-11-30-combined-orcids.txt
|
||||
</span></span><span style="display:flex;"><span>$ wc -l /tmp/2021-11-30-combined-orcids.txt
|
||||
</span></span><span style="display:flex;"><span>1348 /tmp/2021-11-30-combined-orcids.txt
|
||||
</span></span></code></pre></div><ul>
|
||||
<li>After I combined them and removed duplicates, I resolved all the names using my <code>resolve-orcids.py</code> script:</li>
|
||||
</ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-console" data-lang="console">$ ./ilri/resolve-orcids.py -i /tmp/2021-11-30-combined-orcids.txt -o /tmp/2021-11-30-combined-orcids-names.txt
|
||||
</code></pre></div><ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ ./ilri/resolve-orcids.py -i /tmp/2021-11-30-combined-orcids.txt -o /tmp/2021-11-30-combined-orcids-names.txt
|
||||
</span></span></code></pre></div><ul>
|
||||
<li>Then I updated some ORCID identifiers that had changed in the XML:</li>
|
||||
</ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-console" data-lang="console">$ cat 2021-11-30-fix-orcids.csv
|
||||
cg.creator.identifier,correct
|
||||
"ADEBOWALE AKANDE: 0000-0002-6521-3272","ADEBOWALE AD AKANDE: 0000-0002-6521-3272"
|
||||
"Daniel Ortiz Gonzalo: 0000-0002-5517-1785","Daniel Ortiz-Gonzalo: 0000-0002-5517-1785"
|
||||
"FRIDAY ANETOR: 0000-0003-3137-1958","Friday Osemenshan Anetor: 0000-0003-3137-1958"
|
||||
"Sander Muilerman: 0000-0001-9103-3294","Sander Muilerman-Rodrigo: 0000-0001-9103-3294"
|
||||
$ ./ilri/fix-metadata-values.py -i 2021-11-30-fix-orcids.csv -db dspace -u dspace -p <span style="color:#e6db74">'fuuu'</span> -f cg.creator.identifier -t <span style="color:#e6db74">'correct'</span> -m <span style="color:#ae81ff">247</span>
|
||||
</code></pre></div><ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat 2021-11-30-fix-orcids.csv
|
||||
</span></span><span style="display:flex;"><span>cg.creator.identifier,correct
|
||||
</span></span><span style="display:flex;"><span>"ADEBOWALE AKANDE: 0000-0002-6521-3272","ADEBOWALE AD AKANDE: 0000-0002-6521-3272"
|
||||
</span></span><span style="display:flex;"><span>"Daniel Ortiz Gonzalo: 0000-0002-5517-1785","Daniel Ortiz-Gonzalo: 0000-0002-5517-1785"
|
||||
</span></span><span style="display:flex;"><span>"FRIDAY ANETOR: 0000-0003-3137-1958","Friday Osemenshan Anetor: 0000-0003-3137-1958"
|
||||
</span></span><span style="display:flex;"><span>"Sander Muilerman: 0000-0001-9103-3294","Sander Muilerman-Rodrigo: 0000-0001-9103-3294"
|
||||
</span></span><span style="display:flex;"><span>$ ./ilri/fix-metadata-values.py -i 2021-11-30-fix-orcids.csv -db dspace -u dspace -p <span style="color:#e6db74">'fuuu'</span> -f cg.creator.identifier -t <span style="color:#e6db74">'correct'</span> -m <span style="color:#ae81ff">247</span>
|
||||
</span></span></code></pre></div><ul>
|
||||
<li>Tag existing items from the IWMI’s new authors with ORCID iDs using <code>add-orcid-identifiers-csv.py</code> (7 new metadata fields added):</li>
|
||||
</ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4"><code class="language-console" data-lang="console">$ cat 2021-11-30-add-orcids.csv
|
||||
dc.contributor.author,cg.creator.identifier
|
||||
"Liaqat, U.W.","Umar Waqas Liaqat: 0000-0001-9027-5232"
|
||||
"Liaqat, Umar Waqas","Umar Waqas Liaqat: 0000-0001-9027-5232"
|
||||
"Munyaradzi, M.","Munyaradzi Junia Mutenje: 0000-0002-7829-9300"
|
||||
"Mutenje, Munyaradzi","Munyaradzi Junia Mutenje: 0000-0002-7829-9300"
|
||||
"Rex, William","William Rex: 0000-0003-4979-5257"
|
||||
"Shrestha, Shisher","Nirman Shrestha: 0000-0002-0996-8611"
|
||||
$ ./ilri/add-orcid-identifiers-csv.py -i 2021-11-30-add-orcids.csv -db dspace -u dspace -p <span style="color:#e6db74">'fuuu'</span>
|
||||
</code></pre></div><!-- raw HTML omitted -->
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ cat 2021-11-30-add-orcids.csv
|
||||
</span></span><span style="display:flex;"><span>dc.contributor.author,cg.creator.identifier
|
||||
</span></span><span style="display:flex;"><span>"Liaqat, U.W.","Umar Waqas Liaqat: 0000-0001-9027-5232"
|
||||
</span></span><span style="display:flex;"><span>"Liaqat, Umar Waqas","Umar Waqas Liaqat: 0000-0001-9027-5232"
|
||||
</span></span><span style="display:flex;"><span>"Munyaradzi, M.","Munyaradzi Junia Mutenje: 0000-0002-7829-9300"
|
||||
</span></span><span style="display:flex;"><span>"Mutenje, Munyaradzi","Munyaradzi Junia Mutenje: 0000-0002-7829-9300"
|
||||
</span></span><span style="display:flex;"><span>"Rex, William","William Rex: 0000-0003-4979-5257"
|
||||
</span></span><span style="display:flex;"><span>"Shrestha, Shisher","Nirman Shrestha: 0000-0002-0996-8611"
|
||||
</span></span><span style="display:flex;"><span>$ ./ilri/add-orcid-identifiers-csv.py -i 2021-11-30-add-orcids.csv -db dspace -u dspace -p <span style="color:#e6db74">'fuuu'</span>
|
||||
</span></span></code></pre></div><!-- raw HTML omitted -->
|
||||
|
||||
|
||||
|
||||
|
Reference in New Issue
Block a user