mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Add notes for 2021-09-13
This commit is contained in:
@ -50,7 +50,7 @@ For example, this item has 51 views on CGSpace, but 0 on AReS
|
||||
|
||||
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.87.0" />
|
||||
<meta name="generator" content="Hugo 0.88.1" />
|
||||
|
||||
|
||||
|
||||
@ -160,12 +160,12 @@ For example, this item has 51 views on CGSpace, but 0 on AReS
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<pre><code class="language-console" data-lang="console">$ curl -XDELETE 'http://localhost:9200/openrxv-items-temp'
|
||||
<pre tabindex="0"><code class="language-console" data-lang="console">$ curl -XDELETE 'http://localhost:9200/openrxv-items-temp'
|
||||
# start indexing in AReS
|
||||
</code></pre><ul>
|
||||
<li>Then, the next morning when it’s done, check the results of the harvesting, backup the current <code>openrxv-items</code> index, and clone the <code>openrxv-items-temp</code> index to <code>openrxv-items</code>:</li>
|
||||
</ul>
|
||||
<pre><code class="language-console" data-lang="console">$ curl -s 'http://localhost:9200/openrxv-items-temp/_count?q=*&pretty'
|
||||
<pre tabindex="0"><code class="language-console" data-lang="console">$ curl -s 'http://localhost:9200/openrxv-items-temp/_count?q=*&pretty'
|
||||
{
|
||||
"count" : 100278,
|
||||
"_shards" : {
|
||||
@ -214,7 +214,7 @@ $ curl -XDELETE 'http://localhost:9200/openrxv-items-2021-01-04'
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<pre><code class="language-console" data-lang="console">$ ./doi-to-handle.py -db dspace -u dspace -p 'fuuu' -i /tmp/dois.txt -o /tmp/out.csv
|
||||
<pre tabindex="0"><code class="language-console" data-lang="console">$ ./doi-to-handle.py -db dspace -u dspace -p 'fuuu' -i /tmp/dois.txt -o /tmp/out.csv
|
||||
</code></pre><ul>
|
||||
<li>Help Udana export IWMI records from AReS
|
||||
<ul>
|
||||
@ -261,12 +261,12 @@ $ curl -XDELETE 'http://localhost:9200/openrxv-items-2021-01-04'
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<pre><code class="language-console" data-lang="console">2021-01-10 10:03:27,692 WARN com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=MODIFY, SubjectType=ITEM, SubjectID=1e8fb96c-b994-4fe2-8f0c-0a98ab138be0, ObjectType=(Unknown), ObjectID=null, TimeStamp=1610269383279, dispatcher=1544803905, detail=[null], transactionID="TX35636856957739531161091194485578658698")
|
||||
<pre tabindex="0"><code class="language-console" data-lang="console">2021-01-10 10:03:27,692 WARN com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=MODIFY, SubjectType=ITEM, SubjectID=1e8fb96c-b994-4fe2-8f0c-0a98ab138be0, ObjectType=(Unknown), ObjectID=null, TimeStamp=1610269383279, dispatcher=1544803905, detail=[null], transactionID="TX35636856957739531161091194485578658698")
|
||||
</code></pre><ul>
|
||||
<li>I filed <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=907">a bug on Atmire’s issue tracker</a></li>
|
||||
<li>Peter asked me to move the CGIAR Gender Platform community to the top level of CGSpace, but I get an error when I use the community-filiator command:</li>
|
||||
</ul>
|
||||
<pre><code class="language-console" data-lang="console">$ dspace community-filiator --remove --parent=10568/66598 --child=10568/106605
|
||||
<pre tabindex="0"><code class="language-console" data-lang="console">$ dspace community-filiator --remove --parent=10568/66598 --child=10568/106605
|
||||
Loading @mire database changes for module MQM
|
||||
Changes have been processed
|
||||
Exception: null
|
||||
@ -301,7 +301,7 @@ java.lang.UnsupportedOperationException
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<pre><code class="language-console" data-lang="console">$ curl -XDELETE 'http://localhost:9200/openrxv-items-temp'
|
||||
<pre tabindex="0"><code class="language-console" data-lang="console">$ curl -XDELETE 'http://localhost:9200/openrxv-items-temp'
|
||||
# start indexing in AReS
|
||||
... after ten hours
|
||||
$ curl -s 'http://localhost:9200/openrxv-items-temp/_count?q=*&pretty'
|
||||
@ -331,7 +331,7 @@ $ curl -XDELETE 'http://localhost:9200/openrxv-items-temp'
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<pre><code class="language-console" data-lang="console">$ cat log/dspace.log.2020-12-2* | grep -E 'session_id=[A-Z0-9]{32}:ip_addr=64.62.202.71' | sort | uniq | wc -l
|
||||
<pre tabindex="0"><code class="language-console" data-lang="console">$ cat log/dspace.log.2020-12-2* | grep -E 'session_id=[A-Z0-9]{32}:ip_addr=64.62.202.71' | sort | uniq | wc -l
|
||||
0
|
||||
</code></pre><ul>
|
||||
<li>So now I should really add it to the DSpace spider agent list so it doesn’t create Solr hits
|
||||
@ -341,7 +341,7 @@ $ curl -XDELETE 'http://localhost:9200/openrxv-items-temp'
|
||||
</li>
|
||||
<li>I purged the existing hits using my <code>check-spider-ip-hits.sh</code> script:</li>
|
||||
</ul>
|
||||
<pre><code class="language-console" data-lang="console">$ ./check-spider-ip-hits.sh -d -f /tmp/ips -s http://localhost:8081/solr -s statistics -p
|
||||
<pre tabindex="0"><code class="language-console" data-lang="console">$ ./check-spider-ip-hits.sh -d -f /tmp/ips -s http://localhost:8081/solr -s statistics -p
|
||||
</code></pre><h2 id="2021-01-11">2021-01-11</h2>
|
||||
<ul>
|
||||
<li>The AReS indexing finished this morning and I moved the <code>openrxv-items-temp</code> core to <code>openrxv-items</code> (see above)
|
||||
@ -351,7 +351,7 @@ $ curl -XDELETE 'http://localhost:9200/openrxv-items-temp'
|
||||
</li>
|
||||
<li>I deployed the community-filiator fix on CGSpace and moved the Gender Platform community to the top level of CGSpace:</li>
|
||||
</ul>
|
||||
<pre><code class="language-console" data-lang="console">$ dspace community-filiator --remove --parent=10568/66598 --child=10568/106605
|
||||
<pre tabindex="0"><code class="language-console" data-lang="console">$ dspace community-filiator --remove --parent=10568/66598 --child=10568/106605
|
||||
</code></pre><h2 id="2021-01-12">2021-01-12</h2>
|
||||
<ul>
|
||||
<li>IWMI is really pressuring us to have a periodic CSV export of their community
|
||||
@ -393,12 +393,12 @@ $ curl -XDELETE 'http://localhost:9200/openrxv-items-temp'
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<pre><code class="language-console" data-lang="console">$ curl -XDELETE 'http://localhost:9200/openrxv-items-temp'
|
||||
<pre tabindex="0"><code class="language-console" data-lang="console">$ curl -XDELETE 'http://localhost:9200/openrxv-items-temp'
|
||||
# start indexing in AReS
|
||||
</code></pre><ul>
|
||||
<li>Then, the next morning when it’s done, check the results of the harvesting, backup the current <code>openrxv-items</code> index, and clone the <code>openrxv-items-temp</code> index to <code>openrxv-items</code>:</li>
|
||||
</ul>
|
||||
<pre><code class="language-console" data-lang="console">$ curl -s 'http://localhost:9200/openrxv-items-temp/_count?q=*&pretty'
|
||||
<pre tabindex="0"><code class="language-console" data-lang="console">$ curl -s 'http://localhost:9200/openrxv-items-temp/_count?q=*&pretty'
|
||||
{
|
||||
"count" : 100540,
|
||||
"_shards" : {
|
||||
@ -445,7 +445,7 @@ $ curl -XDELETE 'http://localhost:9200/openrxv-items-2021-01-18'
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<pre><code>localhost/dspace63= > BEGIN;
|
||||
<pre tabindex="0"><code>localhost/dspace63= > BEGIN;
|
||||
localhost/dspace63= > DELETE FROM metadatavalue WHERE metadata_field_id IN (115, 116, 117, 118);
|
||||
DELETE 27
|
||||
localhost/dspace63= > COMMIT;
|
||||
@ -462,7 +462,7 @@ localhost/dspace63= > COMMIT;
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<pre><code class="language-console" data-lang="console">$ docker exec -it api /bin/bash
|
||||
<pre tabindex="0"><code class="language-console" data-lang="console">$ docker exec -it api /bin/bash
|
||||
# apt update && apt install unoconv
|
||||
</code></pre><ul>
|
||||
<li>Help Peter get a list of titles and DOIs for CGSpace items that Altmetric does not have an attention score for
|
||||
@ -512,12 +512,12 @@ localhost/dspace63= > COMMIT;
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<pre><code class="language-console" data-lang="console">$ curl -XDELETE 'http://localhost:9200/openrxv-items-temp'
|
||||
<pre tabindex="0"><code class="language-console" data-lang="console">$ curl -XDELETE 'http://localhost:9200/openrxv-items-temp'
|
||||
# start indexing in AReS
|
||||
</code></pre><ul>
|
||||
<li>Then, the next morning when it’s done, check the results of the harvesting, backup the current <code>openrxv-items</code> index, and clone the <code>openrxv-items-temp</code> index to <code>openrxv-items</code>:</li>
|
||||
</ul>
|
||||
<pre><code class="language-console" data-lang="console">$ curl -s 'http://localhost:9200/openrxv-items-temp/_count?q=*&pretty'
|
||||
<pre tabindex="0"><code class="language-console" data-lang="console">$ curl -s 'http://localhost:9200/openrxv-items-temp/_count?q=*&pretty'
|
||||
{
|
||||
"count" : 100699,
|
||||
"_shards" : {
|
||||
@ -579,7 +579,7 @@ $ curl -XDELETE 'http://localhost:9200/openrxv-items-2021-01-25'
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<pre><code>Jan 26, 2021 10:47:23 AM org.apache.coyote.http11.AbstractHttp11Processor process
|
||||
<pre tabindex="0"><code>Jan 26, 2021 10:47:23 AM org.apache.coyote.http11.AbstractHttp11Processor process
|
||||
INFO: Error parsing HTTP request header
|
||||
Note: further occurrences of HTTP request parsing errors will be logged at DEBUG level.
|
||||
java.lang.IllegalArgumentException: Invalid character found in the request target [/discover/search/csv?query=*&scope=~&filters=author:(Alan\%20Orth)]. The valid characters are defined in RFC 7230 and RFC 3986
|
||||
@ -601,12 +601,12 @@ java.lang.IllegalArgumentException: Invalid character found in the request targe
|
||||
<li>I <a href="https://jira.lyrasis.org/browse/DS-4566">filed a bug</a> on DSpace’s issue tracker (though I accidentally hit Enter and submitted it before I finished, and there is no edit function)</li>
|
||||
<li>Looking into Linode report that the load outbound traffic rate was high this morning:</li>
|
||||
</ul>
|
||||
<pre><code class="language-console" data-lang="console"># grep -E '26/Jan/2021:(08|09|10|11|12)' /var/log/nginx/rest.log | goaccess --log-format=COMBINED -
|
||||
<pre tabindex="0"><code class="language-console" data-lang="console"># grep -E '26/Jan/2021:(08|09|10|11|12)' /var/log/nginx/rest.log | goaccess --log-format=COMBINED -
|
||||
</code></pre><ul>
|
||||
<li>The culprit seems to be the ILRI publications importer, so that’s OK</li>
|
||||
<li>But I also see an IP in Jordan hitting the REST API 1,100 times today:</li>
|
||||
</ul>
|
||||
<pre><code>80.10.12.54 - - [26/Jan/2021:09:43:42 +0100] "GET /rest/rest/bitstreams/98309f17-a831-48ed-8f0a-2d3244cc5a1c/retrieve HTTP/2.0" 302 138 "http://wp.local/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36"
|
||||
<pre tabindex="0"><code>80.10.12.54 - - [26/Jan/2021:09:43:42 +0100] "GET /rest/rest/bitstreams/98309f17-a831-48ed-8f0a-2d3244cc5a1c/retrieve HTTP/2.0" 302 138 "http://wp.local/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36"
|
||||
</code></pre><ul>
|
||||
<li>Seems to be someone from CodeObia working on WordPress
|
||||
<ul>
|
||||
@ -615,7 +615,7 @@ java.lang.IllegalArgumentException: Invalid character found in the request targe
|
||||
</li>
|
||||
<li>I purged all ~3,000 statistics hits that have the “<a href="http://wp.local/%22">http://wp.local/"</a> referrer:</li>
|
||||
</ul>
|
||||
<pre><code class="language-console" data-lang="console">$ curl -s "http://localhost:8081/solr/statistics/update?softCommit=true" -H "Content-Type: text/xml" --data-binary "<delete><query>referrer:http\:\/\/wp\.local\/</query></delete>"
|
||||
<pre tabindex="0"><code class="language-console" data-lang="console">$ curl -s "http://localhost:8081/solr/statistics/update?softCommit=true" -H "Content-Type: text/xml" --data-binary "<delete><query>referrer:http\:\/\/wp\.local\/</query></delete>"
|
||||
</code></pre><ul>
|
||||
<li>Tag version 0.4.3 of the csv-metadata-quality tool on GitHub: <a href="https://github.com/ilri/csv-metadata-quality/releases/tag/v0.4.3">https://github.com/ilri/csv-metadata-quality/releases/tag/v0.4.3</a>
|
||||
<ul>
|
||||
@ -661,7 +661,7 @@ java.lang.IllegalArgumentException: Invalid character found in the request targe
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<pre><code class="language-console" data-lang="console">$ curl -XDELETE 'http://localhost:9200/openrxv-items-temp'
|
||||
<pre tabindex="0"><code class="language-console" data-lang="console">$ curl -XDELETE 'http://localhost:9200/openrxv-items-temp'
|
||||
# start indexing in AReS
|
||||
</code></pre><ul>
|
||||
<li>Sent out emails about CG Core v2 to Macaroni Bros, Fabio, Hector at CCAFS, Dani and Tariku</li>
|
||||
|
Reference in New Issue
Block a user