mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-12-22 13:12:19 +01:00
Add notes for 2021-01-26
This commit is contained in:
parent
ce74818085
commit
ea19549164
@ -346,5 +346,57 @@ $ curl -XDELETE 'http://localhost:9200/openrxv-items-2021-01-25'
|
||||
- If you do a free-text search it works properly, but if you try to use the metadata filters it doesn't
|
||||
- I changed the default setting to make it available to any logged in user and will deploy it on CGSpace this week
|
||||
|
||||
## 2021-01-26
|
||||
|
||||
- Email some CIAT users who submitted items with upper case AGROVOC terms
|
||||
- I will do another global replace soon after they reply
|
||||
- Add CGIAR Impact Areas and UN Sustainable Development Goals (SDGs) to the `6x_prod` branch
|
||||
- Looking into the issue with exporting search results in XMLUI again
|
||||
- I notice that there is an HTTP 400 when you try to export search results containing a filter
|
||||
- The Tomcat logs show:
|
||||
|
||||
```
|
||||
Jan 26, 2021 10:47:23 AM org.apache.coyote.http11.AbstractHttp11Processor process
|
||||
INFO: Error parsing HTTP request header
|
||||
Note: further occurrences of HTTP request parsing errors will be logged at DEBUG level.
|
||||
java.lang.IllegalArgumentException: Invalid character found in the request target [/discover/search/csv?query=*&scope=~&filters=author:(Alan\%20Orth)]. The valid characters are defined in RFC 7230 and RFC 3986
|
||||
at org.apache.coyote.http11.InternalInputBuffer.parseRequestLine(InternalInputBuffer.java:213)
|
||||
at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1108)
|
||||
at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:654)
|
||||
at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:317)
|
||||
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
|
||||
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
|
||||
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
|
||||
at java.lang.Thread.run(Thread.java:748)
|
||||
```
|
||||
|
||||
- This actually seems to be a simple issue, as I notice DSpace is escaping the space for some reason:
|
||||
- The URL that fails is: https://dspacetest.cgiar.org/discover/search/csv?query=*&scope=~&filters=author:(Alan\%20Orth)
|
||||
- The URL that works is: https://dspacetest.cgiar.org/discover/search/csv?query=*&scope=~&filters=author:(Alan%20Orth)
|
||||
- I [filed a bug](https://jira.lyrasis.org/browse/DS-4566) on DSpace's issue tracker (though I accidentally hit Enter and submitted it before I finished, and there is no edit function)
|
||||
- Looking into Linode report that the load outbound traffic rate was high this morning:
|
||||
|
||||
```console
|
||||
# grep -E '26/Jan/2021:(08|09|10|11|12)' /var/log/nginx/rest.log | goaccess --log-format=COMBINED -
|
||||
```
|
||||
|
||||
- The culprit seems to be the ILRI publications importer, so that's OK
|
||||
- But I also see an IP in Jordan hitting the REST API 1,100 times today:
|
||||
|
||||
```
|
||||
80.10.12.54 - - [26/Jan/2021:09:43:42 +0100] "GET /rest/rest/bitstreams/98309f17-a831-48ed-8f0a-2d3244cc5a1c/retrieve HTTP/2.0" 302 138 "http://wp.local/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36"
|
||||
```
|
||||
|
||||
- Seems to be someone from CodeObia working on WordPress
|
||||
- I told them to please use a bot user agent so it doesn't affect our stats, and to use DSpace Test if possible
|
||||
- I purged all ~3,000 statistics hits that have the "http://wp.local/" referrer:
|
||||
|
||||
```console
|
||||
$ curl -s "http://localhost:8081/solr/statistics/update?softCommit=true" -H "Content-Type: text/xml" --data-binary "<delete><query>referrer:http\:\/\/wp\.local\/</query></delete>"
|
||||
```
|
||||
|
||||
- Tag version 0.4.3 of the csv-metadata-quality tool on GitHub: https://github.com/ilri/csv-metadata-quality/releases/tag/v0.4.3
|
||||
- I just realized that I never submitted this to CGSpace as a Big Data Platform output
|
||||
- I used my previous [DSpace Statistics API submission](https://hdl.handle.net/10568/99143) as a reference and submitted it to CGSpace
|
||||
|
||||
<!-- vim: set sw=2 ts=2: -->
|
||||
|
@ -27,7 +27,7 @@ For example, this item has 51 views on CGSpace, but 0 on AReS
|
||||
<meta property="og:type" content="article" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2021-01/" />
|
||||
<meta property="article:published_time" content="2021-01-03T10:13:54+02:00" />
|
||||
<meta property="article:modified_time" content="2021-01-24T17:40:56+02:00" />
|
||||
<meta property="article:modified_time" content="2021-01-25T16:37:30+02:00" />
|
||||
|
||||
|
||||
|
||||
@ -60,9 +60,9 @@ For example, this item has 51 views on CGSpace, but 0 on AReS
|
||||
"@type": "BlogPosting",
|
||||
"headline": "January, 2021",
|
||||
"url": "https://alanorth.github.io/cgspace-notes/2021-01/",
|
||||
"wordCount": "2526",
|
||||
"wordCount": "2880",
|
||||
"datePublished": "2021-01-03T10:13:54+02:00",
|
||||
"dateModified": "2021-01-24T17:40:56+02:00",
|
||||
"dateModified": "2021-01-25T16:37:30+02:00",
|
||||
"author": {
|
||||
"@type": "Person",
|
||||
"name": "Alan Orth"
|
||||
@ -564,6 +564,66 @@ $ curl -XDELETE 'http://localhost:9200/openrxv-items-2021-01-25'
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<h2 id="2021-01-26">2021-01-26</h2>
|
||||
<ul>
|
||||
<li>Email some CIAT users who submitted items with upper case AGROVOC terms
|
||||
<ul>
|
||||
<li>I will do another global replace soon after they reply</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>Add CGIAR Impact Areas and UN Sustainable Development Goals (SDGs) to the <code>6x_prod</code> branch</li>
|
||||
<li>Looking into the issue with exporting search results in XMLUI again
|
||||
<ul>
|
||||
<li>I notice that there is an HTTP 400 when you try to export search results containing a filter</li>
|
||||
<li>The Tomcat logs show:</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<pre><code>Jan 26, 2021 10:47:23 AM org.apache.coyote.http11.AbstractHttp11Processor process
|
||||
INFO: Error parsing HTTP request header
|
||||
Note: further occurrences of HTTP request parsing errors will be logged at DEBUG level.
|
||||
java.lang.IllegalArgumentException: Invalid character found in the request target [/discover/search/csv?query=*&scope=~&filters=author:(Alan\%20Orth)]. The valid characters are defined in RFC 7230 and RFC 3986
|
||||
at org.apache.coyote.http11.InternalInputBuffer.parseRequestLine(InternalInputBuffer.java:213)
|
||||
at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1108)
|
||||
at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:654)
|
||||
at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:317)
|
||||
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
|
||||
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
|
||||
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
|
||||
at java.lang.Thread.run(Thread.java:748)
|
||||
</code></pre><ul>
|
||||
<li>This actually seems to be a simple issue, as I notice DSpace is escaping the space for some reason:
|
||||
<ul>
|
||||
<li>The URL that fails is: <a href="https://dspacetest.cgiar.org/discover/search/csv?query=">https://dspacetest.cgiar.org/discover/search/csv?query=</a>*&scope=~&filters=author:(Alan%20Orth)</li>
|
||||
<li>The URL that works is: <a href="https://dspacetest.cgiar.org/discover/search/csv?query=">https://dspacetest.cgiar.org/discover/search/csv?query=</a>*&scope=~&filters=author:(Alan%20Orth)</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>I <a href="https://jira.lyrasis.org/browse/DS-4566">filed a bug</a> on DSpace’s issue tracker (though I accidentally hit Enter and submitted it before I finished, and there is no edit function)</li>
|
||||
<li>Looking into Linode report that the load outbound traffic rate was high this morning:</li>
|
||||
</ul>
|
||||
<pre><code class="language-console" data-lang="console"># grep -E '26/Jan/2021:(08|09|10|11|12)' /var/log/nginx/rest.log | goaccess --log-format=COMBINED -
|
||||
</code></pre><ul>
|
||||
<li>The culprit seems to be the ILRI publications importer, so that’s OK</li>
|
||||
<li>But I also see an IP in Jordan hitting the REST API 1,100 times today:</li>
|
||||
</ul>
|
||||
<pre><code>80.10.12.54 - - [26/Jan/2021:09:43:42 +0100] "GET /rest/rest/bitstreams/98309f17-a831-48ed-8f0a-2d3244cc5a1c/retrieve HTTP/2.0" 302 138 "http://wp.local/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36"
|
||||
</code></pre><ul>
|
||||
<li>Seems to be someone from CodeObia working on WordPress
|
||||
<ul>
|
||||
<li>I told them to please use a bot user agent so it doesn’t affect our stats, and to use DSpace Test if possible</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>I purged all ~3,000 statistics hits that have the “<a href="http://wp.local/%22">http://wp.local/"</a> referrer:</li>
|
||||
</ul>
|
||||
<pre><code class="language-console" data-lang="console">$ curl -s "http://localhost:8081/solr/statistics/update?softCommit=true" -H "Content-Type: text/xml" --data-binary "<delete><query>referrer:http\:\/\/wp\.local\/</query></delete>"
|
||||
</code></pre><ul>
|
||||
<li>Tag version 0.4.3 of the csv-metadata-quality tool on GitHub: <a href="https://github.com/ilri/csv-metadata-quality/releases/tag/v0.4.3">https://github.com/ilri/csv-metadata-quality/releases/tag/v0.4.3</a>
|
||||
<ul>
|
||||
<li>I just realized that I never submitted this to CGSpace as a Big Data Platform output</li>
|
||||
<li>I used my previous <a href="https://hdl.handle.net/10568/99143">DSpace Statistics API submission</a> as a reference and submitted it to CGSpace</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<!-- raw HTML omitted -->
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/" />
|
||||
<meta property="og:updated_time" content="2021-01-24T17:40:56+02:00" />
|
||||
<meta property="og:updated_time" content="2021-01-25T16:37:30+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
||||
<meta property="og:updated_time" content="2021-01-24T17:40:56+02:00" />
|
||||
<meta property="og:updated_time" content="2021-01-25T16:37:30+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
||||
<meta property="og:updated_time" content="2021-01-24T17:40:56+02:00" />
|
||||
<meta property="og:updated_time" content="2021-01-25T16:37:30+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
||||
<meta property="og:updated_time" content="2021-01-24T17:40:56+02:00" />
|
||||
<meta property="og:updated_time" content="2021-01-25T16:37:30+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
||||
<meta property="og:updated_time" content="2021-01-24T17:40:56+02:00" />
|
||||
<meta property="og:updated_time" content="2021-01-25T16:37:30+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
||||
<meta property="og:updated_time" content="2021-01-24T17:40:56+02:00" />
|
||||
<meta property="og:updated_time" content="2021-01-25T16:37:30+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
||||
<meta property="og:updated_time" content="2021-01-24T17:40:56+02:00" />
|
||||
<meta property="og:updated_time" content="2021-01-25T16:37:30+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
||||
<meta property="og:updated_time" content="2021-01-24T17:40:56+02:00" />
|
||||
<meta property="og:updated_time" content="2021-01-25T16:37:30+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
||||
<meta property="og:updated_time" content="2021-01-24T17:40:56+02:00" />
|
||||
<meta property="og:updated_time" content="2021-01-25T16:37:30+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
||||
<meta property="og:updated_time" content="2021-01-24T17:40:56+02:00" />
|
||||
<meta property="og:updated_time" content="2021-01-25T16:37:30+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
||||
<meta property="og:updated_time" content="2021-01-24T17:40:56+02:00" />
|
||||
<meta property="og:updated_time" content="2021-01-25T16:37:30+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
||||
<meta property="og:updated_time" content="2021-01-24T17:40:56+02:00" />
|
||||
<meta property="og:updated_time" content="2021-01-25T16:37:30+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
||||
<meta property="og:updated_time" content="2021-01-24T17:40:56+02:00" />
|
||||
<meta property="og:updated_time" content="2021-01-25T16:37:30+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
||||
<meta property="og:updated_time" content="2021-01-24T17:40:56+02:00" />
|
||||
<meta property="og:updated_time" content="2021-01-25T16:37:30+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
||||
<meta property="og:updated_time" content="2021-01-24T17:40:56+02:00" />
|
||||
<meta property="og:updated_time" content="2021-01-25T16:37:30+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
||||
<meta property="og:updated_time" content="2021-01-24T17:40:56+02:00" />
|
||||
<meta property="og:updated_time" content="2021-01-25T16:37:30+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
||||
<meta property="og:updated_time" content="2021-01-24T17:40:56+02:00" />
|
||||
<meta property="og:updated_time" content="2021-01-25T16:37:30+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
||||
<meta property="og:updated_time" content="2021-01-24T17:40:56+02:00" />
|
||||
<meta property="og:updated_time" content="2021-01-25T16:37:30+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
||||
<meta property="og:updated_time" content="2021-01-24T17:40:56+02:00" />
|
||||
<meta property="og:updated_time" content="2021-01-25T16:37:30+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
||||
<meta property="og:updated_time" content="2021-01-24T17:40:56+02:00" />
|
||||
<meta property="og:updated_time" content="2021-01-25T16:37:30+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -4,27 +4,27 @@
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/categories/</loc>
|
||||
<lastmod>2021-01-24T17:40:56+02:00</lastmod>
|
||||
<lastmod>2021-01-25T16:37:30+02:00</lastmod>
|
||||
</url>
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/</loc>
|
||||
<lastmod>2021-01-24T17:40:56+02:00</lastmod>
|
||||
<lastmod>2021-01-25T16:37:30+02:00</lastmod>
|
||||
</url>
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/2021-01/</loc>
|
||||
<lastmod>2021-01-24T17:40:56+02:00</lastmod>
|
||||
<lastmod>2021-01-25T16:37:30+02:00</lastmod>
|
||||
</url>
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/categories/notes/</loc>
|
||||
<lastmod>2021-01-24T17:40:56+02:00</lastmod>
|
||||
<lastmod>2021-01-25T16:37:30+02:00</lastmod>
|
||||
</url>
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/posts/</loc>
|
||||
<lastmod>2021-01-24T17:40:56+02:00</lastmod>
|
||||
<lastmod>2021-01-25T16:37:30+02:00</lastmod>
|
||||
</url>
|
||||
|
||||
<url>
|
||||
|
Loading…
Reference in New Issue
Block a user