Add notes for 2021-01-26

This commit is contained in:
2021-01-26 15:52:39 +02:00
parent ce74818085
commit ea19549164
23 changed files with 140 additions and 28 deletions

View File

@ -346,5 +346,57 @@ $ curl -XDELETE 'http://localhost:9200/openrxv-items-2021-01-25'
- If you do a free-text search it works properly, but if you try to use the metadata filters it doesn't
- I changed the default setting to make it available to any logged in user and will deploy it on CGSpace this week
## 2021-01-26
- Email some CIAT users who submitted items with upper case AGROVOC terms
- I will do another global replace soon after they reply
- Add CGIAR Impact Areas and UN Sustainable Development Goals (SDGs) to the `6x_prod` branch
- Looking into the issue with exporting search results in XMLUI again
- I notice that there is an HTTP 400 when you try to export search results containing a filter
- The Tomcat logs show:
```
Jan 26, 2021 10:47:23 AM org.apache.coyote.http11.AbstractHttp11Processor process
INFO: Error parsing HTTP request header
Note: further occurrences of HTTP request parsing errors will be logged at DEBUG level.
java.lang.IllegalArgumentException: Invalid character found in the request target [/discover/search/csv?query=*&scope=~&filters=author:(Alan\%20Orth)]. The valid characters are defined in RFC 7230 and RFC 3986
at org.apache.coyote.http11.InternalInputBuffer.parseRequestLine(InternalInputBuffer.java:213)
at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1108)
at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:654)
at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:317)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:748)
```
- This actually seems to be a simple issue, as I notice DSpace is escaping the space for some reason:
- The URL that fails is: https://dspacetest.cgiar.org/discover/search/csv?query=*&scope=~&filters=author:(Alan\%20Orth)
- The URL that works is: https://dspacetest.cgiar.org/discover/search/csv?query=*&scope=~&filters=author:(Alan%20Orth)
- I [filed a bug](https://jira.lyrasis.org/browse/DS-4566) on DSpace's issue tracker (though I accidentally hit Enter and submitted it before I finished, and there is no edit function)
- Looking into Linode report that the load outbound traffic rate was high this morning:
```console
# grep -E '26/Jan/2021:(08|09|10|11|12)' /var/log/nginx/rest.log | goaccess --log-format=COMBINED -
```
- The culprit seems to be the ILRI publications importer, so that's OK
- But I also see an IP in Jordan hitting the REST API 1,100 times today:
```
80.10.12.54 - - [26/Jan/2021:09:43:42 +0100] "GET /rest/rest/bitstreams/98309f17-a831-48ed-8f0a-2d3244cc5a1c/retrieve HTTP/2.0" 302 138 "http://wp.local/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36"
```
- Seems to be someone from CodeObia working on WordPress
- I told them to please use a bot user agent so it doesn't affect our stats, and to use DSpace Test if possible
- I purged all ~3,000 statistics hits that have the "http://wp.local/" referrer:
```console
$ curl -s "http://localhost:8081/solr/statistics/update?softCommit=true" -H "Content-Type: text/xml" --data-binary "<delete><query>referrer:http\:\/\/wp\.local\/</query></delete>"
```
- Tag version 0.4.3 of the csv-metadata-quality tool on GitHub: https://github.com/ilri/csv-metadata-quality/releases/tag/v0.4.3
- I just realized that I never submitted this to CGSpace as a Big Data Platform output
- I used my previous [DSpace Statistics API submission](https://hdl.handle.net/10568/99143) as a reference and submitted it to CGSpace
<!-- vim: set sw=2 ts=2: -->