Add notes for 2021-03-07

This commit is contained in:
2021-03-07 15:51:12 +02:00
parent 24d6440c91
commit 75596fd524
23 changed files with 172 additions and 28 deletions

View File

@ -162,4 +162,76 @@ $ curl -XDELETE 'http://localhost:9200/openrxv-items-2021-03-05'
- [docker/docker-compose.yml: Pin Redis to version 5](https://github.com/ilri/OpenRXV/pull/87)
- I deployed the latest changes from the last few days on AReS production
## 2021-03-07
- I realized there is something wrong with the Elasticsearch indexes on AReS
- On a new test environment I see `openrxv-items` is correctly an alias of `openrxv-items-final`:
```console
$ curl -s 'http://localhost:9200/_alias/' | python -m json.tool | less
...
"openrxv-items-final": {
"aliases": {
"openrxv-items": {}
}
},
```
- But on AReS production `openrxv-items` has somehow become an index:
```console
$ curl -s 'http://localhost:9200/_alias/' | python -m json.tool | less
...
"openrxv-items": {
"aliases": {}
},
"openrxv-items-final": {
"aliases": {}
},
"openrxv-items-temp": {
"aliases": {}
},
```
- I fixed the issue on production by cloning the `openrxv-items` index to `openrxv-items-final`, deleting `openrxv-items`, and then re-creating it as an alias:
```console
$ curl -X PUT "localhost:9200/openrxv-items/_settings" -H 'Content-Type: application/json' -d'{"settings": {"index.blocks.write": true}}'
$ curl -s -X POST http://localhost:9200/openrxv-items/_clone/openrxv-items-2021-03-07
$ curl -XDELETE 'http://localhost:9200/openrxv-items-final'
$ curl -s -X POST http://localhost:9200/openrxv-items/_clone/openrxv-items-final
$ curl -XDELETE 'http://localhost:9200/openrxv-items'
$ curl -s -X POST 'http://localhost:9200/_aliases' -H 'Content-Type: application/json' -d'{"actions" : [{"add" : { "index" : "openrxv-items-final", "alias" : "openrxv-items"}}]}'
```
- Delete backups and remove read-only mode on `openrxv-items`:
```console
$ curl -XDELETE 'http://localhost:9200/openrxv-items-2021-03-07'
$ curl -X PUT "localhost:9200/openrxv-items/_settings" -H 'Content-Type: application/json' -d'{"settings": {"index.blocks.write": false}}'
```
- Linode sent alerts about the CPU usage on CGSpace yesterday and the day before
- Looking in the logs I see a few IPs making heavy usage on the REST API and XMLUI:
```console
# zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/access.log.2.gz /var/log/nginx/access.log.3.gz | grep -E '0[56]/Mar/2021' | goaccess --log-format=COMBINED -
```
- I see the usual IPs for CCAFS and ILRI importer bots, but also `143.233.242.132` which appears to be for GARDIAN:
```console
# zgrep '143.233.242.132' /var/log/nginx/access.log.1 | grep -c Delphi
6237
# zgrep '143.233.242.132' /var/log/nginx/access.log.1 | grep -c -v Delphi
6418
```
- They seem to make requests twice, once with the Delphi user agent that we know and already mark as a bot, and once with a "normal" user agent
- Looking in Solr I see they have been using this IP for awhile, as they have 100,000 hits going back into 2020
- I will add this IP to the list of bots in nginx and purge it from Solr with my `check-spider-ip-hits.sh` script
- I made a few changes to OpenRXV:
- [Migrated away from links to use networks](https://github.com/ilri/OpenRXV/issues/89)
- [Converted the backend container to use a custom image that includes `unoconv`](https://github.com/ilri/OpenRXV/issues/68) so we don't have to manually install it anymore
<!-- vim: set sw=2 ts=2: -->