mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Add notes for 2021-04-13
This commit is contained in:
@ -177,7 +177,7 @@ $ curl -s 'http://localhost:9200/_alias/' | python -m json.tool | less
|
||||
},
|
||||
```
|
||||
|
||||
- But on AReS production `openrxv-items` has somehow become an index:
|
||||
- But on AReS production `openrxv-items` has somehow become a concrete index:
|
||||
|
||||
```console
|
||||
$ curl -s 'http://localhost:9200/_alias/' | python -m json.tool | less
|
||||
|
@ -481,4 +481,117 @@ $ psql -c 'SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid =
|
||||
|
||||
- I definitely need to look into that!
|
||||
|
||||
## 2021-04-11
|
||||
|
||||
- I am trying to resolve the AReS Elasticsearch index issues that happened last week
|
||||
- I decided to back up the `openrxv-items` index to `openrxv-items-backup` and then delete all the others:
|
||||
|
||||
```console
|
||||
$ curl -X PUT "localhost:9200/openrxv-items/_settings" -H 'Content-Type: application/json' -d'{"settings": {"index.blocks.write": true}}'
|
||||
$ curl -s -X POST http://localhost:9200/openrxv-items-temp/_clone/openrxv-items-backup
|
||||
$ curl -X PUT "localhost:9200/openrxv-items/_settings" -H 'Content-Type: application/json' -d'{"settings": {"index.blocks.write": false}}'
|
||||
$ curl -XDELETE 'http://localhost:9200/openrxv-items-temp'
|
||||
$ curl -XDELETE 'http://localhost:9200/openrxv-items-final'
|
||||
$ curl -XDELETE 'http://localhost:9200/openrxv-items-final'
|
||||
```
|
||||
|
||||
- Then I updated all Docker containers and rebooted the server (linode20) so that the correct indexes would be created again:
|
||||
|
||||
```console
|
||||
$ docker images | grep -v ^REPO | sed 's/ \+/:/g' | cut -d: -f1,2 | xargs -L1 docker pull
|
||||
```
|
||||
|
||||
- Then I realized I have to clone the backup index directly to `openrxv-items-final`, and re-create the `openrxv-items` alias:
|
||||
|
||||
```console
|
||||
$ curl -XDELETE 'http://localhost:9200/openrxv-items-final'
|
||||
$ curl -X PUT "localhost:9200/openrxv-items-backup/_settings" -H 'Content-Type: application/json' -d'{"settings": {"index.blocks.write": true}}'
|
||||
$ curl -s -X POST http://localhost:9200/openrxv-items-backup/_clone/openrxv-items-final
|
||||
$ curl -s -X POST 'http://localhost:9200/_aliases' -H 'Content-Type: application/json' -d'{"actions" : [{"add" : { "index" : "openrxv-items-final", "alias" : "openrxv-items"}}]}'
|
||||
```
|
||||
|
||||
- Now I see both `openrxv-items-final` and `openrxv-items` have the current number of items:
|
||||
|
||||
```console
|
||||
$ curl -s 'http://localhost:9200/openrxv-items/_count?q=*&pretty'
|
||||
{
|
||||
"count" : 103373,
|
||||
"_shards" : {
|
||||
"total" : 1,
|
||||
"successful" : 1,
|
||||
"skipped" : 0,
|
||||
"failed" : 0
|
||||
}
|
||||
}
|
||||
$ curl -s 'http://localhost:9200/openrxv-items-final/_count?q=*&pretty'
|
||||
{
|
||||
"count" : 103373,
|
||||
"_shards" : {
|
||||
"total" : 1,
|
||||
"successful" : 1,
|
||||
"skipped" : 0,
|
||||
"failed" : 0
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
- Then I started a fresh harvesting in the AReS Explorer admin dashboard
|
||||
|
||||
## 2021-04-12
|
||||
|
||||
- The harvesting on AReS finished last night, but the indexes got messed up again
|
||||
- I will have to fix them manually next time...
|
||||
|
||||
## 2021-04-13
|
||||
|
||||
- Looking into the logs on 2021-04-06 on CGSpace and DSpace Test to see if there is anything specific that stands out about the activty on those days that would cause the PostgreSQL issues
|
||||
- Digging into the Munin graphs for the last week I found a few other things happening on that morning:
|
||||
|
||||

|
||||

|
||||

|
||||
|
||||
- 13,000 requests in the last two months from a user with user agent `SomeRandomText`, for example:
|
||||
|
||||
```console
|
||||
84.33.2.97 - - [06/Apr/2021:06:25:13 +0200] "GET /bitstream/handle/10568/77776/CROP%20SCIENCE.jpg.jpg HTTP/1.1" 404 10890 "-" "SomeRandomText"
|
||||
```
|
||||
|
||||
- I purged them:
|
||||
|
||||
```console
|
||||
$ ./ilri/check-spider-hits.sh -f /tmp/agents.txt -p
|
||||
Purging 13159 hits from SomeRandomText in statistics
|
||||
|
||||
Total number of bot hits purged: 13159
|
||||
```
|
||||
|
||||
- I noticed there were 78 items submitted in the hour before CGSpace crashed:
|
||||
|
||||
```console
|
||||
# grep -a -E '2021-04-06 0(6|7):' /home/cgspace.cgiar.org/log/dspace.log.2021-04-06 | grep -c -a add_item
|
||||
78
|
||||
```
|
||||
|
||||
- Of those 78, 77 of them were from Udana
|
||||
- Compared to other mornings (0 to 9 AM) this month that seems to be pretty high:
|
||||
|
||||
```console
|
||||
# for num in {01..13}; do grep -a -E "2021-04-$num 0" /home/cgspace.cgiar.org/log/dspace.log.2021-04-$num | grep -c -a
|
||||
add_item; done
|
||||
32
|
||||
0
|
||||
0
|
||||
2
|
||||
8
|
||||
108
|
||||
4
|
||||
0
|
||||
29
|
||||
0
|
||||
1
|
||||
1
|
||||
2
|
||||
```
|
||||
|
||||
<!-- vim: set sw=2 ts=2: -->
|
||||
|
Reference in New Issue
Block a user