diff --git a/content/posts/2021-03.md b/content/posts/2021-03.md index 31d053e57..fb2655724 100644 --- a/content/posts/2021-03.md +++ b/content/posts/2021-03.md @@ -177,7 +177,7 @@ $ curl -s 'http://localhost:9200/_alias/' | python -m json.tool | less }, ``` -- But on AReS production `openrxv-items` has somehow become an index: +- But on AReS production `openrxv-items` has somehow become a concrete index: ```console $ curl -s 'http://localhost:9200/_alias/' | python -m json.tool | less diff --git a/content/posts/2021-04.md b/content/posts/2021-04.md index 89ee3c7c6..7030c1794 100644 --- a/content/posts/2021-04.md +++ b/content/posts/2021-04.md @@ -481,4 +481,117 @@ $ psql -c 'SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = - I definitely need to look into that! +## 2021-04-11 + +- I am trying to resolve the AReS Elasticsearch index issues that happened last week + - I decided to back up the `openrxv-items` index to `openrxv-items-backup` and then delete all the others: + +```console +$ curl -X PUT "localhost:9200/openrxv-items/_settings" -H 'Content-Type: application/json' -d'{"settings": {"index.blocks.write": true}}' +$ curl -s -X POST http://localhost:9200/openrxv-items-temp/_clone/openrxv-items-backup +$ curl -X PUT "localhost:9200/openrxv-items/_settings" -H 'Content-Type: application/json' -d'{"settings": {"index.blocks.write": false}}' +$ curl -XDELETE 'http://localhost:9200/openrxv-items-temp' +$ curl -XDELETE 'http://localhost:9200/openrxv-items-final' +$ curl -XDELETE 'http://localhost:9200/openrxv-items-final' +``` + +- Then I updated all Docker containers and rebooted the server (linode20) so that the correct indexes would be created again: + +```console +$ docker images | grep -v ^REPO | sed 's/ \+/:/g' | cut -d: -f1,2 | xargs -L1 docker pull +``` + +- Then I realized I have to clone the backup index directly to `openrxv-items-final`, and re-create the `openrxv-items` alias: + +```console +$ curl -XDELETE 'http://localhost:9200/openrxv-items-final' +$ curl -X PUT "localhost:9200/openrxv-items-backup/_settings" -H 'Content-Type: application/json' -d'{"settings": {"index.blocks.write": true}}' +$ curl -s -X POST http://localhost:9200/openrxv-items-backup/_clone/openrxv-items-final +$ curl -s -X POST 'http://localhost:9200/_aliases' -H 'Content-Type: application/json' -d'{"actions" : [{"add" : { "index" : "openrxv-items-final", "alias" : "openrxv-items"}}]}' +``` + +- Now I see both `openrxv-items-final` and `openrxv-items` have the current number of items: + +```console +$ curl -s 'http://localhost:9200/openrxv-items/_count?q=*&pretty' +{ + "count" : 103373, + "_shards" : { + "total" : 1, + "successful" : 1, + "skipped" : 0, + "failed" : 0 + } +} +$ curl -s 'http://localhost:9200/openrxv-items-final/_count?q=*&pretty' +{ + "count" : 103373, + "_shards" : { + "total" : 1, + "successful" : 1, + "skipped" : 0, + "failed" : 0 + } +} +``` + +- Then I started a fresh harvesting in the AReS Explorer admin dashboard + +## 2021-04-12 + +- The harvesting on AReS finished last night, but the indexes got messed up again + - I will have to fix them manually next time... + +## 2021-04-13 + +- Looking into the logs on 2021-04-06 on CGSpace and DSpace Test to see if there is anything specific that stands out about the activty on those days that would cause the PostgreSQL issues + - Digging into the Munin graphs for the last week I found a few other things happening on that morning: + +![/dev/sda disk latency week](/cgspace-notes/2021/04/sda-week.png) +![JVM classes unloaded week](/cgspace-notes/2021/04/classes_unloaded-week.png) +![Nginx status week](/cgspace-notes/2021/04/nginx_status-week.png) + +- 13,000 requests in the last two months from a user with user agent `SomeRandomText`, for example: + +```console +84.33.2.97 - - [06/Apr/2021:06:25:13 +0200] "GET /bitstream/handle/10568/77776/CROP%20SCIENCE.jpg.jpg HTTP/1.1" 404 10890 "-" "SomeRandomText" +``` + +- I purged them: + +```console +$ ./ilri/check-spider-hits.sh -f /tmp/agents.txt -p +Purging 13159 hits from SomeRandomText in statistics + +Total number of bot hits purged: 13159 +``` + +- I noticed there were 78 items submitted in the hour before CGSpace crashed: + +```console +# grep -a -E '2021-04-06 0(6|7):' /home/cgspace.cgiar.org/log/dspace.log.2021-04-06 | grep -c -a add_item +78 +``` + +- Of those 78, 77 of them were from Udana +- Compared to other mornings (0 to 9 AM) this month that seems to be pretty high: + +```console +# for num in {01..13}; do grep -a -E "2021-04-$num 0" /home/cgspace.cgiar.org/log/dspace.log.2021-04-$num | grep -c -a + add_item; done +32 +0 +0 +2 +8 +108 +4 +0 +29 +0 +1 +1 +2 +``` + diff --git a/docs/2021-03/index.html b/docs/2021-03/index.html index 123139ecb..447cb20b2 100644 --- a/docs/2021-03/index.html +++ b/docs/2021-03/index.html @@ -44,7 +44,7 @@ Also, we found some issues building and running OpenRXV currently due to ecosyst "@type": "BlogPosting", "headline": "March, 2021", "url": "https://alanorth.github.io/cgspace-notes/2021-03/", - "wordCount": "4452", + "wordCount": "4453", "datePublished": "2021-03-01T10:13:54+02:00", "dateModified": "2021-04-05T19:36:44+03:00", "author": { @@ -306,7 +306,7 @@ $ curl -XDELETE 'http://localhost:9200/openrxv-items-2021-03-05' } },
$ curl -s 'http://localhost:9200/_alias/' | python -m json.tool | less
 ...
diff --git a/docs/2021-04/index.html b/docs/2021-04/index.html
index ee382cbb3..bbb211710 100644
--- a/docs/2021-04/index.html
+++ b/docs/2021-04/index.html
@@ -24,7 +24,7 @@ Perhaps one of the containers crashed, I should have looked closer but I was in
 
 
 
-
+
 
 
 
@@ -54,9 +54,9 @@ Perhaps one of the containers crashed, I should have looked closer but I was in
   "@type": "BlogPosting",
   "headline": "April, 2021",
   "url": "https://alanorth.github.io/cgspace-notes/2021-04/",
-  "wordCount": "2530",
+  "wordCount": "2984",
   "datePublished": "2021-04-01T09:50:54+03:00",
-  "dateModified": "2021-04-06T22:48:44+03:00",
+  "dateModified": "2021-04-13T15:42:35+03:00",
   "author": {
     "@type": "Person",
     "name": "Alan Orth"
@@ -192,7 +192,7 @@ $ curl -X PUT "localhost:9200/openrxv-items-final/_settings" -H 'Conte
 
 
  • Create the CGSpace community and collection structure for the new Accelerating Impacts of CGIAR Climate Research for Africa (AICCRA) and assign all workflow steps
  • -

    2021-04-04

    +

    2021-04-05

    @@ -600,7 +600,112 @@ $ psql -c 'SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = - +

    2021-04-11

    + +
    $ curl -X PUT "localhost:9200/openrxv-items/_settings" -H 'Content-Type: application/json' -d'{"settings": {"index.blocks.write": true}}'
    +$ curl -s -X POST http://localhost:9200/openrxv-items-temp/_clone/openrxv-items-backup
    +$ curl -X PUT "localhost:9200/openrxv-items/_settings" -H 'Content-Type: application/json' -d'{"settings": {"index.blocks.write": false}}'
    +$ curl -XDELETE 'http://localhost:9200/openrxv-items-temp'
    +$ curl -XDELETE 'http://localhost:9200/openrxv-items-final'
    +$ curl -XDELETE 'http://localhost:9200/openrxv-items-final'
    +
    +
    $ docker images | grep -v ^REPO | sed 's/ \+/:/g' | cut -d: -f1,2 | xargs -L1 docker pull
    +
    +
    $ curl -XDELETE 'http://localhost:9200/openrxv-items-final'
    +$ curl -X PUT "localhost:9200/openrxv-items-backup/_settings" -H 'Content-Type: application/json' -d'{"settings": {"index.blocks.write": true}}'
    +$ curl -s -X POST http://localhost:9200/openrxv-items-backup/_clone/openrxv-items-final
    +$ curl -s -X POST 'http://localhost:9200/_aliases' -H 'Content-Type: application/json' -d'{"actions" : [{"add" : { "index" : "openrxv-items-final", "alias" : "openrxv-items"}}]}'
    +
    +
    $ curl -s 'http://localhost:9200/openrxv-items/_count?q=*&pretty'     
    +{
    +  "count" : 103373,
    +  "_shards" : {
    +    "total" : 1,
    +    "successful" : 1,
    +    "skipped" : 0,
    +    "failed" : 0
    +  }
    +}
    +$ curl -s 'http://localhost:9200/openrxv-items-final/_count?q=*&pretty'
    +{
    +  "count" : 103373,
    +  "_shards" : {
    +    "total" : 1,
    +    "successful" : 1,
    +    "skipped" : 0,
    +    "failed" : 0
    +  }
    +}
    +
    +

    2021-04-12

    + +

    2021-04-13

    + +

    /dev/sda disk latency week +JVM classes unloaded week +Nginx status week

    + +
    84.33.2.97 - - [06/Apr/2021:06:25:13 +0200] "GET /bitstream/handle/10568/77776/CROP%20SCIENCE.jpg.jpg HTTP/1.1" 404 10890 "-" "SomeRandomText"
    +
    +
    $ ./ilri/check-spider-hits.sh -f /tmp/agents.txt -p
    +Purging 13159 hits from SomeRandomText in statistics
    +
    +Total number of bot hits purged: 13159
    +
    +
    # grep -a -E '2021-04-06 0(6|7):' /home/cgspace.cgiar.org/log/dspace.log.2021-04-06 | grep -c -a add_item 
    +78
    +
    +
    # for num in {01..13}; do grep -a -E "2021-04-$num 0" /home/cgspace.cgiar.org/log/dspace.log.2021-04-$num | grep -c -a
    + add_item; done
    +32
    +0
    +0
    +2
    +8
    +108
    +4
    +0
    +29
    +0
    +1
    +1
    +2
    +
    diff --git a/docs/2021/04/classes_unloaded-week.png b/docs/2021/04/classes_unloaded-week.png new file mode 100644 index 000000000..4c915179d Binary files /dev/null and b/docs/2021/04/classes_unloaded-week.png differ diff --git a/docs/2021/04/nginx_status-week.png b/docs/2021/04/nginx_status-week.png new file mode 100644 index 000000000..01f988e9e Binary files /dev/null and b/docs/2021/04/nginx_status-week.png differ diff --git a/docs/2021/04/sda-week.png b/docs/2021/04/sda-week.png new file mode 100644 index 000000000..ddc5a7b50 Binary files /dev/null and b/docs/2021/04/sda-week.png differ diff --git a/docs/categories/index.html b/docs/categories/index.html index c2a7ad648..362a0621c 100644 --- a/docs/categories/index.html +++ b/docs/categories/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/index.html b/docs/categories/notes/index.html index 96c462c44..16b73c56b 100644 --- a/docs/categories/notes/index.html +++ b/docs/categories/notes/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/2/index.html b/docs/categories/notes/page/2/index.html index 2792f8c8d..51a226520 100644 --- a/docs/categories/notes/page/2/index.html +++ b/docs/categories/notes/page/2/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/3/index.html b/docs/categories/notes/page/3/index.html index c76e39671..c645fc1e6 100644 --- a/docs/categories/notes/page/3/index.html +++ b/docs/categories/notes/page/3/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/4/index.html b/docs/categories/notes/page/4/index.html index 314219934..fc254dde0 100644 --- a/docs/categories/notes/page/4/index.html +++ b/docs/categories/notes/page/4/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/5/index.html b/docs/categories/notes/page/5/index.html index 4ed4147f7..2bf711f4d 100644 --- a/docs/categories/notes/page/5/index.html +++ b/docs/categories/notes/page/5/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/index.html b/docs/index.html index 031dbf09e..e757df3f7 100644 --- a/docs/index.html +++ b/docs/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/2/index.html b/docs/page/2/index.html index 5a9c964d4..aaf580017 100644 --- a/docs/page/2/index.html +++ b/docs/page/2/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/3/index.html b/docs/page/3/index.html index 67c9d33d6..2fd77eb42 100644 --- a/docs/page/3/index.html +++ b/docs/page/3/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/4/index.html b/docs/page/4/index.html index 25f18443b..fcb134e06 100644 --- a/docs/page/4/index.html +++ b/docs/page/4/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/5/index.html b/docs/page/5/index.html index 30f3653f2..e25faf4fe 100644 --- a/docs/page/5/index.html +++ b/docs/page/5/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/6/index.html b/docs/page/6/index.html index 7c3b08343..00c0e755b 100644 --- a/docs/page/6/index.html +++ b/docs/page/6/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/7/index.html b/docs/page/7/index.html index 08e6b07d8..afa1735bd 100644 --- a/docs/page/7/index.html +++ b/docs/page/7/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/index.html b/docs/posts/index.html index 597884a15..305f51ffe 100644 --- a/docs/posts/index.html +++ b/docs/posts/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/2/index.html b/docs/posts/page/2/index.html index 7197552a4..a668f6803 100644 --- a/docs/posts/page/2/index.html +++ b/docs/posts/page/2/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/3/index.html b/docs/posts/page/3/index.html index d66e1fe6c..d1abbad22 100644 --- a/docs/posts/page/3/index.html +++ b/docs/posts/page/3/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/4/index.html b/docs/posts/page/4/index.html index a3debd9da..b415af2d6 100644 --- a/docs/posts/page/4/index.html +++ b/docs/posts/page/4/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/5/index.html b/docs/posts/page/5/index.html index 2ec6b7b37..f67b1e6c6 100644 --- a/docs/posts/page/5/index.html +++ b/docs/posts/page/5/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/6/index.html b/docs/posts/page/6/index.html index f28a424ae..d279c5df5 100644 --- a/docs/posts/page/6/index.html +++ b/docs/posts/page/6/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/7/index.html b/docs/posts/page/7/index.html index af9abeb2d..2105083fb 100644 --- a/docs/posts/page/7/index.html +++ b/docs/posts/page/7/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/sitemap.xml b/docs/sitemap.xml index f1601ee3a..0dd18e0ea 100644 --- a/docs/sitemap.xml +++ b/docs/sitemap.xml @@ -3,19 +3,19 @@ xmlns:xhtml="http://www.w3.org/1999/xhtml"> https://alanorth.github.io/cgspace-notes/2021-04/ - 2021-04-06T22:48:44+03:00 + 2021-04-13T15:42:35+03:00 https://alanorth.github.io/cgspace-notes/categories/ - 2021-04-06T22:48:44+03:00 + 2021-04-13T15:42:35+03:00 https://alanorth.github.io/cgspace-notes/ - 2021-04-06T22:48:44+03:00 + 2021-04-13T15:42:35+03:00 https://alanorth.github.io/cgspace-notes/categories/notes/ - 2021-04-06T22:48:44+03:00 + 2021-04-13T15:42:35+03:00 https://alanorth.github.io/cgspace-notes/posts/ - 2021-04-06T22:48:44+03:00 + 2021-04-13T15:42:35+03:00 https://alanorth.github.io/cgspace-notes/2021-03/ 2021-04-05T19:36:44+03:00 diff --git a/static/2021/04/classes_unloaded-week.png b/static/2021/04/classes_unloaded-week.png new file mode 100644 index 000000000..4c915179d Binary files /dev/null and b/static/2021/04/classes_unloaded-week.png differ diff --git a/static/2021/04/nginx_status-week.png b/static/2021/04/nginx_status-week.png new file mode 100644 index 000000000..01f988e9e Binary files /dev/null and b/static/2021/04/nginx_status-week.png differ diff --git a/static/2021/04/sda-week.png b/static/2021/04/sda-week.png new file mode 100644 index 000000000..ddc5a7b50 Binary files /dev/null and b/static/2021/04/sda-week.png differ