diff --git a/content/posts/2019-05.md b/content/posts/2019-05.md index 2df3cb04e..b8485f50d 100644 --- a/content/posts/2019-05.md +++ b/content/posts/2019-05.md @@ -118,5 +118,105 @@ Caused by: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed - Also, I tried to increase the `writeLockTimeout` in `solrconfig.xml` from the default of 1000ms to 10000ms - Eventually the Atmire stats started working, despite errors about "Error opening new searcher" in the Solr Admin UI - I wrote to the dspace-tech mailing list again on the thread from March, 2019 +- There were a few alerts from UptimeRobot about CGSpace going up and down this morning, along with an alert from Linode about 596% load + - Looking at the Munin stats I see an exponential rise in DSpace XMLUI sessions, firewall activity, and PostgreSQL connections this morning: + +![CGSpace XMLUI sessions day](/cgspace-notes/2019/05/2019-05-06-jmx_dspace_sessions-day.png) + +![linode18 firewall connections day](/cgspace-notes/2019/05/2019-05-06-fw_conntrack-day.png) + +![linode18 postgres connections day](/cgspace-notes/2019/05/2019-05-06-postgres_connections_db-day.png) + +![linode18 CPU day](/cgspace-notes/2019/05/2019-05-06-cpu-day.png) + +- The number of unique sessions today is *ridiculously* high compared to the last few days considering it's only 12:30PM right now: + +``` +$ grep -o -E 'session_id=[A-Z0-9]{32}' dspace.log.2019-05-06 | sort | uniq | wc -l +101108 +$ grep -o -E 'session_id=[A-Z0-9]{32}' dspace.log.2019-05-05 | sort | uniq | wc -l +14618 +$ grep -o -E 'session_id=[A-Z0-9]{32}' dspace.log.2019-05-04 | sort | uniq | wc -l +14946 +$ grep -o -E 'session_id=[A-Z0-9]{32}' dspace.log.2019-05-03 | sort | uniq | wc -l +6410 +$ grep -o -E 'session_id=[A-Z0-9]{32}' dspace.log.2019-05-02 | sort | uniq | wc -l +7758 +$ grep -o -E 'session_id=[A-Z0-9]{32}' dspace.log.2019-05-01 | sort | uniq | wc -l +20528 +``` + +- The number of unique IP addresses from 2 to 6 AM this morning is already several times higher than the average for that time of the morning this past week: + +``` +# zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E '06/May/2019:(02|03|04|05|06)' | awk '{print $1}' | sort | uniq | wc -l +7127 +# zcat --force /var/log/nginx/access.log.1 /var/log/nginx/access.log.2.gz | grep -E '05/May/2019:(02|03|04|05|06)' | awk '{print $1}' | sort | uniq | wc -l +1231 +# zcat --force /var/log/nginx/access.log.2.gz /var/log/nginx/access.log.3.gz | grep -E '04/May/2019:(02|03|04|05|06)' | awk '{print $1}' | sort | uniq | wc -l +1255 +# zcat --force /var/log/nginx/access.log.3.gz /var/log/nginx/access.log.4.gz | grep -E '03/May/2019:(02|03|04|05|06)' | awk '{print $1}' | sort | uniq | wc -l +1736 +# zcat --force /var/log/nginx/access.log.4.gz /var/log/nginx/access.log.5.gz | grep -E '02/May/2019:(02|03|04|05|06)' | awk '{print $1}' | sort | uniq | wc -l +1573 +# zcat --force /var/log/nginx/access.log.5.gz /var/log/nginx/access.log.6.gz | grep -E '01/May/2019:(02|03|04|05|06)' | awk '{print $1}' | sort | uniq | wc -l +1410 +``` + +- Just this morning between the hours of 2 and 6 the number of unique sessions was *very* high compared to previous mornings: + +``` +$ cat dspace.log.2019-05-06 | grep -E '2019-05-06 (02|03|04|05|06):' | grep -o -E 'session_id=[A-Z0-9]{32}' | sort | uniq | wc -l +83650 +$ cat dspace.log.2019-05-05 | grep -E '2019-05-05 (02|03|04|05|06):' | grep -o -E 'session_id=[A-Z0-9]{32}' | sort | uniq | wc -l +2547 +$ cat dspace.log.2019-05-04 | grep -E '2019-05-04 (02|03|04|05|06):' | grep -o -E 'session_id=[A-Z0-9]{32}' | sort | uniq | wc -l +2574 +$ cat dspace.log.2019-05-03 | grep -E '2019-05-03 (02|03|04|05|06):' | grep -o -E 'session_id=[A-Z0-9]{32}' | sort | uniq | wc -l +2911 +$ cat dspace.log.2019-05-02 | grep -E '2019-05-02 (02|03|04|05|06):' | grep -o -E 'session_id=[A-Z0-9]{32}' | sort | uniq | wc -l +2704 +$ cat dspace.log.2019-05-01 | grep -E '2019-05-01 (02|03|04|05|06):' | grep -o -E 'session_id=[A-Z0-9]{32}' | sort | uniq | wc -l +3699 +``` + +- Most of the requests were GETs: + +``` +# cat /var/log/nginx/{access,library-access}.log /var/log/nginx/{access,library-access}.log.1 | grep -E '06/May/2019:(02|03|04|05|06)' | grep -o -E "(GET|HEAD|POST|PUT)" | sort | uniq -c | sort -n + 1 PUT + 98 POST + 2845 HEAD + 98121 GET +``` + +- I'm not exactly sure what happened this morning, but it looks like some legitimate user traffic—perhaps someone launched a new publication and it got a bunch of hits? +- Looking again, I see 84,000 requests to `/handle` this morning (not including logs for library.cgiar.org because those get HTTP 301 redirect to CGSpace and appear here in `access.log`): + +``` +# cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E '06/May/2019:(02|03|04|05|06)' | grep -c -o -E " /handle/[0-9]+/[0-9]+" +84350 +``` + +- But it would be difficult to find a pattern for those requests because they cover 78,000 *unique* Handles (ie direct browsing of items, collections, or communities) and only 2,492 discover/browse (total, not unique): + +``` +# cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E '06/May/2019:(02|03|04|05|06)' | grep -o -E " /handle/[0-9]+/[0-9]+ HTTP" | sort | uniq | wc -l +78104 +# cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E '06/May/2019:(02|03|04|05|06)' | grep -o -E " /handle/[0-9]+/[0-9]+/(discover|browse)" | wc -l +2492 +``` + +- In other news, I see some IP is making several requests per second to the exact same REST API endpoints, for example: + +``` +# grep /rest/handle/10568/3703?expand=all rest.log | awk '{print $1}' | sort | uniq -c + 3 2a01:7e00::f03c:91ff:fe0a:d645 + 113 63.32.242.35 +``` + +- According to [viewdns.info](https://viewdns.info/reverseip/?host=63.32.242.35&t=1) that server belongs to Macaroni Brothers' + - The user agent of their non-REST API requests from the same IP is Drupal + - This is one very good reason to limit REST API requests, and perhaps to enable caching via nginx diff --git a/docs/2019-05/index.html b/docs/2019-05/index.html index 7de9e0e06..d7061ea90 100644 --- a/docs/2019-05/index.html +++ b/docs/2019-05/index.html @@ -28,7 +28,7 @@ But after this I tried to delete the item from the XMLUI and it is still present - + @@ -61,9 +61,9 @@ But after this I tried to delete the item from the XMLUI and it is still present "@type": "BlogPosting", "headline": "May, 2019", "url": "https:\/\/alanorth.github.io\/cgspace-notes\/2019-05\/", - "wordCount": "796", + "wordCount": "1511", "datePublished": "2019-05-01T07:37:43\x2b03:00", - "dateModified": "2019-05-05T23:53:42\x2b03:00", + "dateModified": "2019-05-06T11:50:57\x2b03:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -288,6 +288,109 @@ Please see the DSpace documentation for assistance.
  • Eventually the Atmire stats started working, despite errors about “Error opening new searcher” in the Solr Admin UI
  • I wrote to the dspace-tech mailing list again on the thread from March, 2019
  • + +
  • There were a few alerts from UptimeRobot about CGSpace going up and down this morning, along with an alert from Linode about 596% load

    + +
  • + + +

    CGSpace XMLUI sessions day

    + +

    linode18 firewall connections day

    + +

    linode18 postgres connections day

    + +

    linode18 CPU day

    + + diff --git a/docs/2019/05/2019-05-06-cpu-day.png b/docs/2019/05/2019-05-06-cpu-day.png new file mode 100644 index 000000000..3ac07ec51 Binary files /dev/null and b/docs/2019/05/2019-05-06-cpu-day.png differ diff --git a/docs/2019/05/2019-05-06-fw_conntrack-day.png b/docs/2019/05/2019-05-06-fw_conntrack-day.png new file mode 100644 index 000000000..ce253ef17 Binary files /dev/null and b/docs/2019/05/2019-05-06-fw_conntrack-day.png differ diff --git a/docs/2019/05/2019-05-06-jmx_dspace_sessions-day.png b/docs/2019/05/2019-05-06-jmx_dspace_sessions-day.png new file mode 100644 index 000000000..453a0a4f6 Binary files /dev/null and b/docs/2019/05/2019-05-06-jmx_dspace_sessions-day.png differ diff --git a/docs/2019/05/2019-05-06-postgres_connections_db-day.png b/docs/2019/05/2019-05-06-postgres_connections_db-day.png new file mode 100644 index 000000000..d4a7d3e97 Binary files /dev/null and b/docs/2019/05/2019-05-06-postgres_connections_db-day.png differ diff --git a/docs/sitemap.xml b/docs/sitemap.xml index 971dc2995..9fef9e83b 100644 --- a/docs/sitemap.xml +++ b/docs/sitemap.xml @@ -4,30 +4,30 @@ https://alanorth.github.io/cgspace-notes/ - 2019-05-05T23:53:42+03:00 + 2019-05-06T11:50:57+03:00 0 https://alanorth.github.io/cgspace-notes/2019-05/ - 2019-05-05T23:53:42+03:00 + 2019-05-06T11:50:57+03:00 https://alanorth.github.io/cgspace-notes/tags/notes/ - 2019-05-05T23:53:42+03:00 + 2019-05-06T11:50:57+03:00 0 https://alanorth.github.io/cgspace-notes/posts/ - 2019-05-05T23:53:42+03:00 + 2019-05-06T11:50:57+03:00 0 https://alanorth.github.io/cgspace-notes/tags/ - 2019-05-05T23:53:42+03:00 + 2019-05-06T11:50:57+03:00 0 diff --git a/static/2019/05/2019-05-06-cpu-day.png b/static/2019/05/2019-05-06-cpu-day.png new file mode 100644 index 000000000..3ac07ec51 Binary files /dev/null and b/static/2019/05/2019-05-06-cpu-day.png differ diff --git a/static/2019/05/2019-05-06-fw_conntrack-day.png b/static/2019/05/2019-05-06-fw_conntrack-day.png new file mode 100644 index 000000000..ce253ef17 Binary files /dev/null and b/static/2019/05/2019-05-06-fw_conntrack-day.png differ diff --git a/static/2019/05/2019-05-06-jmx_dspace_sessions-day.png b/static/2019/05/2019-05-06-jmx_dspace_sessions-day.png new file mode 100644 index 000000000..453a0a4f6 Binary files /dev/null and b/static/2019/05/2019-05-06-jmx_dspace_sessions-day.png differ diff --git a/static/2019/05/2019-05-06-postgres_connections_db-day.png b/static/2019/05/2019-05-06-postgres_connections_db-day.png new file mode 100644 index 000000000..d4a7d3e97 Binary files /dev/null and b/static/2019/05/2019-05-06-postgres_connections_db-day.png differ