diff --git a/content/posts/2019-05.md b/content/posts/2019-05.md index b8485f50d..2c765c809 100644 --- a/content/posts/2019-05.md +++ b/content/posts/2019-05.md @@ -219,4 +219,45 @@ $ cat dspace.log.2019-05-01 | grep -E '2019-05-01 (02|03|04|05|06):' | grep -o - - The user agent of their non-REST API requests from the same IP is Drupal - This is one very good reason to limit REST API requests, and perhaps to enable caching via nginx +## 2019-05-07 + +- The total number of unique IPs on CGSpace yesterday was almost 14,000, which is several thousand higher than previous day totals: + +``` +# zcat --force /var/log/nginx/access.log.1 /var/log/nginx/access.log.2.gz | grep -E '06/May/2019' | awk '{print $1}' | sort | uniq | wc -l +13969 +# zcat --force /var/log/nginx/access.log.2.gz /var/log/nginx/access.log.3.gz | grep -E '05/May/2019' | awk '{print $1}' | sort | uniq | wc -l +5936 +# zcat --force /var/log/nginx/access.log.3.gz /var/log/nginx/access.log.4.gz | grep -E '04/May/2019' | awk '{print $1}' | sort | uniq | wc -l +6229 +# zcat --force /var/log/nginx/access.log.4.gz /var/log/nginx/access.log.5.gz | grep -E '03/May/2019' | awk '{print $1}' | sort | uniq | wc -l +8051 +``` + +- Total number of sessions yesterday was *much* higher compared to days last week: + +``` +$ cat dspace.log.2019-05-06 | grep -E 'session_id=[A-Z0-9]{32}' | sort | uniq | wc -l +144160 +$ cat dspace.log.2019-05-05 | grep -E 'session_id=[A-Z0-9]{32}' | sort | uniq | wc -l +57269 +$ cat dspace.log.2019-05-04 | grep -E 'session_id=[A-Z0-9]{32}' | sort | uniq | wc -l +58648 +$ cat dspace.log.2019-05-03 | grep -E 'session_id=[A-Z0-9]{32}' | sort | uniq | wc -l +27883 +$ cat dspace.log.2019-05-02 | grep -E 'session_id=[A-Z0-9]{32}' | sort | uniq | wc -l +26996 +$ cat dspace.log.2019-05-01 | grep -E 'session_id=[A-Z0-9]{32}' | sort | uniq | wc -l +61866 +``` + +- The usage statistics seem to agree that yesterday was crazy: + +![Atmire Usage statistics spike 2019-05-06](/cgspace-notes/2019/05/2019-05-07-atmire-usage-week.png) + +- Sarah from RTB asked me about the RSS / XML link for the the CGIAR.org website again + - Apparently Sam Stacey is trying to add an RSS feed so the items get automatically syndicated to the CGIAR website + - I send her the link to the collection RSS feed +- Add requests cache to `resolve-addresses.py` script + diff --git a/docs/2019-05/index.html b/docs/2019-05/index.html index d7061ea90..517007c0b 100644 --- a/docs/2019-05/index.html +++ b/docs/2019-05/index.html @@ -28,7 +28,7 @@ But after this I tried to delete the item from the XMLUI and it is still present - + @@ -61,9 +61,9 @@ But after this I tried to delete the item from the XMLUI and it is still present "@type": "BlogPosting", "headline": "May, 2019", "url": "https:\/\/alanorth.github.io\/cgspace-notes\/2019-05\/", - "wordCount": "1511", + "wordCount": "1783", "datePublished": "2019-05-01T07:37:43\x2b03:00", - "dateModified": "2019-05-06T11:50:57\x2b03:00", + "dateModified": "2019-05-06T15:41:40\x2b03:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -393,6 +393,52 @@ $ cat dspace.log.2019-05-01 | grep -E '2019-05-01 (02|03|04|05|06):' | grep -o - +
The total number of unique IPs on CGSpace yesterday was almost 14,000, which is several thousand higher than previous day totals:
+ +# zcat --force /var/log/nginx/access.log.1 /var/log/nginx/access.log.2.gz | grep -E '06/May/2019' | awk '{print $1}' | sort | uniq | wc -l
+13969
+# zcat --force /var/log/nginx/access.log.2.gz /var/log/nginx/access.log.3.gz | grep -E '05/May/2019' | awk '{print $1}' | sort | uniq | wc -l
+5936
+# zcat --force /var/log/nginx/access.log.3.gz /var/log/nginx/access.log.4.gz | grep -E '04/May/2019' | awk '{print $1}' | sort | uniq | wc -l
+6229
+# zcat --force /var/log/nginx/access.log.4.gz /var/log/nginx/access.log.5.gz | grep -E '03/May/2019' | awk '{print $1}' | sort | uniq | wc -l
+8051
+
Total number of sessions yesterday was much higher compared to days last week:
+ +$ cat dspace.log.2019-05-06 | grep -E 'session_id=[A-Z0-9]{32}' | sort | uniq | wc -l
+144160
+$ cat dspace.log.2019-05-05 | grep -E 'session_id=[A-Z0-9]{32}' | sort | uniq | wc -l
+57269
+$ cat dspace.log.2019-05-04 | grep -E 'session_id=[A-Z0-9]{32}' | sort | uniq | wc -l
+58648
+$ cat dspace.log.2019-05-03 | grep -E 'session_id=[A-Z0-9]{32}' | sort | uniq | wc -l
+27883
+$ cat dspace.log.2019-05-02 | grep -E 'session_id=[A-Z0-9]{32}' | sort | uniq | wc -l
+26996
+$ cat dspace.log.2019-05-01 | grep -E 'session_id=[A-Z0-9]{32}' | sort | uniq | wc -l
+61866
+
The usage statistics seem to agree that yesterday was crazy:
resolve-addresses.py
script