diff --git a/content/post/2018-02.md b/content/post/2018-02.md index 085dbd6c2..9eb3a665c 100644 --- a/content/post/2018-02.md +++ b/content/post/2018-02.md @@ -595,3 +595,61 @@ UPDATE 2 - The one on the bottom left uses a similar format to our author display, and the one in the middle uses the format [recommended by ORCID's branding guidelines](https://orcid.org/trademark-and-id-display-guidelines) - Also, I realized that the Academicons font icon set we're using includes an ORCID badge so we don't need to use the PNG image anymore - Run system updates on DSpace Test (linode02) and reboot the server +- Looking back at the system errors on 2018-02-15, I wonder what the fuck caused this: + +``` +$ wc -l dspace.log.2018-02-1{0..8} + 383483 dspace.log.2018-02-10 + 275022 dspace.log.2018-02-11 + 249557 dspace.log.2018-02-12 + 280142 dspace.log.2018-02-13 + 615119 dspace.log.2018-02-14 + 4388259 dspace.log.2018-02-15 + 243496 dspace.log.2018-02-16 + 209186 dspace.log.2018-02-17 + 167432 dspace.log.2018-02-18 +``` + +- From an average of a few hundred thousand to over four million lines in DSpace log? +- Using grep's `-B1` I can see the line before the heap space error, which has the time, ie: + +``` +2018-02-15 16:02:12,748 ERROR org.dspace.app.xmlui.cocoon.DSpaceCocoonServletFilter @ Serious Error Occurred Processing Request! +org.springframework.web.util.NestedServletException: Handler processing failed; nested exception is java.lang.OutOfMemoryError: Java heap space +``` + +- So these errors happened at hours 16, 18, 19, and 20 +- Let's see what was going on in nginx then: + +``` +# zcat --force /var/log/nginx/*.log.{3,4}.gz | wc -l +168571 +# zcat --force /var/log/nginx/*.log.{3,4}.gz | grep -E "15/Feb/2018:(16|18|19|20)" | wc -l +8188 +``` + +- Only 8,000 requests during those four hours, out of 170,000 the whole day! +- And the usage of XMLUI, REST, and OAI looks SUPER boring: + +``` +# zcat --force /var/log/nginx/*.log.{3,4}.gz | grep -E "15/Feb/2018:(16|18|19|20)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10 + 111 95.108.181.88 + 158 45.5.184.221 + 201 104.196.152.243 + 205 68.180.228.157 + 236 40.77.167.131 + 253 207.46.13.159 + 293 207.46.13.59 + 296 63.143.42.242 + 303 207.46.13.157 + 416 63.143.42.244 +``` + +- 63.143.42.244 is Uptime Robot, and 207.46.x.x is Bing! +- The DSpace sessions, PostgreSQL connections, and JVM memory all look normal +- I see a lot of AccessShareLock on February 15th...? + +![PostgreSQL locks](/cgspace-notes/2018/02/postgresql-locks-week.png) + +- I have no idea what caused this crash +- In other news, I adjusted the ORCID badge size on the XMLUI item display and sent it back to Peter for feedback diff --git a/docs/2018-02/index.html b/docs/2018-02/index.html index ac48020ea..ee926cede 100644 --- a/docs/2018-02/index.html +++ b/docs/2018-02/index.html @@ -23,7 +23,7 @@ I copied the logic in the jmx_tomcat_dbpools provided by Ubuntu’s munin-pl - + @@ -57,9 +57,9 @@ I copied the logic in the jmx_tomcat_dbpools provided by Ubuntu’s munin-pl "@type": "BlogPosting", "headline": "February, 2018", "url": "https://alanorth.github.io/cgspace-notes/2018-02/", - "wordCount": "3914", + "wordCount": "4172", "datePublished": "2018-02-01T16:28:54+02:00", - "dateModified": "2018-02-18T11:21:16+02:00", + "dateModified": "2018-02-18T12:02:54+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -793,6 +793,70 @@ UPDATE 2
$ wc -l dspace.log.2018-02-1{0..8}
+ 383483 dspace.log.2018-02-10
+ 275022 dspace.log.2018-02-11
+ 249557 dspace.log.2018-02-12
+ 280142 dspace.log.2018-02-13
+ 615119 dspace.log.2018-02-14
+ 4388259 dspace.log.2018-02-15
+ 243496 dspace.log.2018-02-16
+ 209186 dspace.log.2018-02-17
+ 167432 dspace.log.2018-02-18
+
+
+-B1
I can see the line before the heap space error, which has the time, ie:2018-02-15 16:02:12,748 ERROR org.dspace.app.xmlui.cocoon.DSpaceCocoonServletFilter @ Serious Error Occurred Processing Request!
+org.springframework.web.util.NestedServletException: Handler processing failed; nested exception is java.lang.OutOfMemoryError: Java heap space
+
+
+# zcat --force /var/log/nginx/*.log.{3,4}.gz | wc -l
+168571
+# zcat --force /var/log/nginx/*.log.{3,4}.gz | grep -E "15/Feb/2018:(16|18|19|20)" | wc -l
+8188
+
+
+# zcat --force /var/log/nginx/*.log.{3,4}.gz | grep -E "15/Feb/2018:(16|18|19|20)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
+ 111 95.108.181.88
+ 158 45.5.184.221
+ 201 104.196.152.243
+ 205 68.180.228.157
+ 236 40.77.167.131
+ 253 207.46.13.159
+ 293 207.46.13.59
+ 296 63.143.42.242
+ 303 207.46.13.157
+ 416 63.143.42.244
+
+
+