mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Add notes for 2017-11-19
This commit is contained in:
@ -693,3 +693,46 @@ $ jconsole -J-DsocksProxyHost=localhost -J-DsocksProxyPort=7777 service:jmx:rmi:
|
||||
- Here is the Jconsole screen after looping `http --print Hh https://dspacetest.cgiar.org/handle/10568/1` for a few minutes:
|
||||
|
||||

|
||||
|
||||
## 2017-11-19
|
||||
|
||||
- Linode sent an alert that CGSpace was using a lot of CPU around 4–6 AM
|
||||
- Looking in the nginx access logs I see the most active XMLUI users between 4 and 6 AM:
|
||||
|
||||
```
|
||||
# cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E "19/Nov/2017:0[456]" | awk '{print $1}' | sort -n | uniq -c | sort -h | tail
|
||||
111 66.249.66.155
|
||||
171 5.9.6.51
|
||||
188 54.162.241.40
|
||||
229 207.46.13.23
|
||||
233 207.46.13.137
|
||||
247 40.77.167.6
|
||||
251 207.46.13.36
|
||||
275 68.180.229.254
|
||||
325 104.196.152.243
|
||||
1610 66.249.66.153
|
||||
```
|
||||
|
||||
- 66.249.66.153 appears to be Googlebot:
|
||||
|
||||
```
|
||||
66.249.66.153 - - [19/Nov/2017:06:26:01 +0000] "GET /handle/10568/2203 HTTP/1.1" 200 6309 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
|
||||
```
|
||||
|
||||
- We know Googlebot is persistent but behaves well, so I guess it was just a coincidence that it came at a time when we had other traffic and server activity
|
||||
- In related news, I see an Atmire update process going for many hours and responsible for hundreds of thousands of log entries (two thirds of all log entries)
|
||||
|
||||
```
|
||||
$ wc -l dspace.log.2017-11-19
|
||||
388472 dspace.log.2017-11-19
|
||||
$ grep -c com.atmire.utils.UpdateSolrStatsMetadata dspace.log.2017-11-19
|
||||
267494
|
||||
```
|
||||
|
||||
- WTF is this process doing every day, and for so many hours?
|
||||
- In unrelated news, when I was looking at the DSpace logs I saw a bunch of errors like this:
|
||||
|
||||
```
|
||||
2017-11-19 03:00:32,806 INFO org.apache.pdfbox.pdfparser.PDFParser @ Document is encrypted
|
||||
2017-11-19 03:00:32,807 ERROR org.apache.pdfbox.filter.FlateFilter @ FlateFilter: stop reading corrupt stream due to a DataFormatException
|
||||
```
|
||||
|
Reference in New Issue
Block a user