Add notes for 2018-12-17

This commit is contained in:
Alan Orth 2018-12-17 22:35:55 +02:00
parent 8443c2c86c
commit 2c7ee40e8d
Signed by: alanorth
GPG Key ID: 0FB860CC9C45B1B9
3 changed files with 68 additions and 8 deletions

View File

@ -369,4 +369,32 @@ $ grep -o -E 'session_id=[A-Z0-9]{32}:ip_addr=54.70.40.11' dspace.log.2018-12-05
- Also, I notice they ended up registering a Handle (they had been considering taking KnowledgeArc's advice to *not* use Handles!) - Also, I notice they ended up registering a Handle (they had been considering taking KnowledgeArc's advice to *not* use Handles!)
- Did some coordination work on the hotel bookings for the January AReS workshop in Amman - Did some coordination work on the hotel bookings for the January AReS workshop in Amman
## 2018-12-17
- Linode alerted me twice today that the load on CGSpace (linode18) was very high
- Looking at the nginx logs I see a few new IPs in the top 10:
```
# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "17/Dec/2018" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
927 157.55.39.81
975 54.70.40.11
2090 50.116.102.77
2121 66.249.66.219
3811 35.237.175.180
4590 205.186.128.185
4590 70.32.83.92
5436 2a01:4f8:173:1e85::2
5438 143.233.227.216
6706 94.71.244.172
```
- `94.71.244.172` and `143.233.227.216` are both in Greece and use the following user agent:
```
Mozilla/3.0 (compatible; Indy Library)
```
- I see that I added this bot to the Tomcat Crawler Session Manager valve in 2017-12 so its XMLUI sessions are getting re-used
- `2a01:4f8:173:1e85::2` is some new bot called `BLEXBot/1.0` which should be matching the existing "bot" pattern in the Tomcat Crawler Session Manager regex
<!-- vim: set sw=2 ts=2: --> <!-- vim: set sw=2 ts=2: -->

View File

@ -21,7 +21,7 @@ I noticed that there is another issue with PDF thumbnails on CGSpace, and I see
" /> " />
<meta property="og:type" content="article" /> <meta property="og:type" content="article" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2018-12/" /><meta property="article:published_time" content="2018-12-02T02:09:30&#43;02:00"/> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2018-12/" /><meta property="article:published_time" content="2018-12-02T02:09:30&#43;02:00"/>
<meta property="article:modified_time" content="2018-12-11T12:27:53&#43;03:00"/> <meta property="article:modified_time" content="2018-12-13T22:50:17&#43;03:00"/>
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="December, 2018"/> <meta name="twitter:title" content="December, 2018"/>
@ -48,9 +48,9 @@ I noticed that there is another issue with PDF thumbnails on CGSpace, and I see
"@type": "BlogPosting", "@type": "BlogPosting",
"headline": "December, 2018", "headline": "December, 2018",
"url": "https://alanorth.github.io/cgspace-notes/2018-12/", "url": "https://alanorth.github.io/cgspace-notes/2018-12/",
"wordCount": "2311", "wordCount": "2448",
"datePublished": "2018-12-02T02:09:30&#43;02:00", "datePublished": "2018-12-02T02:09:30&#43;02:00",
"dateModified": "2018-12-11T12:27:53&#43;03:00", "dateModified": "2018-12-13T22:50:17&#43;03:00",
"author": { "author": {
"@type": "Person", "@type": "Person",
"name": "Alan Orth" "name": "Alan Orth"
@ -535,6 +535,38 @@ $ grep -o -E 'session_id=[A-Z0-9]{32}:ip_addr=54.70.40.11' dspace.log.2018-12-05
<li>Did some coordination work on the hotel bookings for the January AReS workshop in Amman</li> <li>Did some coordination work on the hotel bookings for the January AReS workshop in Amman</li>
</ul> </ul>
<h2 id="2018-12-17">2018-12-17</h2>
<ul>
<li>Linode alerted me twice today that the load on CGSpace (linode18) was very high</li>
<li>Looking at the nginx logs I see a few new IPs in the top 10:</li>
</ul>
<pre><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &quot;17/Dec/2018&quot; | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
927 157.55.39.81
975 54.70.40.11
2090 50.116.102.77
2121 66.249.66.219
3811 35.237.175.180
4590 205.186.128.185
4590 70.32.83.92
5436 2a01:4f8:173:1e85::2
5438 143.233.227.216
6706 94.71.244.172
</code></pre>
<ul>
<li><code>94.71.244.172</code> and <code>143.233.227.216</code> are both in Greece and use the following user agent:</li>
</ul>
<pre><code>Mozilla/3.0 (compatible; Indy Library)
</code></pre>
<ul>
<li>I see that I added this bot to the Tomcat Crawler Session Manager valve in 2017-12 so its XMLUI sessions are getting re-used</li>
<li><code>2a01:4f8:173:1e85::2</code> is some new bot called <code>BLEXBot/1.0</code> which should be matching the existing &ldquo;bot&rdquo; pattern in the Tomcat Crawler Session Manager regex</li>
</ul>
<!-- vim: set sw=2 ts=2: --> <!-- vim: set sw=2 ts=2: -->

View File

@ -4,7 +4,7 @@
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/2018-12/</loc> <loc>https://alanorth.github.io/cgspace-notes/2018-12/</loc>
<lastmod>2018-12-11T12:27:53+03:00</lastmod> <lastmod>2018-12-13T22:50:17+03:00</lastmod>
</url> </url>
<url> <url>
@ -199,7 +199,7 @@
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/</loc> <loc>https://alanorth.github.io/cgspace-notes/</loc>
<lastmod>2018-12-11T12:27:53+03:00</lastmod> <lastmod>2018-12-13T22:50:17+03:00</lastmod>
<priority>0</priority> <priority>0</priority>
</url> </url>
@ -210,7 +210,7 @@
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/tags/notes/</loc> <loc>https://alanorth.github.io/cgspace-notes/tags/notes/</loc>
<lastmod>2018-12-11T12:27:53+03:00</lastmod> <lastmod>2018-12-13T22:50:17+03:00</lastmod>
<priority>0</priority> <priority>0</priority>
</url> </url>
@ -222,13 +222,13 @@
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/posts/</loc> <loc>https://alanorth.github.io/cgspace-notes/posts/</loc>
<lastmod>2018-12-11T12:27:53+03:00</lastmod> <lastmod>2018-12-13T22:50:17+03:00</lastmod>
<priority>0</priority> <priority>0</priority>
</url> </url>
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/tags/</loc> <loc>https://alanorth.github.io/cgspace-notes/tags/</loc>
<lastmod>2018-12-11T12:27:53+03:00</lastmod> <lastmod>2018-12-13T22:50:17+03:00</lastmod>
<priority>0</priority> <priority>0</priority>
</url> </url>