Add notes for 2020-08-13

This commit is contained in:
Alan Orth 2020-08-13 17:56:39 +03:00
parent ccecd63eb0
commit eafe422984
Signed by: alanorth
GPG Key ID: 0FB860CC9C45B1B9
20 changed files with 54 additions and 25 deletions

View File

@ -385,4 +385,17 @@ dspace=# SELECT count(text_value) FROM metadatavalue WHERE metadata_field_id = 2
- I noticed a bunch of user agents with "Crawl" in the Solr stats, which is strange because the DSpace spider agents file has had "crawl" for a long time (and it is case insensitive) - I noticed a bunch of user agents with "Crawl" in the Solr stats, which is strange because the DSpace spider agents file has had "crawl" for a long time (and it is case insensitive)
- In any case I will purge them and add them to the Tomcat Crawler Session Manager Valve so that at least their sessions get re-used - In any case I will purge them and add them to the Tomcat Crawler Session Manager Valve so that at least their sessions get re-used
## 2020-08-13
- Linode keeps sending mails that the load and outgoing bandwidth is above the threshold
- I took a look briefly and found two IPs with the "Delphi 2009" user agent
- Then there is 88.99.115.53 which made 82,000 requests in 2020 so far with no user agent
- 64.62.202.73 has made 7,000 requests with this user agent `Mozilla/5.0 (compatible; +centuryb.o.t9[at]gmail.com)`
- I had added it to the Tomcat Crawler Session Manager Valve last week but never purged the hits from Solr
- 195.54.160.163 is making thousands of requests with user agents liket this:
`(CASE WHEN 2850=9474 THEN 2850 ELSE NULL END)`
- I purged 150,000 hits from 2020 and 2020 from these user agents and hosts
<!-- vim: set sw=2 ts=2: --> <!-- vim: set sw=2 ts=2: -->

View File

@ -19,7 +19,7 @@ It is class based so I can easily add support for other vocabularies, and the te
<meta property="og:type" content="article" /> <meta property="og:type" content="article" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2020-08/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2020-08/" />
<meta property="article:published_time" content="2020-08-02T15:35:54+03:00" /> <meta property="article:published_time" content="2020-08-02T15:35:54+03:00" />
<meta property="article:modified_time" content="2020-08-10T15:59:22+03:00" /> <meta property="article:modified_time" content="2020-08-11T11:35:05+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="August, 2020"/> <meta name="twitter:title" content="August, 2020"/>
@ -43,9 +43,9 @@ It is class based so I can easily add support for other vocabularies, and the te
"@type": "BlogPosting", "@type": "BlogPosting",
"headline": "August, 2020", "headline": "August, 2020",
"url": "https://alanorth.github.io/cgspace-notes/2020-08/", "url": "https://alanorth.github.io/cgspace-notes/2020-08/",
"wordCount": "2443", "wordCount": "2554",
"datePublished": "2020-08-02T15:35:54+03:00", "datePublished": "2020-08-02T15:35:54+03:00",
"dateModified": "2020-08-10T15:59:22+03:00", "dateModified": "2020-08-11T11:35:05+03:00",
"author": { "author": {
"@type": "Person", "@type": "Person",
"name": "Alan Orth" "name": "Alan Orth"
@ -550,6 +550,22 @@ $ curl -s &quot;http://localhost:8081/solr/statistics-2010/update?softCommit=tru
</ul> </ul>
</li> </li>
</ul> </ul>
<h2 id="2020-08-13">2020-08-13</h2>
<ul>
<li>Linode keeps sending mails that the load and outgoing bandwidth is above the threshold
<ul>
<li>I took a look briefly and found two IPs with the &ldquo;Delphi 2009&rdquo; user agent</li>
<li>Then there is 88.99.115.53 which made 82,000 requests in 2020 so far with no user agent</li>
<li>64.62.202.73 has made 7,000 requests with this user agent <code>Mozilla/5.0 (compatible; +centuryb.o.t9[at]gmail.com)</code></li>
<li>I had added it to the Tomcat Crawler Session Manager Valve last week but never purged the hits from Solr</li>
<li>195.54.160.163 is making thousands of requests with user agents liket this:</li>
</ul>
</li>
</ul>
<p><code>(CASE WHEN 2850=9474 THEN 2850 ELSE NULL END)</code></p>
<ul>
<li>I purged 150,000 hits from 2020 and 2020 from these user agents and hosts</li>
</ul>
<!-- raw HTML omitted --> <!-- raw HTML omitted -->

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/" />
<meta property="og:updated_time" content="2020-08-10T15:59:22+03:00" /> <meta property="og:updated_time" content="2020-08-11T11:35:05+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="Categories"/> <meta name="twitter:title" content="Categories"/>

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2020-08-10T15:59:22+03:00" /> <meta property="og:updated_time" content="2020-08-11T11:35:05+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="Notes"/> <meta name="twitter:title" content="Notes"/>

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2020-08-10T15:59:22+03:00" /> <meta property="og:updated_time" content="2020-08-11T11:35:05+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="Notes"/> <meta name="twitter:title" content="Notes"/>

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2020-08-10T15:59:22+03:00" /> <meta property="og:updated_time" content="2020-08-11T11:35:05+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="Notes"/> <meta name="twitter:title" content="Notes"/>

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2020-08-10T15:59:22+03:00" /> <meta property="og:updated_time" content="2020-08-11T11:35:05+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="Notes"/> <meta name="twitter:title" content="Notes"/>

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2020-08-10T15:59:22+03:00" /> <meta property="og:updated_time" content="2020-08-11T11:35:05+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="CGSpace Notes"/> <meta name="twitter:title" content="CGSpace Notes"/>

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2020-08-10T15:59:22+03:00" /> <meta property="og:updated_time" content="2020-08-11T11:35:05+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="CGSpace Notes"/> <meta name="twitter:title" content="CGSpace Notes"/>

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2020-08-10T15:59:22+03:00" /> <meta property="og:updated_time" content="2020-08-11T11:35:05+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="CGSpace Notes"/> <meta name="twitter:title" content="CGSpace Notes"/>

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2020-08-10T15:59:22+03:00" /> <meta property="og:updated_time" content="2020-08-11T11:35:05+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="CGSpace Notes"/> <meta name="twitter:title" content="CGSpace Notes"/>

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2020-08-10T15:59:22+03:00" /> <meta property="og:updated_time" content="2020-08-11T11:35:05+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="CGSpace Notes"/> <meta name="twitter:title" content="CGSpace Notes"/>

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2020-08-10T15:59:22+03:00" /> <meta property="og:updated_time" content="2020-08-11T11:35:05+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="CGSpace Notes"/> <meta name="twitter:title" content="CGSpace Notes"/>

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2020-08-10T15:59:22+03:00" /> <meta property="og:updated_time" content="2020-08-11T11:35:05+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="Posts"/> <meta name="twitter:title" content="Posts"/>

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2020-08-10T15:59:22+03:00" /> <meta property="og:updated_time" content="2020-08-11T11:35:05+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="Posts"/> <meta name="twitter:title" content="Posts"/>

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2020-08-10T15:59:22+03:00" /> <meta property="og:updated_time" content="2020-08-11T11:35:05+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="Posts"/> <meta name="twitter:title" content="Posts"/>

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2020-08-10T15:59:22+03:00" /> <meta property="og:updated_time" content="2020-08-11T11:35:05+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="Posts"/> <meta name="twitter:title" content="Posts"/>

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2020-08-10T15:59:22+03:00" /> <meta property="og:updated_time" content="2020-08-11T11:35:05+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="Posts"/> <meta name="twitter:title" content="Posts"/>

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2020-08-10T15:59:22+03:00" /> <meta property="og:updated_time" content="2020-08-11T11:35:05+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="Posts"/> <meta name="twitter:title" content="Posts"/>

View File

@ -4,27 +4,27 @@
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/2020-08/</loc> <loc>https://alanorth.github.io/cgspace-notes/2020-08/</loc>
<lastmod>2020-08-10T15:59:22+03:00</lastmod> <lastmod>2020-08-11T11:35:05+03:00</lastmod>
</url> </url>
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/categories/</loc> <loc>https://alanorth.github.io/cgspace-notes/categories/</loc>
<lastmod>2020-08-10T15:59:22+03:00</lastmod> <lastmod>2020-08-11T11:35:05+03:00</lastmod>
</url> </url>
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/</loc> <loc>https://alanorth.github.io/cgspace-notes/</loc>
<lastmod>2020-08-10T15:59:22+03:00</lastmod> <lastmod>2020-08-11T11:35:05+03:00</lastmod>
</url> </url>
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/categories/notes/</loc> <loc>https://alanorth.github.io/cgspace-notes/categories/notes/</loc>
<lastmod>2020-08-10T15:59:22+03:00</lastmod> <lastmod>2020-08-11T11:35:05+03:00</lastmod>
</url> </url>
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/posts/</loc> <loc>https://alanorth.github.io/cgspace-notes/posts/</loc>
<lastmod>2020-08-10T15:59:22+03:00</lastmod> <lastmod>2020-08-11T11:35:05+03:00</lastmod>
</url> </url>
<url> <url>