mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-12-21 12:42:18 +01:00
Add notes for 2021-11-07
This commit is contained in:
parent
2ca9096495
commit
b3df4ff58f
@ -36,4 +36,41 @@ $ ./run.sh -s http://localhost:8081/solr/statistics-2019 -a import -o statistics
|
||||
- I checked on CGSpace's and I can't find them there either, but I see them in Solr when I query in the admin UI
|
||||
- I need to debug that, but it doesn't seem to be related to the sharding...
|
||||
|
||||
## 2021-11-04
|
||||
|
||||
- I spent a little bit of time debugging the Solr bug with the statistics-2019 shard but couldn't reproduce it for the few items I tested
|
||||
- So that's good, it seems the sharding worked
|
||||
- Linode alerted me to high CPU usage on CGSpace (linode18) yesterday
|
||||
- Looking at the Solr hits from yesterday I see 91.213.50.11 making 2,300 requests
|
||||
- According to AbuseIPDB.com this is owned by Registrarus LLC (registrarus.ru) and it has been reported for malicious activity by several users
|
||||
- The ASN is 50340 (SELECTEL-MSK, RU)
|
||||
- They are attempting SQL injection:
|
||||
|
||||
```console
|
||||
91.213.50.11 - - [03/Nov/2021:06:47:20 +0100] "HEAD /bitstream/handle/10568/106239/U19ArtSimonikovaChromosomeInthomNodev.pdf?sequence=1%60%20WHERE%206158%3D6158%20AND%204894%3D4741--%20kIlq&isAllowed=y HTTP/1.1" 200 0 "https://cgspace.cgiar.org:443/bitstream/handle/10568/106239/U19ArtSimonikovaChromosomeInthomNodev.pdf" "Mozilla/5.0 (X11; U; Linux i686; en-CA; rv:1.8.0.10) Gecko/20070223 Fedora/1.5.0.10-1.fc5 Firefox/1.5.0.10"
|
||||
```
|
||||
|
||||
- Another is in China, and they grabbed 1,200 PDFs from the REST API in under an hour:
|
||||
|
||||
```console
|
||||
# zgrep 222.129.53.160 /var/log/nginx/rest.log.2.gz | wc -l
|
||||
1178
|
||||
```
|
||||
|
||||
- I will continue to split the Solr statistics back into year-shards on DSpace Test (linode26)
|
||||
- Today I did all 2018 stats...
|
||||
- I want to see if there is a noticeable change in JVM memory, Solr response time, etc
|
||||
|
||||
## 2021-11-07
|
||||
|
||||
- Update all Docker containers on AReS and rebuild OpenRXV:
|
||||
|
||||
```console
|
||||
$ docker images | grep -v ^REPO | sed 's/ \+/:/g' | cut -d: -f1,2 | xargs -L1 docker pull
|
||||
$ docker-compose build
|
||||
```
|
||||
|
||||
- Then restart the server and start a fresh harvest
|
||||
- Continue splitting the Solr statistics into yearly shards on DSpace Test (doing 2017 today)
|
||||
|
||||
<!-- vim: set sw=2 ts=2: -->
|
||||
|
@ -18,7 +18,7 @@ $ zstd statistics-2019.json
|
||||
<meta property="og:type" content="article" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2021-11/" />
|
||||
<meta property="article:published_time" content="2021-11-02T22:27:07+02:00" />
|
||||
<meta property="article:modified_time" content="2021-11-01T10:49:21+02:00" />
|
||||
<meta property="article:modified_time" content="2021-11-03T15:56:15+02:00" />
|
||||
|
||||
|
||||
|
||||
@ -42,9 +42,9 @@ $ zstd statistics-2019.json
|
||||
"@type": "BlogPosting",
|
||||
"headline": "November, 2021",
|
||||
"url": "https://alanorth.github.io/cgspace-notes/2021-11/",
|
||||
"wordCount": "238",
|
||||
"wordCount": "468",
|
||||
"datePublished": "2021-11-02T22:27:07+02:00",
|
||||
"dateModified": "2021-11-01T10:49:21+02:00",
|
||||
"dateModified": "2021-11-03T15:56:15+02:00",
|
||||
"author": {
|
||||
"@type": "Person",
|
||||
"name": "Alan Orth"
|
||||
@ -149,6 +149,46 @@ $ ./run.sh -s http://localhost:8081/solr/statistics-2019 -a import -o statistics
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<h2 id="2021-11-04">2021-11-04</h2>
|
||||
<ul>
|
||||
<li>I spent a little bit of time debugging the Solr bug with the statistics-2019 shard but couldn’t reproduce it for the few items I tested
|
||||
<ul>
|
||||
<li>So that’s good, it seems the sharding worked</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>Linode alerted me to high CPU usage on CGSpace (linode18) yesterday
|
||||
<ul>
|
||||
<li>Looking at the Solr hits from yesterday I see 91.213.50.11 making 2,300 requests</li>
|
||||
<li>According to AbuseIPDB.com this is owned by Registrarus LLC (registrarus.ru) and it has been reported for malicious activity by several users</li>
|
||||
<li>The ASN is 50340 (SELECTEL-MSK, RU)</li>
|
||||
<li>They are attempting SQL injection:</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<pre tabindex="0"><code class="language-console" data-lang="console">91.213.50.11 - - [03/Nov/2021:06:47:20 +0100] "HEAD /bitstream/handle/10568/106239/U19ArtSimonikovaChromosomeInthomNodev.pdf?sequence=1%60%20WHERE%206158%3D6158%20AND%204894%3D4741--%20kIlq&isAllowed=y HTTP/1.1" 200 0 "https://cgspace.cgiar.org:443/bitstream/handle/10568/106239/U19ArtSimonikovaChromosomeInthomNodev.pdf" "Mozilla/5.0 (X11; U; Linux i686; en-CA; rv:1.8.0.10) Gecko/20070223 Fedora/1.5.0.10-1.fc5 Firefox/1.5.0.10"
|
||||
</code></pre><ul>
|
||||
<li>Another is in China, and they grabbed 1,200 PDFs from the REST API in under an hour:</li>
|
||||
</ul>
|
||||
<pre tabindex="0"><code class="language-console" data-lang="console"># zgrep 222.129.53.160 /var/log/nginx/rest.log.2.gz | wc -l
|
||||
1178
|
||||
</code></pre><ul>
|
||||
<li>I will continue to split the Solr statistics back into year-shards on DSpace Test (linode26)
|
||||
<ul>
|
||||
<li>Today I did all 2018 stats…</li>
|
||||
<li>I want to see if there is a noticeable change in JVM memory, Solr response time, etc</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<h2 id="2021-11-07">2021-11-07</h2>
|
||||
<ul>
|
||||
<li>Update all Docker containers on AReS and rebuild OpenRXV:</li>
|
||||
</ul>
|
||||
<pre tabindex="0"><code class="language-console" data-lang="console">$ docker images | grep -v ^REPO | sed 's/ \+/:/g' | cut -d: -f1,2 | xargs -L1 docker pull
|
||||
$ docker-compose build
|
||||
</code></pre><ul>
|
||||
<li>Then restart the server and start a fresh harvest</li>
|
||||
<li>Continue splitting the Solr statistics into yearly shards on DSpace Test (doing 2017 today)</li>
|
||||
</ul>
|
||||
<!-- raw HTML omitted -->
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/" />
|
||||
<meta property="og:updated_time" content="2021-11-01T10:49:21+02:00" />
|
||||
<meta property="og:updated_time" content="2021-11-03T15:56:15+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
||||
<meta property="og:updated_time" content="2021-11-01T10:49:21+02:00" />
|
||||
<meta property="og:updated_time" content="2021-11-03T15:56:15+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
||||
<meta property="og:updated_time" content="2021-11-01T10:49:21+02:00" />
|
||||
<meta property="og:updated_time" content="2021-11-03T15:56:15+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
||||
<meta property="og:updated_time" content="2021-11-01T10:49:21+02:00" />
|
||||
<meta property="og:updated_time" content="2021-11-03T15:56:15+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
||||
<meta property="og:updated_time" content="2021-11-01T10:49:21+02:00" />
|
||||
<meta property="og:updated_time" content="2021-11-03T15:56:15+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
||||
<meta property="og:updated_time" content="2021-11-01T10:49:21+02:00" />
|
||||
<meta property="og:updated_time" content="2021-11-03T15:56:15+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
||||
<meta property="og:updated_time" content="2021-11-01T10:49:21+02:00" />
|
||||
<meta property="og:updated_time" content="2021-11-03T15:56:15+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
||||
<meta property="og:updated_time" content="2021-11-01T10:49:21+02:00" />
|
||||
<meta property="og:updated_time" content="2021-11-03T15:56:15+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
||||
<meta property="og:updated_time" content="2021-11-01T10:49:21+02:00" />
|
||||
<meta property="og:updated_time" content="2021-11-03T15:56:15+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
||||
<meta property="og:updated_time" content="2021-11-01T10:49:21+02:00" />
|
||||
<meta property="og:updated_time" content="2021-11-03T15:56:15+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
||||
<meta property="og:updated_time" content="2021-11-01T10:49:21+02:00" />
|
||||
<meta property="og:updated_time" content="2021-11-03T15:56:15+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
||||
<meta property="og:updated_time" content="2021-11-01T10:49:21+02:00" />
|
||||
<meta property="og:updated_time" content="2021-11-03T15:56:15+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
||||
<meta property="og:updated_time" content="2021-11-01T10:49:21+02:00" />
|
||||
<meta property="og:updated_time" content="2021-11-03T15:56:15+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
||||
<meta property="og:updated_time" content="2021-11-01T10:49:21+02:00" />
|
||||
<meta property="og:updated_time" content="2021-11-03T15:56:15+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
||||
<meta property="og:updated_time" content="2021-11-01T10:49:21+02:00" />
|
||||
<meta property="og:updated_time" content="2021-11-03T15:56:15+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
||||
<meta property="og:updated_time" content="2021-11-01T10:49:21+02:00" />
|
||||
<meta property="og:updated_time" content="2021-11-03T15:56:15+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
||||
<meta property="og:updated_time" content="2021-11-01T10:49:21+02:00" />
|
||||
<meta property="og:updated_time" content="2021-11-03T15:56:15+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
||||
<meta property="og:updated_time" content="2021-11-01T10:49:21+02:00" />
|
||||
<meta property="og:updated_time" content="2021-11-03T15:56:15+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
||||
<meta property="og:updated_time" content="2021-11-01T10:49:21+02:00" />
|
||||
<meta property="og:updated_time" content="2021-11-03T15:56:15+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
||||
<meta property="og:updated_time" content="2021-11-01T10:49:21+02:00" />
|
||||
<meta property="og:updated_time" content="2021-11-03T15:56:15+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
||||
<meta property="og:updated_time" content="2021-11-01T10:49:21+02:00" />
|
||||
<meta property="og:updated_time" content="2021-11-03T15:56:15+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
||||
<meta property="og:updated_time" content="2021-11-01T10:49:21+02:00" />
|
||||
<meta property="og:updated_time" content="2021-11-03T15:56:15+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -10,7 +10,7 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
||||
<meta property="og:updated_time" content="2021-11-01T10:49:21+02:00" />
|
||||
<meta property="og:updated_time" content="2021-11-03T15:56:15+02:00" />
|
||||
|
||||
|
||||
|
||||
|
@ -3,19 +3,19 @@
|
||||
xmlns:xhtml="http://www.w3.org/1999/xhtml">
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/categories/</loc>
|
||||
<lastmod>2021-11-01T10:49:21+02:00</lastmod>
|
||||
<lastmod>2021-11-03T15:56:15+02:00</lastmod>
|
||||
</url><url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/</loc>
|
||||
<lastmod>2021-11-01T10:49:21+02:00</lastmod>
|
||||
<lastmod>2021-11-03T15:56:15+02:00</lastmod>
|
||||
</url><url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/categories/notes/</loc>
|
||||
<lastmod>2021-11-01T10:49:21+02:00</lastmod>
|
||||
<lastmod>2021-11-03T15:56:15+02:00</lastmod>
|
||||
</url><url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/2021-11/</loc>
|
||||
<lastmod>2021-11-01T10:49:21+02:00</lastmod>
|
||||
<lastmod>2021-11-03T15:56:15+02:00</lastmod>
|
||||
</url><url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/posts/</loc>
|
||||
<lastmod>2021-11-01T10:49:21+02:00</lastmod>
|
||||
<lastmod>2021-11-03T15:56:15+02:00</lastmod>
|
||||
</url><url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/2021-10/</loc>
|
||||
<lastmod>2021-11-01T10:48:13+02:00</lastmod>
|
||||
|
Loading…
Reference in New Issue
Block a user