Add notes for 2021-10-05

This commit is contained in:
Alan Orth 2021-10-05 18:54:39 +03:00
parent 4c11bc1c1e
commit 63500a8837
Signed by: alanorth
GPG Key ID: 0FB860CC9C45B1B9
26 changed files with 59 additions and 33 deletions

View File

@ -87,7 +87,7 @@ $ csvgrep -c asn -m 14618 /tmp/mozilla-4.0-ips.csv | csvcut -c ip | sed 1d | tee
290382 GET /handle/10568/83389
```
- Before I purge all those I will ask someone Samuel Stacey from the System office to hopefully get an insight...
- Before I purge all those I will ask someone Samuel Stacey from the System Office to hopefully get an insight...
- Meeting with Michael Victor, Peter, Jane, and Abenet about the future of repositories in the One CGIAR
- Meeting with Michelle from Altmetric about their new CSV upload system
- I sent her some examples of Handles that have DOIs, but no linked score (yet) to see if an association will be created when she uploads them
@ -107,4 +107,17 @@ $ ./ilri/agrovoc-lookup.py -i /tmp/agrovoc-sorted.txt -o /tmp/agrovoc-matches.cs
$ csvgrep -c 'number of matches' -m '0' /tmp/agrovoc-matches.csv | csvcut -c 1 > /tmp/invalid-agrovoc.csv
```
## 2021-10-05
- Sam put me in touch with Dodi from the System Office web team and he confirmed that the Amazon requests are not theirs
- I added `Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)` to the list of bad bots in nginx
- I purged all the Amazon IPs using this user agent, as well as the few other IPs I identified yesterday
```console
$ ./ilri/check-spider-ip-hits.sh -f /tmp/robot-ips.txt -p
...
Total number of bot hits purged: 465119
```
<!-- vim: set sw=2 ts=2: -->

View File

@ -25,7 +25,7 @@ So we have 1879/7100 (26.46%) matching already
<meta property="og:type" content="article" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2021-10/" />
<meta property="article:published_time" content="2021-10-01T11:14:07+03:00" />
<meta property="article:modified_time" content="2021-10-01T11:14:07+03:00" />
<meta property="article:modified_time" content="2021-10-04T19:40:13+03:00" />
@ -56,9 +56,9 @@ So we have 1879/7100 (26.46%) matching already
"@type": "BlogPosting",
"headline": "October, 2021",
"url": "https://alanorth.github.io/cgspace-notes/2021-10/",
"wordCount": "697",
"wordCount": "771",
"datePublished": "2021-10-01T11:14:07+03:00",
"dateModified": "2021-10-01T11:14:07+03:00",
"dateModified": "2021-10-04T19:40:13+03:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@ -216,7 +216,7 @@ $ csvgrep -c asn -m 14618 /tmp/mozilla-4.0-ips.csv | csvcut -c ip | sed 1d | tee
1607 GET /handle/10568/103816
290382 GET /handle/10568/83389
</code></pre><ul>
<li>Before I purge all those I will ask someone Samuel Stacey from the System office to hopefully get an insight&hellip;</li>
<li>Before I purge all those I will ask someone Samuel Stacey from the System Office to hopefully get an insight&hellip;</li>
<li>Meeting with Michael Victor, Peter, Jane, and Abenet about the future of repositories in the One CGIAR</li>
<li>Meeting with Michelle from Altmetric about their new CSV upload system
<ul>
@ -234,6 +234,19 @@ $ csvgrep -c asn -m 14618 /tmp/mozilla-4.0-ips.csv | csvcut -c ip | sed 1d | tee
<pre tabindex="0"><code class="language-console" data-lang="console">$ csvcut -c 'dcterms.subject[en_US]' ~/Downloads/2021-10-03-non-IWMI-publications.csv | sed -e 1d -e 's/||/\n/g' -e 's/&quot;//g' | sort -u &gt; /tmp/agrovoc.txt
$ ./ilri/agrovoc-lookup.py -i /tmp/agrovoc-sorted.txt -o /tmp/agrovoc-matches.csv
$ csvgrep -c 'number of matches' -m '0' /tmp/agrovoc-matches.csv | csvcut -c 1 &gt; /tmp/invalid-agrovoc.csv
</code></pre><h2 id="2021-10-05">2021-10-05</h2>
<ul>
<li>Sam put me in touch with Dodi from the System Office web team and he confirmed that the Amazon requests are not theirs
<ul>
<li>I added <code>Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)</code> to the list of bad bots in nginx</li>
<li>I purged all the Amazon IPs using this user agent, as well as the few other IPs I identified yesterday</li>
</ul>
</li>
</ul>
<pre tabindex="0"><code class="language-console" data-lang="console">$ ./ilri/check-spider-ip-hits.sh -f /tmp/robot-ips.txt -p
...
Total number of bot hits purged: 465119
</code></pre><!-- raw HTML omitted -->

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
<meta property="og:updated_time" content="2021-10-04T19:40:13+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
<meta property="og:updated_time" content="2021-10-04T19:40:13+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
<meta property="og:updated_time" content="2021-10-04T19:40:13+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
<meta property="og:updated_time" content="2021-10-04T19:40:13+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
<meta property="og:updated_time" content="2021-10-04T19:40:13+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
<meta property="og:updated_time" content="2021-10-04T19:40:13+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
<meta property="og:updated_time" content="2021-10-04T19:40:13+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
<meta property="og:updated_time" content="2021-10-04T19:40:13+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
<meta property="og:updated_time" content="2021-10-04T19:40:13+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
<meta property="og:updated_time" content="2021-10-04T19:40:13+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
<meta property="og:updated_time" content="2021-10-04T19:40:13+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
<meta property="og:updated_time" content="2021-10-04T19:40:13+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
<meta property="og:updated_time" content="2021-10-04T19:40:13+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
<meta property="og:updated_time" content="2021-10-04T19:40:13+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
<meta property="og:updated_time" content="2021-10-04T19:40:13+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
<meta property="og:updated_time" content="2021-10-04T19:40:13+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
<meta property="og:updated_time" content="2021-10-04T19:40:13+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
<meta property="og:updated_time" content="2021-10-04T19:40:13+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
<meta property="og:updated_time" content="2021-10-04T19:40:13+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
<meta property="og:updated_time" content="2021-10-04T19:40:13+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
<meta property="og:updated_time" content="2021-10-04T19:40:13+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
<meta property="og:updated_time" content="2021-10-04T19:40:13+03:00" />

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
<meta property="og:updated_time" content="2021-10-04T19:40:13+03:00" />

View File

@ -3,19 +3,19 @@
xmlns:xhtml="http://www.w3.org/1999/xhtml">
<url>
<loc>https://alanorth.github.io/cgspace-notes/categories/</loc>
<lastmod>2021-10-04T11:10:54+03:00</lastmod>
<lastmod>2021-10-04T19:40:13+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/</loc>
<lastmod>2021-10-04T11:10:54+03:00</lastmod>
<lastmod>2021-10-04T19:40:13+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/categories/notes/</loc>
<lastmod>2021-10-04T11:10:54+03:00</lastmod>
<lastmod>2021-10-04T19:40:13+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/2021-10/</loc>
<lastmod>2021-10-01T11:14:07+03:00</lastmod>
<lastmod>2021-10-04T19:40:13+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/posts/</loc>
<lastmod>2021-10-04T11:10:54+03:00</lastmod>
<lastmod>2021-10-04T19:40:13+03:00</lastmod>
</url><url>
<loc>https://alanorth.github.io/cgspace-notes/2021-09/</loc>
<lastmod>2021-10-04T11:10:54+03:00</lastmod>