Update notes for 2020-07-24

This commit is contained in:
2020-07-24 23:23:15 +03:00
parent 6b75032413
commit 9e6ff5d999
21 changed files with 223 additions and 30 deletions

View File

@ -24,7 +24,7 @@ I think I will need to ask Udana to re-copy and paste the abstracts with more ca
<meta property="og:type" content="article" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2019-03/" />
<meta property="article:published_time" content="2019-03-01T12:16:30+01:00" />
<meta property="article:modified_time" content="2020-04-13T15:30:24+03:00" />
<meta property="article:modified_time" content="2020-07-24T21:57:55+03:00" />
<meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="March, 2019"/>
@ -55,7 +55,7 @@ I think I will need to ask Udana to re-copy and paste the abstracts with more ca
"url": "https://alanorth.github.io/cgspace-notes/2019-03/",
"wordCount": "7105",
"datePublished": "2019-03-01T12:16:30+01:00",
"dateModified": "2020-04-13T15:30:24+03:00",
"dateModified": "2020-07-24T21:57:55+03:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@ -951,7 +951,7 @@ $ grep -o -E 'session_id=[A-Z0-9]{32}' dspace.log.2019-03-22 | sort -u | wc -l
712 35.174.184.209
784 2a01:4f8:13b:1296::2
</code></pre><ul>
<li>The two IPV6 addresses are something called BLEXBot, which seems to check the robots.txt file and the completely ignore it by making thousands of requests to dynamic pages like Browse and Discovery</li>
<li>The two IPV6 addresses are something called BLEXBot, which seems to check the robots.txt file and then completely ignore it by making thousands of requests to dynamic pages like Browse and Discovery</li>
<li>Then <code>35.174.184.209</code> is MauiBot, which does the same thing</li>
<li>Also <code>3.91.79.74</code> does, which appears to be CCBot</li>
<li>I will add these three to the &ldquo;bad bot&rdquo; rate limiting that I originally used for Baidu</li>