mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-11-26 08:28:18 +01:00
Update notes for 2017-10-31
This commit is contained in:
parent
db726df881
commit
31dde1c16d
@ -338,3 +338,18 @@ WARNING: [SetPropertiesRule]{Server/Service/Engine/Host/Valve} Setting property
|
||||
```
|
||||
# goaccess /var/log/nginx/access.log --log-format=COMBINED
|
||||
```
|
||||
|
||||
- According to Uptime Robot CGSpace went down and up a few times
|
||||
- I had a look at goaccess and I saw that CORE was actively indexing
|
||||
- Also, PostgreSQL connections were at 91 (with the max being 60 per web app, hmmm)
|
||||
- I'm really starting to get annoyed with these guys, and thinking about blocking their IP address for a few days to see if CGSpace becomes more stable
|
||||
- Actually, come to think of it, they aren't even obeying `robots.txt`, because we actually disallow `/discover` and `/search-filter` URLs but they are hitting those massively:
|
||||
|
||||
```
|
||||
# grep "CORE/0.6" /var/log/nginx/access.log | grep -o -E "GET /(discover|search-filter)" | sort -n | uniq -c | sort -rn
|
||||
158058 GET /discover
|
||||
14260 GET /search-filter
|
||||
```
|
||||
|
||||
- I tested a URL of pattern `/discover` in Google's webmaster tools and it was indeed identified as blocked
|
||||
- I will send feedback to the CORE bot team
|
||||
|
@ -28,7 +28,7 @@ Add Katherine Lutz to the groups for content sumission and edit steps of the CGI
|
||||
|
||||
<meta property="article:published_time" content="2017-10-01T08:07:54+03:00"/>
|
||||
|
||||
<meta property="article:modified_time" content="2017-10-31T13:35:56+02:00"/>
|
||||
<meta property="article:modified_time" content="2017-10-31T15:38:27+02:00"/>
|
||||
|
||||
|
||||
|
||||
@ -66,9 +66,9 @@ Add Katherine Lutz to the groups for content sumission and edit steps of the CGI
|
||||
"@type": "BlogPosting",
|
||||
"headline": "October, 2017",
|
||||
"url": "https://alanorth.github.io/cgspace-notes/2017-10/",
|
||||
"wordCount": "2468",
|
||||
"wordCount": "2613",
|
||||
"datePublished": "2017-10-01T08:07:54+03:00",
|
||||
"dateModified": "2017-10-31T13:35:56+02:00",
|
||||
"dateModified": "2017-10-31T15:38:27+02:00",
|
||||
"author": {
|
||||
"@type": "Person",
|
||||
"name": "Alan Orth"
|
||||
@ -522,6 +522,24 @@ session_id=6C30F10B4351A4ED83EC6ED50AFD6B6A
|
||||
<pre><code># goaccess /var/log/nginx/access.log --log-format=COMBINED
|
||||
</code></pre>
|
||||
|
||||
<ul>
|
||||
<li>According to Uptime Robot CGSpace went down and up a few times</li>
|
||||
<li>I had a look at goaccess and I saw that CORE was actively indexing</li>
|
||||
<li>Also, PostgreSQL connections were at 91 (with the max being 60 per web app, hmmm)</li>
|
||||
<li>I’m really starting to get annoyed with these guys, and thinking about blocking their IP address for a few days to see if CGSpace becomes more stable</li>
|
||||
<li>Actually, come to think of it, they aren’t even obeying <code>robots.txt</code>, because we actually disallow <code>/discover</code> and <code>/search-filter</code> URLs but they are hitting those massively:</li>
|
||||
</ul>
|
||||
|
||||
<pre><code># grep "CORE/0.6" /var/log/nginx/access.log | grep -o -E "GET /(discover|search-filter)" | sort -n | uniq -c | sort -rn
|
||||
158058 GET /discover
|
||||
14260 GET /search-filter
|
||||
</code></pre>
|
||||
|
||||
<ul>
|
||||
<li>I tested a URL of pattern <code>/discover</code> in Google’s webmaster tools and it was indeed identified as blocked</li>
|
||||
<li>I will send feedback to the CORE bot team</li>
|
||||
</ul>
|
||||
|
||||
|
||||
|
||||
|
||||
|
@ -28,7 +28,7 @@ Disallow: /cgspace-notes/2015-12/
|
||||
Disallow: /cgspace-notes/2015-11/
|
||||
Disallow: /cgspace-notes/
|
||||
Disallow: /cgspace-notes/categories/
|
||||
Disallow: /cgspace-notes/categories/notes/
|
||||
Disallow: /cgspace-notes/tags/notes/
|
||||
Disallow: /cgspace-notes/categories/notes/
|
||||
Disallow: /cgspace-notes/post/
|
||||
Disallow: /cgspace-notes/tags/
|
||||
|
@ -4,7 +4,7 @@
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/2017-10/</loc>
|
||||
<lastmod>2017-10-31T13:35:56+02:00</lastmod>
|
||||
<lastmod>2017-10-31T15:38:27+02:00</lastmod>
|
||||
</url>
|
||||
|
||||
<url>
|
||||
@ -129,7 +129,7 @@
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/</loc>
|
||||
<lastmod>2017-10-31T13:35:56+02:00</lastmod>
|
||||
<lastmod>2017-10-31T15:38:27+02:00</lastmod>
|
||||
<priority>0</priority>
|
||||
</url>
|
||||
|
||||
@ -138,27 +138,27 @@
|
||||
<priority>0</priority>
|
||||
</url>
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/tags/notes/</loc>
|
||||
<lastmod>2017-10-31T15:38:27+02:00</lastmod>
|
||||
<priority>0</priority>
|
||||
</url>
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/categories/notes/</loc>
|
||||
<lastmod>2017-09-28T12:00:49+03:00</lastmod>
|
||||
<priority>0</priority>
|
||||
</url>
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/tags/notes/</loc>
|
||||
<lastmod>2017-10-31T13:35:56+02:00</lastmod>
|
||||
<priority>0</priority>
|
||||
</url>
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/post/</loc>
|
||||
<lastmod>2017-10-31T13:35:56+02:00</lastmod>
|
||||
<lastmod>2017-10-31T15:38:27+02:00</lastmod>
|
||||
<priority>0</priority>
|
||||
</url>
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/tags/</loc>
|
||||
<lastmod>2017-10-31T13:35:56+02:00</lastmod>
|
||||
<lastmod>2017-10-31T15:38:27+02:00</lastmod>
|
||||
<priority>0</priority>
|
||||
</url>
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user