cgspace-notes/docs/2019-01/index.html

261 lines
7.2 KiB
HTML

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
<meta property="og:title" content="January, 2019" />
<meta property="og:description" content="2019-01-02
Linode alerted that CGSpace (linode18) had a higher outbound traffic rate than normal early this morning
I don&rsquo;t see anything interesting in the web server logs around that time though:
# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &quot;02/Jan/2019:0(1|2|3)&quot; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
92 40.77.167.4
99 210.7.29.100
120 38.126.157.45
177 35.237.175.180
177 40.77.167.32
216 66.249.75.219
225 18.203.76.93
261 46.101.86.248
357 207.46.13.1
903 54.70.40.11
" />
<meta property="og:type" content="article" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2019-01/" /><meta property="article:published_time" content="2019-01-02T09:48:30&#43;02:00"/>
<meta property="article:modified_time" content="2019-01-02T09:59:01&#43;02:00"/>
<meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="January, 2019"/>
<meta name="twitter:description" content="2019-01-02
Linode alerted that CGSpace (linode18) had a higher outbound traffic rate than normal early this morning
I don&rsquo;t see anything interesting in the web server logs around that time though:
# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &quot;02/Jan/2019:0(1|2|3)&quot; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
92 40.77.167.4
99 210.7.29.100
120 38.126.157.45
177 35.237.175.180
177 40.77.167.32
216 66.249.75.219
225 18.203.76.93
261 46.101.86.248
357 207.46.13.1
903 54.70.40.11
"/>
<meta name="generator" content="Hugo 0.53" />
<script type="application/ld+json">
{
"@context": "http://schema.org",
"@type": "BlogPosting",
"headline": "January, 2019",
"url": "https://alanorth.github.io/cgspace-notes/2019-01/",
"wordCount": "263",
"datePublished": "2019-01-02T09:48:30&#43;02:00",
"dateModified": "2019-01-02T09:59:01&#43;02:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
},
"keywords": "Notes"
}
</script>
<link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2019-01/">
<title>January, 2019 | CGSpace Notes</title>
<!-- combined, minified CSS -->
<link href="https://alanorth.github.io/cgspace-notes/css/style.css" rel="stylesheet" integrity="sha384-6&#43;EGfPoOzk/n2DVJSlglKT8TV1TgIMvVcKI73IZgBswLasPBn94KommV6ilJqCXE" crossorigin="anonymous">
</head>
<body>
<div class="blog-masthead">
<div class="container">
<nav class="nav blog-nav">
<a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
</nav>
</div>
</div>
<header class="blog-header">
<div class="container">
<h1 class="blog-title"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
<p class="lead blog-description">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
</div>
</header>
<div class="container">
<div class="row">
<div class="col-sm-8 blog-main">
<article class="blog-post">
<header>
<h2 class="blog-post-title"><a href="https://alanorth.github.io/cgspace-notes/2019-01/">January, 2019</a></h2>
<p class="blog-post-meta"><time datetime="2019-01-02T09:48:30&#43;02:00">Wed Jan 02, 2019</time> by Alan Orth in
<i class="fa fa-tag" aria-hidden="true"></i>&nbsp;<a href="/cgspace-notes/tags/notes" rel="tag">Notes</a>
</p>
</header>
<h2 id="2019-01-02">2019-01-02</h2>
<ul>
<li>Linode alerted that CGSpace (linode18) had a higher outbound traffic rate than normal early this morning</li>
<li>I don&rsquo;t see anything interesting in the web server logs around that time though:</li>
</ul>
<pre><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &quot;02/Jan/2019:0(1|2|3)&quot; | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
92 40.77.167.4
99 210.7.29.100
120 38.126.157.45
177 35.237.175.180
177 40.77.167.32
216 66.249.75.219
225 18.203.76.93
261 46.101.86.248
357 207.46.13.1
903 54.70.40.11
</code></pre>
<ul>
<li>Analyzing the types of requests made by the top few IPs during that time:</li>
</ul>
<pre><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &quot;02/Jan/2019:0(1|2|3)&quot; | grep 54.70.40.11 | grep -o -E &quot;(bitstream|discover|handle)&quot; | sort | uniq -c
30 bitstream
534 discover
352 handle
# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &quot;02/Jan/2019:0(1|2|3)&quot; | grep 207.46.13.1 | grep -o -E &quot;(bitstream|discover|handle)&quot; | sort | uniq -c
194 bitstream
345 handle
# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &quot;02/Jan/2019:0(1|2|3)&quot; | grep 46.101.86.248 | grep -o -E &quot;(bitstream|discover|handle)&quot; | sort | uniq -c
261 handle
</code></pre>
<ul>
<li>It&rsquo;s not clear to me what was causing the outbound traffic spike</li>
<li>Oh nice! The once-per-year cron job for rotating the Solr statistics actually worked now (for the first time ever!):</li>
</ul>
<pre><code>Moving: 81742 into core statistics-2010
Moving: 1837285 into core statistics-2011
Moving: 3764612 into core statistics-2012
Moving: 4557946 into core statistics-2013
Moving: 5483684 into core statistics-2014
Moving: 2941736 into core statistics-2015
Moving: 5926070 into core statistics-2016
Moving: 10562554 into core statistics-2017
Moving: 18497180 into core statistics-2018
</code></pre>
<ul>
<li>This could by why the outbound traffic rate was high, due to the S3 backup that run at 3:30AM&hellip;</li>
</ul>
<!-- vim: set sw=2 ts=2: -->
</article>
</div> <!-- /.blog-main -->
<aside class="col-sm-3 ml-auto blog-sidebar">
<section class="sidebar-module">
<h4>Recent Posts</h4>
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2019-01/">January, 2019</a></li>
<li><a href="/cgspace-notes/2018-12/">December, 2018</a></li>
<li><a href="/cgspace-notes/2018-11/">November, 2018</a></li>
<li><a href="/cgspace-notes/2018-10/">October, 2018</a></li>
<li><a href="/cgspace-notes/2018-09/">September, 2018</a></li>
</ol>
</section>
<section class="sidebar-module">
<h4>Links</h4>
<ol class="list-unstyled">
<li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
<li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
<li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
</ol>
</section>
</aside>
</div> <!-- /.row -->
</div> <!-- /.container -->
<footer class="blog-footer">
<p>
Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
</p>
<p>
<a href="#">Back to top</a>
</p>
</footer>
</body>
</html>