<!DOCTYPE html>
<html lang="en">

  <head>
    <meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">

<meta property="og:title" content="February, 2019" />
<meta property="og:description" content="2019-02-01


Linode has alerted a few times since last night that the CPU usage on CGSpace (linode18) was high despite me increasing the alert threshold last week from 250% to 275%—I might need to increase it again!
The top IPs before, during, and after this latest alert tonight were:


# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &quot;01/Feb/2019:(17|18|19|20|21)&quot; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
    245 207.46.13.5
    332 54.70.40.11
    385 5.143.231.38
    405 207.46.13.173
    405 207.46.13.75
   1117 66.249.66.219
   1121 35.237.175.180
   1546 5.9.6.51
   2474 45.5.186.2
   5490 85.25.237.71



85.25.237.71 is the &ldquo;Linguee Bot&rdquo; that I first saw last month
The Solr statistics the past few months have been very high and I was wondering if the web server logs also showed an increase
There were just over 3 million accesses in the nginx logs last month:


# time zcat --force /var/log/nginx/* | grep -cE &quot;[0-9]{1,2}/Jan/2019&quot;
3018243

real    0m19.873s
user    0m22.203s
sys     0m1.979s
" />
<meta property="og:type" content="article" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2019-02/" /><meta property="article:published_time" content="2019-02-01T21:37:30&#43;02:00"/>
<meta property="article:modified_time" content="2019-02-01T21:46:09&#43;02:00"/>

<meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="February, 2019"/>
<meta name="twitter:description" content="2019-02-01


Linode has alerted a few times since last night that the CPU usage on CGSpace (linode18) was high despite me increasing the alert threshold last week from 250% to 275%—I might need to increase it again!
The top IPs before, during, and after this latest alert tonight were:


# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &quot;01/Feb/2019:(17|18|19|20|21)&quot; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
    245 207.46.13.5
    332 54.70.40.11
    385 5.143.231.38
    405 207.46.13.173
    405 207.46.13.75
   1117 66.249.66.219
   1121 35.237.175.180
   1546 5.9.6.51
   2474 45.5.186.2
   5490 85.25.237.71



85.25.237.71 is the &ldquo;Linguee Bot&rdquo; that I first saw last month
The Solr statistics the past few months have been very high and I was wondering if the web server logs also showed an increase
There were just over 3 million accesses in the nginx logs last month:


# time zcat --force /var/log/nginx/* | grep -cE &quot;[0-9]{1,2}/Jan/2019&quot;
3018243

real    0m19.873s
user    0m22.203s
sys     0m1.979s
"/>
<meta name="generator" content="Hugo 0.53" />


    
<script type="application/ld+json">
{
  "@context": "http://schema.org",
  "@type": "BlogPosting",
  "headline": "February, 2019",
  "url": "https://alanorth.github.io/cgspace-notes/2019-02/",
  "wordCount": "227",
  "datePublished": "2019-02-01T21:37:30&#43;02:00",
  "dateModified": "2019-02-01T21:46:09&#43;02:00",
  "author": {
    "@type": "Person",
    "name": "Alan Orth"
  },
  "keywords": "Notes"
}
</script>



    <link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2019-02/">

    <title>February, 2019 | CGSpace Notes</title>

    <!-- combined, minified CSS -->
    <link href="https://alanorth.github.io/cgspace-notes/css/style.css" rel="stylesheet" integrity="sha384-6&#43;EGfPoOzk/n2DVJSlglKT8TV1TgIMvVcKI73IZgBswLasPBn94KommV6ilJqCXE" crossorigin="anonymous">

    

    

    

    

  </head>

  <body>

    
    <div class="blog-masthead">
      <div class="container">
        <nav class="nav blog-nav">
          <a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
        </nav>
      </div>
    </div>
    

    
    
    <header class="blog-header">
      <div class="container">
        <h1 class="blog-title"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
        <p class="lead blog-description">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
      </div>
    </header>
    
    

    
    <div class="container">
      <div class="row">
        <div class="col-sm-8 blog-main">

          


<article class="blog-post">
  <header>
    <h2 class="blog-post-title"><a href="https://alanorth.github.io/cgspace-notes/2019-02/">February, 2019</a></h2>
    <p class="blog-post-meta"><time datetime="2019-02-01T21:37:30&#43;02:00">Fri Feb 01, 2019</time> by Alan Orth in 

<i class="fa fa-tag" aria-hidden="true"></i>&nbsp;<a href="/cgspace-notes/tags/notes" rel="tag">Notes</a>

</p>
  </header>
  <h2 id="2019-02-01">2019-02-01</h2>

<ul>
<li>Linode has alerted a few times since last night that the CPU usage on CGSpace (linode18) was high despite me increasing the alert threshold last week from 250% to 275%—I might need to increase it again!</li>
<li>The top IPs before, during, and after this latest alert tonight were:</li>
</ul>

<pre><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &quot;01/Feb/2019:(17|18|19|20|21)&quot; | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
    245 207.46.13.5
    332 54.70.40.11
    385 5.143.231.38
    405 207.46.13.173
    405 207.46.13.75
   1117 66.249.66.219
   1121 35.237.175.180
   1546 5.9.6.51
   2474 45.5.186.2
   5490 85.25.237.71
</code></pre>

<ul>
<li><code>85.25.237.71</code> is the &ldquo;Linguee Bot&rdquo; that I first saw last month</li>
<li>The Solr statistics the past few months have been very high and I was wondering if the web server logs also showed an increase</li>
<li>There were just over 3 million accesses in the nginx logs last month:</li>
</ul>

<pre><code># time zcat --force /var/log/nginx/* | grep -cE &quot;[0-9]{1,2}/Jan/2019&quot;
3018243

real    0m19.873s
user    0m22.203s
sys     0m1.979s
</code></pre>

<ul>
<li>Normally I&rsquo;d say this was very high, but <a href="/cgspace-notes/2018-02/">about this time last year</a> I remember thinking the same thing when we had 3.1 million&hellip;</li>
<li>I will have to keep an eye on this to see if there is some error in Solr&hellip;</li>
<li>Atmire sent their <a href="https://github.com/ilri/DSpace/pull/407">pull request to re-enable the Metadata Quality Module (MQM) on our <code>5_x-dev</code> branch</a> today

<ul>
<li>I will test it next week and send them feedback</li>
</ul></li>
</ul>

<!-- vim: set sw=2 ts=2: -->

  

  

</article> 



        </div> <!-- /.blog-main -->

        <aside class="col-sm-3 ml-auto blog-sidebar">
  

  
        <section class="sidebar-module">
    <h4>Recent Posts</h4>
    <ol class="list-unstyled">


<li><a href="/cgspace-notes/2019-02/">February, 2019</a></li>

<li><a href="/cgspace-notes/2019-01/">January, 2019</a></li>

<li><a href="/cgspace-notes/2018-12/">December, 2018</a></li>

<li><a href="/cgspace-notes/2018-11/">November, 2018</a></li>

<li><a href="/cgspace-notes/2018-10/">October, 2018</a></li>

    </ol>
  </section>

  

  
  <section class="sidebar-module">
    <h4>Links</h4>
    <ol class="list-unstyled">
      
      <li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
      
      <li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
      
      <li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
      
    </ol>
  </section>
  
</aside>


      </div> <!-- /.row -->
    </div> <!-- /.container -->
    

    
    <footer class="blog-footer">
      <p>
      
      Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
      
      </p>
      <p>
      <a href="#">Back to top</a>
      </p>
    </footer>
    

  </body>

</html>