mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-12-18 19:22:18 +01:00
348 lines
12 KiB
HTML
348 lines
12 KiB
HTML
<!DOCTYPE html>
|
|
<html lang="en">
|
|
|
|
<head>
|
|
<meta charset="utf-8">
|
|
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
|
|
|
|
<meta property="og:title" content="September, 2019" />
|
|
<meta property="og:description" content="2019-09-01
|
|
|
|
|
|
Linode emailed to say that CGSpace (linode18) had a high rate of outbound traffic for several hours this morning
|
|
|
|
Here are the top ten IPs in the nginx XMLUI and REST/OAI logs this morning:
|
|
|
|
# zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E "01/Sep/2019:0" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
|
440 17.58.101.255
|
|
441 157.55.39.101
|
|
485 207.46.13.43
|
|
728 169.60.128.125
|
|
730 207.46.13.108
|
|
758 157.55.39.9
|
|
808 66.160.140.179
|
|
814 207.46.13.212
|
|
2472 163.172.71.23
|
|
6092 3.94.211.189
|
|
# zcat --force /var/log/nginx/rest.log /var/log/nginx/rest.log.1 /var/log/nginx/oai.log /var/log/nginx/oai.log.1 | grep -E "01/Sep/2019:0" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
|
33 2a01:7e00::f03c:91ff:fe16:fcb
|
|
57 3.83.192.124
|
|
57 3.87.77.25
|
|
57 54.82.1.8
|
|
822 2a01:9cc0:47:1:1a:4:0:2
|
|
1223 45.5.184.72
|
|
1633 172.104.229.92
|
|
5112 205.186.128.185
|
|
7249 2a01:7e00::f03c:91ff:fe18:7396
|
|
9124 45.5.186.2
|
|
|
|
" />
|
|
<meta property="og:type" content="article" />
|
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2019-09/" />
|
|
<meta property="article:published_time" content="2019-09-01T10:17:51+03:00" />
|
|
<meta property="article:modified_time" content="2019-09-10T17:20:42+03:00" />
|
|
|
|
<meta name="twitter:card" content="summary"/>
|
|
<meta name="twitter:title" content="September, 2019"/>
|
|
<meta name="twitter:description" content="2019-09-01
|
|
|
|
|
|
Linode emailed to say that CGSpace (linode18) had a high rate of outbound traffic for several hours this morning
|
|
|
|
Here are the top ten IPs in the nginx XMLUI and REST/OAI logs this morning:
|
|
|
|
# zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E "01/Sep/2019:0" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
|
440 17.58.101.255
|
|
441 157.55.39.101
|
|
485 207.46.13.43
|
|
728 169.60.128.125
|
|
730 207.46.13.108
|
|
758 157.55.39.9
|
|
808 66.160.140.179
|
|
814 207.46.13.212
|
|
2472 163.172.71.23
|
|
6092 3.94.211.189
|
|
# zcat --force /var/log/nginx/rest.log /var/log/nginx/rest.log.1 /var/log/nginx/oai.log /var/log/nginx/oai.log.1 | grep -E "01/Sep/2019:0" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
|
33 2a01:7e00::f03c:91ff:fe16:fcb
|
|
57 3.83.192.124
|
|
57 3.87.77.25
|
|
57 54.82.1.8
|
|
822 2a01:9cc0:47:1:1a:4:0:2
|
|
1223 45.5.184.72
|
|
1633 172.104.229.92
|
|
5112 205.186.128.185
|
|
7249 2a01:7e00::f03c:91ff:fe18:7396
|
|
9124 45.5.186.2
|
|
|
|
"/>
|
|
<meta name="generator" content="Hugo 0.58.1" />
|
|
|
|
|
|
|
|
<script type="application/ld+json">
|
|
{
|
|
"@context": "http://schema.org",
|
|
"@type": "BlogPosting",
|
|
"headline": "September, 2019",
|
|
"url": "https:\/\/alanorth.github.io\/cgspace-notes\/2019-09\/",
|
|
"wordCount": "685",
|
|
"datePublished": "2019-09-01T10:17:51\x2b03:00",
|
|
"dateModified": "2019-09-10T17:20:42\x2b03:00",
|
|
"author": {
|
|
"@type": "Person",
|
|
"name": "Alan Orth"
|
|
},
|
|
"keywords": "Notes"
|
|
}
|
|
</script>
|
|
|
|
|
|
|
|
<link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2019-09/">
|
|
|
|
<title>September, 2019 | CGSpace Notes</title>
|
|
|
|
<!-- combined, minified CSS -->
|
|
<link href="https://alanorth.github.io/cgspace-notes/css/style.css" rel="stylesheet" integrity="sha384-G5B34w7DFTumWTswxYzTX7NWfbvQEg1HbFFEg6ItN03uTAAoS2qkPS/fu3LhuuSA" crossorigin="anonymous">
|
|
|
|
<!-- RSS 2.0 feed -->
|
|
|
|
|
|
|
|
|
|
|
|
|
|
</head>
|
|
|
|
<body>
|
|
|
|
|
|
<div class="blog-masthead">
|
|
<div class="container">
|
|
<nav class="nav blog-nav">
|
|
<a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
|
|
</nav>
|
|
</div>
|
|
</div>
|
|
|
|
|
|
|
|
|
|
<header class="blog-header">
|
|
<div class="container">
|
|
<h1 class="blog-title"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
|
|
<p class="lead blog-description">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
|
|
</div>
|
|
</header>
|
|
|
|
|
|
|
|
|
|
<div class="container">
|
|
<div class="row">
|
|
<div class="col-sm-8 blog-main">
|
|
|
|
|
|
|
|
|
|
<article class="blog-post">
|
|
<header>
|
|
<h2 class="blog-post-title"><a href="https://alanorth.github.io/cgspace-notes/2019-09/">September, 2019</a></h2>
|
|
<p class="blog-post-meta"><time datetime="2019-09-01T10:17:51+03:00">Sun Sep 01, 2019</time> by Alan Orth in
|
|
|
|
<i class="fa fa-tag" aria-hidden="true"></i> <a href="/cgspace-notes/tags/notes" rel="tag">Notes</a>
|
|
|
|
</p>
|
|
</header>
|
|
<h2 id="2019-09-01">2019-09-01</h2>
|
|
|
|
<ul>
|
|
<li>Linode emailed to say that CGSpace (linode18) had a high rate of outbound traffic for several hours this morning</li>
|
|
|
|
<li><p>Here are the top ten IPs in the nginx XMLUI and REST/OAI logs this morning:</p>
|
|
|
|
<pre><code># zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E "01/Sep/2019:0" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
|
440 17.58.101.255
|
|
441 157.55.39.101
|
|
485 207.46.13.43
|
|
728 169.60.128.125
|
|
730 207.46.13.108
|
|
758 157.55.39.9
|
|
808 66.160.140.179
|
|
814 207.46.13.212
|
|
2472 163.172.71.23
|
|
6092 3.94.211.189
|
|
# zcat --force /var/log/nginx/rest.log /var/log/nginx/rest.log.1 /var/log/nginx/oai.log /var/log/nginx/oai.log.1 | grep -E "01/Sep/2019:0" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
|
33 2a01:7e00::f03c:91ff:fe16:fcb
|
|
57 3.83.192.124
|
|
57 3.87.77.25
|
|
57 54.82.1.8
|
|
822 2a01:9cc0:47:1:1a:4:0:2
|
|
1223 45.5.184.72
|
|
1633 172.104.229.92
|
|
5112 205.186.128.185
|
|
7249 2a01:7e00::f03c:91ff:fe18:7396
|
|
9124 45.5.186.2
|
|
</code></pre></li>
|
|
</ul>
|
|
|
|
<ul>
|
|
<li><code>3.94.211.189</code> is MauiBot, and most of its requests are to Discovery and get rate limited with HTTP 503</li>
|
|
|
|
<li><p><code>163.172.71.23</code> is some IP on Online SAS in France and its user agent is:</p>
|
|
|
|
<pre><code>Mozilla/5.0 ((Windows; U; Windows NT 6.1; fr; rv:1.9.2) Gecko/20100115 Firefox/3.6)
|
|
</code></pre></li>
|
|
|
|
<li><p>It actually got mostly HTTP 200 responses:</p>
|
|
|
|
<pre><code># zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E "01/Sep/2019:0" | grep 163.172.71.23 | awk '{print $9}' | sort | uniq -c
|
|
1775 200
|
|
703 499
|
|
72 503
|
|
</code></pre></li>
|
|
|
|
<li><p>And it was mostly requesting Discover pages:</p>
|
|
|
|
<pre><code># zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E "01/Sep/2019:0" | grep 163.172.71.23 | grep -o -E "(bitstream|discover|handle)" | sort | uniq -c
|
|
2350 discover
|
|
71 handle
|
|
</code></pre></li>
|
|
|
|
<li><p>I’m not sure why the outbound traffic rate was so high…</p></li>
|
|
</ul>
|
|
|
|
<h2 id="2019-09-02">2019-09-02</h2>
|
|
|
|
<ul>
|
|
<li>Follow up with Carol and Francesca from Bioversity as they were on holiday during the mid-to-late August
|
|
|
|
<ul>
|
|
<li>I told them to check the <a href="https://dspacetest.cgiar.org/handle/10568/103999">temporary collection on DSpace Test</a> where I uploaded the 1,427 items so they can see how it will look</li>
|
|
<li>Also, I told them to advise me about the strange file extensions (.7z, .zip, .lck)</li>
|
|
<li>Also, I reminded Abenet to check the metadata, as the institutional authors at least will need some modification</li>
|
|
</ul></li>
|
|
</ul>
|
|
|
|
<h2 id="2019-09-10">2019-09-10</h2>
|
|
|
|
<ul>
|
|
<li>Altmetric responded to say that they have fixed an issue with their badge code so now research outputs with multiple handles are showing badges!
|
|
|
|
<ul>
|
|
<li>See: <a href="https://hdl.handle.net/handle/10568/97825">https://hdl.handle.net/handle/10568/97825</a></li>
|
|
</ul></li>
|
|
<li>Follow up with Bosede about the mixup with PDFs in the items uploaded in 2018-12 (aka Daniel1807.xsl)
|
|
|
|
<ul>
|
|
<li>These are the same ones that Peter noticed last week, that Bosede and I had been discussing earlier this year that we never sorted out</li>
|
|
<li>It looks like these items were uploaded by Sisay on 2018-12-19 so we can use the <a href="https://cgspace.cgiar.org/handle/10568/68616/discover?filtertype_1=dateAccessioned&filter_relational_operator_1=contains&filter_1=2018-12-19&submit_apply_filter=&query=">accession date as a filter</a> to narrow it down to 230 items (of which only 104 have PDFs, according to the Daniel1807.xls input input file)</li>
|
|
<li>Now I just checked a few manually and they are correct in the original input file, so something must have happened when Sisay was processing them for upload</li>
|
|
<li>I have asked Sisay to fix them…</li>
|
|
</ul></li>
|
|
<li>Continue working on CG Core v2 migration, focusing on the crosswalk mappings
|
|
|
|
<ul>
|
|
<li>I think we can skip the MODS crosswalk for now because it is only used in <a href="https://wiki.duraspace.org/display/DSDOC5x/DSpace+AIP+Format#DSpaceAIPFormat-MODSSchema">AIP exports that are meant for non-DSpace systems</a></li>
|
|
<li>We should probably do the QDC crosswalk as well as those in <code>xhtml-head-item.properties</code>…</li>
|
|
<li>Ouch, there is potentially a lot of work in the OAI metadata formats like DIM, METS, and QDC (see <code>dspace/config/crosswalks/oai/*.xsl</code>)</li>
|
|
<li>In general I think I should only modify the left side of the crosswalk mappings (ie, where metadata is coming from) so we maintain the same exact output for search engines, etc</li>
|
|
</ul></li>
|
|
</ul>
|
|
|
|
<h2 id="2019-09-11">2019-09-11</h2>
|
|
|
|
<ul>
|
|
<li>Maria Garruccio asked me to add two new Bioversity ORCID identifiers to CGSpace so I created a <a href="https://github.com/ilri/DSpace/pull/431">pull request</a></li>
|
|
<li>Marissa Van Epp asked me to add new CCAFS Phase II project tags to CGSpace so I created a <a href="https://github.com/ilri/DSpace/pull/432">pull request</a>
|
|
|
|
<ul>
|
|
<li>I will wait until I hear from her to merge it because there is one tag that seems to be a duplicate because its name (PII-WA_agrosylvopast) is similar to one that already exists (PII-WA_AgroSylvopastoralSystems)</li>
|
|
</ul></li>
|
|
<li>More work on the CG Core v2 migrations
|
|
|
|
<ul>
|
|
<li>I have updated my <a href="https://gist.github.com/alanorth/2db39e91f48d116e00a4edffd6ba6409">notes on the possible changes</a> and done more work on the XMLUI replacements</li>
|
|
</ul></li>
|
|
</ul>
|
|
|
|
<h2 id="2019-09-12">2019-09-12</h2>
|
|
|
|
<ul>
|
|
<li>Deploy <a href="https://jdbc.postgresql.org/">PostgreSQL JDBC driver</a> version 42.2.7 on DSpace Test and update the <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure scripts</a></li>
|
|
</ul>
|
|
|
|
<!-- vim: set sw=2 ts=2: -->
|
|
|
|
|
|
|
|
|
|
|
|
</article>
|
|
|
|
|
|
|
|
</div> <!-- /.blog-main -->
|
|
|
|
<aside class="col-sm-3 ml-auto blog-sidebar">
|
|
|
|
|
|
|
|
<section class="sidebar-module">
|
|
<h4>Recent Posts</h4>
|
|
<ol class="list-unstyled">
|
|
|
|
|
|
<li><a href="/cgspace-notes/posts/">Posts</a></li>
|
|
|
|
<li><a href="/cgspace-notes/2019-09/">September, 2019</a></li>
|
|
|
|
<li><a href="/cgspace-notes/2019-08/">August, 2019</a></li>
|
|
|
|
<li><a href="/cgspace-notes/2019-07/">July, 2019</a></li>
|
|
|
|
<li><a href="/cgspace-notes/2019-06/">June, 2019</a></li>
|
|
|
|
</ol>
|
|
</section>
|
|
|
|
|
|
|
|
|
|
<section class="sidebar-module">
|
|
<h4>Links</h4>
|
|
<ol class="list-unstyled">
|
|
|
|
<li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
|
|
|
|
<li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
|
|
|
|
<li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
|
|
|
|
</ol>
|
|
</section>
|
|
|
|
</aside>
|
|
|
|
|
|
</div> <!-- /.row -->
|
|
</div> <!-- /.container -->
|
|
|
|
|
|
|
|
<footer class="blog-footer">
|
|
<p>
|
|
|
|
Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
|
|
|
|
</p>
|
|
<p>
|
|
<a href="#">Back to top</a>
|
|
</p>
|
|
</footer>
|
|
|
|
|
|
</body>
|
|
|
|
</html>
|