Add notes for 2020-01-27

This commit is contained in:
2020-01-27 16:20:44 +02:00
parent 207ace0883
commit 8feb93be39
112 changed files with 11466 additions and 5158 deletions

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="2019-01-02
Linode alerted that CGSpace (linode18) had a higher outbound traffic rate than normal early this morning
I don&#39;t see anything interesting in the web server logs around that time though:
I don&rsquo;t see anything interesting in the web server logs around that time though:
# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &quot;02/Jan/2019:0(1|2|3)&quot; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
92 40.77.167.4
@ -33,7 +33,7 @@ I don&#39;t see anything interesting in the web server logs around that time tho
<meta name="twitter:description" content="2019-01-02
Linode alerted that CGSpace (linode18) had a higher outbound traffic rate than normal early this morning
I don&#39;t see anything interesting in the web server logs around that time though:
I don&rsquo;t see anything interesting in the web server logs around that time though:
# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &quot;02/Jan/2019:0(1|2|3)&quot; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
92 40.77.167.4
@ -47,7 +47,7 @@ I don&#39;t see anything interesting in the web server logs around that time tho
357 207.46.13.1
903 54.70.40.11
"/>
<meta name="generator" content="Hugo 0.62.2" />
<meta name="generator" content="Hugo 0.63.1" />
@ -77,7 +77,7 @@ I don&#39;t see anything interesting in the web server logs around that time tho
<!-- combined, minified CSS -->
<link href="https://alanorth.github.io/cgspace-notes/css/style.a20c1a4367639632cdb341d23c27ca44fedcc75b0f8b3cbea6203010da153d3c.css" rel="stylesheet" integrity="sha256-ogwaQ2djljLNs0HSPCfKRP7cx1sPizy&#43;piAwENoVPTw=" crossorigin="anonymous">
<link href="https://alanorth.github.io/cgspace-notes/css/style.23e2c3298bcc8c1136c19aba330c211ec94c36f7c4454ea15cf4d3548370042a.css" rel="stylesheet" integrity="sha256-I&#43;LDKYvMjBE2wZq6MwwhHslMNvfERU6hXPTTVINwBCo=" crossorigin="anonymous">
<!-- RSS 2.0 feed -->
@ -124,7 +124,7 @@ I don&#39;t see anything interesting in the web server logs around that time tho
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-01/">January, 2019</a></h2>
<p class="blog-post-meta"><time datetime="2019-01-02T09:48:30&#43;02:00">Wed Jan 02, 2019</time> by Alan Orth in
<i class="fa fa-folder" aria-hidden="true"></i>&nbsp;<a href="/cgspace-notes/categories/notes" rel="category tag">Notes</a>
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes" rel="category tag">Notes</a>
</p>
@ -132,7 +132,7 @@ I don&#39;t see anything interesting in the web server logs around that time tho
<h2 id="2019-01-02">2019-01-02</h2>
<ul>
<li>Linode alerted that CGSpace (linode18) had a higher outbound traffic rate than normal early this morning</li>
<li>I don't see anything interesting in the web server logs around that time though:</li>
<li>I don&rsquo;t see anything interesting in the web server logs around that time though:</li>
</ul>
<pre><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &quot;02/Jan/2019:0(1|2|3)&quot; | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
92 40.77.167.4
@ -158,7 +158,7 @@ I don&#39;t see anything interesting in the web server logs around that time tho
# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &quot;02/Jan/2019:0(1|2|3)&quot; | grep 46.101.86.248 | grep -o -E &quot;(bitstream|discover|handle)&quot; | sort | uniq -c
261 handle
</code></pre><ul>
<li>It's not clear to me what was causing the outbound traffic spike</li>
<li>It&rsquo;s not clear to me what was causing the outbound traffic spike</li>
<li>Oh nice! The once-per-year cron job for rotating the Solr statistics actually worked now (for the first time ever!):</li>
</ul>
<pre><code>Moving: 81742 into core statistics-2010
@ -182,7 +182,7 @@ Moving: 18497180 into core statistics-2018
$ sudo docker rm dspacedb
$ sudo docker run --name dspacedb -v /home/aorth/.local/lib/containers/volumes/dspacedb_data:/var/lib/postgresql/data -e POSTGRES_PASSWORD=postgres -p 5432:5432 -d postgres:9.6-alpine
</code></pre><ul>
<li>Testing DSpace 5.9 with Tomcat 8.5.37 on my local machine and I see that Atmire's Listings and Reports still doesn't work
<li>Testing DSpace 5.9 with Tomcat 8.5.37 on my local machine and I see that Atmire&rsquo;s Listings and Reports still doesn&rsquo;t work
<ul>
<li>After logging in via XMLUI and clicking the Listings and Reports link from the sidebar it redirects me to a JSPUI login page</li>
<li>If I log in again there the Listings and Reports work&hellip; hmm.</li>
@ -264,17 +264,17 @@ org.apache.jasper.JasperException: /home.jsp (line: [214], column: [1]) /discove
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:748)
</code></pre><ul>
<li>I notice that I get different JSESSIONID cookies for <code>/</code> (XMLUI) and <code>/jspui</code> (JSPUI) on Tomcat 8.5.37, I wonder if it's the same on Tomcat 7.0.92&hellip; yes I do.</li>
<li>I notice that I get different JSESSIONID cookies for <code>/</code> (XMLUI) and <code>/jspui</code> (JSPUI) on Tomcat 8.5.37, I wonder if it&rsquo;s the same on Tomcat 7.0.92&hellip; yes I do.</li>
<li>Hmm, on Tomcat 7.0.92 I see that I get a <code>dspace.current.user.id</code> session cookie after logging into XMLUI, and then when I browse to JSPUI I am still logged in&hellip;
<ul>
<li>I didn't see that cookie being set on Tomcat 8.5.37</li>
<li>I didn&rsquo;t see that cookie being set on Tomcat 8.5.37</li>
</ul>
</li>
<li>I sent a message to the dspace-tech mailing list to ask</li>
</ul>
<h2 id="2019-01-04">2019-01-04</h2>
<ul>
<li>Linode sent a message last night that CGSpace (linode18) had high CPU usage, but I don't see anything around that time in the web server logs:</li>
<li>Linode sent a message last night that CGSpace (linode18) had high CPU usage, but I don&rsquo;t see anything around that time in the web server logs:</li>
</ul>
<pre><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &quot;03/Jan/2019:1(7|8|9)&quot; | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
189 207.46.13.192
@ -288,7 +288,7 @@ org.apache.jasper.JasperException: /home.jsp (line: [214], column: [1]) /discove
1776 66.249.70.27
2099 54.70.40.11
</code></pre><ul>
<li>I'm thinking about trying to validate our <code>dc.subject</code> terms against <a href="http://aims.fao.org/agrovoc/webservices">AGROVOC webservices</a></li>
<li>I&rsquo;m thinking about trying to validate our <code>dc.subject</code> terms against <a href="http://aims.fao.org/agrovoc/webservices">AGROVOC webservices</a></li>
<li>There seem to be a few APIs and the documentation is kinda confusing, but I found this REST endpoint that does work well, for example searching for <code>SOIL</code>:</li>
</ul>
<pre><code>$ http http://agrovoc.uniroma2.it/agrovoc/rest/v1/search?query=SOIL&amp;lang=en
@ -336,7 +336,7 @@ X-Frame-Options: ALLOW-FROM http://aims.fao.org
}
</code></pre><ul>
<li>The API does not appear to be case sensitive (searches for <code>SOIL</code> and <code>soil</code> return the same thing)</li>
<li>I'm a bit confused that there's no obvious return code or status when a term is not found, for example <code>SOILS</code>:</li>
<li>I&rsquo;m a bit confused that there&rsquo;s no obvious return code or status when a term is not found, for example <code>SOILS</code>:</li>
</ul>
<pre><code>HTTP/1.1 200 OK
Access-Control-Allow-Origin: *
@ -428,8 +428,8 @@ In [14]: for row in result.fetchone():
<ul>
<li>Tim Donohue responded to my thread about the cookies on the dspace-tech mailing list
<ul>
<li>He suspects it's a change of behavior in Tomcat 8.5, and indeed I see a mention of new cookie processing in the <a href="https://tomcat.apache.org/migration-85.html#Cookies">Tomcat 8.5 migration guide</a></li>
<li>I tried to switch my XMLUI and JSPUI contexts to use the <code>LegacyCookieProcessor</code>, but it didn't seem to help</li>
<li>He suspects it&rsquo;s a change of behavior in Tomcat 8.5, and indeed I see a mention of new cookie processing in the <a href="https://tomcat.apache.org/migration-85.html#Cookies">Tomcat 8.5 migration guide</a></li>
<li>I tried to switch my XMLUI and JSPUI contexts to use the <code>LegacyCookieProcessor</code>, but it didn&rsquo;t seem to help</li>
<li>I <a href="https://jira.duraspace.org/browse/DS-4140">filed DS-4140 on the DSpace issue tracker</a></li>
</ul>
</li>
@ -438,8 +438,8 @@ In [14]: for row in result.fetchone():
<ul>
<li>Tezira wrote to say she has stopped receiving the <code>DSpace Submission Approved and Archived</code> emails from CGSpace as of January 2nd
<ul>
<li>I told her that I haven't done anything to disable it lately, but that I would check</li>
<li>Bizu also says she hasn't received them lately</li>
<li>I told her that I haven&rsquo;t done anything to disable it lately, but that I would check</li>
<li>Bizu also says she hasn&rsquo;t received them lately</li>
</ul>
</li>
</ul>
@ -452,12 +452,12 @@ In [14]: for row in result.fetchone():
<li>Day two of CGSpace AReS meeting in Amman
<ul>
<li>Discuss possibly extending the <a href="https://github.com/ilri/dspace-statistics-api">dspace-statistics-api</a> to make community and collection statistics available</li>
<li>Discuss new &ldquo;final&rdquo; CG Core document and some changes that we'll need to do on CGSpace and other repositories</li>
<li>Discuss new &ldquo;final&rdquo; CG Core document and some changes that we&rsquo;ll need to do on CGSpace and other repositories</li>
<li>We agreed to try to stick to pure Dublin Core where possible, then use fields that exist in standard DSpace, and use &ldquo;cg&rdquo; namespace for everything else</li>
<li>Major changes are to move <code>dc.contributor.author</code> to <code>dc.creator</code> (which MELSpace and WorldFish are already using in their DSpace repositories)</li>
</ul>
</li>
<li>I am testing the speed of the WorldFish DSpace repository's REST API and it's five to ten times faster than CGSpace as I tested in <a href="/cgspace-notes/2018-10/">2018-10</a>:</li>
<li>I am testing the speed of the WorldFish DSpace repository&rsquo;s REST API and it&rsquo;s five to ten times faster than CGSpace as I tested in <a href="/cgspace-notes/2018-10/">2018-10</a>:</li>
</ul>
<pre><code>$ time http --print h 'https://digitalarchive.worldfishcenter.org/rest/items?expand=metadata,bitstreams,parentCommunityList&amp;limit=100&amp;offset=0'
@ -582,8 +582,8 @@ In [14]: for row in result.fetchone():
</li>
<li>Something happened to the Solr usage statistics on CGSpace
<ul>
<li>I looked on the server and the Solr cores are there (56GB!), and I don't see any obvious errors in dmesg or anything</li>
<li>I see that the server hasn't been rebooted in 26 days so I rebooted it</li>
<li>I looked on the server and the Solr cores are there (56GB!), and I don&rsquo;t see any obvious errors in dmesg or anything</li>
<li>I see that the server hasn&rsquo;t been rebooted in 26 days so I rebooted it</li>
</ul>
</li>
<li>After reboot the Solr stats are still messed up in the Atmire Usage Stats module, it only shows 2019-01!</li>
@ -712,7 +712,7 @@ Caused by: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed
</li>
<li>Abenet was asking if the Atmire Usage Stats are correct because they are over 2 million the last few months&hellip;</li>
<li>For 2019-01 alone the Usage Stats are already around 1.2 million</li>
<li>I tried to look in the nginx logs to see how many raw requests there are so far this month and it's about 1.4 million:</li>
<li>I tried to look in the nginx logs to see how many raw requests there are so far this month and it&rsquo;s about 1.4 million:</li>
</ul>
<pre><code># time zcat --force /var/log/nginx/* | grep -cE &quot;[0-9]{1,2}/Jan/2019&quot;
1442874
@ -724,8 +724,8 @@ sys 0m2.396s
<ul>
<li>Send reminder to Atmire about purchasing the <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=657">MQM module</a></li>
<li>Trying to decide the solid action points for CGSpace on the CG Core 2.0 metadata&hellip;</li>
<li>It's difficult to decide some of these because the current CG Core 2.0 document does not provide guidance or rationale (yet)!</li>
<li>Also, there is not a good Dublin Core reference (or maybe I just don't understand?)</li>
<li>It&rsquo;s difficult to decide some of these because the current CG Core 2.0 document does not provide guidance or rationale (yet)!</li>
<li>Also, there is not a good Dublin Core reference (or maybe I just don&rsquo;t understand?)</li>
<li>Several authoritative documents on Dublin Core appear to be:
<ul>
<li><a href="http://dublincore.org/documents/dces/">Dublin Core Metadata Element Set, Version 1.1: Reference Description</a></li>
@ -762,7 +762,7 @@ sys 0m2.396s
<h2 id="2019-01-19">2019-01-19</h2>
<ul>
<li>
<p>There's no official set of Dublin Core qualifiers so I can't tell if things like <code>dc.contributor.author</code> that are used by DSpace are official</p>
<p>There&rsquo;s no official set of Dublin Core qualifiers so I can&rsquo;t tell if things like <code>dc.contributor.author</code> that are used by DSpace are official</p>
</li>
<li>
<p>I found a great <a href="https://www.dri.ie/sites/default/files/files/qualified-dublin-core-metadata-guidelines.pdf">presentation from 2015 by the Digital Repository of Ireland</a> that discusses using MARC Relator Terms with Dublin Core elements</p>
@ -777,12 +777,12 @@ sys 0m2.396s
</ul>
<h2 id="2019-01-20">2019-01-20</h2>
<ul>
<li>That's weird, I logged into DSpace Test (linode19) and it says it has been up for 213 days:</li>
<li>That&rsquo;s weird, I logged into DSpace Test (linode19) and it says it has been up for 213 days:</li>
</ul>
<pre><code># w
04:46:14 up 213 days, 7:25, 4 users, load average: 1.94, 1.50, 1.35
</code></pre><ul>
<li>I've definitely rebooted it several times in the past few months&hellip; according to <code>journalctl -b</code> it was a few weeks ago on 2019-01-02</li>
<li>I&rsquo;ve definitely rebooted it several times in the past few months&hellip; according to <code>journalctl -b</code> it was a few weeks ago on 2019-01-02</li>
<li>I re-ran the Ansible DSpace tag, ran all system updates, and rebooted the host</li>
<li>After rebooting I notice that the Linode kernel went down from 4.19.8 to 4.18.16&hellip;</li>
<li>Atmire sent a quote on our <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=657">ticket about purchasing the Metadata Quality Module (MQM) for DSpace 5.8</a></li>
@ -793,7 +793,7 @@ sys 0m2.396s
</ul>
<h2 id="2019-01-21">2019-01-21</h2>
<ul>
<li>Investigating running Tomcat 7 on Ubuntu 18.04 with the tarball and a custom systemd package instead of waiting for our DSpace to get compatible with Ubuntu 18.04's Tomcat 8.5</li>
<li>Investigating running Tomcat 7 on Ubuntu 18.04 with the tarball and a custom systemd package instead of waiting for our DSpace to get compatible with Ubuntu 18.04&rsquo;s Tomcat 8.5</li>
<li>I could either run with a simple <code>tomcat7.service</code> like this:</li>
</ul>
<pre><code>[Unit]
@ -808,7 +808,7 @@ Group=aorth
[Install]
WantedBy=multi-user.target
</code></pre><ul>
<li>Or try to use adapt a real systemd service like Arch Linux's:</li>
<li>Or try to use adapt a real systemd service like Arch Linux&rsquo;s:</li>
</ul>
<pre><code>[Unit]
Description=Tomcat 7 servlet container
@ -847,7 +847,7 @@ ExecStop=/usr/bin/jsvc \
WantedBy=multi-user.target
</code></pre><ul>
<li>I see that <code>jsvc</code> and <code>libcommons-daemon-java</code> are both available on Ubuntu so that should be easy to port</li>
<li>We probably don't need Eclipse Java Bytecode Compiler (ecj)</li>
<li>We probably don&rsquo;t need Eclipse Java Bytecode Compiler (ecj)</li>
<li>I tested Tomcat 7.0.92 on Arch Linux using the <code>tomcat7.service</code> with <code>jsvc</code> and it works&hellip; nice!</li>
<li>I think I might manage this the same way I do the restic releases in the <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure scripts</a>, where I download a specific version and symlink to some generic location without the version number</li>
<li>I verified that there is indeed an issue with sharded Solr statistics cores on DSpace, which will cause inaccurate results in the dspace-statistics-api:</li>
@ -858,7 +858,7 @@ $ http 'http://localhost:3000/solr/statistics-2018/select?indent=on&amp;rows=0&a
&lt;result name=&quot;response&quot; numFound=&quot;241&quot; start=&quot;0&quot;&gt;
</code></pre><ul>
<li>I opened an issue on the GitHub issue tracker (<a href="https://github.com/ilri/dspace-statistics-api/issues/10">#10</a>)</li>
<li>I don't think the <a href="https://solrclient.readthedocs.io/en/latest/">SolrClient library</a> we are currently using supports these type of queries so we might have to just do raw queries with requests</li>
<li>I don&rsquo;t think the <a href="https://solrclient.readthedocs.io/en/latest/">SolrClient library</a> we are currently using supports these type of queries so we might have to just do raw queries with requests</li>
<li>The <a href="https://github.com/django-haystack/pysolr">pysolr</a> library says it supports multicore indexes, but I am not sure it does (or at least not with our setup):</li>
</ul>
<pre><code>import pysolr
@ -899,8 +899,8 @@ $ http 'http://localhost:3000/solr/statistics/select?&amp;shards=localhost:8081/
<li>I implemented a proof of concept to query the Solr STATUS for active cores and to add them with a <code>shards</code> query string</li>
<li>A few things I noticed:
<ul>
<li>Solr doesn't mind if you use an empty <code>shards</code> parameter</li>
<li>Solr doesn't mind if you have an extra comma at the end of the <code>shards</code> parameter</li>
<li>Solr doesn&rsquo;t mind if you use an empty <code>shards</code> parameter</li>
<li>Solr doesn&rsquo;t mind if you have an extra comma at the end of the <code>shards</code> parameter</li>
<li>If you are searching multiple cores, you need to include the base core in the <code>shards</code> parameter as well</li>
<li>For example, compare the following two queries, first including the base core and the shard in the <code>shards</code> parameter, and then only including the shard:</li>
</ul>
@ -930,7 +930,7 @@ $ http 'http://localhost:8081/solr/statistics/select?indent=on&amp;rows=0&amp;q=
915 35.237.175.180
</code></pre><ul>
<li>35.237.175.180 is known to us</li>
<li>I don't think we've seen 196.191.127.37 before. Its user agent is:</li>
<li>I don&rsquo;t think we&rsquo;ve seen 196.191.127.37 before. Its user agent is:</li>
</ul>
<pre><code>Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 UBrowser/7.0.185.1002 Safari/537.36
</code></pre><ul>
@ -957,7 +957,7 @@ $ http 'http://localhost:8081/solr/statistics/select?indent=on&amp;rows=0&amp;q=
<p>Very interesting discussion of methods for <a href="https://jdebp.eu/FGA/systemd-house-of-horror/tomcat.html">running Tomcat under systemd</a></p>
</li>
<li>
<p>We can set the ulimit options that used to be in <code>/etc/default/tomcat7</code> with systemd's <code>LimitNOFILE</code> and <code>LimitAS</code> (see the <code>systemd.exec</code> man page)</p>
<p>We can set the ulimit options that used to be in <code>/etc/default/tomcat7</code> with systemd&rsquo;s <code>LimitNOFILE</code> and <code>LimitAS</code> (see the <code>systemd.exec</code> man page)</p>
<ul>
<li>Note that we need to use <code>infinity</code> instead of <code>unlimited</code> for the address space</li>
</ul>
@ -991,7 +991,7 @@ COPY 1109
9265 45.5.186.2
</code></pre><ul>
<li>
<p>I think it's the usual IPs:</p>
<p>I think it&rsquo;s the usual IPs:</p>
<ul>
<li>45.5.186.2 is CIAT</li>
<li>70.32.83.92 is CCAFS</li>
@ -1009,7 +1009,7 @@ COPY 1109
</ul>
</li>
<li>
<p>Just to make sure these were not uploaded by the user or something, I manually forced the regeneration of these with DSpace's <code>filter-media</code>:</p>
<p>Just to make sure these were not uploaded by the user or something, I manually forced the regeneration of these with DSpace&rsquo;s <code>filter-media</code>:</p>
</li>
</ul>
<pre><code>$ schedtool -D -e ionice -c2 -n7 nice -n19 /home/cgspace.cgiar.org/bin/dspace filter-media -v -f -i 10568/98390
@ -1022,9 +1022,9 @@ $ schedtool -D -e ionice -c2 -n7 nice -n19 /home/cgspace.cgiar.org/bin/dspace fi
</ul>
<h2 id="2019-01-24">2019-01-24</h2>
<ul>
<li>I noticed Ubuntu's Ghostscript 9.26 works on some troublesome PDFs where Arch's Ghostscript 9.26 doesn't, so the fix for the first/last page crash is not the patch I found yesterday</li>
<li>Ubuntu's Ghostscript uses another <a href="http://git.ghostscript.com/?p=ghostpdl.git;h=fae21f1668d2b44b18b84cf0923a1d5f3008a696">patch from Ghostscript git</a> (<a href="https://bugs.ghostscript.com/show_bug.cgi?id=700315">upstream bug report</a>)</li>
<li>I re-compiled Arch's ghostscript with the patch and then I was able to generate a thumbnail from one of the <a href="https://cgspace.cgiar.org/handle/10568/98390">troublesome PDFs</a></li>
<li>I noticed Ubuntu&rsquo;s Ghostscript 9.26 works on some troublesome PDFs where Arch&rsquo;s Ghostscript 9.26 doesn&rsquo;t, so the fix for the first/last page crash is not the patch I found yesterday</li>
<li>Ubuntu&rsquo;s Ghostscript uses another <a href="http://git.ghostscript.com/?p=ghostpdl.git;h=fae21f1668d2b44b18b84cf0923a1d5f3008a696">patch from Ghostscript git</a> (<a href="https://bugs.ghostscript.com/show_bug.cgi?id=700315">upstream bug report</a>)</li>
<li>I re-compiled Arch&rsquo;s ghostscript with the patch and then I was able to generate a thumbnail from one of the <a href="https://cgspace.cgiar.org/handle/10568/98390">troublesome PDFs</a></li>
<li>Before and after:</li>
</ul>
<pre><code>$ identify Food\ safety\ Kenya\ fruits.pdf\[0\]
@ -1068,7 +1068,7 @@ identify: CorruptImageProfile `xmp' @ warning/profile.c/SetImageProfileInternal/
</ul>
<pre><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &quot;24/Jan/2019:&quot; | grep 45.5.186.2 | grep -Eo &quot;GET /(handle|bitstream|rest|oai)/&quot; | sort | uniq -c | sort -n
</code></pre><ul>
<li>CIAT's community currently has 12,000 items in it so this is normal</li>
<li>CIAT&rsquo;s community currently has 12,000 items in it so this is normal</li>
<li>The issue with goo.gl links that we saw yesterday appears to be resolved, as links are working again&hellip;</li>
<li>For example: <a href="https://goo.gl/fb/VRj9Gq">https://goo.gl/fb/VRj9Gq</a></li>
<li>The full <a href="http://id.loc.gov/vocabulary/relators.html">list of MARC Relators on the Library of Congress website</a> linked from the <a href="http://dublincore.org/usage/documents/relators/">DMCI relators page</a> is very confusing</li>
@ -1085,9 +1085,9 @@ identify: CorruptImageProfile `xmp' @ warning/profile.c/SetImageProfileInternal/
<ul>
<li>I tested by doing a Tomcat 7.0.91 installation, then switching it to 7.0.92 and it worked&hellip; nice!</li>
<li>I refined the tasks so much that I was confident enough to deploy them on DSpace Test and it went very well</li>
<li>Basically I just stopped tomcat7, created a dspace user, removed tomcat7, chown'd everything to the dspace user, then ran the playbook</li>
<li>Basically I just stopped tomcat7, created a dspace user, removed tomcat7, chown&rsquo;d everything to the dspace user, then ran the playbook</li>
<li>So now DSpace Test (linode19) is running Tomcat 7.0.92&hellip; w00t</li>
<li>Now we need to monitor it for a few weeks to see if there is anything we missed, and then I can change CGSpace (linode18) as well, and we're ready for Ubuntu 18.04 too!</li>
<li>Now we need to monitor it for a few weeks to see if there is anything we missed, and then I can change CGSpace (linode18) as well, and we&rsquo;re ready for Ubuntu 18.04 too!</li>
</ul>
</li>
</ul>
@ -1107,7 +1107,7 @@ identify: CorruptImageProfile `xmp' @ warning/profile.c/SetImageProfileInternal/
4644 205.186.128.185
4644 70.32.83.92
</code></pre><ul>
<li>I think it's the usual IPs:
<li>I think it&rsquo;s the usual IPs:
<ul>
<li>70.32.83.92 is CCAFS</li>
<li>205.186.128.185 is CCAFS or perhaps another Macaroni Bros harvester (new ILRI website?)</li>
@ -1158,7 +1158,7 @@ identify: CorruptImageProfile `xmp' @ warning/profile.c/SetImageProfileInternal/
2107 199.47.87.140
2540 45.5.186.2
</code></pre><ul>
<li>Of course there is CIAT's <code>45.5.186.2</code>, but also <code>45.5.184.2</code> appears to be CIAT&hellip; I wonder why they have two harvesters?</li>
<li>Of course there is CIAT&rsquo;s <code>45.5.186.2</code>, but also <code>45.5.184.2</code> appears to be CIAT&hellip; I wonder why they have two harvesters?</li>
<li><code>199.47.87.140</code> and <code>199.47.87.141</code> is TurnItIn with the following user agent:</li>
</ul>
<pre><code>TurnitinBot (https://turnitin.com/robot/crawlerinfo.html)
@ -1181,7 +1181,7 @@ identify: CorruptImageProfile `xmp' @ warning/profile.c/SetImageProfileInternal/
<li><code>45.5.186.2</code> is CIAT as usual&hellip;</li>
<li><code>70.32.83.92</code> and <code>205.186.128.185</code> are CCAFS as usual&hellip;</li>
<li><code>66.249.66.219</code> is Google&hellip;</li>
<li>I'm thinking it might finally be time to increase the threshold of the Linode CPU alerts
<li>I&rsquo;m thinking it might finally be time to increase the threshold of the Linode CPU alerts
<ul>
<li>I adjusted the alert threshold from 250% to 275%</li>
</ul>
@ -1233,7 +1233,7 @@ identify: CorruptImageProfile `xmp' @ warning/profile.c/SetImageProfileInternal/
9239 45.5.186.2
</code></pre><ul>
<li><code>45.5.186.2</code> and <code>45.5.184.2</code> are CIAT as always</li>
<li><code>85.25.237.71</code> is some new server in Germany that I've never seen before with the user agent:</li>
<li><code>85.25.237.71</code> is some new server in Germany that I&rsquo;ve never seen before with the user agent:</li>
</ul>
<pre><code>Linguee Bot (http://www.linguee.com/bot; bot@linguee.com)
</code></pre><!-- raw HTML omitted -->