cgspace-notes/public/2017-10/index.html

301 lines
10 KiB
HTML
Raw Normal View History

2017-10-01 07:13:31 +02:00
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
<meta property="og:title" content="October, 2017" />
<meta property="og:description" content="2017-10-01
Peter emailed to point out that many items in the ILRI archive collection have multiple handles:
http://hdl.handle.net/10568/78495||http://hdl.handle.net/10568/79336
There appears to be a pattern but I&rsquo;ll have to look a bit closer and try to clean them up automatically, either in SQL or in OpenRefine
2017-10-01 12:44:37 +02:00
Add Katherine Lutz to the groups for content sumission and edit steps of the CGIAR System collections
2017-10-01 07:13:31 +02:00
" />
<meta property="og:type" content="article" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2017-10/" />
<meta property="article:published_time" content="2017-10-01T08:07:54&#43;03:00"/>
2017-10-06 02:10:53 +02:00
<meta property="article:modified_time" content="2017-10-05T18:36:49&#43;03:00"/>
2017-10-01 07:13:31 +02:00
<meta name="twitter:card" content="summary"/><meta name="twitter:title" content="October, 2017"/>
<meta name="twitter:description" content="2017-10-01
Peter emailed to point out that many items in the ILRI archive collection have multiple handles:
http://hdl.handle.net/10568/78495||http://hdl.handle.net/10568/79336
There appears to be a pattern but I&rsquo;ll have to look a bit closer and try to clean them up automatically, either in SQL or in OpenRefine
2017-10-01 12:44:37 +02:00
Add Katherine Lutz to the groups for content sumission and edit steps of the CGIAR System collections
2017-10-01 07:13:31 +02:00
"/>
<meta name="generator" content="Hugo 0.29" />
<script type="application/ld+json">
{
"@context": "http://schema.org",
"@type": "BlogPosting",
"headline": "October, 2017",
"url": "https://alanorth.github.io/cgspace-notes/2017-10/",
2017-10-06 02:10:53 +02:00
"wordCount": "627",
2017-10-01 07:13:31 +02:00
"datePublished": "2017-10-01T08:07:54&#43;03:00",
2017-10-06 02:10:53 +02:00
"dateModified": "2017-10-05T18:36:49&#43;03:00",
2017-10-01 07:13:31 +02:00
"author": {
"@type": "Person",
"name": "Alan Orth"
},
"keywords": "Notes"
}
</script>
<link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2017-10/">
<title>October, 2017 | CGSpace Notes</title>
<!-- combined, minified CSS -->
<link href="https://alanorth.github.io/cgspace-notes/css/style.css" rel="stylesheet" integrity="sha384-zYRhIy0/Yl1e5lW9cimY1AugfdkHChXyCbs2NKFaLTgeQjVfj/CMPIUdjXm/JPWV" crossorigin="anonymous">
</head>
<body>
<div class="blog-masthead">
<div class="container">
<nav class="nav blog-nav">
<a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
</nav>
</div>
</div>
<header class="blog-header">
<div class="container">
<h1 class="blog-title"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
<p class="lead blog-description">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
</div>
</header>
<div class="container">
<div class="row">
<div class="col-sm-8 blog-main">
<article class="blog-post">
<header>
<h2 class="blog-post-title"><a href="https://alanorth.github.io/cgspace-notes/2017-10/">October, 2017</a></h2>
<p class="blog-post-meta"><time datetime="2017-10-01T08:07:54&#43;03:00">Sun Oct 01, 2017</time> by Alan Orth in
<i class="fa fa-tag" aria-hidden="true"></i>&nbsp;<a href="/cgspace-notes/tags/notes" rel="tag">Notes</a>
</p>
</header>
<h2 id="2017-10-01">2017-10-01</h2>
<ul>
<li>Peter emailed to point out that many items in the <a href="https://cgspace.cgiar.org/handle/10568/2703">ILRI archive collection</a> have multiple handles:</li>
</ul>
<pre><code>http://hdl.handle.net/10568/78495||http://hdl.handle.net/10568/79336
</code></pre>
<ul>
<li>There appears to be a pattern but I&rsquo;ll have to look a bit closer and try to clean them up automatically, either in SQL or in OpenRefine</li>
2017-10-01 12:44:37 +02:00
<li>Add Katherine Lutz to the groups for content sumission and edit steps of the CGIAR System collections</li>
2017-10-01 07:13:31 +02:00
</ul>
<p></p>
2017-10-02 07:14:44 +02:00
<h2 id="2017-10-02">2017-10-02</h2>
<ul>
<li>Peter Ballantyne said he was having problems logging into CGSpace with &ldquo;both&rdquo; of his accounts (CGIAR LDAP and personal, apparently)</li>
<li>I looked in the logs and saw some LDAP lookup failures due to timeout but also strangely a &ldquo;no DN found&rdquo; error:</li>
</ul>
<pre><code>2017-10-01 20:24:57,928 WARN org.dspace.authenticate.LDAPAuthentication @ anonymous:session_id=CA0AA5FEAEA8805645489404CDCE9594:ip_addr=41.204.190.40:ldap_attribute_lookup:type=failed_search javax.naming.CommunicationException\colon; svcgroot2.cgiarad.org\colon;3269 [Root exception is java.net.ConnectException\colon; Connection timed out (Connection timed out)]
2017-10-01 20:22:37,982 INFO org.dspace.authenticate.LDAPAuthentication @ anonymous:session_id=CA0AA5FEAEA8805645489404CDCE9594:ip_addr=41.204.190.40:failed_login:no DN found for user pballantyne
</code></pre>
<ul>
2017-10-02 07:31:19 +02:00
<li>I thought maybe his account had expired (seeing as it&rsquo;s was the first of the month) but he says he was finally able to log in today</li>
<li>The logs for yesterday show fourteen errors related to LDAP auth failures:</li>
</ul>
<pre><code>$ grep -c &quot;ldap_authentication:type=failed_auth&quot; dspace.log.2017-10-01
14
</code></pre>
<ul>
<li>For what it&rsquo;s worth, there are no errors on any other recent days, so it must have been some network issue on Linode or CGNET&rsquo;s LDAP server</li>
2017-10-02 16:29:23 +02:00
<li>Linode emailed to say that linode578611 (DSpace Test) needs to migrate to a new host for a security update so I initiated the migration immediately rather than waiting for the scheduled time in two weeks</li>
2017-10-02 07:14:44 +02:00
</ul>
2017-10-04 10:29:41 +02:00
<h2 id="2017-10-04">2017-10-04</h2>
<ul>
<li>Twice in the last twenty-four hours Linode has alerted about high CPU usage on CGSpace (linode2533629)</li>
<li>Communicate with Sam from the CGIAR System Organization about some broken links coming from their CGIAR Library domain to CGSpace</li>
<li>The first is a link to a browse page that should be handled better in nginx:</li>
</ul>
<pre><code>http://library.cgiar.org/browse?value=Intellectual%20Assets%20Reports&amp;type=subject → https://cgspace.cgiar.org/browse?value=Intellectual%20Assets%20Reports&amp;type=subject
</code></pre>
<ul>
<li>We&rsquo;ll need to check for browse links and handle them properly, including swapping the <code>subject</code> parameter for <code>systemsubject</code> (which doesn&rsquo;t exist in Discovery yet, but we&rsquo;ll need to add it) as we have moved their poorly curated subjects from <code>dc.subject</code> to <code>cg.subject.system</code></li>
<li>The second link was a direct link to a bitstream which has broken due to the sequence being updated, so I told him he should link to the handle of the item instead</li>
2017-10-04 14:56:39 +02:00
<li>Help Sisay proof sixty-two IITA records on DSpace Test</li>
<li>Lots of inconsistencies and errors in subjects, dc.format.extent, regions, countries</li>
2017-10-04 16:06:10 +02:00
<li>Merge the Discovery search changes for ISI Journal (<a href="https://github.com/ilri/DSpace/pull/341">#341</a>)</li>
2017-10-04 10:29:41 +02:00
</ul>
2017-10-05 17:36:49 +02:00
<h2 id="2017-10-05">2017-10-05</h2>
<ul>
<li>Twice in the past twenty-four hours Linode has warned that CGSpace&rsquo;s outbound traffic rate was exceeding the notification threshold</li>
<li>I had a look at yesterday&rsquo;s OAI and REST logs in <code>/var/log/nginx</code> but didn&rsquo;t see anything unusual:</li>
</ul>
<pre><code># awk '{print $1}' /var/log/nginx/rest.log.1 | sort -n | uniq -c | sort -h | tail -n 10
141 157.55.39.240
145 40.77.167.85
162 66.249.66.92
181 66.249.66.95
211 66.249.66.91
312 66.249.66.94
384 66.249.66.90
1495 50.116.102.77
3904 70.32.83.92
9904 45.5.184.196
# awk '{print $1}' /var/log/nginx/oai.log.1 | sort -n | uniq -c | sort -h | tail -n 10
5 66.249.66.71
6 66.249.66.67
6 68.180.229.31
8 41.84.227.85
8 66.249.66.92
17 66.249.66.65
24 66.249.66.91
38 66.249.66.95
69 66.249.66.90
148 66.249.66.94
</code></pre>
<ul>
<li>Working on the nginx redirects for CGIAR Library</li>
<li>We should start using 301 redirects and also allow for <code>/sitemap</code> to work on the library.cgiar.org domain so the CGIAR System Organization people can update their Google Search Console and allow Google to find their content in a structured way</li>
<li>Remove eleven occurrences of <code>ACP</code> in IITA&rsquo;s <code>cg.coverage.region</code> using the Atmire batch edit module from Discovery</li>
<li>Need to investigate how we can verify the library.cgiar.org using the HTML or DNS methods</li>
2017-10-06 02:10:53 +02:00
<li>Run corrections on 143 ILRI Archive items that had two <code>dc.identifier.uri</code> values (Handle) that Peter had pointed out earlier this week</li>
<li>I used OpenRefine to isolate them and then fixed and re-imported them into CGSpace</li>
2017-10-05 17:36:49 +02:00
</ul>
2017-10-01 07:13:31 +02:00
</article>
</div> <!-- /.blog-main -->
<aside class="col-sm-3 ml-auto blog-sidebar">
<section class="sidebar-module">
<h4>Recent Posts</h4>
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2017-10/">October, 2017</a></li>
<li><a href="/cgspace-notes/cgiar-library-migration/">CGIAR Library Migration</a></li>
<li><a href="/cgspace-notes/2017-09/">September, 2017</a></li>
<li><a href="/cgspace-notes/2017-08/">August, 2017</a></li>
<li><a href="/cgspace-notes/2017-07/">July, 2017</a></li>
</ol>
</section>
<section class="sidebar-module">
<h4>Links</h4>
<ol class="list-unstyled">
<li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
<li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
<li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
</ol>
</section>
</aside>
</div> <!-- /.row -->
</div> <!-- /.container -->
<footer class="blog-footer">
<p>
Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
</p>
<p>
<a href="#">Back to top</a>
</p>
</footer>
</body>
</html>