mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-12-23 13:34:32 +01:00
336 lines
12 KiB
HTML
336 lines
12 KiB
HTML
<!DOCTYPE html>
|
|
<html lang="en">
|
|
|
|
<head>
|
|
<meta charset="utf-8">
|
|
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
|
|
|
|
<meta property="og:title" content="October, 2017" />
|
|
<meta property="og:description" content="2017-10-01
|
|
|
|
|
|
Peter emailed to point out that many items in the ILRI archive collection have multiple handles:
|
|
|
|
|
|
http://hdl.handle.net/10568/78495||http://hdl.handle.net/10568/79336
|
|
|
|
|
|
|
|
There appears to be a pattern but I’ll have to look a bit closer and try to clean them up automatically, either in SQL or in OpenRefine
|
|
Add Katherine Lutz to the groups for content sumission and edit steps of the CGIAR System collections
|
|
|
|
|
|
" />
|
|
<meta property="og:type" content="article" />
|
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2017-10/" />
|
|
|
|
|
|
|
|
<meta property="article:published_time" content="2017-10-01T08:07:54+03:00"/>
|
|
<meta property="article:modified_time" content="2017-10-06T19:27:58+03:00"/>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<meta name="twitter:card" content="summary"/><meta name="twitter:title" content="October, 2017"/>
|
|
<meta name="twitter:description" content="2017-10-01
|
|
|
|
|
|
Peter emailed to point out that many items in the ILRI archive collection have multiple handles:
|
|
|
|
|
|
http://hdl.handle.net/10568/78495||http://hdl.handle.net/10568/79336
|
|
|
|
|
|
|
|
There appears to be a pattern but I’ll have to look a bit closer and try to clean them up automatically, either in SQL or in OpenRefine
|
|
Add Katherine Lutz to the groups for content sumission and edit steps of the CGIAR System collections
|
|
|
|
|
|
"/>
|
|
<meta name="generator" content="Hugo 0.29" />
|
|
|
|
|
|
|
|
<script type="application/ld+json">
|
|
{
|
|
"@context": "http://schema.org",
|
|
"@type": "BlogPosting",
|
|
"headline": "October, 2017",
|
|
"url": "https://alanorth.github.io/cgspace-notes/2017-10/",
|
|
"wordCount": "857",
|
|
"datePublished": "2017-10-01T08:07:54+03:00",
|
|
"dateModified": "2017-10-06T19:27:58+03:00",
|
|
"author": {
|
|
"@type": "Person",
|
|
"name": "Alan Orth"
|
|
},
|
|
"keywords": "Notes"
|
|
}
|
|
</script>
|
|
|
|
|
|
|
|
<link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2017-10/">
|
|
|
|
<title>October, 2017 | CGSpace Notes</title>
|
|
|
|
<!-- combined, minified CSS -->
|
|
<link href="https://alanorth.github.io/cgspace-notes/css/style.css" rel="stylesheet" integrity="sha384-zYRhIy0/Yl1e5lW9cimY1AugfdkHChXyCbs2NKFaLTgeQjVfj/CMPIUdjXm/JPWV" crossorigin="anonymous">
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
</head>
|
|
|
|
<body>
|
|
|
|
<div class="blog-masthead">
|
|
<div class="container">
|
|
<nav class="nav blog-nav">
|
|
<a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
|
|
|
|
|
|
</nav>
|
|
</div>
|
|
</div>
|
|
|
|
<header class="blog-header">
|
|
<div class="container">
|
|
<h1 class="blog-title"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
|
|
<p class="lead blog-description">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
|
|
</div>
|
|
</header>
|
|
|
|
<div class="container">
|
|
<div class="row">
|
|
<div class="col-sm-8 blog-main">
|
|
|
|
|
|
|
|
|
|
<article class="blog-post">
|
|
<header>
|
|
<h2 class="blog-post-title"><a href="https://alanorth.github.io/cgspace-notes/2017-10/">October, 2017</a></h2>
|
|
<p class="blog-post-meta"><time datetime="2017-10-01T08:07:54+03:00">Sun Oct 01, 2017</time> by Alan Orth in
|
|
|
|
<i class="fa fa-tag" aria-hidden="true"></i> <a href="/cgspace-notes/tags/notes" rel="tag">Notes</a>
|
|
|
|
</p>
|
|
</header>
|
|
<h2 id="2017-10-01">2017-10-01</h2>
|
|
|
|
<ul>
|
|
<li>Peter emailed to point out that many items in the <a href="https://cgspace.cgiar.org/handle/10568/2703">ILRI archive collection</a> have multiple handles:</li>
|
|
</ul>
|
|
|
|
<pre><code>http://hdl.handle.net/10568/78495||http://hdl.handle.net/10568/79336
|
|
</code></pre>
|
|
|
|
<ul>
|
|
<li>There appears to be a pattern but I’ll have to look a bit closer and try to clean them up automatically, either in SQL or in OpenRefine</li>
|
|
<li>Add Katherine Lutz to the groups for content sumission and edit steps of the CGIAR System collections</li>
|
|
</ul>
|
|
|
|
<p></p>
|
|
|
|
<h2 id="2017-10-02">2017-10-02</h2>
|
|
|
|
<ul>
|
|
<li>Peter Ballantyne said he was having problems logging into CGSpace with “both” of his accounts (CGIAR LDAP and personal, apparently)</li>
|
|
<li>I looked in the logs and saw some LDAP lookup failures due to timeout but also strangely a “no DN found” error:</li>
|
|
</ul>
|
|
|
|
<pre><code>2017-10-01 20:24:57,928 WARN org.dspace.authenticate.LDAPAuthentication @ anonymous:session_id=CA0AA5FEAEA8805645489404CDCE9594:ip_addr=41.204.190.40:ldap_attribute_lookup:type=failed_search javax.naming.CommunicationException\colon; svcgroot2.cgiarad.org\colon;3269 [Root exception is java.net.ConnectException\colon; Connection timed out (Connection timed out)]
|
|
2017-10-01 20:22:37,982 INFO org.dspace.authenticate.LDAPAuthentication @ anonymous:session_id=CA0AA5FEAEA8805645489404CDCE9594:ip_addr=41.204.190.40:failed_login:no DN found for user pballantyne
|
|
</code></pre>
|
|
|
|
<ul>
|
|
<li>I thought maybe his account had expired (seeing as it’s was the first of the month) but he says he was finally able to log in today</li>
|
|
<li>The logs for yesterday show fourteen errors related to LDAP auth failures:</li>
|
|
</ul>
|
|
|
|
<pre><code>$ grep -c "ldap_authentication:type=failed_auth" dspace.log.2017-10-01
|
|
14
|
|
</code></pre>
|
|
|
|
<ul>
|
|
<li>For what it’s worth, there are no errors on any other recent days, so it must have been some network issue on Linode or CGNET’s LDAP server</li>
|
|
<li>Linode emailed to say that linode578611 (DSpace Test) needs to migrate to a new host for a security update so I initiated the migration immediately rather than waiting for the scheduled time in two weeks</li>
|
|
</ul>
|
|
|
|
<h2 id="2017-10-04">2017-10-04</h2>
|
|
|
|
<ul>
|
|
<li>Twice in the last twenty-four hours Linode has alerted about high CPU usage on CGSpace (linode2533629)</li>
|
|
<li>Communicate with Sam from the CGIAR System Organization about some broken links coming from their CGIAR Library domain to CGSpace</li>
|
|
<li>The first is a link to a browse page that should be handled better in nginx:</li>
|
|
</ul>
|
|
|
|
<pre><code>http://library.cgiar.org/browse?value=Intellectual%20Assets%20Reports&type=subject → https://cgspace.cgiar.org/browse?value=Intellectual%20Assets%20Reports&type=subject
|
|
</code></pre>
|
|
|
|
<ul>
|
|
<li>We’ll need to check for browse links and handle them properly, including swapping the <code>subject</code> parameter for <code>systemsubject</code> (which doesn’t exist in Discovery yet, but we’ll need to add it) as we have moved their poorly curated subjects from <code>dc.subject</code> to <code>cg.subject.system</code></li>
|
|
<li>The second link was a direct link to a bitstream which has broken due to the sequence being updated, so I told him he should link to the handle of the item instead</li>
|
|
<li>Help Sisay proof sixty-two IITA records on DSpace Test</li>
|
|
<li>Lots of inconsistencies and errors in subjects, dc.format.extent, regions, countries</li>
|
|
<li>Merge the Discovery search changes for ISI Journal (<a href="https://github.com/ilri/DSpace/pull/341">#341</a>)</li>
|
|
</ul>
|
|
|
|
<h2 id="2017-10-05">2017-10-05</h2>
|
|
|
|
<ul>
|
|
<li>Twice in the past twenty-four hours Linode has warned that CGSpace’s outbound traffic rate was exceeding the notification threshold</li>
|
|
<li>I had a look at yesterday’s OAI and REST logs in <code>/var/log/nginx</code> but didn’t see anything unusual:</li>
|
|
</ul>
|
|
|
|
<pre><code># awk '{print $1}' /var/log/nginx/rest.log.1 | sort -n | uniq -c | sort -h | tail -n 10
|
|
141 157.55.39.240
|
|
145 40.77.167.85
|
|
162 66.249.66.92
|
|
181 66.249.66.95
|
|
211 66.249.66.91
|
|
312 66.249.66.94
|
|
384 66.249.66.90
|
|
1495 50.116.102.77
|
|
3904 70.32.83.92
|
|
9904 45.5.184.196
|
|
# awk '{print $1}' /var/log/nginx/oai.log.1 | sort -n | uniq -c | sort -h | tail -n 10
|
|
5 66.249.66.71
|
|
6 66.249.66.67
|
|
6 68.180.229.31
|
|
8 41.84.227.85
|
|
8 66.249.66.92
|
|
17 66.249.66.65
|
|
24 66.249.66.91
|
|
38 66.249.66.95
|
|
69 66.249.66.90
|
|
148 66.249.66.94
|
|
</code></pre>
|
|
|
|
<ul>
|
|
<li>Working on the nginx redirects for CGIAR Library</li>
|
|
<li>We should start using 301 redirects and also allow for <code>/sitemap</code> to work on the library.cgiar.org domain so the CGIAR System Organization people can update their Google Search Console and allow Google to find their content in a structured way</li>
|
|
<li>Remove eleven occurrences of <code>ACP</code> in IITA’s <code>cg.coverage.region</code> using the Atmire batch edit module from Discovery</li>
|
|
<li>Need to investigate how we can verify the library.cgiar.org using the HTML or DNS methods</li>
|
|
<li>Run corrections on 143 ILRI Archive items that had two <code>dc.identifier.uri</code> values (Handle) that Peter had pointed out earlier this week</li>
|
|
<li>I used OpenRefine to isolate them and then fixed and re-imported them into CGSpace</li>
|
|
<li>I manually checked a dozen of them and it appeared that the correct handle was always the second one, so I just deleted the first one</li>
|
|
</ul>
|
|
|
|
<h2 id="2017-10-06">2017-10-06</h2>
|
|
|
|
<ul>
|
|
<li>I saw a nice tweak to thumbnail presentation on the Cardiff Metropolitan University DSpace: <a href="https://repository.cardiffmet.ac.uk/handle/10369/8780">https://repository.cardiffmet.ac.uk/handle/10369/8780</a></li>
|
|
<li>It adds a subtle border and box shadow, before and after:</li>
|
|
</ul>
|
|
|
|
<p><img src="/cgspace-notes/2017/10/dspace-thumbnail-original.png" alt="Original flat thumbnails" />
|
|
<img src="/cgspace-notes/2017/10/dspace-thumbnail-box-shadow.png" alt="Tweaked with border and box shadow" /></p>
|
|
|
|
<ul>
|
|
<li>I’ll post it to the Yammer group to see what people think</li>
|
|
<li>I figured out at way to do the HTML verification for Google Search console for library.cgiar.org</li>
|
|
<li>We can drop the HTML file in their XMLUI theme folder and it will get copied to the webapps directory during build/install</li>
|
|
<li>Then we add an nginx alias for that URL in the library.cgiar.org vhost</li>
|
|
<li>This method is kinda a hack but at least we can put all the pieces into git to be reproducible</li>
|
|
<li>I will tell Tunji to send me the verification file</li>
|
|
</ul>
|
|
|
|
<h2 id="2017-10-10">2017-10-10</h2>
|
|
|
|
<ul>
|
|
<li>Deploy logic to allow verification of the library.cgiar.org domain in the Google Search Console (<a href="https://github.com/ilri/DSpace/pull/343">#343</a>)</li>
|
|
<li>After verifying both the HTTP and HTTPS domains and submitting a sitemap it will be interesting to see how the stats in the console as well as the search results change (currently 28,500 results):</li>
|
|
</ul>
|
|
|
|
<p><img src="/cgspace-notes/2017/10/google-search-console.png" alt="Google Search Console" />
|
|
<img src="/cgspace-notes/2017/10/google-search-console-2.png" alt="Google Search Console 2" />
|
|
<img src="/cgspace-notes/2017/10/google-search-results.png" alt="Google Search results" /></p>
|
|
|
|
<ul>
|
|
<li>I tried to submit a “Change of Address” request in the Google Search Console but I need to be an owner on CGSpace’s console (currently I’m just a user) in order to do that</li>
|
|
</ul>
|
|
|
|
|
|
|
|
|
|
|
|
</article>
|
|
|
|
|
|
|
|
</div> <!-- /.blog-main -->
|
|
|
|
<aside class="col-sm-3 ml-auto blog-sidebar">
|
|
|
|
|
|
|
|
<section class="sidebar-module">
|
|
<h4>Recent Posts</h4>
|
|
<ol class="list-unstyled">
|
|
|
|
|
|
<li><a href="/cgspace-notes/2017-10/">October, 2017</a></li>
|
|
|
|
<li><a href="/cgspace-notes/cgiar-library-migration/">CGIAR Library Migration</a></li>
|
|
|
|
<li><a href="/cgspace-notes/2017-09/">September, 2017</a></li>
|
|
|
|
<li><a href="/cgspace-notes/2017-08/">August, 2017</a></li>
|
|
|
|
<li><a href="/cgspace-notes/2017-07/">July, 2017</a></li>
|
|
|
|
</ol>
|
|
</section>
|
|
|
|
|
|
|
|
|
|
<section class="sidebar-module">
|
|
<h4>Links</h4>
|
|
<ol class="list-unstyled">
|
|
|
|
<li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
|
|
|
|
<li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
|
|
|
|
<li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
|
|
|
|
</ol>
|
|
</section>
|
|
|
|
</aside>
|
|
|
|
|
|
</div> <!-- /.row -->
|
|
</div> <!-- /.container -->
|
|
|
|
<footer class="blog-footer">
|
|
<p>
|
|
|
|
Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
|
|
|
|
</p>
|
|
<p>
|
|
<a href="#">Back to top</a>
|
|
</p>
|
|
</footer>
|
|
|
|
</body>
|
|
|
|
</html>
|