Add notes for 2023-04-02

This commit is contained in:
2023-04-02 09:16:25 +03:00
parent a2875a3811
commit 5a0b3aaec1
135 changed files with 1383 additions and 1012 deletions

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2023-03-28T23:38:38+03:00" />
<meta property="og:updated_time" content="2023-04-02T08:19:36+03:00" />
@ -81,6 +81,33 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-04/">April, 2023</a></h2>
<p class="blog-post-meta"><time datetime="2023-04-02T08:19:36+03:00">Sun Apr 02, 2023</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2023-04-02">2023-04-02</h2>
<ul>
<li>Run all system updates on CGSpace and reboot it</li>
<li>I exported CGSpace to CSV to check for any missing Initiative collection mappings
<ul>
<li>I also did a check for missing country/region mappings with csv-metadata-quality</li>
</ul>
</li>
<li>Start a harvest on AReS</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2023-04/'>Read more →</a>
</article>
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2023-03/">March, 2023</a></h2>
@ -318,39 +345,6 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-06/">June, 2022</a></h2>
<p class="blog-post-meta"><time datetime="2022-06-06T09:01:36+03:00">Mon Jun 06, 2022</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2022-06-06">2022-06-06</h2>
<ul>
<li>Look at the Solr statistics on CGSpace
<ul>
<li>I see 167,000 hits from a bunch of Microsoft IPs with reverse DNS &ldquo;msnbot-&rdquo; using the Solr query <code>dns:*msnbot* AND dns:*.msn.com</code></li>
<li>I purged these first so I could see the other &ldquo;real&rdquo; IPs in the Solr facets</li>
</ul>
</li>
<li>I see 47,500 hits from 80.248.237.167 on a data center ISP in Sweden, using a normal user agent</li>
<li>I see 13,000 hits from 163.237.216.11 on a data center ISP in Australia, using a normal user agent</li>
<li>I see 7,300 hits from 208.185.238.57 from Britanica, using a normal user agent
<ul>
<li>There seem to be many more of these:</li>
</ul>
</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2022-06/'>Read more →</a>
</article>
<nav class="blog-pagination">
@ -375,6 +369,8 @@
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
<li><a href="/cgspace-notes/2023-02/">February, 2023</a></li>
@ -383,8 +379,6 @@
<li><a href="/cgspace-notes/2022-12/">December, 2022</a></li>
<li><a href="/cgspace-notes/2022-11/">November, 2022</a></li>
</ol>
</section>

View File

@ -6,7 +6,25 @@
<description>Recent content in Notes on CGSpace Notes</description>
<generator>Hugo -- gohugo.io</generator>
<language>en-us</language>
<lastBuildDate>Wed, 01 Mar 2023 07:58:36 +0300</lastBuildDate><atom:link href="https://alanorth.github.io/cgspace-notes/categories/notes/index.xml" rel="self" type="application/rss+xml" />
<lastBuildDate>Sun, 02 Apr 2023 08:19:36 +0300</lastBuildDate><atom:link href="https://alanorth.github.io/cgspace-notes/categories/notes/index.xml" rel="self" type="application/rss+xml" />
<item>
<title>April, 2023</title>
<link>https://alanorth.github.io/cgspace-notes/2023-04/</link>
<pubDate>Sun, 02 Apr 2023 08:19:36 +0300</pubDate>
<guid>https://alanorth.github.io/cgspace-notes/2023-04/</guid>
<description>&lt;h2 id=&#34;2023-04-02&#34;&gt;2023-04-02&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Run all system updates on CGSpace and reboot it&lt;/li&gt;
&lt;li&gt;I exported CGSpace to CSV to check for any missing Initiative collection mappings
&lt;ul&gt;
&lt;li&gt;I also did a check for missing country/region mappings with csv-metadata-quality&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Start a harvest on AReS&lt;/li&gt;
&lt;/ul&gt;</description>
</item>
<item>
<title>March, 2023</title>
<link>https://alanorth.github.io/cgspace-notes/2023-03/</link>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2023-03-28T23:38:38+03:00" />
<meta property="og:updated_time" content="2023-04-02T08:19:36+03:00" />
@ -81,6 +81,39 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-06/">June, 2022</a></h2>
<p class="blog-post-meta"><time datetime="2022-06-06T09:01:36+03:00">Mon Jun 06, 2022</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2022-06-06">2022-06-06</h2>
<ul>
<li>Look at the Solr statistics on CGSpace
<ul>
<li>I see 167,000 hits from a bunch of Microsoft IPs with reverse DNS &ldquo;msnbot-&rdquo; using the Solr query <code>dns:*msnbot* AND dns:*.msn.com</code></li>
<li>I purged these first so I could see the other &ldquo;real&rdquo; IPs in the Solr facets</li>
</ul>
</li>
<li>I see 47,500 hits from 80.248.237.167 on a data center ISP in Sweden, using a normal user agent</li>
<li>I see 13,000 hits from 163.237.216.11 on a data center ISP in Australia, using a normal user agent</li>
<li>I see 7,300 hits from 208.185.238.57 from Britanica, using a normal user agent
<ul>
<li>There seem to be many more of these:</li>
</ul>
</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2022-06/'>Read more →</a>
</article>
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-05/">May, 2022</a></h2>
@ -334,31 +367,6 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-08/">August, 2021</a></h2>
<p class="blog-post-meta"><time datetime="2021-08-01T09:01:07+03:00">Sun Aug 01, 2021</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2021-08-01">2021-08-01</h2>
<ul>
<li>Update Docker images on AReS server (linode20) and reboot the server:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># docker images | grep -v ^REPO | sed <span style="color:#e6db74">&#39;s/ \+/:/g&#39;</span> | cut -d: -f1,2 | grep -v none | xargs -L1 docker pull
</span></span></code></pre></div><ul>
<li>I decided to upgrade linode20 from Ubuntu 18.04 to 20.04</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2021-08/'>Read more →</a>
</article>
<nav class="blog-pagination">
<a class="btn btn-outline-primary" href="/cgspace-notes/categories/notes/" rel="prev" role="button">Previous page</a>
@ -383,6 +391,8 @@
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
<li><a href="/cgspace-notes/2023-02/">February, 2023</a></li>
@ -391,8 +401,6 @@
<li><a href="/cgspace-notes/2022-12/">December, 2022</a></li>
<li><a href="/cgspace-notes/2022-11/">November, 2022</a></li>
</ol>
</section>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2023-03-28T23:38:38+03:00" />
<meta property="og:updated_time" content="2023-04-02T08:19:36+03:00" />
@ -81,6 +81,31 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-08/">August, 2021</a></h2>
<p class="blog-post-meta"><time datetime="2021-08-01T09:01:07+03:00">Sun Aug 01, 2021</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2021-08-01">2021-08-01</h2>
<ul>
<li>Update Docker images on AReS server (linode20) and reboot the server:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># docker images | grep -v ^REPO | sed <span style="color:#e6db74">&#39;s/ \+/:/g&#39;</span> | cut -d: -f1,2 | grep -v none | xargs -L1 docker pull
</span></span></code></pre></div><ul>
<li>I decided to upgrade linode20 from Ubuntu 18.04 to 20.04</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2021-08/'>Read more →</a>
</article>
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-07/">July, 2021</a></h2>
@ -336,26 +361,6 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/">CGSpace DSpace 6 Upgrade</a></h2>
<p class="blog-post-meta"><time datetime="2020-11-15T13:27:35+02:00">Sun Nov 15, 2020</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/migration/" rel="tag">Migration</a>
</p>
</header>
<p>Notes about the DSpace 6 upgrade on CGSpace in 2020-11.</p>
<a href='https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/'>Read more →</a>
</article>
<nav class="blog-pagination">
<a class="btn btn-outline-primary" href="/cgspace-notes/categories/notes/page/2/" rel="prev" role="button">Previous page</a>
@ -380,6 +385,8 @@
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
<li><a href="/cgspace-notes/2023-02/">February, 2023</a></li>
@ -388,8 +395,6 @@
<li><a href="/cgspace-notes/2022-12/">December, 2022</a></li>
<li><a href="/cgspace-notes/2022-11/">November, 2022</a></li>
</ol>
</section>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2023-03-28T23:38:38+03:00" />
<meta property="og:updated_time" content="2023-04-02T08:19:36+03:00" />
@ -81,6 +81,26 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/">CGSpace DSpace 6 Upgrade</a></h2>
<p class="blog-post-meta"><time datetime="2020-11-15T13:27:35+02:00">Sun Nov 15, 2020</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/tags/migration/" rel="tag">Migration</a>
</p>
</header>
<p>Notes about the DSpace 6 upgrade on CGSpace in 2020-11.</p>
<a href='https://alanorth.github.io/cgspace-notes/cgspace-dspace6-upgrade/'>Read more →</a>
</article>
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-11/">November, 2020</a></h2>
@ -340,34 +360,6 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-02/">February, 2020</a></h2>
<p class="blog-post-meta"><time datetime="2020-02-02T11:56:30+02:00">Sun Feb 02, 2020</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2020-02-02">2020-02-02</h2>
<ul>
<li>Continue working on porting CGSpace&rsquo;s DSpace 5 code to DSpace 6.3 that I started yesterday
<ul>
<li>Sign up for an account with MaxMind so I can get the GeoLite2-City.mmdb database</li>
<li>I still need to wire up the API credentials and cron job into the Ansible infrastructure playbooks</li>
<li>Fix some minor issues in the config and XMLUI themes, like removing Atmire stuff</li>
<li>The code finally builds and runs with a fresh install</li>
</ul>
</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2020-02/'>Read more →</a>
</article>
<nav class="blog-pagination">
<a class="btn btn-outline-primary" href="/cgspace-notes/categories/notes/page/3/" rel="prev" role="button">Previous page</a>
@ -392,6 +384,8 @@
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
<li><a href="/cgspace-notes/2023-02/">February, 2023</a></li>
@ -400,8 +394,6 @@
<li><a href="/cgspace-notes/2022-12/">December, 2022</a></li>
<li><a href="/cgspace-notes/2022-11/">November, 2022</a></li>
</ol>
</section>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2023-03-28T23:38:38+03:00" />
<meta property="og:updated_time" content="2023-04-02T08:19:36+03:00" />
@ -81,6 +81,34 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-02/">February, 2020</a></h2>
<p class="blog-post-meta"><time datetime="2020-02-02T11:56:30+02:00">Sun Feb 02, 2020</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2020-02-02">2020-02-02</h2>
<ul>
<li>Continue working on porting CGSpace&rsquo;s DSpace 5 code to DSpace 6.3 that I started yesterday
<ul>
<li>Sign up for an account with MaxMind so I can get the GeoLite2-City.mmdb database</li>
<li>I still need to wire up the API credentials and cron job into the Ansible infrastructure playbooks</li>
<li>Fix some minor issues in the config and XMLUI themes, like removing Atmire stuff</li>
<li>The code finally builds and runs with a fresh install</li>
</ul>
</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2020-02/'>Read more →</a>
</article>
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-01/">January, 2020</a></h2>
@ -369,47 +397,6 @@ DELETE 1
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-04/">April, 2019</a></h2>
<p class="blog-post-meta"><time datetime="2019-04-01T09:00:43+03:00">Mon Apr 01, 2019</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2019-04-01">2019-04-01</h2>
<ul>
<li>Meeting with AgroKnow to discuss CGSpace, ILRI data, AReS, GARDIAN, etc
<ul>
<li>They asked if we had plans to enable RDF support in CGSpace</li>
</ul>
</li>
<li>There have been 4,400 more downloads of the CTA Spore publication from those strange Amazon IP addresses today
<ul>
<li>I suspected that some might not be successful, because the stats show less, but today they were all HTTP 200!</li>
</ul>
</li>
</ul>
<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep &#39;Spore-192-EN-web.pdf&#39; | grep -E &#39;(18.196.196.108|18.195.78.144|18.195.218.6)&#39; | awk &#39;{print $9}&#39; | sort | uniq -c | sort -n | tail -n 5
4432 200
</code></pre><ul>
<li>In the last two weeks there have been 47,000 downloads of this <em>same exact PDF</em> by these three IP addresses</li>
<li>Apply country and region corrections and deletions on DSpace Test and CGSpace:</li>
</ul>
<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2019-02-21-fix-9-countries.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.coverage.country -m 228 -t ACTION -d
$ ./fix-metadata-values.py -i /tmp/2019-02-21-fix-4-regions.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.coverage.region -m 231 -t action -d
$ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-2-countries.csv -db dspace -u dspace -p &#39;fuuu&#39; -m 228 -f cg.coverage.country -d
$ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-1-region.csv -db dspace -u dspace -p &#39;fuuu&#39; -m 231 -f cg.coverage.region -d
</code></pre>
<a href='https://alanorth.github.io/cgspace-notes/2019-04/'>Read more →</a>
</article>
<nav class="blog-pagination">
<a class="btn btn-outline-primary" href="/cgspace-notes/categories/notes/page/4/" rel="prev" role="button">Previous page</a>
@ -434,6 +421,8 @@ $ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-1-region.csv -db dspace
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
<li><a href="/cgspace-notes/2023-02/">February, 2023</a></li>
@ -442,8 +431,6 @@ $ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-1-region.csv -db dspace
<li><a href="/cgspace-notes/2022-12/">December, 2022</a></li>
<li><a href="/cgspace-notes/2022-11/">November, 2022</a></li>
</ol>
</section>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2023-03-28T23:38:38+03:00" />
<meta property="og:updated_time" content="2023-04-02T08:19:36+03:00" />
@ -81,6 +81,47 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-04/">April, 2019</a></h2>
<p class="blog-post-meta"><time datetime="2019-04-01T09:00:43+03:00">Mon Apr 01, 2019</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2019-04-01">2019-04-01</h2>
<ul>
<li>Meeting with AgroKnow to discuss CGSpace, ILRI data, AReS, GARDIAN, etc
<ul>
<li>They asked if we had plans to enable RDF support in CGSpace</li>
</ul>
</li>
<li>There have been 4,400 more downloads of the CTA Spore publication from those strange Amazon IP addresses today
<ul>
<li>I suspected that some might not be successful, because the stats show less, but today they were all HTTP 200!</li>
</ul>
</li>
</ul>
<pre tabindex="0"><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep &#39;Spore-192-EN-web.pdf&#39; | grep -E &#39;(18.196.196.108|18.195.78.144|18.195.218.6)&#39; | awk &#39;{print $9}&#39; | sort | uniq -c | sort -n | tail -n 5
4432 200
</code></pre><ul>
<li>In the last two weeks there have been 47,000 downloads of this <em>same exact PDF</em> by these three IP addresses</li>
<li>Apply country and region corrections and deletions on DSpace Test and CGSpace:</li>
</ul>
<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2019-02-21-fix-9-countries.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.coverage.country -m 228 -t ACTION -d
$ ./fix-metadata-values.py -i /tmp/2019-02-21-fix-4-regions.csv -db dspace -u dspace -p &#39;fuuu&#39; -f cg.coverage.region -m 231 -t action -d
$ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-2-countries.csv -db dspace -u dspace -p &#39;fuuu&#39; -m 228 -f cg.coverage.country -d
$ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-1-region.csv -db dspace -u dspace -p &#39;fuuu&#39; -m 231 -f cg.coverage.region -d
</code></pre>
<a href='https://alanorth.github.io/cgspace-notes/2019-04/'>Read more →</a>
</article>
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-03/">March, 2019</a></h2>
@ -350,44 +391,6 @@ sys 0m1.979s
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-06/">June, 2018</a></h2>
<p class="blog-post-meta"><time datetime="2018-06-04T19:49:54-07:00">Mon Jun 04, 2018</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2018-06-04">2018-06-04</h2>
<ul>
<li>Test the <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=560">DSpace 5.8 module upgrades from Atmire</a> (<a href="https://github.com/ilri/DSpace/pull/378">#378</a>)
<ul>
<li>There seems to be a problem with the CUA and L&amp;R versions in <code>pom.xml</code> because they are using SNAPSHOT and it doesn&rsquo;t build</li>
</ul>
</li>
<li>I added the new CCAFS Phase II Project Tag <code>PII-FP1_PACCA2</code> and merged it into the <code>5_x-prod</code> branch (<a href="https://github.com/ilri/DSpace/pull/379">#379</a>)</li>
<li>I proofed and tested the ILRI author corrections that Peter sent back to me this week:</li>
</ul>
<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2018-05-30-Correct-660-authors.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.contributor.author -t correct -m 3 -n
</code></pre><ul>
<li>I think a sane proofing workflow in OpenRefine is to apply the custom text facets for check/delete/remove and illegal characters that I developed in <a href="/cgspace-notes/2018-03/">March, 2018</a></li>
<li>Time to index ~70,000 items on CGSpace:</li>
</ul>
<pre tabindex="0"><code>$ time schedtool -D -e ionice -c2 -n7 nice -n19 [dspace]/bin/dspace index-discovery -b
real 74m42.646s
user 8m5.056s
sys 2m7.289s
</code></pre>
<a href='https://alanorth.github.io/cgspace-notes/2018-06/'>Read more →</a>
</article>
<nav class="blog-pagination">
<a class="btn btn-outline-primary" href="/cgspace-notes/categories/notes/page/5/" rel="prev" role="button">Previous page</a>
@ -412,6 +415,8 @@ sys 2m7.289s
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
<li><a href="/cgspace-notes/2023-02/">February, 2023</a></li>
@ -420,8 +425,6 @@ sys 2m7.289s
<li><a href="/cgspace-notes/2022-12/">December, 2022</a></li>
<li><a href="/cgspace-notes/2022-11/">November, 2022</a></li>
</ol>
</section>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2023-03-28T23:38:38+03:00" />
<meta property="og:updated_time" content="2023-04-02T08:19:36+03:00" />
@ -81,6 +81,44 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-06/">June, 2018</a></h2>
<p class="blog-post-meta"><time datetime="2018-06-04T19:49:54-07:00">Mon Jun 04, 2018</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2018-06-04">2018-06-04</h2>
<ul>
<li>Test the <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=560">DSpace 5.8 module upgrades from Atmire</a> (<a href="https://github.com/ilri/DSpace/pull/378">#378</a>)
<ul>
<li>There seems to be a problem with the CUA and L&amp;R versions in <code>pom.xml</code> because they are using SNAPSHOT and it doesn&rsquo;t build</li>
</ul>
</li>
<li>I added the new CCAFS Phase II Project Tag <code>PII-FP1_PACCA2</code> and merged it into the <code>5_x-prod</code> branch (<a href="https://github.com/ilri/DSpace/pull/379">#379</a>)</li>
<li>I proofed and tested the ILRI author corrections that Peter sent back to me this week:</li>
</ul>
<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2018-05-30-Correct-660-authors.csv -db dspace -u dspace -p &#39;fuuu&#39; -f dc.contributor.author -t correct -m 3 -n
</code></pre><ul>
<li>I think a sane proofing workflow in OpenRefine is to apply the custom text facets for check/delete/remove and illegal characters that I developed in <a href="/cgspace-notes/2018-03/">March, 2018</a></li>
<li>Time to index ~70,000 items on CGSpace:</li>
</ul>
<pre tabindex="0"><code>$ time schedtool -D -e ionice -c2 -n7 nice -n19 [dspace]/bin/dspace index-discovery -b
real 74m42.646s
user 8m5.056s
sys 2m7.289s
</code></pre>
<a href='https://alanorth.github.io/cgspace-notes/2018-06/'>Read more →</a>
</article>
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-05/">May, 2018</a></h2>
@ -386,6 +424,8 @@ COPY 54701
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2023-04/">April, 2023</a></li>
<li><a href="/cgspace-notes/2023-03/">March, 2023</a></li>
<li><a href="/cgspace-notes/2023-02/">February, 2023</a></li>
@ -394,8 +434,6 @@ COPY 54701
<li><a href="/cgspace-notes/2022-12/">December, 2022</a></li>
<li><a href="/cgspace-notes/2022-11/">November, 2022</a></li>
</ol>
</section>