mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Add notes for 2022-05-04
This commit is contained in:
@ -10,14 +10,14 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/" />
|
||||
<meta property="og:updated_time" content="2022-04-27T09:58:45+03:00" />
|
||||
<meta property="og:updated_time" content="2022-05-04T09:13:39+03:00" />
|
||||
|
||||
|
||||
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="Categories"/>
|
||||
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
|
||||
<meta name="generator" content="Hugo 0.97.3" />
|
||||
<meta name="generator" content="Hugo 0.98.0" />
|
||||
|
||||
|
||||
|
||||
@ -84,7 +84,7 @@
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/categories/notes/">Notes</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2022-04-01T10:53:39+03:00">Fri Apr 01, 2022</time> by Alan Orth</p>
|
||||
<p class="blog-post-meta"><time datetime="2022-05-04T09:13:39+03:00">Wed May 04, 2022</time> by Alan Orth</p>
|
||||
</header>
|
||||
|
||||
<a href='https://alanorth.github.io/cgspace-notes/categories/notes/'>Read more →</a>
|
||||
@ -108,6 +108,8 @@
|
||||
<ol class="list-unstyled">
|
||||
|
||||
|
||||
<li><a href="/cgspace-notes/2022-05/">May, 2022</a></li>
|
||||
|
||||
<li><a href="/cgspace-notes/2022-04/">April, 2022</a></li>
|
||||
|
||||
<li><a href="/cgspace-notes/2022-03/">March, 2022</a></li>
|
||||
@ -116,8 +118,6 @@
|
||||
|
||||
<li><a href="/cgspace-notes/2022-01/">January, 2022</a></li>
|
||||
|
||||
<li><a href="/cgspace-notes/2021-12/">December, 2021</a></li>
|
||||
|
||||
</ol>
|
||||
</section>
|
||||
|
||||
|
@ -6,11 +6,11 @@
|
||||
<description>Recent content in Categories on CGSpace Notes</description>
|
||||
<generator>Hugo -- gohugo.io</generator>
|
||||
<language>en-us</language>
|
||||
<lastBuildDate>Fri, 01 Apr 2022 10:53:39 +0300</lastBuildDate><atom:link href="https://alanorth.github.io/cgspace-notes/categories/index.xml" rel="self" type="application/rss+xml" />
|
||||
<lastBuildDate>Wed, 04 May 2022 09:13:39 +0300</lastBuildDate><atom:link href="https://alanorth.github.io/cgspace-notes/categories/index.xml" rel="self" type="application/rss+xml" />
|
||||
<item>
|
||||
<title>Notes</title>
|
||||
<link>https://alanorth.github.io/cgspace-notes/categories/notes/</link>
|
||||
<pubDate>Fri, 01 Apr 2022 10:53:39 +0300</pubDate>
|
||||
<pubDate>Wed, 04 May 2022 09:13:39 +0300</pubDate>
|
||||
|
||||
<guid>https://alanorth.github.io/cgspace-notes/categories/notes/</guid>
|
||||
<description></description>
|
||||
|
@ -10,14 +10,14 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
||||
<meta property="og:updated_time" content="2022-04-27T09:58:45+03:00" />
|
||||
<meta property="og:updated_time" content="2022-05-04T09:13:39+03:00" />
|
||||
|
||||
|
||||
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="Notes"/>
|
||||
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
|
||||
<meta name="generator" content="Hugo 0.97.3" />
|
||||
<meta name="generator" content="Hugo 0.98.0" />
|
||||
|
||||
|
||||
|
||||
@ -81,6 +81,48 @@
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-05/">May, 2022</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2022-05-04T09:13:39+03:00">Wed May 04, 2022</time> by Alan Orth in
|
||||
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
|
||||
|
||||
|
||||
</p>
|
||||
</header>
|
||||
<h2 id="2022-05-04">2022-05-04</h2>
|
||||
<ul>
|
||||
<li>I found a few more IPs making requests using the shady Chrome 44 user agent in the last few days so I will add them to the block list too:
|
||||
<ul>
|
||||
<li>18.207.136.176</li>
|
||||
<li>185.189.36.248</li>
|
||||
<li>50.118.223.78</li>
|
||||
<li>52.70.76.123</li>
|
||||
<li>3.236.10.11</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>Looking at the Solr statistics for 2022-04
|
||||
<ul>
|
||||
<li>52.191.137.59 is Microsoft, but they are using a normal user agent and making tens of thousands of requests</li>
|
||||
<li>64.39.98.62 is owned by Qualys, and all their requests are probing for /etc/passwd etc</li>
|
||||
<li>185.192.69.15 is in the Netherlands and is using a normal user agent, but making excessive automated HTTP requests to paths forbidden in robots.txt</li>
|
||||
<li>157.55.39.159 is owned by Microsoft and identifies as bingbot so I don’t know why its requests were logged in Solr</li>
|
||||
<li>52.233.67.176 is owned by Microsoft and uses a normal user agent, but making excessive automated HTTP requests</li>
|
||||
<li>157.55.39.144 is owned by Microsoft and uses a normal user agent, but making excessive automated HTTP requests</li>
|
||||
<li>207.46.13.177 is owned by Microsoft and identifies as bingbot so I don’t know why its requests were logged in Solr</li>
|
||||
<li>If I query Solr for <code>time:2022-04* AND dns:*msnbot* AND dns:*.msn.com.</code> I see a handful of IPs that made 41,000 requests</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>I purged 93,974 hits from these IPs using my <code>check-spider-ip-hits.sh</code> script</li>
|
||||
</ul>
|
||||
<a href='https://alanorth.github.io/cgspace-notes/2022-05/'>Read more →</a>
|
||||
</article>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2022-04/">April, 2022</a></h2>
|
||||
@ -317,30 +359,6 @@
|
||||
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-07/">July, 2021</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2021-07-01T08:53:07+03:00">Thu Jul 01, 2021</time> by Alan Orth in
|
||||
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
|
||||
|
||||
|
||||
</p>
|
||||
</header>
|
||||
<h2 id="2021-07-01">2021-07-01</h2>
|
||||
<ul>
|
||||
<li>Export another list of ALL subjects on CGSpace, including AGROVOC and non-AGROVOC for Enrico:</li>
|
||||
</ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace63= > \COPY (SELECT DISTINCT LOWER(text_value) AS subject, count(*) FROM metadatavalue WHERE dspace_object_id in (SELECT dspace_object_id FROM item) AND metadata_field_id IN (119, 120, 127, 122, 128, 125, 135, 203, 208, 210, 215, 123, 236, 242, 187) GROUP BY subject ORDER BY count DESC) to /tmp/2021-07-01-all-subjects.csv WITH CSV HEADER;
|
||||
</span></span><span style="display:flex;"><span>COPY 20994
|
||||
</span></span></code></pre></div>
|
||||
<a href='https://alanorth.github.io/cgspace-notes/2021-07/'>Read more →</a>
|
||||
</article>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<nav class="blog-pagination">
|
||||
|
||||
|
||||
@ -365,6 +383,8 @@
|
||||
<ol class="list-unstyled">
|
||||
|
||||
|
||||
<li><a href="/cgspace-notes/2022-05/">May, 2022</a></li>
|
||||
|
||||
<li><a href="/cgspace-notes/2022-04/">April, 2022</a></li>
|
||||
|
||||
<li><a href="/cgspace-notes/2022-03/">March, 2022</a></li>
|
||||
@ -373,8 +393,6 @@
|
||||
|
||||
<li><a href="/cgspace-notes/2022-01/">January, 2022</a></li>
|
||||
|
||||
<li><a href="/cgspace-notes/2021-12/">December, 2021</a></li>
|
||||
|
||||
</ol>
|
||||
</section>
|
||||
|
||||
|
@ -6,7 +6,40 @@
|
||||
<description>Recent content in Notes on CGSpace Notes</description>
|
||||
<generator>Hugo -- gohugo.io</generator>
|
||||
<language>en-us</language>
|
||||
<lastBuildDate>Fri, 01 Apr 2022 10:53:39 +0300</lastBuildDate><atom:link href="https://alanorth.github.io/cgspace-notes/categories/notes/index.xml" rel="self" type="application/rss+xml" />
|
||||
<lastBuildDate>Wed, 04 May 2022 09:13:39 +0300</lastBuildDate><atom:link href="https://alanorth.github.io/cgspace-notes/categories/notes/index.xml" rel="self" type="application/rss+xml" />
|
||||
<item>
|
||||
<title>May, 2022</title>
|
||||
<link>https://alanorth.github.io/cgspace-notes/2022-05/</link>
|
||||
<pubDate>Wed, 04 May 2022 09:13:39 +0300</pubDate>
|
||||
|
||||
<guid>https://alanorth.github.io/cgspace-notes/2022-05/</guid>
|
||||
<description><h2 id="2022-05-04">2022-05-04</h2>
|
||||
<ul>
|
||||
<li>I found a few more IPs making requests using the shady Chrome 44 user agent in the last few days so I will add them to the block list too:
|
||||
<ul>
|
||||
<li>18.207.136.176</li>
|
||||
<li>185.189.36.248</li>
|
||||
<li>50.118.223.78</li>
|
||||
<li>52.70.76.123</li>
|
||||
<li>3.236.10.11</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>Looking at the Solr statistics for 2022-04
|
||||
<ul>
|
||||
<li>52.191.137.59 is Microsoft, but they are using a normal user agent and making tens of thousands of requests</li>
|
||||
<li>64.39.98.62 is owned by Qualys, and all their requests are probing for /etc/passwd etc</li>
|
||||
<li>185.192.69.15 is in the Netherlands and is using a normal user agent, but making excessive automated HTTP requests to paths forbidden in robots.txt</li>
|
||||
<li>157.55.39.159 is owned by Microsoft and identifies as bingbot so I don&rsquo;t know why its requests were logged in Solr</li>
|
||||
<li>52.233.67.176 is owned by Microsoft and uses a normal user agent, but making excessive automated HTTP requests</li>
|
||||
<li>157.55.39.144 is owned by Microsoft and uses a normal user agent, but making excessive automated HTTP requests</li>
|
||||
<li>207.46.13.177 is owned by Microsoft and identifies as bingbot so I don&rsquo;t know why its requests were logged in Solr</li>
|
||||
<li>If I query Solr for <code>time:2022-04* AND dns:*msnbot* AND dns:*.msn.com.</code> I see a handful of IPs that made 41,000 requests</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>I purged 93,974 hits from these IPs using my <code>check-spider-ip-hits.sh</code> script</li>
|
||||
</ul></description>
|
||||
</item>
|
||||
|
||||
<item>
|
||||
<title>April, 2022</title>
|
||||
<link>https://alanorth.github.io/cgspace-notes/2022-04/</link>
|
||||
|
@ -10,14 +10,14 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
||||
<meta property="og:updated_time" content="2022-04-27T09:58:45+03:00" />
|
||||
<meta property="og:updated_time" content="2022-05-04T09:13:39+03:00" />
|
||||
|
||||
|
||||
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="Notes"/>
|
||||
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
|
||||
<meta name="generator" content="Hugo 0.97.3" />
|
||||
<meta name="generator" content="Hugo 0.98.0" />
|
||||
|
||||
|
||||
|
||||
@ -81,6 +81,30 @@
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-07/">July, 2021</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2021-07-01T08:53:07+03:00">Thu Jul 01, 2021</time> by Alan Orth in
|
||||
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
|
||||
|
||||
|
||||
</p>
|
||||
</header>
|
||||
<h2 id="2021-07-01">2021-07-01</h2>
|
||||
<ul>
|
||||
<li>Export another list of ALL subjects on CGSpace, including AGROVOC and non-AGROVOC for Enrico:</li>
|
||||
</ul>
|
||||
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>localhost/dspace63= > \COPY (SELECT DISTINCT LOWER(text_value) AS subject, count(*) FROM metadatavalue WHERE dspace_object_id in (SELECT dspace_object_id FROM item) AND metadata_field_id IN (119, 120, 127, 122, 128, 125, 135, 203, 208, 210, 215, 123, 236, 242, 187) GROUP BY subject ORDER BY count DESC) to /tmp/2021-07-01-all-subjects.csv WITH CSV HEADER;
|
||||
</span></span><span style="display:flex;"><span>COPY 20994
|
||||
</span></span></code></pre></div>
|
||||
<a href='https://alanorth.github.io/cgspace-notes/2021-07/'>Read more →</a>
|
||||
</article>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-06/">June, 2021</a></h2>
|
||||
@ -332,31 +356,6 @@
|
||||
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-11/">November, 2020</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2020-11-01T13:11:54+02:00">Sun Nov 01, 2020</time> by Alan Orth in
|
||||
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
|
||||
|
||||
|
||||
</p>
|
||||
</header>
|
||||
<h2 id="2020-11-01">2020-11-01</h2>
|
||||
<ul>
|
||||
<li>Continue with processing the statistics-2019 Solr core with the AtomicStatisticsUpdateCLI tool on DSpace Test
|
||||
<ul>
|
||||
<li>So far we’ve spent at least fifty hours to process the statistics and statistics-2019 core… wow.</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<a href='https://alanorth.github.io/cgspace-notes/2020-11/'>Read more →</a>
|
||||
</article>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<nav class="blog-pagination">
|
||||
|
||||
<a class="btn btn-outline-primary" href="/cgspace-notes/categories/notes/" rel="prev" role="button">Previous page</a>
|
||||
@ -381,6 +380,8 @@
|
||||
<ol class="list-unstyled">
|
||||
|
||||
|
||||
<li><a href="/cgspace-notes/2022-05/">May, 2022</a></li>
|
||||
|
||||
<li><a href="/cgspace-notes/2022-04/">April, 2022</a></li>
|
||||
|
||||
<li><a href="/cgspace-notes/2022-03/">March, 2022</a></li>
|
||||
@ -389,8 +390,6 @@
|
||||
|
||||
<li><a href="/cgspace-notes/2022-01/">January, 2022</a></li>
|
||||
|
||||
<li><a href="/cgspace-notes/2021-12/">December, 2021</a></li>
|
||||
|
||||
</ol>
|
||||
</section>
|
||||
|
||||
|
@ -10,14 +10,14 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
||||
<meta property="og:updated_time" content="2022-04-27T09:58:45+03:00" />
|
||||
<meta property="og:updated_time" content="2022-05-04T09:13:39+03:00" />
|
||||
|
||||
|
||||
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="Notes"/>
|
||||
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
|
||||
<meta name="generator" content="Hugo 0.97.3" />
|
||||
<meta name="generator" content="Hugo 0.98.0" />
|
||||
|
||||
|
||||
|
||||
@ -81,6 +81,31 @@
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-11/">November, 2020</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2020-11-01T13:11:54+02:00">Sun Nov 01, 2020</time> by Alan Orth in
|
||||
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
|
||||
|
||||
|
||||
</p>
|
||||
</header>
|
||||
<h2 id="2020-11-01">2020-11-01</h2>
|
||||
<ul>
|
||||
<li>Continue with processing the statistics-2019 Solr core with the AtomicStatisticsUpdateCLI tool on DSpace Test
|
||||
<ul>
|
||||
<li>So far we’ve spent at least fifty hours to process the statistics and statistics-2019 core… wow.</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<a href='https://alanorth.github.io/cgspace-notes/2020-11/'>Read more →</a>
|
||||
</article>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-10/">October, 2020</a></h2>
|
||||
@ -343,43 +368,6 @@
|
||||
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-01/">January, 2020</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2020-01-06T10:48:30+02:00">Mon Jan 06, 2020</time> by Alan Orth in
|
||||
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
|
||||
|
||||
|
||||
</p>
|
||||
</header>
|
||||
<h2 id="2020-01-06">2020-01-06</h2>
|
||||
<ul>
|
||||
<li>Open <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=706">a ticket</a> with Atmire to request a quote for the upgrade to DSpace 6</li>
|
||||
<li>Last week Altmetric responded about the <a href="https://hdl.handle.net/10568/97087">item</a> that had a lower score than than its DOI
|
||||
<ul>
|
||||
<li>The score is now linked to the DOI</li>
|
||||
<li>Another <a href="https://hdl.handle.net/10568/91278">item</a> that had the same problem in 2019 has now also linked to the score for its DOI</li>
|
||||
<li>Another <a href="https://hdl.handle.net/10568/81236">item</a> that had the same problem in 2019 has also been fixed</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<h2 id="2020-01-07">2020-01-07</h2>
|
||||
<ul>
|
||||
<li>Peter Ballantyne highlighted one more WLE <a href="https://hdl.handle.net/10568/101286">item</a> that is missing the Altmetric score that its DOI has
|
||||
<ul>
|
||||
<li>The DOI has a score of 259, but the Handle has no score at all</li>
|
||||
<li>I <a href="https://twitter.com/mralanorth/status/1214471427157626881">tweeted</a> the CGSpace repository link</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<a href='https://alanorth.github.io/cgspace-notes/2020-01/'>Read more →</a>
|
||||
</article>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<nav class="blog-pagination">
|
||||
|
||||
<a class="btn btn-outline-primary" href="/cgspace-notes/categories/notes/page/2/" rel="prev" role="button">Previous page</a>
|
||||
@ -404,6 +392,8 @@
|
||||
<ol class="list-unstyled">
|
||||
|
||||
|
||||
<li><a href="/cgspace-notes/2022-05/">May, 2022</a></li>
|
||||
|
||||
<li><a href="/cgspace-notes/2022-04/">April, 2022</a></li>
|
||||
|
||||
<li><a href="/cgspace-notes/2022-03/">March, 2022</a></li>
|
||||
@ -412,8 +402,6 @@
|
||||
|
||||
<li><a href="/cgspace-notes/2022-01/">January, 2022</a></li>
|
||||
|
||||
<li><a href="/cgspace-notes/2021-12/">December, 2021</a></li>
|
||||
|
||||
</ol>
|
||||
</section>
|
||||
|
||||
|
@ -10,14 +10,14 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
||||
<meta property="og:updated_time" content="2022-04-27T09:58:45+03:00" />
|
||||
<meta property="og:updated_time" content="2022-05-04T09:13:39+03:00" />
|
||||
|
||||
|
||||
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="Notes"/>
|
||||
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
|
||||
<meta name="generator" content="Hugo 0.97.3" />
|
||||
<meta name="generator" content="Hugo 0.98.0" />
|
||||
|
||||
|
||||
|
||||
@ -81,6 +81,43 @@
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-01/">January, 2020</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2020-01-06T10:48:30+02:00">Mon Jan 06, 2020</time> by Alan Orth in
|
||||
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
|
||||
|
||||
|
||||
</p>
|
||||
</header>
|
||||
<h2 id="2020-01-06">2020-01-06</h2>
|
||||
<ul>
|
||||
<li>Open <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=706">a ticket</a> with Atmire to request a quote for the upgrade to DSpace 6</li>
|
||||
<li>Last week Altmetric responded about the <a href="https://hdl.handle.net/10568/97087">item</a> that had a lower score than than its DOI
|
||||
<ul>
|
||||
<li>The score is now linked to the DOI</li>
|
||||
<li>Another <a href="https://hdl.handle.net/10568/91278">item</a> that had the same problem in 2019 has now also linked to the score for its DOI</li>
|
||||
<li>Another <a href="https://hdl.handle.net/10568/81236">item</a> that had the same problem in 2019 has also been fixed</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<h2 id="2020-01-07">2020-01-07</h2>
|
||||
<ul>
|
||||
<li>Peter Ballantyne highlighted one more WLE <a href="https://hdl.handle.net/10568/101286">item</a> that is missing the Altmetric score that its DOI has
|
||||
<ul>
|
||||
<li>The DOI has a score of 259, but the Handle has no score at all</li>
|
||||
<li>I <a href="https://twitter.com/mralanorth/status/1214471427157626881">tweeted</a> the CGSpace repository link</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<a href='https://alanorth.github.io/cgspace-notes/2020-01/'>Read more →</a>
|
||||
</article>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-12/">December, 2019</a></h2>
|
||||
@ -373,38 +410,6 @@ $ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-1-region.csv -db dspace
|
||||
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-03/">March, 2019</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2019-03-01T12:16:30+01:00">Fri Mar 01, 2019</time> by Alan Orth in
|
||||
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
|
||||
|
||||
|
||||
</p>
|
||||
</header>
|
||||
<h2 id="2019-03-01">2019-03-01</h2>
|
||||
<ul>
|
||||
<li>I checked IITA’s 259 Feb 14 records from last month for duplicates using Atmire’s Duplicate Checker on a fresh snapshot of CGSpace on my local machine and everything looks good</li>
|
||||
<li>I am now only waiting to hear from her about where the items should go, though I assume Journal Articles go to IITA Journal Articles collection, etc…</li>
|
||||
<li>Looking at the other half of Udana’s WLE records from 2018-11
|
||||
<ul>
|
||||
<li>I finished the ones for Restoring Degraded Landscapes (RDL), but these are for Variability, Risks and Competing Uses (VRC)</li>
|
||||
<li>I did the usual cleanups for whitespace, added regions where they made sense for certain countries, cleaned up the DOI link formats, added rights information based on the publications page for a few items</li>
|
||||
<li>Most worryingly, there are encoding errors in the abstracts for eleven items, for example:</li>
|
||||
<li>68.15% <20> 9.45 instead of 68.15% ± 9.45</li>
|
||||
<li>2003<EFBFBD>2013 instead of 2003–2013</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>I think I will need to ask Udana to re-copy and paste the abstracts with more care using Google Docs</li>
|
||||
</ul>
|
||||
<a href='https://alanorth.github.io/cgspace-notes/2019-03/'>Read more →</a>
|
||||
</article>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<nav class="blog-pagination">
|
||||
|
||||
<a class="btn btn-outline-primary" href="/cgspace-notes/categories/notes/page/3/" rel="prev" role="button">Previous page</a>
|
||||
@ -429,6 +434,8 @@ $ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-1-region.csv -db dspace
|
||||
<ol class="list-unstyled">
|
||||
|
||||
|
||||
<li><a href="/cgspace-notes/2022-05/">May, 2022</a></li>
|
||||
|
||||
<li><a href="/cgspace-notes/2022-04/">April, 2022</a></li>
|
||||
|
||||
<li><a href="/cgspace-notes/2022-03/">March, 2022</a></li>
|
||||
@ -437,8 +444,6 @@ $ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-1-region.csv -db dspace
|
||||
|
||||
<li><a href="/cgspace-notes/2022-01/">January, 2022</a></li>
|
||||
|
||||
<li><a href="/cgspace-notes/2021-12/">December, 2021</a></li>
|
||||
|
||||
</ol>
|
||||
</section>
|
||||
|
||||
|
@ -10,14 +10,14 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
||||
<meta property="og:updated_time" content="2022-04-27T09:58:45+03:00" />
|
||||
<meta property="og:updated_time" content="2022-05-04T09:13:39+03:00" />
|
||||
|
||||
|
||||
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="Notes"/>
|
||||
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
|
||||
<meta name="generator" content="Hugo 0.97.3" />
|
||||
<meta name="generator" content="Hugo 0.98.0" />
|
||||
|
||||
|
||||
|
||||
@ -81,6 +81,38 @@
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-03/">March, 2019</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2019-03-01T12:16:30+01:00">Fri Mar 01, 2019</time> by Alan Orth in
|
||||
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
|
||||
|
||||
|
||||
</p>
|
||||
</header>
|
||||
<h2 id="2019-03-01">2019-03-01</h2>
|
||||
<ul>
|
||||
<li>I checked IITA’s 259 Feb 14 records from last month for duplicates using Atmire’s Duplicate Checker on a fresh snapshot of CGSpace on my local machine and everything looks good</li>
|
||||
<li>I am now only waiting to hear from her about where the items should go, though I assume Journal Articles go to IITA Journal Articles collection, etc…</li>
|
||||
<li>Looking at the other half of Udana’s WLE records from 2018-11
|
||||
<ul>
|
||||
<li>I finished the ones for Restoring Degraded Landscapes (RDL), but these are for Variability, Risks and Competing Uses (VRC)</li>
|
||||
<li>I did the usual cleanups for whitespace, added regions where they made sense for certain countries, cleaned up the DOI link formats, added rights information based on the publications page for a few items</li>
|
||||
<li>Most worryingly, there are encoding errors in the abstracts for eleven items, for example:</li>
|
||||
<li>68.15% <20> 9.45 instead of 68.15% ± 9.45</li>
|
||||
<li>2003<EFBFBD>2013 instead of 2003–2013</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>I think I will need to ask Udana to re-copy and paste the abstracts with more care using Google Docs</li>
|
||||
</ul>
|
||||
<a href='https://alanorth.github.io/cgspace-notes/2019-03/'>Read more →</a>
|
||||
</article>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-02/">February, 2019</a></h2>
|
||||
@ -356,34 +388,6 @@ sys 2m7.289s
|
||||
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-05/">May, 2018</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2018-05-01T16:43:54+03:00">Tue May 01, 2018</time> by Alan Orth in
|
||||
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
|
||||
|
||||
|
||||
</p>
|
||||
</header>
|
||||
<h2 id="2018-05-01">2018-05-01</h2>
|
||||
<ul>
|
||||
<li>I cleared the Solr statistics core on DSpace Test by issuing two commands directly to the Solr admin interface:
|
||||
<ul>
|
||||
<li>http://localhost:3000/solr/statistics/update?stream.body=%3Cdelete%3E%3Cquery%3E*:*%3C/query%3E%3C/delete%3E</li>
|
||||
<li>http://localhost:3000/solr/statistics/update?stream.body=%3Ccommit/%3E</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>Then I reduced the JVM heap size from 6144 back to 5120m</li>
|
||||
<li>Also, I switched it to use OpenJDK instead of Oracle Java, as well as re-worked the <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure scripts</a> to support hosts choosing which distribution they want to use</li>
|
||||
</ul>
|
||||
<a href='https://alanorth.github.io/cgspace-notes/2018-05/'>Read more →</a>
|
||||
</article>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<nav class="blog-pagination">
|
||||
|
||||
<a class="btn btn-outline-primary" href="/cgspace-notes/categories/notes/page/4/" rel="prev" role="button">Previous page</a>
|
||||
@ -408,6 +412,8 @@ sys 2m7.289s
|
||||
<ol class="list-unstyled">
|
||||
|
||||
|
||||
<li><a href="/cgspace-notes/2022-05/">May, 2022</a></li>
|
||||
|
||||
<li><a href="/cgspace-notes/2022-04/">April, 2022</a></li>
|
||||
|
||||
<li><a href="/cgspace-notes/2022-03/">March, 2022</a></li>
|
||||
@ -416,8 +422,6 @@ sys 2m7.289s
|
||||
|
||||
<li><a href="/cgspace-notes/2022-01/">January, 2022</a></li>
|
||||
|
||||
<li><a href="/cgspace-notes/2021-12/">December, 2021</a></li>
|
||||
|
||||
</ol>
|
||||
</section>
|
||||
|
||||
|
@ -10,14 +10,14 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
||||
<meta property="og:updated_time" content="2022-04-27T09:58:45+03:00" />
|
||||
<meta property="og:updated_time" content="2022-05-04T09:13:39+03:00" />
|
||||
|
||||
|
||||
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="Notes"/>
|
||||
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
|
||||
<meta name="generator" content="Hugo 0.97.3" />
|
||||
<meta name="generator" content="Hugo 0.98.0" />
|
||||
|
||||
|
||||
|
||||
@ -81,6 +81,34 @@
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-05/">May, 2018</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2018-05-01T16:43:54+03:00">Tue May 01, 2018</time> by Alan Orth in
|
||||
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
|
||||
|
||||
|
||||
</p>
|
||||
</header>
|
||||
<h2 id="2018-05-01">2018-05-01</h2>
|
||||
<ul>
|
||||
<li>I cleared the Solr statistics core on DSpace Test by issuing two commands directly to the Solr admin interface:
|
||||
<ul>
|
||||
<li>http://localhost:3000/solr/statistics/update?stream.body=%3Cdelete%3E%3Cquery%3E*:*%3C/query%3E%3C/delete%3E</li>
|
||||
<li>http://localhost:3000/solr/statistics/update?stream.body=%3Ccommit/%3E</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>Then I reduced the JVM heap size from 6144 back to 5120m</li>
|
||||
<li>Also, I switched it to use OpenJDK instead of Oracle Java, as well as re-worked the <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure scripts</a> to support hosts choosing which distribution they want to use</li>
|
||||
</ul>
|
||||
<a href='https://alanorth.github.io/cgspace-notes/2018-05/'>Read more →</a>
|
||||
</article>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-04/">April, 2018</a></h2>
|
||||
@ -358,6 +386,8 @@ COPY 54701
|
||||
<ol class="list-unstyled">
|
||||
|
||||
|
||||
<li><a href="/cgspace-notes/2022-05/">May, 2022</a></li>
|
||||
|
||||
<li><a href="/cgspace-notes/2022-04/">April, 2022</a></li>
|
||||
|
||||
<li><a href="/cgspace-notes/2022-03/">March, 2022</a></li>
|
||||
@ -366,8 +396,6 @@ COPY 54701
|
||||
|
||||
<li><a href="/cgspace-notes/2022-01/">January, 2022</a></li>
|
||||
|
||||
<li><a href="/cgspace-notes/2021-12/">December, 2021</a></li>
|
||||
|
||||
</ol>
|
||||
</section>
|
||||
|
||||
|
Reference in New Issue
Block a user