Add notes for 2020-01-27

This commit is contained in:
2020-01-27 16:20:44 +02:00
parent 207ace0883
commit 8feb93be39
112 changed files with 11466 additions and 5158 deletions

View File

@ -9,8 +9,8 @@
<meta property="og:description" content="2017-01-02
I checked to see if the Solr sharding task that is supposed to run on January 1st had run and saw there was an error
I tested on DSpace Test as well and it doesn&#39;t work there either
I asked on the dspace-tech mailing list because it seems to be broken, and actually now I&#39;m not sure if we&#39;ve ever had the sharding task run successfully over all these years
I tested on DSpace Test as well and it doesn&rsquo;t work there either
I asked on the dspace-tech mailing list because it seems to be broken, and actually now I&rsquo;m not sure if we&rsquo;ve ever had the sharding task run successfully over all these years
" />
<meta property="og:type" content="article" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2017-01/" />
@ -22,10 +22,10 @@ I asked on the dspace-tech mailing list because it seems to be broken, and actua
<meta name="twitter:description" content="2017-01-02
I checked to see if the Solr sharding task that is supposed to run on January 1st had run and saw there was an error
I tested on DSpace Test as well and it doesn&#39;t work there either
I asked on the dspace-tech mailing list because it seems to be broken, and actually now I&#39;m not sure if we&#39;ve ever had the sharding task run successfully over all these years
I tested on DSpace Test as well and it doesn&rsquo;t work there either
I asked on the dspace-tech mailing list because it seems to be broken, and actually now I&rsquo;m not sure if we&rsquo;ve ever had the sharding task run successfully over all these years
"/>
<meta name="generator" content="Hugo 0.62.2" />
<meta name="generator" content="Hugo 0.63.1" />
@ -55,7 +55,7 @@ I asked on the dspace-tech mailing list because it seems to be broken, and actua
<!-- combined, minified CSS -->
<link href="https://alanorth.github.io/cgspace-notes/css/style.a20c1a4367639632cdb341d23c27ca44fedcc75b0f8b3cbea6203010da153d3c.css" rel="stylesheet" integrity="sha256-ogwaQ2djljLNs0HSPCfKRP7cx1sPizy&#43;piAwENoVPTw=" crossorigin="anonymous">
<link href="https://alanorth.github.io/cgspace-notes/css/style.23e2c3298bcc8c1136c19aba330c211ec94c36f7c4454ea15cf4d3548370042a.css" rel="stylesheet" integrity="sha256-I&#43;LDKYvMjBE2wZq6MwwhHslMNvfERU6hXPTTVINwBCo=" crossorigin="anonymous">
<!-- RSS 2.0 feed -->
@ -103,15 +103,15 @@ I asked on the dspace-tech mailing list because it seems to be broken, and actua
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-01/">January, 2017</a></h2>
<p class="blog-post-meta"><time datetime="2017-01-02T10:43:00&#43;03:00">Mon Jan 02, 2017</time> by Alan Orth in
<i class="fa fa-tag" aria-hidden="true"></i>&nbsp;<a href="/cgspace-notes/tags/notes" rel="tag">Notes</a>
<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/tags/notes" rel="tag">Notes</a>
</p>
</header>
<h2 id="2017-01-02">2017-01-02</h2>
<ul>
<li>I checked to see if the Solr sharding task that is supposed to run on January 1st had run and saw there was an error</li>
<li>I tested on DSpace Test as well and it doesn't work there either</li>
<li>I asked on the dspace-tech mailing list because it seems to be broken, and actually now I'm not sure if we've ever had the sharding task run successfully over all these years</li>
<li>I tested on DSpace Test as well and it doesn&rsquo;t work there either</li>
<li>I asked on the dspace-tech mailing list because it seems to be broken, and actually now I&rsquo;m not sure if we&rsquo;ve ever had the sharding task run successfully over all these years</li>
</ul>
<h2 id="2017-01-04">2017-01-04</h2>
<ul>
@ -186,18 +186,18 @@ Caused by: java.net.SocketException: Broken pipe (Write failed)
</ul>
<h2 id="2017-01-08">2017-01-08</h2>
<ul>
<li>Put Sisay's <code>item-view.xsl</code> code to show mapped collections on CGSpace (<a href="https://github.com/ilri/DSpace/pull/295">#295</a>)</li>
<li>Put Sisay&rsquo;s <code>item-view.xsl</code> code to show mapped collections on CGSpace (<a href="https://github.com/ilri/DSpace/pull/295">#295</a>)</li>
</ul>
<h2 id="2017-01-09">2017-01-09</h2>
<ul>
<li>A user wrote to tell me that the new display of an item's mappings had a crazy bug for at least one item: <a href="https://cgspace.cgiar.org/handle/10568/78596">https://cgspace.cgiar.org/handle/10568/78596</a></li>
<li>A user wrote to tell me that the new display of an item&rsquo;s mappings had a crazy bug for at least one item: <a href="https://cgspace.cgiar.org/handle/10568/78596">https://cgspace.cgiar.org/handle/10568/78596</a></li>
<li>She said she only mapped it once, but it appears to be mapped 184 times</li>
</ul>
<p><img src="/cgspace-notes/2017/01/mapping-crazy-duplicate.png" alt="Crazy item mapping"></p>
<h2 id="2017-01-10">2017-01-10</h2>
<ul>
<li>I tried to clean up the duplicate mappings by exporting the item's metadata to CSV, editing, and re-importing, but DSpace said &ldquo;no changes were detected&rdquo;</li>
<li>I've asked on the dspace-tech mailing list to see if anyone can help</li>
<li>I tried to clean up the duplicate mappings by exporting the item&rsquo;s metadata to CSV, editing, and re-importing, but DSpace said &ldquo;no changes were detected&rdquo;</li>
<li>I&rsquo;ve asked on the dspace-tech mailing list to see if anyone can help</li>
<li>I found an old post on the mailing list discussing a similar issue, and listing some SQL commands that might help</li>
<li>For example, this shows 186 mappings for the item, the first three of which are real:</li>
</ul>
@ -226,7 +226,7 @@ UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 15:
<pre><code>print(&quot;Fixing {} occurences of: {}&quot;.format(records_to_fix, record[0].encode('utf-8')))
</code></pre><ul>
<li>See: <a href="http://stackoverflow.com/a/36427358/487333">http://stackoverflow.com/a/36427358/487333</a></li>
<li>I'm actually not sure if we need to encode() the strings to UTF-8 before writing them to the database&hellip; I've never had this issue before</li>
<li>I&rsquo;m actually not sure if we need to encode() the strings to UTF-8 before writing them to the database&hellip; I&rsquo;ve never had this issue before</li>
<li>Now back to cleaning up some journal titles so we can make the controlled vocabulary:</li>
</ul>
<pre><code>$ ./fix-metadata-values.py -i /tmp/fix-27-journal-titles.csv -f dc.source -t correct -m 55 -d dspace -u dspace -p 'fuuu'
@ -237,7 +237,7 @@ UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 15:
</code></pre><ul>
<li>The values are a bit dirty and outdated, since the file I had given to Abenet and Peter was from November</li>
<li>I will have to go through these and fix some more before making the controlled vocabulary</li>
<li>Added 30 more corrections or so, now there are 49 total and I'll have to get the top 500 after applying them</li>
<li>Added 30 more corrections or so, now there are 49 total and I&rsquo;ll have to get the top 500 after applying them</li>
</ul>
<h2 id="2017-01-13">2017-01-13</h2>
<ul>
@ -256,12 +256,12 @@ delete from collection2item where id = '91082';
<li>Helping clean up some file names in the 232 CIAT records that Sisay worked on last week</li>
<li>There are about 30 files with <code>%20</code> (space) and Spanish accents in the file name</li>
<li>At first I thought we should fix these, but actually it is <a href="https://www.w3.org/TR/html4/appendix/notes.html#h-B.2.1">prescribed by the W3 working group to convert these to UTF8 and URL encode them</a>!</li>
<li>And the file names don't really matter either, as long as the SAF Builder tool can read them—after that DSpace renames them with a hash in the assetstore</li>
<li>And the file names don&rsquo;t really matter either, as long as the SAF Builder tool can read them—after that DSpace renames them with a hash in the assetstore</li>
<li>Seems like the only ones I should replace are the <code>'</code> apostrophe characters, as <code>%27</code>:</li>
</ul>
<pre><code>value.replace(&quot;'&quot;,'%27')
</code></pre><ul>
<li>Add the item's Type to the filename column as a hint to SAF Builder so it can set a more useful description field:</li>
<li>Add the item&rsquo;s Type to the filename column as a hint to SAF Builder so it can set a more useful description field:</li>
</ul>
<pre><code>value + &quot;__description:&quot; + cells[&quot;dc.type&quot;].value
</code></pre><ul>
@ -279,18 +279,18 @@ $ gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/ebook -dNOPAUSE -
</ul>
<h2 id="2017-01-19">2017-01-19</h2>
<ul>
<li>In testing a random sample of CIAT's PDFs for compressability, it looks like all of these methods generally increase the file size so we will just import them as they are</li>
<li>In testing a random sample of CIAT&rsquo;s PDFs for compressability, it looks like all of these methods generally increase the file size so we will just import them as they are</li>
<li>Import 232 CIAT records into CGSpace:</li>
</ul>
<pre><code>$ JAVA_OPTS=&quot;-Xmx512m -Dfile.encoding=UTF-8&quot; /home/cgspace.cgiar.org/bin/dspace import --add --eperson=aorth@mjanja.ch --collection=10568/68704 --source /home/aorth/CIAT_232/SimpleArchiveFormat/ --mapfile=/tmp/ciat.map &amp;&gt; /tmp/ciat.log
</code></pre><h2 id="2017-01-22">2017-01-22</h2>
<ul>
<li>Looking at some records that Sisay is having problems importing into DSpace Test (seems to be because of copious whitespace return characters from Excel's CSV exporter)</li>
<li>Looking at some records that Sisay is having problems importing into DSpace Test (seems to be because of copious whitespace return characters from Excel&rsquo;s CSV exporter)</li>
<li>There were also some issues with an invalid dc.date.issued field, and I trimmed leading / trailing whitespace and cleaned up some URLs with unneeded parameters like ?show=full</li>
</ul>
<h2 id="2017-01-23">2017-01-23</h2>
<ul>
<li>I merged Atmire's pull request into the development branch so they can deploy it on DSpace Test</li>
<li>I merged Atmire&rsquo;s pull request into the development branch so they can deploy it on DSpace Test</li>
<li>Move some old ILRI Program communities to a new subcommunity for former programs (10568/79164):</li>
</ul>
<pre><code>$ for community in 10568/171 10568/27868 10568/231 10568/27869 10568/150 10568/230 10568/32724 10568/172; do /home/cgspace.cgiar.org/bin/dspace community-filiator --remove --parent=10568/27866 --child=&quot;$community&quot; &amp;&amp; /home/cgspace.cgiar.org/bin/dspace community-filiator --set --parent=10568/79164 --child=&quot;$community&quot;; done