Add notes for 2020-01-27

This commit is contained in:
2020-01-27 16:20:44 +02:00
parent 207ace0883
commit 8feb93be39
112 changed files with 11466 additions and 5158 deletions

View File

@ -20,7 +20,7 @@ Need to send Peter and Michael some notes about this in a few days
Also, need to consider talking to Atmire about hiring them to bring ORCiD metadata to REST / OAI
Filed an issue on DSpace issue tracker for the filter-media bug that causes it to process JPGs even when limiting to the PDF thumbnail plugin: DS-3516
Discovered that the ImageMagic filter-media plugin creates JPG thumbnails with the CMYK colorspace when the source PDF is using CMYK
Interestingly, it seems DSpace 4.x's thumbnails were sRGB, but forcing regeneration using DSpace 5.x's ImageMagick plugin creates CMYK JPGs if the source PDF was CMYK (see 10568/51999):
Interestingly, it seems DSpace 4.x’s thumbnails were sRGB, but forcing regeneration using DSpace 5.x’s ImageMagick plugin creates CMYK JPGs if the source PDF was CMYK (see 10568/51999):
$ identify ~/Desktop/alc_contrastes_desafios.jpg
/Users/aorth/Desktop/alc_contrastes_desafios.jpg JPEG 464x600 464x600+0+0 8-bit CMYK 168KB 0.000u 0:00.000
@ -46,12 +46,12 @@ Need to send Peter and Michael some notes about this in a few days
Also, need to consider talking to Atmire about hiring them to bring ORCiD metadata to REST / OAI
Filed an issue on DSpace issue tracker for the filter-media bug that causes it to process JPGs even when limiting to the PDF thumbnail plugin: DS-3516
Discovered that the ImageMagic filter-media plugin creates JPG thumbnails with the CMYK colorspace when the source PDF is using CMYK
Interestingly, it seems DSpace 4.x's thumbnails were sRGB, but forcing regeneration using DSpace 5.x's ImageMagick plugin creates CMYK JPGs if the source PDF was CMYK (see 10568/51999):
Interestingly, it seems DSpace 4.x’s thumbnails were sRGB, but forcing regeneration using DSpace 5.x’s ImageMagick plugin creates CMYK JPGs if the source PDF was CMYK (see 10568/51999):
$ identify ~/Desktop/alc_contrastes_desafios.jpg
/Users/aorth/Desktop/alc_contrastes_desafios.jpg JPEG 464x600 464x600+0+0 8-bit CMYK 168KB 0.000u 0:00.000
"/>
<meta name="generator" content="Hugo 0.62.2" />
<meta name="generator" content="Hugo 0.63.1" />
@ -81,7 +81,7 @@ $ identify ~/Desktop/alc_contrastes_desafios.jpg
<!-- combined, minified CSS -->
<link href="https://alanorth.github.io/cgspace-notes/css/style.a20c1a4367639632cdb341d23c27ca44fedcc75b0f8b3cbea6203010da153d3c.css" rel="stylesheet" integrity="sha256-ogwaQ2djljLNs0HSPCfKRP7cx1sPizy&#43;piAwENoVPTw=" crossorigin="anonymous">
<link href="https://alanorth.github.io/cgspace-notes/css/style.23e2c3298bcc8c1136c19aba330c211ec94c36f7c4454ea15cf4d3548370042a.css" rel="stylesheet" integrity="sha256-I&#43;LDKYvMjBE2wZq6MwwhHslMNvfERU6hXPTTVINwBCo=" crossorigin="anonymous">
<!-- RSS 2.0 feed -->
@ -129,7 +129,7 @@ $ identify ~/Desktop/alc_contrastes_desafios.jpg
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-03/">March, 2017</a></h2>
<p class="blog-post-meta"><time datetime="2017-03-01T17:08:52&#43;02:00">Wed Mar 01, 2017</time> by Alan Orth in
<i class="fa fa-tag" aria-hidden="true"></i>&nbsp;<a href="/cgspace-notes/tags/notes" rel="tag">Notes</a>
<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/tags/notes" rel="tag">Notes</a>
</p>
</header>
@ -147,7 +147,7 @@ $ identify ~/Desktop/alc_contrastes_desafios.jpg
<li>Also, need to consider talking to Atmire about hiring them to bring ORCiD metadata to REST / OAI</li>
<li>Filed an issue on DSpace issue tracker for the <code>filter-media</code> bug that causes it to process JPGs even when limiting to the PDF thumbnail plugin: <a href="https://jira.duraspace.org/browse/DS-3516">DS-3516</a></li>
<li>Discovered that the ImageMagic <code>filter-media</code> plugin creates JPG thumbnails with the CMYK colorspace when the source PDF is using CMYK</li>
<li>Interestingly, it seems DSpace 4.x's thumbnails were sRGB, but forcing regeneration using DSpace 5.x's ImageMagick plugin creates CMYK JPGs if the source PDF was CMYK (see <a href="https://cgspace.cgiar.org/handle/10568/51999">10568/51999</a>):</li>
<li>Interestingly, it seems DSpace 4.x&rsquo;s thumbnails were sRGB, but forcing regeneration using DSpace 5.x&rsquo;s ImageMagick plugin creates CMYK JPGs if the source PDF was CMYK (see <a href="https://cgspace.cgiar.org/handle/10568/51999">10568/51999</a>):</li>
</ul>
<pre><code>$ identify ~/Desktop/alc_contrastes_desafios.jpg
/Users/aorth/Desktop/alc_contrastes_desafios.jpg JPEG 464x600 464x600+0+0 8-bit CMYK 168KB 0.000u 0:00.000
@ -162,7 +162,7 @@ $ identify ~/Desktop/alc_contrastes_desafios.jpg
<h2 id="2017-03-03">2017-03-03</h2>
<ul>
<li>I created a patch for DS-3517 and made a pull request against upstream <code>dspace-5_x</code>: <a href="https://github.com/DSpace/DSpace/pull/1669">https://github.com/DSpace/DSpace/pull/1669</a></li>
<li>Looks like <code>-colorspace sRGB</code> alone isn't enough, we need to use profiles:</li>
<li>Looks like <code>-colorspace sRGB</code> alone isn&rsquo;t enough, we need to use profiles:</li>
</ul>
<pre><code>$ convert alc_contrastes_desafios.pdf\[0\] -profile /opt/brew/Cellar/ghostscript/9.20/share/ghostscript/9.20/iccprofiles/default_cmyk.icc -thumbnail 300x300 -flatten -profile /opt/brew/Cellar/ghostscript/9.20/share/ghostscript/9.20/iccprofiles/default_rgb.icc alc_contrastes_desafios.pdf.jpg
</code></pre><ul>
@ -180,7 +180,7 @@ DirectClass sRGB Alpha
</code></pre><h2 id="2017-03-04">2017-03-04</h2>
<ul>
<li>Spent more time looking at the ImageMagick CMYK issue</li>
<li>The <code>default_cmyk.icc</code> and <code>default_rgb.icc</code> files are both part of the Ghostscript GPL distribution, but according to DSpace's <code>LICENSES_THIRD_PARTY</code> file, DSpace doesn't allow distribution of dependencies that are licensed solely under the GPL</li>
<li>The <code>default_cmyk.icc</code> and <code>default_rgb.icc</code> files are both part of the Ghostscript GPL distribution, but according to DSpace&rsquo;s <code>LICENSES_THIRD_PARTY</code> file, DSpace doesn&rsquo;t allow distribution of dependencies that are licensed solely under the GPL</li>
<li>So this issue is kinda pointless now, as the ICC profiles are absolutely necessary to make a meaningful CMYK→sRGB conversion</li>
</ul>
<h2 id="2017-03-05">2017-03-05</h2>
@ -191,10 +191,10 @@ DirectClass sRGB Alpha
</ul>
<pre><code>$ curl -s -H &quot;accept: application/json&quot; -H &quot;Content-Type: application/json&quot; -X POST &quot;https://dspacetest.cgiar.org/rest/items/find-by-metadata-field&quot; -d '{&quot;key&quot;: &quot;cg.subject.ilri&quot;,&quot;value&quot;: &quot;LAND REFORM&quot;, &quot;language&quot;: null}' | json_pp
</code></pre><ul>
<li>But there are hundreds of combinations of fields and values (like <code>dc.subject</code> and all the center subjects), and we can't use wildcards in REST!</li>
<li>But there are hundreds of combinations of fields and values (like <code>dc.subject</code> and all the center subjects), and we can&rsquo;t use wildcards in REST!</li>
<li>Reading about enabling multiple handle prefixes in DSpace</li>
<li>There is a mailing list thread from 2011 about it: <a href="http://dspace.2283337.n4.nabble.com/Multiple-handle-prefixes-merged-DSpace-instances-td3427192.html">http://dspace.2283337.n4.nabble.com/Multiple-handle-prefixes-merged-DSpace-instances-td3427192.html</a></li>
<li>And a comment from Atmire's Bram about it on the DSpace wiki: <a href="https://wiki.duraspace.org/display/DSDOC5x/Installing+DSpace?focusedCommentId=78163296#comment-78163296">https://wiki.duraspace.org/display/DSDOC5x/Installing+DSpace?focusedCommentId=78163296#comment-78163296</a></li>
<li>And a comment from Atmire&rsquo;s Bram about it on the DSpace wiki: <a href="https://wiki.duraspace.org/display/DSDOC5x/Installing+DSpace?focusedCommentId=78163296#comment-78163296">https://wiki.duraspace.org/display/DSDOC5x/Installing+DSpace?focusedCommentId=78163296#comment-78163296</a></li>
<li>Bram mentions an undocumented configuration option <code>handle.plugin.checknameauthority</code>, but I noticed another one in <code>dspace.cfg</code>:</li>
</ul>
<pre><code># List any additional prefixes that need to be managed by this handle server
@ -202,12 +202,12 @@ DirectClass sRGB Alpha
# that repository)
# handle.additional.prefixes = prefix1[, prefix2]
</code></pre><ul>
<li>Because of this I noticed that our Handle server's <code>config.dct</code> was potentially misconfigured!</li>
<li>Because of this I noticed that our Handle server&rsquo;s <code>config.dct</code> was potentially misconfigured!</li>
<li>We had some default values still present:</li>
</ul>
<pre><code>&quot;300:0.NA/YOUR_NAMING_AUTHORITY&quot;
</code></pre><ul>
<li>I've changed them to the following and restarted the handle server:</li>
<li>I&rsquo;ve changed them to the following and restarted the handle server:</li>
</ul>
<pre><code>&quot;300:0.NA/10568&quot;
</code></pre><ul>
@ -226,7 +226,7 @@ DirectClass sRGB Alpha
</ul>
<h2 id="2017-03-06">2017-03-06</h2>
<ul>
<li>Someone on the mailing list said that <code>handle.plugin.checknameauthority</code> should be false if we're using multiple handle prefixes</li>
<li>Someone on the mailing list said that <code>handle.plugin.checknameauthority</code> should be false if we&rsquo;re using multiple handle prefixes</li>
</ul>
<h2 id="2017-03-07">2017-03-07</h2>
<ul>
@ -240,14 +240,14 @@ DirectClass sRGB Alpha
</ul>
</li>
<li>I need to talk to Michael and Peter to share the news, and discuss the structure of their community(s) and try some actual test data</li>
<li>We'll need to do some data cleaning to make sure they are using the same fields we are, like <code>dc.type</code> and <code>cg.identifier.status</code></li>
<li>We&rsquo;ll need to do some data cleaning to make sure they are using the same fields we are, like <code>dc.type</code> and <code>cg.identifier.status</code></li>
<li>Another thing is that the import process creates new <code>dc.date.accessioned</code> and <code>dc.date.available</code> fields, so we end up with duplicates (is it important to preserve the originals for these?)</li>
<li>Report DS-3520 issue to Atmire</li>
</ul>
<h2 id="2017-03-08">2017-03-08</h2>
<ul>
<li>Merge the author separator changes to <code>5_x-prod</code>, as everyone has responded positively about it, and it's the default in Mirage2 afterall!</li>
<li>Cherry pick the <code>commons-collections</code> patch from DSpace's <code>dspace-5_x</code> branch to address DS-3520: <a href="https://jira.duraspace.org/browse/DS-3520">https://jira.duraspace.org/browse/DS-3520</a></li>
<li>Merge the author separator changes to <code>5_x-prod</code>, as everyone has responded positively about it, and it&rsquo;s the default in Mirage2 afterall!</li>
<li>Cherry pick the <code>commons-collections</code> patch from DSpace&rsquo;s <code>dspace-5_x</code> branch to address DS-3520: <a href="https://jira.duraspace.org/browse/DS-3520">https://jira.duraspace.org/browse/DS-3520</a></li>
</ul>
<h2 id="2017-03-09">2017-03-09</h2>
<ul>
@ -267,7 +267,7 @@ $ ./delete-metadata-values.py -i Investors-Delete-121.csv -f dc.description.spon
<pre><code>dspace=# \copy (select distinct text_value from metadatavalue where resource_type_id=2 and metadata_field_id IN (select metadata_field_id from metadatafieldregistry where element = 'description' and qualifier = 'sponsorship')) to /tmp/sponsorship.csv with csv;
</code></pre><ul>
<li>Pull request for controlled vocabulary if Peter approves: <a href="https://github.com/ilri/DSpace/pull/308">https://github.com/ilri/DSpace/pull/308</a></li>
<li>Review Sisay's roots, tubers, and bananas (RTB) theme, which still needs some fixes to work properly: <a href="https://github.com/ilri/DSpace/pull/307">https://github.com/ilri/DSpace/pull/307</a></li>
<li>Review Sisay&rsquo;s roots, tubers, and bananas (RTB) theme, which still needs some fixes to work properly: <a href="https://github.com/ilri/DSpace/pull/307">https://github.com/ilri/DSpace/pull/307</a></li>
<li>Created an issue to track the progress on the Livestock CRP theme: <a href="https://github.com/ilri/DSpace/issues/309">https://github.com/ilri/DSpace/issues/309</a></li>
<li>Created a basic theme for the Livestock CRP community</li>
</ul>
@ -306,13 +306,13 @@ $ ./delete-metadata-values.py -i Investors-Delete-121.csv -f dc.description.spon
</ul>
<pre><code>$ ./fix-metadata-values.py -i ccafs-flagships-feb7.csv -f cg.subject.ccafs -t correct -m 210 -d dspace -u dspace -p fuuu
</code></pre><ul>
<li>We've been waiting since February to run these</li>
<li>We&rsquo;ve been waiting since February to run these</li>
<li>Also, I generated a list of all CCAFS flagships because there are a dozen or so more than there should be:</li>
</ul>
<pre><code>dspace=# \copy (select distinct text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=210 group by text_value order by count desc) to /tmp/ccafs.csv with csv;
</code></pre><ul>
<li>I sent a list to CCAFS people so they can tell me if some should be deleted or moved, etc</li>
<li>Test, squash, and merge Sisay's RTB theme into <code>5_x-prod</code>: <a href="https://github.com/ilri/DSpace/pull/316">https://github.com/ilri/DSpace/pull/316</a></li>
<li>Test, squash, and merge Sisay&rsquo;s RTB theme into <code>5_x-prod</code>: <a href="https://github.com/ilri/DSpace/pull/316">https://github.com/ilri/DSpace/pull/316</a></li>
</ul>
<h2 id="2017-03-29">2017-03-29</h2>
<ul>