Add notes for 2017-02-28

This commit is contained in:
2017-02-28 18:57:31 +02:00
parent ff4dca769e
commit a3f0d88945
5 changed files with 107 additions and 1 deletions

View File

@ -90,7 +90,7 @@ Looks like we’ll be using cg.identifier.ccafsprojectpii as the field name
"headline": "February, 2017",
"url": "https://alanorth.github.io/cgspace-notes/2017-02/",
"wordCount": "1862",
"wordCount": "2019",
"datePublished": "2017-02-07T07:04:52-08:00",
@ -498,11 +498,33 @@ Certificate chain
<li>Regarding the <code>filter-media</code> issue I found earlier, it seems that the ImageMagick PDF plugin will also process JPGs if they are in the &ldquo;Content Files&rdquo; (aka <code>ORIGINAL</code>) bundle</li>
<li>The problem likely lies in the logic of <code>ImageMagickThumbnailFilter.java</code>, as <code>ImageMagickPdfThumbnailFilter.java</code> extends it</li>
<li>Run CIAT corrections on CGSpace</li>
</ul>
<pre><code>dspace=# update metadatavalue set authority='3026b1de-9302-4f3e-85ab-ef48da024eb2', confidence=600 where resource_type_id=2 and metadata_field_id=3 and text_value = 'International Center for Tropical Agriculture';
</code></pre>
<ul>
<li>CGNET has fixed the certificate chain on their LDAP server</li>
<li>Redeploy CGSpace and DSpace Test to on latest <code>5_x-prod</code> branch with fixes for LDAP bind user</li>
<li>Run all system updates on CGSpace server and reboot</li>
</ul>
<h2 id="2017-02-28">2017-02-28</h2>
<ul>
<li>After running the CIAT corrections and updating the Discovery and authority indexes, there is still no change in the number of items listed for CIAT in Discovery</li>
<li>Ah, this is probably because some items have the <code>International Center for Tropical Agriculture</code> author twice, which I first noticed in 2016-12 but couldn&rsquo;t figure out how to fix</li>
<li>I think I can do it by first exporting all metadatavalues that have the author <code>International Center for Tropical Agriculture</code></li>
</ul>
<pre><code>dspace=# \copy (select resource_id, metadata_value_id from metadatavalue where resource_type_id=2 and metadata_field_id=3 and text_value='International Center for Tropical Agriculture') to /tmp/ciat.csv with csv;
COPY 1968
</code></pre>
<ul>
<li>And then using awk or uniq to either remove or print the lines that have a duplicate <code>resource_id</code> (meaning they belong to the same item in DSpace and are therefore duplicates), and then using the <code>metadata_value_id</code> to delete them</li>
</ul>