<li>Switch CGSpace (linode18) to use OpenJDK instead of Oracle JDK</li>
<li>I manually installed OpenJDK, then removed Oracle JDK, then re-ran the <ahref="http://github.com/ilri/rmg-ansible-public">Ansible playbook</a> to update all configuration files, etc</li>
<li>Then I ran all system updates and restarted the server</li>
</ul>
<h2id="2018-12-02">2018-12-02</h2>
<ul>
<li>I noticed that there is another issue with PDF thumbnails on CGSpace, and I see there was another <ahref="https://usn.ubuntu.com/3831-1/">Ghostscript vulnerability last week</a></li>
</ul>
<ul>
<li>The error when I try to manually run the media filter for one item from the command line:</li>
at org.im4java.core.Info.getBaseInfo(Info.java:360)
at org.im4java.core.Info.<init>(Info.java:151)
at org.dspace.app.mediafilter.ImageMagickThumbnailFilter.getImageFile(ImageMagickThumbnailFilter.java:142)
at org.dspace.app.mediafilter.ImageMagickPdfThumbnailFilter.getDestinationStream(ImageMagickPdfThumbnailFilter.java:24)
at org.dspace.app.mediafilter.FormatFilter.processBitstream(FormatFilter.java:170)
at org.dspace.app.mediafilter.MediaFilterManager.filterBitstream(MediaFilterManager.java:475)
at org.dspace.app.mediafilter.MediaFilterManager.filterItem(MediaFilterManager.java:429)
at org.dspace.app.mediafilter.MediaFilterManager.applyFiltersItem(MediaFilterManager.java:401)
at org.dspace.app.mediafilter.MediaFilterManager.main(MediaFilterManager.java:237)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:226)
at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:78)
</code></pre>
<ul>
<li>A comment on <ahref="https://stackoverflow.com/questions/53560755/ghostscript-9-26-update-breaks-imagick-readimage-for-multipage-pdf">StackOverflow question</a> from yesterday suggests it might be a bug with the <code>pngalpha</code> device in Ghostscript and <ahref="https://bugs.ghostscript.com/show_bug.cgi?id=699815">links to an upstream bug</a></li>
<li>I think we need to wait for a fix from Ubuntu</li>
<li>Start proofing the latest round of 226 IITA archive records that Bosede sent last week and Sisay uploaded to DSpace Test this weekend (<ahref="https://dspacetest.cgiar.org/handle/10568/108298">IITA_Dec_1_1997 aka Daniel1807</a>)
<ul>
<li>One item missing the authorship type</li>
<li>Some invalid countries (smart quotes, mispellings)</li>
<li>Added countries to some items that mentioned research in particular countries in their abstracts</li>
<li>One item had “MADAGASCAR” for ISI Journal</li>
<li>Minor corrections in IITA subject (LIVELIHOOD→LIVELIHOODS)</li>
<li>Trim whitespace in abstract field</li>
<li>Fix some sponsors (though some with “Governments of Canada” etc I’m not sure why those are plural)</li>
<li>Eighteen items had <code>en||fr</code> for the language, but the content was only in French so changed them to just <code>fr</code></li>
<li>Six items had encoding errors in French text so I will ask Bosede to re-do them carefully</li>
<li>Correct and normalize a few AGROVOC subjects</li>
</ul></li>
<li>Expand my “encoding error” detection GREL to include <code>~</code> as I saw a lot of that in some copy pasted French text recently:</li>
<li>I looked at the DSpace Ghostscript issue more and it seems to only affect certain PDFs…</li>
<li>I can successfully generate a thumbnail for another recent item (<ahref="https://hdl.handle.net/10568/98394"><sup>10568</sup>⁄<sub>98394</sub></a>), but not for <ahref="https://hdl.handle.net/10568/98390"><sup>10568</sup>⁄<sub>98930</sub></a></li>
<li>Even manually on my Arch Linux desktop with ghostscript 9.26-1 and the <code>pngalpha</code> device, I can generate a thumbnail for the first one (<sup>10568</sup>⁄<sub>98394</sub>):</li>
Info Note Mainstreaming gender and social differentiation into CCAFS research activities in West Africa-converted.pdf[0]=>Info Note Mainstreaming gender and social differentiation into CCAFS research activities in West Africa-converted.pdf PDF 595x841 595x841+0+0 16-bit sRGB 107443B 0.000u 0:00.000
<li>And wow, I can’t even run ImageMagick’s <code>identify</code> on the first page of the second item (<sup>10568</sup>⁄<sub>98930</sub>):</li>
Food safety Kenya fruits.pdf PDF 612x792+0+0 DirectClass 8-bit 1.4Mi 0.000u 0m:0.000002s
</code></pre>
<ul>
<li>Interesting that ImageMagick’s <code>identify</code><em>does</em> work if you do not specify a page, perhaps as <ahref="https://bugs.ghostscript.com/show_bug.cgi?id=699815">alluded to in the recent Ghostscript bug report</a>:</li>
<li>I inspected the troublesome PDF using <ahref="http://jhove.openpreservation.org/">jhove</a> and noticed that it is using <code>ISO PDF/A-1, Level B</code> and the other one doesn’t list a profile, though I don’t think this is relevant</li>