Add notes for 2022-10-28

This commit is contained in:
2022-10-28 13:17:35 +03:00
parent 189f33e1ce
commit 3633377854
29 changed files with 205 additions and 34 deletions

View File

@ -20,7 +20,7 @@ I filed an issue to ask about Java 11+ support
<meta property="og:type" content="article" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2022-10/" />
<meta property="article:published_time" content="2022-10-01T19:45:36+03:00" />
<meta property="article:modified_time" content="2022-10-26T09:15:29+03:00" />
<meta property="article:modified_time" content="2022-10-26T17:50:40+03:00" />
@ -46,9 +46,9 @@ I filed an issue to ask about Java 11&#43; support
"@type": "BlogPosting",
"headline": "October, 2022",
"url": "https://alanorth.github.io/cgspace-notes/2022-10/",
"wordCount": "3320",
"wordCount": "3650",
"datePublished": "2022-10-01T19:45:36+03:00",
"dateModified": "2022-10-26T09:15:29+03:00",
"dateModified": "2022-10-26T17:50:40+03:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@ -849,6 +849,92 @@ I filed an issue to ask about Java 11&#43; support
</ul>
</li>
</ul>
<h2 id="2022-10-27">2022-10-27</h2>
<ul>
<li>I found out that we can use <a href="https://pdfcpu.io/boxes/boxes_remove.html#examples">pdfcpu to remove the CropBox from a PDF</a> for testing:</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ pdfcpu box rem -- <span style="color:#e6db74">&#34;crop&#34;</span> in.pdf out.pdf
</span></span></code></pre></div><ul>
<li>I filed <a href="https://github.com/DSpace/DSpace/issues/8549">an issue on DSpace</a> for the ImageMagick <code>CropBox</code> problem
<ul>
<li>I decided that this is a bug that should be fixed separately from the &ldquo;improving thumbnail quality&rdquo; issue</li>
<li>I made <a href="https://github.com/DSpace/DSpace/pull/8550">a pull request</a> to fix the <code>CropBox</code> issue</li>
</ul>
</li>
<li>I did more work on my <a href="https://github.com/alanorth/improved-dspace-thumbnails/">improved-dspace-thumbnails</a> microsite to complement the DSpace thumbnail pull requests
<ul>
<li>I am updating it to recommend using the PDF cropbox and &ldquo;supersampling&rdquo; with a higher density than 72</li>
<li>I measured execution time of ImageMagick with <code>time</code> and found that the higher-density mode takes about five times longer on average</li>
<li>I measured the <a href="https://stackoverflow.com/a/131346">maximum heap memory of ImageMagick with Valgrind and Massif</a>:</li>
</ul>
</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ valgrind --tool<span style="color:#f92672">=</span>massif magick convert ...
</span></span></code></pre></div><ul>
<li>Then I checked the results for each set of default DSpace thumbnail runs and &ldquo;improved&rdquo; thumbnail runs using <code>ms_print</code> (hacky way to get the max heap, I know):</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ <span style="color:#66d9ef">for</span> file in memory-dspace/massif.out.49*; <span style="color:#66d9ef">do</span> ms_print <span style="color:#e6db74">&#34;</span>$file<span style="color:#e6db74">&#34;</span> | grep -A1 <span style="color:#e6db74">&#34; MB&#34;</span> | tail -n1 | sed <span style="color:#e6db74">&#39;s/\^.*//&#39;</span>; <span style="color:#66d9ef">done</span>
</span></span><span style="display:flex;"><span>15.87
</span></span><span style="display:flex;"><span>16.06
</span></span><span style="display:flex;"><span>21.26
</span></span><span style="display:flex;"><span>15.88
</span></span><span style="display:flex;"><span>20.01
</span></span><span style="display:flex;"><span>15.85
</span></span><span style="display:flex;"><span>20.06
</span></span><span style="display:flex;"><span>16.04
</span></span><span style="display:flex;"><span>15.87
</span></span><span style="display:flex;"><span>15.87
</span></span><span style="display:flex;"><span>20.02
</span></span><span style="display:flex;"><span>15.87
</span></span><span style="display:flex;"><span>15.86
</span></span><span style="display:flex;"><span>19.92
</span></span><span style="display:flex;"><span>10.89
</span></span><span style="display:flex;"><span>$ <span style="color:#66d9ef">for</span> file in memory-improved/massif.out.5*; <span style="color:#66d9ef">do</span> ms_print <span style="color:#e6db74">&#34;</span>$file<span style="color:#e6db74">&#34;</span> | grep -A1 <span style="color:#e6db74">&#34; MB&#34;</span> | tail -n1 | sed <span style="color:#e6db74">&#39;s/\^.*//&#39;</span>; <span style="color:#66d9ef">done</span>
</span></span><span style="display:flex;"><span>245.3
</span></span><span style="display:flex;"><span>245.5
</span></span><span style="display:flex;"><span>298.6
</span></span><span style="display:flex;"><span>245.3
</span></span><span style="display:flex;"><span>306.8
</span></span><span style="display:flex;"><span>245.2
</span></span><span style="display:flex;"><span>306.9
</span></span><span style="display:flex;"><span>245.5
</span></span><span style="display:flex;"><span>245.2
</span></span><span style="display:flex;"><span>245.3
</span></span><span style="display:flex;"><span>306.8
</span></span><span style="display:flex;"><span>245.3
</span></span><span style="display:flex;"><span>244.9
</span></span><span style="display:flex;"><span>306.3
</span></span><span style="display:flex;"><span>165.6
</span></span></code></pre></div><ul>
<li>Ouch, this shows that it takes about <em>fifteen times</em> more memory to do the &ldquo;4x&rdquo; density of 288!
<ul>
<li>It seems more reasonable to use a &ldquo;2x&rdquo; density of 144:</li>
</ul>
</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ <span style="color:#66d9ef">for</span> file in memory-improved-144/*; <span style="color:#66d9ef">do</span> ms_print <span style="color:#e6db74">&#34;</span>$file<span style="color:#e6db74">&#34;</span> | grep -A1 <span style="color:#e6db74">&#34; MB&#34;</span> | tail -n1 | sed <span style="color:#e6db74">&#39;s/\^.*//&#39;</span>; <span style="color:#66d9ef">done</span>
</span></span><span style="display:flex;"><span>61.80
</span></span><span style="display:flex;"><span>62.00
</span></span><span style="display:flex;"><span>76.76
</span></span><span style="display:flex;"><span>61.82
</span></span><span style="display:flex;"><span>77.43
</span></span><span style="display:flex;"><span>61.77
</span></span><span style="display:flex;"><span>77.48
</span></span><span style="display:flex;"><span>61.98
</span></span><span style="display:flex;"><span>61.76
</span></span><span style="display:flex;"><span>61.81
</span></span><span style="display:flex;"><span>77.44
</span></span><span style="display:flex;"><span>61.81
</span></span><span style="display:flex;"><span>61.69
</span></span><span style="display:flex;"><span>77.16
</span></span><span style="display:flex;"><span>41.84
</span></span></code></pre></div><ul>
<li>There&rsquo;s a really cool visualizer called massif-visualizer, but it isn&rsquo;t easy to parse</li>
</ul>
<h2 id="2022-10-28">2022-10-28</h2>
<ul>
<li>I finalized the code for the ImageMagick density change and made a <a href="https://github.com/DSpace/DSpace/pull/8553">pull request</a> against DSpace 7.x</li>
</ul>
<!-- raw HTML omitted -->