Add notes for 2022-03-04

This commit is contained in:
2022-03-04 15:30:06 +03:00
parent 7453499827
commit 27acbac859
115 changed files with 6550 additions and 6444 deletions

View File

@ -36,7 +36,7 @@ Then I ran all system updates and restarted the server
I noticed that there is another issue with PDF thumbnails on CGSpace, and I see there was another Ghostscript vulnerability last week
"/>
<meta name="generator" content="Hugo 0.92.2" />
<meta name="generator" content="Hugo 0.93.1" />
@ -135,8 +135,8 @@ I noticed that there is another issue with PDF thumbnails on CGSpace, and I see
<ul>
<li>The error when I try to manually run the media filter for one item from the command line:</li>
</ul>
<pre tabindex="0"><code>org.im4java.core.InfoException: org.im4java.core.CommandException: org.im4java.core.CommandException: identify: FailedToExecuteCommand `&quot;gs&quot; -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 &quot;-sDEVICE=pngalpha&quot; -dTextAlphaBits=4 -dGraphicsAlphaBits=4 &quot;-r72x72&quot; -dFirstPage=1 -dLastPage=1 &quot;-sOutputFile=/tmp/magick-12989PcFN0DnJOej7%d&quot; &quot;-f/tmp/magick-129895Bmp44lvUfxo&quot; &quot;-f/tmp/magick-12989C0QFG51fktLF&quot;' (-1) @ error/delegate.c/ExternalDelegateCommand/461.
org.im4java.core.InfoException: org.im4java.core.CommandException: org.im4java.core.CommandException: identify: FailedToExecuteCommand `&quot;gs&quot; -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 &quot;-sDEVICE=pngalpha&quot; -dTextAlphaBits=4 -dGraphicsAlphaBits=4 &quot;-r72x72&quot; -dFirstPage=1 -dLastPage=1 &quot;-sOutputFile=/tmp/magick-12989PcFN0DnJOej7%d&quot; &quot;-f/tmp/magick-129895Bmp44lvUfxo&quot; &quot;-f/tmp/magick-12989C0QFG51fktLF&quot;' (-1) @ error/delegate.c/ExternalDelegateCommand/461.
<pre tabindex="0"><code>org.im4java.core.InfoException: org.im4java.core.CommandException: org.im4java.core.CommandException: identify: FailedToExecuteCommand `&#34;gs&#34; -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 &#34;-sDEVICE=pngalpha&#34; -dTextAlphaBits=4 -dGraphicsAlphaBits=4 &#34;-r72x72&#34; -dFirstPage=1 -dLastPage=1 &#34;-sOutputFile=/tmp/magick-12989PcFN0DnJOej7%d&#34; &#34;-f/tmp/magick-129895Bmp44lvUfxo&#34; &#34;-f/tmp/magick-12989C0QFG51fktLF&#34;&#39; (-1) @ error/delegate.c/ExternalDelegateCommand/461.
org.im4java.core.InfoException: org.im4java.core.CommandException: org.im4java.core.CommandException: identify: FailedToExecuteCommand `&#34;gs&#34; -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 &#34;-sDEVICE=pngalpha&#34; -dTextAlphaBits=4 -dGraphicsAlphaBits=4 &#34;-r72x72&#34; -dFirstPage=1 -dLastPage=1 &#34;-sOutputFile=/tmp/magick-12989PcFN0DnJOej7%d&#34; &#34;-f/tmp/magick-129895Bmp44lvUfxo&#34; &#34;-f/tmp/magick-12989C0QFG51fktLF&#34;&#39; (-1) @ error/delegate.c/ExternalDelegateCommand/461.
at org.im4java.core.Info.getBaseInfo(Info.java:360)
at org.im4java.core.Info.&lt;init&gt;(Info.java:151)
at org.dspace.app.mediafilter.ImageMagickThumbnailFilter.getImageFile(ImageMagickThumbnailFilter.java:142)
@ -158,13 +158,13 @@ org.im4java.core.InfoException: org.im4java.core.CommandException: org.im4java.c
<li>For what it&rsquo;s worth, I get the same error on my local Arch Linux environment with Ghostscript 9.26:</li>
</ul>
<pre tabindex="0"><code>$ gs -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 -sDEVICE=pngalpha -dTextAlphaBits=4 -dGraphicsAlphaBits=4 -r72x72 -dFirstPage=1 -dLastPage=1 -sOutputFile=/tmp/out%d -f/home/aorth/Desktop/Food\ safety\ Kenya\ fruits.pdf
DEBUG: FC_WEIGHT didn't match
DEBUG: FC_WEIGHT didn&#39;t match
zsh: segmentation fault (core dumped) gs -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000
</code></pre><ul>
<li>When I replace the <code>pngalpha</code> device with <code>png16m</code> as suggested in the StackOverflow comments it works:</li>
</ul>
<pre tabindex="0"><code>$ gs -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 -sDEVICE=png16m -dTextAlphaBits=4 -dGraphicsAlphaBits=4 -r72x72 -dFirstPage=1 -dLastPage=1 -sOutputFile=/tmp/out%d -f/home/aorth/Desktop/Food\ safety\ Kenya\ fruits.pdf
DEBUG: FC_WEIGHT didn't match
DEBUG: FC_WEIGHT didn&#39;t match
</code></pre><ul>
<li>Start proofing the latest round of 226 IITA archive records that Bosede sent last week and Sisay uploaded to DSpace Test this weekend (<a href="https://dspacetest.cgiar.org/handle/10568/108298">IITA_Dec_1_1997 aka Daniel1807</a>)
<ul>
@ -203,7 +203,7 @@ DEBUG: FC_WEIGHT didn't match
</ul>
<pre tabindex="0"><code>$ identify Info\ Note\ Mainstreaming\ gender\ and\ social\ differentiation\ into\ CCAFS\ research\ activities\ in\ West\ Africa-converted.pdf\[0\]
Info Note Mainstreaming gender and social differentiation into CCAFS research activities in West Africa-converted.pdf[0]=&gt;Info Note Mainstreaming gender and social differentiation into CCAFS research activities in West Africa-converted.pdf PDF 595x841 595x841+0+0 16-bit sRGB 107443B 0.000u 0:00.000
identify: CorruptImageProfile `xmp' @ warning/profile.c/SetImageProfileInternal/1746.
identify: CorruptImageProfile `xmp&#39; @ warning/profile.c/SetImageProfileInternal/1746.
</code></pre><ul>
<li>And wow, I can&rsquo;t even run ImageMagick&rsquo;s <code>identify</code> on the first page of the second item (10568/98930):</li>
</ul>
@ -213,7 +213,7 @@ zsh: abort (core dumped) identify Food\ safety\ Kenya\ fruits.pdf\[0\]
<li>But with GraphicsMagick&rsquo;s <code>identify</code> it works:</li>
</ul>
<pre tabindex="0"><code>$ gm identify Food\ safety\ Kenya\ fruits.pdf\[0\]
DEBUG: FC_WEIGHT didn't match
DEBUG: FC_WEIGHT didn&#39;t match
Food safety Kenya fruits.pdf PDF 612x792+0+0 DirectClass 8-bit 1.4Mi 0.000u 0m:0.000002s
</code></pre><ul>
<li>Interesting that ImageMagick&rsquo;s <code>identify</code> <em>does</em> work if you do not specify a page, perhaps as <a href="https://bugs.ghostscript.com/show_bug.cgi?id=699815">alluded to in the recent Ghostscript bug report</a>:</li>
@ -224,20 +224,20 @@ Food safety Kenya fruits.pdf[1] PDF 612x792 612x792+0+0 16-bit sRGB 64626B 0.010
Food safety Kenya fruits.pdf[2] PDF 612x792 612x792+0+0 16-bit sRGB 64626B 0.010u 0:00.009
Food safety Kenya fruits.pdf[3] PDF 612x792 612x792+0+0 16-bit sRGB 64626B 0.010u 0:00.009
Food safety Kenya fruits.pdf[4] PDF 612x792 612x792+0+0 16-bit sRGB 64626B 0.010u 0:00.009
identify: CorruptImageProfile `xmp' @ warning/profile.c/SetImageProfileInternal/1746.
identify: CorruptImageProfile `xmp&#39; @ warning/profile.c/SetImageProfileInternal/1746.
</code></pre><ul>
<li>As I expected, ImageMagick cannot generate a thumbnail, but GraphicsMagick can (though it looks like crap):</li>
</ul>
<pre tabindex="0"><code>$ convert Food\ safety\ Kenya\ fruits.pdf\[0\] -thumbnail 600x600 -flatten Food\ safety\ Kenya\ fruits.pdf.jpg
zsh: abort (core dumped) convert Food\ safety\ Kenya\ fruits.pdf\[0\] -thumbnail 600x600 -flatten
$ gm convert Food\ safety\ Kenya\ fruits.pdf\[0\] -thumbnail 600x600 -flatten Food\ safety\ Kenya\ fruits.pdf.jpg
DEBUG: FC_WEIGHT didn't match
DEBUG: FC_WEIGHT didn&#39;t match
</code></pre><ul>
<li>I inspected the troublesome PDF using <a href="http://jhove.openpreservation.org/">jhove</a> and noticed that it is using <code>ISO PDF/A-1, Level B</code> and the other one doesn&rsquo;t list a profile, though I don&rsquo;t think this is relevant</li>
<li>I found another item that fails when generating a thumbnail (<a href="https://hdl.handle.net/10568/98391">10568/98391</a>, DSpace complains:</li>
</ul>
<pre tabindex="0"><code>org.im4java.core.InfoException: org.im4java.core.CommandException: org.im4java.core.CommandException: identify: FailedToExecuteCommand `&quot;gs&quot; -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 &quot;-sDEVICE=pngalpha&quot; -dTextAlphaBits=4 -dGraphicsAlphaBits=4 &quot;-r72x72&quot; -dFirstPage=1 -dLastPage=1 &quot;-sOutputFile=/tmp/magick-142966vQs5Di64ntH%d&quot; &quot;-f/tmp/magick-14296Q0rJjfCeIj3w&quot; &quot;-f/tmp/magick-14296k_K6MWqwvpDm&quot;' (-1) @ error/delegate.c/ExternalDelegateCommand/461.
org.im4java.core.InfoException: org.im4java.core.CommandException: org.im4java.core.CommandException: identify: FailedToExecuteCommand `&quot;gs&quot; -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 &quot;-sDEVICE=pngalpha&quot; -dTextAlphaBits=4 -dGraphicsAlphaBits=4 &quot;-r72x72&quot; -dFirstPage=1 -dLastPage=1 &quot;-sOutputFile=/tmp/magick-142966vQs5Di64ntH%d&quot; &quot;-f/tmp/magick-14296Q0rJjfCeIj3w&quot; &quot;-f/tmp/magick-14296k_K6MWqwvpDm&quot;' (-1) @ error/delegate.c/ExternalDelegateCommand/461.
<pre tabindex="0"><code>org.im4java.core.InfoException: org.im4java.core.CommandException: org.im4java.core.CommandException: identify: FailedToExecuteCommand `&#34;gs&#34; -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 &#34;-sDEVICE=pngalpha&#34; -dTextAlphaBits=4 -dGraphicsAlphaBits=4 &#34;-r72x72&#34; -dFirstPage=1 -dLastPage=1 &#34;-sOutputFile=/tmp/magick-142966vQs5Di64ntH%d&#34; &#34;-f/tmp/magick-14296Q0rJjfCeIj3w&#34; &#34;-f/tmp/magick-14296k_K6MWqwvpDm&#34;&#39; (-1) @ error/delegate.c/ExternalDelegateCommand/461.
org.im4java.core.InfoException: org.im4java.core.CommandException: org.im4java.core.CommandException: identify: FailedToExecuteCommand `&#34;gs&#34; -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 &#34;-sDEVICE=pngalpha&#34; -dTextAlphaBits=4 -dGraphicsAlphaBits=4 &#34;-r72x72&#34; -dFirstPage=1 -dLastPage=1 &#34;-sOutputFile=/tmp/magick-142966vQs5Di64ntH%d&#34; &#34;-f/tmp/magick-14296Q0rJjfCeIj3w&#34; &#34;-f/tmp/magick-14296k_K6MWqwvpDm&#34;&#39; (-1) @ error/delegate.c/ExternalDelegateCommand/461.
at org.im4java.core.Info.getBaseInfo(Info.java:360)
at org.im4java.core.Info.&lt;init&gt;(Info.java:151)
at org.dspace.app.mediafilter.ImageMagickThumbnailFilter.getImageFile(ImageMagickThumbnailFilter.java:142)
@ -253,11 +253,11 @@ org.im4java.core.InfoException: org.im4java.core.CommandException: org.im4java.c
at java.lang.reflect.Method.invoke(Method.java:498)
at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:226)
at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:78)
Caused by: org.im4java.core.CommandException: org.im4java.core.CommandException: identify: FailedToExecuteCommand `&quot;gs&quot; -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 &quot;-sDEVICE=pngalpha&quot; -dTextAlphaBits=4 -dGraphicsAlphaBits=4 &quot;-r72x72&quot; -dFirstPage=1 -dLastPage=1 &quot;-sOutputFile=/tmp/magick-142966vQs5Di64ntH%d&quot; &quot;-f/tmp/magick-14296Q0rJjfCeIj3w&quot; &quot;-f/tmp/magick-14296k_K6MWqwvpDm&quot;' (-1) @ error/delegate.c/ExternalDelegateCommand/461.
Caused by: org.im4java.core.CommandException: org.im4java.core.CommandException: identify: FailedToExecuteCommand `&#34;gs&#34; -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 &#34;-sDEVICE=pngalpha&#34; -dTextAlphaBits=4 -dGraphicsAlphaBits=4 &#34;-r72x72&#34; -dFirstPage=1 -dLastPage=1 &#34;-sOutputFile=/tmp/magick-142966vQs5Di64ntH%d&#34; &#34;-f/tmp/magick-14296Q0rJjfCeIj3w&#34; &#34;-f/tmp/magick-14296k_K6MWqwvpDm&#34;&#39; (-1) @ error/delegate.c/ExternalDelegateCommand/461.
at org.im4java.core.ImageCommand.run(ImageCommand.java:219)
at org.im4java.core.Info.getBaseInfo(Info.java:342)
... 14 more
Caused by: org.im4java.core.CommandException: identify: FailedToExecuteCommand `&quot;gs&quot; -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 &quot;-sDEVICE=pngalpha&quot; -dTextAlphaBits=4 -dGraphicsAlphaBits=4 &quot;-r72x72&quot; -dFirstPage=1 -dLastPage=1 &quot;-sOutputFile=/tmp/magick-142966vQs5Di64ntH%d&quot; &quot;-f/tmp/magick-14296Q0rJjfCeIj3w&quot; &quot;-f/tmp/magick-14296k_K6MWqwvpDm&quot;' (-1) @ error/delegate.c/ExternalDelegateCommand/461.
Caused by: org.im4java.core.CommandException: identify: FailedToExecuteCommand `&#34;gs&#34; -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 &#34;-sDEVICE=pngalpha&#34; -dTextAlphaBits=4 -dGraphicsAlphaBits=4 &#34;-r72x72&#34; -dFirstPage=1 -dLastPage=1 &#34;-sOutputFile=/tmp/magick-142966vQs5Di64ntH%d&#34; &#34;-f/tmp/magick-14296Q0rJjfCeIj3w&#34; &#34;-f/tmp/magick-14296k_K6MWqwvpDm&#34;&#39; (-1) @ error/delegate.c/ExternalDelegateCommand/461.
at org.im4java.core.ImageCommand.finished(ImageCommand.java:253)
at org.im4java.process.ProcessStarter.run(ProcessStarter.java:314)
at org.im4java.core.ImageCommand.run(ImageCommand.java:215)
@ -274,22 +274,22 @@ zsh: abort (core dumped) convert bnfb_biofortification\ Module_Participants\ Gu
</code></pre><ul>
<li>So far the only thing that stands out is that the two files that don&rsquo;t work were created with Microsoft Office 2016:</li>
</ul>
<pre tabindex="0"><code>$ pdfinfo bnfb_biofortification\ Module_Participants\ Guide\ 2018.pdf | grep -E '^(Creator|Producer)'
<pre tabindex="0"><code>$ pdfinfo bnfb_biofortification\ Module_Participants\ Guide\ 2018.pdf | grep -E &#39;^(Creator|Producer)&#39;
Creator: Microsoft® Word 2016
Producer: Microsoft® Word 2016
$ pdfinfo Food\ safety\ Kenya\ fruits.pdf | grep -E '^(Creator|Producer)'
$ pdfinfo Food\ safety\ Kenya\ fruits.pdf | grep -E &#39;^(Creator|Producer)&#39;
Creator: Microsoft® Word 2016
Producer: Microsoft® Word 2016
</code></pre><ul>
<li>And the one that works was created with Office 365:</li>
</ul>
<pre tabindex="0"><code>$ pdfinfo Info\ Note\ Mainstreaming\ gender\ and\ social\ differentiation\ into\ CCAFS\ research\ activities\ in\ West\ Africa-converted.pdf | grep -E '^(Creator|Producer)'
<pre tabindex="0"><code>$ pdfinfo Info\ Note\ Mainstreaming\ gender\ and\ social\ differentiation\ into\ CCAFS\ research\ activities\ in\ West\ Africa-converted.pdf | grep -E &#39;^(Creator|Producer)&#39;
Creator: Microsoft® Word for Office 365
Producer: Microsoft® Word for Office 365
</code></pre><ul>
<li>I remembered an old technique I was using to generate thumbnails in 2015 using Inkscape followed by ImageMagick or GraphicsMagick:</li>
</ul>
<pre tabindex="0"><code>$ inkscape Food\ safety\ Kenya\ fruits.pdf -z --export-dpi=72 --export-area-drawing --export-png='cover.png'
<pre tabindex="0"><code>$ inkscape Food\ safety\ Kenya\ fruits.pdf -z --export-dpi=72 --export-area-drawing --export-png=&#39;cover.png&#39;
$ gm convert -resize x600 -flatten -quality 85 cover.png cover.jpg
</code></pre><ul>
<li>I&rsquo;ve tried a few times this week to register for the <a href="https://www.evisa.gov.et/">Ethiopian eVisa website</a>, but it is never successful</li>
@ -320,7 +320,7 @@ $ gm convert -resize x600 -flatten -quality 85 cover.png cover.jpg
<ul>
<li>Last night Linode sent a message that the load on CGSpace (linode18) was too high, here&rsquo;s a list of the top users at the time and throughout the day:</li>
</ul>
<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &quot;03/Dec/2018:1(5|6|7|8)&quot; | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;03/Dec/2018:1(5|6|7|8)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
225 40.77.167.142
226 66.249.64.63
232 46.101.86.248
@ -331,7 +331,7 @@ $ gm convert -resize x600 -flatten -quality 85 cover.png cover.jpg
962 66.249.70.27
1193 35.237.175.180
1450 2a01:4f8:140:3192::2
# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &quot;03/Dec/2018&quot; | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;03/Dec/2018&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
1141 207.46.13.57
1299 197.210.168.174
1341 54.70.40.11
@ -345,9 +345,9 @@ $ gm convert -resize x600 -flatten -quality 85 cover.png cover.jpg
</code></pre><ul>
<li><code>35.237.175.180</code> is known to us (CCAFS?), and I&rsquo;ve already added it to the list of bot IPs in nginx, which appears to be working:</li>
</ul>
<pre tabindex="0"><code>$ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=35.237.175.180' dspace.log.2018-12-03
<pre tabindex="0"><code>$ grep -c -E &#39;session_id=[A-Z0-9]{32}:ip_addr=35.237.175.180&#39; dspace.log.2018-12-03
4772
$ grep -o -E 'session_id=[A-Z0-9]{32}:ip_addr=35.237.175.180' dspace.log.2018-12-03 | sort | uniq | wc -l
$ grep -o -E &#39;session_id=[A-Z0-9]{32}:ip_addr=35.237.175.180&#39; dspace.log.2018-12-03 | sort | uniq | wc -l
630
</code></pre><ul>
<li>I haven&rsquo;t seen <code>2a01:4f8:140:3192::2</code> before. Its user agent is some new bot:</li>
@ -356,9 +356,9 @@ $ grep -o -E 'session_id=[A-Z0-9]{32}:ip_addr=35.237.175.180' dspace.log.2018-12
</code></pre><ul>
<li>At least it seems the Tomcat Crawler Session Manager Valve is working to re-use the common bot XMLUI sessions:</li>
</ul>
<pre tabindex="0"><code>$ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=2a01:4f8:140:3192::2' dspace.log.2018-12-03
<pre tabindex="0"><code>$ grep -c -E &#39;session_id=[A-Z0-9]{32}:ip_addr=2a01:4f8:140:3192::2&#39; dspace.log.2018-12-03
5111
$ grep -o -E 'session_id=[A-Z0-9]{32}:ip_addr=2a01:4f8:140:3192::2' dspace.log.2018-12-03 | sort | uniq | wc -l
$ grep -o -E &#39;session_id=[A-Z0-9]{32}:ip_addr=2a01:4f8:140:3192::2&#39; dspace.log.2018-12-03 | sort | uniq | wc -l
419
</code></pre><ul>
<li><code>78.46.79.71</code> is another host on Hetzner with the following user agent:</li>
@ -368,9 +368,9 @@ $ grep -o -E 'session_id=[A-Z0-9]{32}:ip_addr=2a01:4f8:140:3192::2' dspace.log.2
<li>This is not the first time a host on Hetzner has used a &ldquo;normal&rdquo; user agent to make thousands of requests</li>
<li>At least it is re-using its Tomcat sessions somehow:</li>
</ul>
<pre tabindex="0"><code>$ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=78.46.79.71' dspace.log.2018-12-03
<pre tabindex="0"><code>$ grep -c -E &#39;session_id=[A-Z0-9]{32}:ip_addr=78.46.79.71&#39; dspace.log.2018-12-03
2044
$ grep -o -E 'session_id=[A-Z0-9]{32}:ip_addr=78.46.79.71' dspace.log.2018-12-03 | sort | uniq | wc -l
$ grep -o -E &#39;session_id=[A-Z0-9]{32}:ip_addr=78.46.79.71&#39; dspace.log.2018-12-03 | sort | uniq | wc -l
1
</code></pre><ul>
<li>In other news, it&rsquo;s good to see my re-work of the database connectivity in the <a href="https://github.com/ilri/dspace-statistics-api">dspace-statistics-api</a> actually caused a reduction of persistent database connections (from 1 to 0, but still!):</li>
@ -385,7 +385,7 @@ $ grep -o -E 'session_id=[A-Z0-9]{32}:ip_addr=78.46.79.71' dspace.log.2018-12-03
<li>Linode sent a message that the CPU usage of CGSpace (linode18) is too high last night</li>
<li>I looked in the logs and there&rsquo;s nothing particular going on:</li>
</ul>
<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &quot;05/Dec/2018&quot; | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;05/Dec/2018&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
1225 157.55.39.177
1240 207.46.13.12
1261 207.46.13.101
@ -403,9 +403,9 @@ $ grep -o -E 'session_id=[A-Z0-9]{32}:ip_addr=78.46.79.71' dspace.log.2018-12-03
</code></pre><ul>
<li>But Tomcat is forcing them to re-use their Tomcat sessions with the Crawler Session Manager valve:</li>
</ul>
<pre tabindex="0"><code>$ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=54.70.40.11' dspace.log.2018-12-05
<pre tabindex="0"><code>$ grep -c -E &#39;session_id=[A-Z0-9]{32}:ip_addr=54.70.40.11&#39; dspace.log.2018-12-05
6980
$ grep -o -E 'session_id=[A-Z0-9]{32}:ip_addr=54.70.40.11' dspace.log.2018-12-05 | sort | uniq | wc -l
$ grep -o -E &#39;session_id=[A-Z0-9]{32}:ip_addr=54.70.40.11&#39; dspace.log.2018-12-05 | sort | uniq | wc -l
1156
</code></pre><ul>
<li><code>2a01:7e00::f03c:91ff:fe0a:d645</code> appears to be the CKM dev server where Danny is testing harvesting via Drupal</li>
@ -446,7 +446,7 @@ $ grep -o -E 'session_id=[A-Z0-9]{32}:ip_addr=54.70.40.11' dspace.log.2018-12-05
<li>Linode alerted me twice today that the load on CGSpace (linode18) was very high</li>
<li>Looking at the nginx logs I see a few new IPs in the top 10:</li>
</ul>
<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &quot;17/Dec/2018&quot; | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;17/Dec/2018&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
927 157.55.39.81
975 54.70.40.11
2090 50.116.102.77
@ -505,7 +505,7 @@ $ ls -lh cgspace_2018-12-19.backup*
</code></pre><ul>
<li>Update usage rights on CGSpace as we agreed with Maria Garruccio and Peter last month:</li>
</ul>
<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2018-11-27-update-rights.csv -f dc.rights -t correct -m 53 -db dspace -u dspace -p 'fuu' -d
<pre tabindex="0"><code>$ ./fix-metadata-values.py -i /tmp/2018-11-27-update-rights.csv -f dc.rights -t correct -m 53 -db dspace -u dspace -p &#39;fuu&#39; -d
Connected to database.
Fixed 466 occurences of: Copyrighted; Any re-use allowed
</code></pre><ul>
@ -519,7 +519,7 @@ Fixed 466 occurences of: Copyrighted; Any re-use allowed
# pg_dropcluster 9.6 main
# pg_upgradecluster 9.5 main
# pg_dropcluster 9.5 main
# dpkg -l | grep postgresql | grep 9.5 | awk '{print $2}' | xargs dpkg -r
# dpkg -l | grep postgresql | grep 9.5 | awk &#39;{print $2}&#39; | xargs dpkg -r
</code></pre><ul>
<li>I&rsquo;ve been running PostgreSQL 9.6 for months on my local development and public DSpace Test (linode19) environments</li>
<li>Run all system updates on CGSpace (linode18) and restart the server</li>
@ -528,13 +528,13 @@ Fixed 466 occurences of: Copyrighted; Any re-use allowed
<pre tabindex="0"><code>$ dspace cleanup -v
- Deleting bitstream information (ID: 158227)
- Deleting bitstream record from database (ID: 158227)
Error: ERROR: update or delete on table &quot;bitstream&quot; violates foreign key constraint &quot;bundle_primary_bitstream_id_fkey&quot; on table &quot;bundle&quot;
Detail: Key (bitstream_id)=(158227) is still referenced from table &quot;bundle&quot;.
Error: ERROR: update or delete on table &#34;bitstream&#34; violates foreign key constraint &#34;bundle_primary_bitstream_id_fkey&#34; on table &#34;bundle&#34;
Detail: Key (bitstream_id)=(158227) is still referenced from table &#34;bundle&#34;.
...
</code></pre><ul>
<li>As always, the solution is to delete those IDs manually in PostgreSQL:</li>
</ul>
<pre tabindex="0"><code>$ psql dspace -c 'update bundle set primary_bitstream_id=NULL where primary_bitstream_id in (158227, 158251);'
<pre tabindex="0"><code>$ psql dspace -c &#39;update bundle set primary_bitstream_id=NULL where primary_bitstream_id in (158227, 158251);&#39;
UPDATE 1
</code></pre><ul>
<li>After all that I started a full Discovery reindex to get the index name changes and rights updates</li>
@ -544,7 +544,7 @@ UPDATE 1
<li>CGSpace went down today for a few minutes while I was at dinner and I quickly restarted Tomcat</li>
<li>The top IP addresses as of this evening are:</li>
</ul>
<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &quot;29/Dec/2018&quot; | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E &#34;29/Dec/2018&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
963 40.77.167.152
987 35.237.175.180
1062 40.77.167.55
@ -558,7 +558,7 @@ UPDATE 1
</code></pre><ul>
<li>And just around the time of the alert:</li>
</ul>
<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log.1 /var/log/nginx/*.log.2.gz | grep -E &quot;29/Dec/2018:1(6|7|8)&quot; | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
<pre tabindex="0"><code># zcat --force /var/log/nginx/*.log.1 /var/log/nginx/*.log.2.gz | grep -E &#34;29/Dec/2018:1(6|7|8)&#34; | awk &#39;{print $1}&#39; | sort | uniq -c | sort -n | tail -n 10
115 66.249.66.223
118 207.46.13.14
123 34.218.226.147