diff --git a/content/posts/2020-01.md b/content/posts/2020-01.md index 8e3909eb0..c50881974 100644 --- a/content/posts/2020-01.md +++ b/content/posts/2020-01.md @@ -219,4 +219,30 @@ $ wc -l hung-nguyen-a*handles.txt - Comparing the lists of items, I see that nine of the ten missing items were added less than twenty-four hours ago, and the other was added last week, so they apparently just haven't been indexed yet - I am curious to check tomorrow to see if they are there +## 2020-01-23 + +- I checked AReS and I see that there are now 55 items for author "Hung Nguyen-Viet" +- Linode sent an alert that the outbound traffic rate of CGSpace (linode18) was high for several hours this morning around 5AM UTC+1 + - I checked the nginx logs this morning for the few hours before and after that using goaccess: + +``` +# cat /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "23/Jan/2020:0[12345678]" | goaccess --log-format=COMBINED - +``` + +- The top two hosts according to the amount of data transferred are: + - 2a01:7e00::f03c:91ff:fe9a:3a37 + - 2a01:7e00::f03c:91ff:fe18:7396 +- Both are on Linode, and appear to be the new and old ilri.org servers + - I will ask the web team + - Judging from the [ILRI publications site](https://www.ilri.org/publications/trade-offs-related-agricultural-use-antimicrobials-and-synergies-emanating-efforts) it seems they are downloading the PDFs so they can generate higher-quality thumbnails: + - They are apparently using this Drupal module to generate the thumbnails: `sites/all/modules/contrib/pdf_to_imagefield` + - I see some excellent suggestions in this [ImageMagick thread from 2012](https://www.imagemagick.org/discourse-server/viewtopic.php?t=21589) that lead me to some nice thumbnails (default PDF density is 72, so supersample to 4X and then resize back to 25%) as well as [this blog post](https://duncanlock.net/blog/2013/11/18/how-to-create-thumbnails-for-pdfs-with-imagemagick-on-linux/): + +``` +$ convert -density 288 -filter lagrange -thumbnail 25% -background white -alpha remove -sampling-factor 1:1 -colorspace sRGB 10568-97925.pdf\[0\] 10568-97925.jpg +``` + +- Here I'm also explicitly setting the background to white and removing any alpha layers, but I could probably also just keep using `-flatten` like DSpace already does +- I wonder if I could hack this into DSpace code to get better thumbnails... + diff --git a/docs/2020-01/index.html b/docs/2020-01/index.html index 45be6d635..cd9269f86 100644 --- a/docs/2020-01/index.html +++ b/docs/2020-01/index.html @@ -29,7 +29,7 @@ I tweeted the CGSpace repository link - + @@ -63,9 +63,9 @@ I tweeted the CGSpace repository link "@type": "BlogPosting", "headline": "January, 2020", "url": "https:\/\/alanorth.github.io\/cgspace-notes\/2020-01\/", - "wordCount": "1674", + "wordCount": "1905", "datePublished": "2020-01-06T10:48:30+02:00", - "dateModified": "2020-01-22T10:35:46+02:00", + "dateModified": "2020-01-22T14:16:08+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -357,6 +357,34 @@ $ wc -l hung-nguyen-a*handles.txt +

2020-01-23

+ +
# cat /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "23/Jan/2020:0[12345678]" | goaccess --log-format=COMBINED -
+
+
$ convert -density 288 -filter lagrange -thumbnail 25% -background white -alpha remove -sampling-factor 1:1 -colorspace sRGB 10568-97925.pdf\[0\] 10568-97925.jpg
+
diff --git a/docs/sitemap.xml b/docs/sitemap.xml index 5ed9a1f90..0e7b23b10 100644 --- a/docs/sitemap.xml +++ b/docs/sitemap.xml @@ -4,27 +4,27 @@ https://alanorth.github.io/cgspace-notes/categories/ - 2020-01-22T10:35:46+02:00 + 2020-01-22T14:16:08+02:00 https://alanorth.github.io/cgspace-notes/ - 2020-01-22T10:35:46+02:00 + 2020-01-22T14:16:08+02:00 https://alanorth.github.io/cgspace-notes/2020-01/ - 2020-01-22T10:35:46+02:00 + 2020-01-22T14:16:08+02:00 https://alanorth.github.io/cgspace-notes/categories/notes/ - 2020-01-22T10:35:46+02:00 + 2020-01-22T14:16:08+02:00 https://alanorth.github.io/cgspace-notes/posts/ - 2020-01-22T10:35:46+02:00 + 2020-01-22T14:16:08+02:00