diff --git a/content/posts/2023-03.md b/content/posts/2023-03.md
index 5a6a9cef4..8d07faac1 100644
--- a/content/posts/2023-03.md
+++ b/content/posts/2023-03.md
@@ -517,4 +517,35 @@ colly | awk '{print $1}' | sort | uniq -c | sort -h
- I exported CGSpace to check for missing Initiative collection mappings
- Start a harvest on AReS
+## 2023-03-27
+
+- The harvest on AReS was incredibly slow and I stopped it about half way twelve hours later
+ - Then I relied on the plugins to get missing items, which caused a high load on the server but actually worked fine
+- Continue working on thumbnails on DSpace
+
+## 2023-03-28
+
+- Regarding ImageMagick there are a few things I've learned
+ - The `-quality` setting does different things for different output formats, see: https://imagemagick.org/script/command-line-options.php#quality
+ - The `-compress` setting controls the compression algorithm for image data, and is unrelated to lossless/lossy
+ - On that note, `-compress lossless` for JPEGs refers to Lossless JPEG, which is not well defined or supported and should be avoided
+ - See: https://imagemagick.org/script/command-line-options.php#compress
+ - The way DSpace currently does its supersampling by exporting to a JPEG, then making a thumbnail of the JPEG, is a double lossy operation
+ - We should be exporting to something lossless like PNG, PPM, or MIFF, then making a thumbnail from that
+ - The PNG format is always lossless so the `-quality` setting controls compression and filtering, but has no effect on the appearance or signature of PNG images
+ - You can use `-quality n` with WebP's `-define webp:lossless=true`, but I'm not sure about the interaction between ImageMagick quality and WebP lossless...
+ - Also, if converting from a lossless format to WebP lossless in the same command, ImageMagick will ignore quality settings
+ - The MIFF format is useful for piping between ImageMagick commands, but it is also lossless and the quality setting is ignored
+ - You can use a format specifier when piping between ImageMagick commands without writing a file
+ - For example, I want to create a lossless PNG from a distorted JPEG for comparison:
+
+```console
+$ magick convert reference.jpg -quality 85 jpg:- | convert - distorted-lossless.png
+```
+
+- If I convert the JPEG to PNG directly it will ignore the quality setting, so I set the quality and the output format, then pipe it to ImageMagick again to convert to lossless PNG
+- In an attempt to quantify the generation loss from DSpace's "JPG JPG" method of creating thumbnails I wrote a script called `generation-loss.sh` to test against a new "PNG JPG" method
+ - With my sample set of seventeen PDFs from CGSpace I found that _the "JPG JPG" method of thumbnailing results in scores an average of 1.6% lower than with the "PNG JPG" method_.
+ - The average file size with _the "PNG JPG" method was only 200 bytes larger_.
+
diff --git a/docs/2023-03/index.html b/docs/2023-03/index.html
index 4af119410..a5d4cbdfe 100644
--- a/docs/2023-03/index.html
+++ b/docs/2023-03/index.html
@@ -16,7 +16,7 @@ I finally got through with porting the input form from DSpace 6 to DSpace 7
-
+
@@ -38,9 +38,9 @@ I finally got through with porting the input form from DSpace 6 to DSpace 7
"@type": "BlogPosting",
"headline": "March, 2023",
"url": "https://alanorth.github.io/cgspace-notes/2023-03/",
- "wordCount": "3600",
+ "wordCount": "3988",
"datePublished": "2023-03-01T07:58:36+03:00",
- "dateModified": "2023-03-24T13:19:13+03:00",
+ "dateModified": "2023-03-27T10:03:45+03:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@@ -687,6 +687,53 @@ RL: performed 0 reads and 16 write i/o operations
I exported CGSpace to check for missing Initiative collection mappings
Start a harvest on AReS
+2023-03-27
+
+- The harvest on AReS was incredibly slow and I stopped it about half way twelve hours later
+
+- Then I relied on the plugins to get missing items, which caused a high load on the server but actually worked fine
+
+
+- Continue working on thumbnails on DSpace
+
+2023-03-28
+
+- Regarding ImageMagick there are a few things I’ve learned
+
+- The
-quality
setting does different things for different output formats, see: https://imagemagick.org/script/command-line-options.php#quality
+- The
-compress
setting controls the compression algorithm for image data, and is unrelated to lossless/lossy
+
+
+- The way DSpace currently does its supersampling by exporting to a JPEG, then making a thumbnail of the JPEG, is a double lossy operation
+
+- We should be exporting to something lossless like PNG, PPM, or MIFF, then making a thumbnail from that
+
+
+- The PNG format is always lossless so the
-quality
setting controls compression and filtering, but has no effect on the appearance or signature of PNG images
+- You can use
-quality n
with WebP’s -define webp:lossless=true
, but I’m not sure about the interaction between ImageMagick quality and WebP lossless…
+
+- Also, if converting from a lossless format to WebP lossless in the same command, ImageMagick will ignore quality settings
+
+
+- The MIFF format is useful for piping between ImageMagick commands, but it is also lossless and the quality setting is ignored
+- You can use a format specifier when piping between ImageMagick commands without writing a file
+- For example, I want to create a lossless PNG from a distorted JPEG for comparison:
+
+
+
+$ magick convert reference.jpg -quality 85 jpg:- | convert - distorted-lossless.png
+
+- If I convert the JPEG to PNG directly it will ignore the quality setting, so I set the quality and the output format, then pipe it to ImageMagick again to convert to lossless PNG
+- In an attempt to quantify the generation loss from DSpace’s “JPG JPG” method of creating thumbnails I wrote a script called
generation-loss.sh
to test against a new “PNG JPG” method
+
+- With my sample set of seventeen PDFs from CGSpace I found that the “JPG JPG” method of thumbnailing results in scores an average of 1.6% lower than with the “PNG JPG” method.
+- The average file size with the “PNG JPG” method was only 200 bytes larger.
+
+
+
diff --git a/docs/categories/index.html b/docs/categories/index.html
index e8519d906..a2cac1a6f 100644
--- a/docs/categories/index.html
+++ b/docs/categories/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/categories/notes/index.html b/docs/categories/notes/index.html
index fe08644da..a6039462a 100644
--- a/docs/categories/notes/index.html
+++ b/docs/categories/notes/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/categories/notes/page/2/index.html b/docs/categories/notes/page/2/index.html
index 88882c86d..e691c69d3 100644
--- a/docs/categories/notes/page/2/index.html
+++ b/docs/categories/notes/page/2/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/categories/notes/page/3/index.html b/docs/categories/notes/page/3/index.html
index 05ebcc00b..520147be6 100644
--- a/docs/categories/notes/page/3/index.html
+++ b/docs/categories/notes/page/3/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/categories/notes/page/4/index.html b/docs/categories/notes/page/4/index.html
index 9e46d5625..473378225 100644
--- a/docs/categories/notes/page/4/index.html
+++ b/docs/categories/notes/page/4/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/categories/notes/page/5/index.html b/docs/categories/notes/page/5/index.html
index 32ebc43bf..a134efdc4 100644
--- a/docs/categories/notes/page/5/index.html
+++ b/docs/categories/notes/page/5/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/categories/notes/page/6/index.html b/docs/categories/notes/page/6/index.html
index dea2c31fa..53b907376 100644
--- a/docs/categories/notes/page/6/index.html
+++ b/docs/categories/notes/page/6/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/categories/notes/page/7/index.html b/docs/categories/notes/page/7/index.html
index ae79fb496..b8b4310ad 100644
--- a/docs/categories/notes/page/7/index.html
+++ b/docs/categories/notes/page/7/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/index.html b/docs/index.html
index 1e1f8c7ab..ab8e621c3 100644
--- a/docs/index.html
+++ b/docs/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/page/10/index.html b/docs/page/10/index.html
index 128886604..32aba162c 100644
--- a/docs/page/10/index.html
+++ b/docs/page/10/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/page/2/index.html b/docs/page/2/index.html
index 1cf13fd6f..58d59cc10 100644
--- a/docs/page/2/index.html
+++ b/docs/page/2/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/page/3/index.html b/docs/page/3/index.html
index bdf0fcac1..2a3a33947 100644
--- a/docs/page/3/index.html
+++ b/docs/page/3/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/page/4/index.html b/docs/page/4/index.html
index f3f871bdf..e4d85d5a3 100644
--- a/docs/page/4/index.html
+++ b/docs/page/4/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/page/5/index.html b/docs/page/5/index.html
index fece6c27b..f21ed59d3 100644
--- a/docs/page/5/index.html
+++ b/docs/page/5/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/page/6/index.html b/docs/page/6/index.html
index 5fa5309eb..0c1a3bc96 100644
--- a/docs/page/6/index.html
+++ b/docs/page/6/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/page/7/index.html b/docs/page/7/index.html
index 62a7c51fd..00068aa47 100644
--- a/docs/page/7/index.html
+++ b/docs/page/7/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/page/8/index.html b/docs/page/8/index.html
index c20a0a7f6..b3eea0c20 100644
--- a/docs/page/8/index.html
+++ b/docs/page/8/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/page/9/index.html b/docs/page/9/index.html
index d021b762e..7c520da45 100644
--- a/docs/page/9/index.html
+++ b/docs/page/9/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/posts/index.html b/docs/posts/index.html
index fc680e952..683b42030 100644
--- a/docs/posts/index.html
+++ b/docs/posts/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/posts/page/10/index.html b/docs/posts/page/10/index.html
index ccee9ae96..a3353755e 100644
--- a/docs/posts/page/10/index.html
+++ b/docs/posts/page/10/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/posts/page/2/index.html b/docs/posts/page/2/index.html
index b8495f933..4ffcf6ca9 100644
--- a/docs/posts/page/2/index.html
+++ b/docs/posts/page/2/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/posts/page/3/index.html b/docs/posts/page/3/index.html
index e8cc13af4..91ea23dd3 100644
--- a/docs/posts/page/3/index.html
+++ b/docs/posts/page/3/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/posts/page/4/index.html b/docs/posts/page/4/index.html
index ee100a6f5..53b33cf16 100644
--- a/docs/posts/page/4/index.html
+++ b/docs/posts/page/4/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/posts/page/5/index.html b/docs/posts/page/5/index.html
index 4ce4be95e..999d99e88 100644
--- a/docs/posts/page/5/index.html
+++ b/docs/posts/page/5/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/posts/page/6/index.html b/docs/posts/page/6/index.html
index 2378351e4..7d585f1c2 100644
--- a/docs/posts/page/6/index.html
+++ b/docs/posts/page/6/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/posts/page/7/index.html b/docs/posts/page/7/index.html
index 86201e88d..cf529fc3a 100644
--- a/docs/posts/page/7/index.html
+++ b/docs/posts/page/7/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/posts/page/8/index.html b/docs/posts/page/8/index.html
index 82bbe8521..9b812f79f 100644
--- a/docs/posts/page/8/index.html
+++ b/docs/posts/page/8/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/posts/page/9/index.html b/docs/posts/page/9/index.html
index 00cc9c5fc..0cc1ad17f 100644
--- a/docs/posts/page/9/index.html
+++ b/docs/posts/page/9/index.html
@@ -10,7 +10,7 @@
-
+
diff --git a/docs/sitemap.xml b/docs/sitemap.xml
index 1e1866172..012e19413 100644
--- a/docs/sitemap.xml
+++ b/docs/sitemap.xml
@@ -3,19 +3,19 @@
xmlns:xhtml="http://www.w3.org/1999/xhtml">
https://alanorth.github.io/cgspace-notes/categories/
- 2023-03-24T13:19:13+03:00
+ 2023-03-27T10:03:45+03:00
https://alanorth.github.io/cgspace-notes/
- 2023-03-24T13:19:13+03:00
+ 2023-03-27T10:03:45+03:00
https://alanorth.github.io/cgspace-notes/2023-03/
- 2023-03-24T13:19:13+03:00
+ 2023-03-27T10:03:45+03:00
https://alanorth.github.io/cgspace-notes/categories/notes/
- 2023-03-24T13:19:13+03:00
+ 2023-03-27T10:03:45+03:00
https://alanorth.github.io/cgspace-notes/posts/
- 2023-03-24T13:19:13+03:00
+ 2023-03-27T10:03:45+03:00
https://alanorth.github.io/cgspace-notes/2023-02/
2023-03-01T08:30:25+03:00