mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Add notes for 2023-04-20
This commit is contained in:
@ -438,9 +438,11 @@ $ psql < locks-age.sql | grep -E "[[:digit:]] days" | awk -F\| '{print $10}' | s
|
||||
- I ended up with a long list of UUIDs to fix before the script would complete:
|
||||
|
||||
```console
|
||||
$ psql -d dspace -c "update bundle set primary_bitstream_id=NULL where primary_bitstream_id in ('a7ddf477-1c04-4de0-9c7a-4d3c84a875bc', '9582b661-9c2d-4c86-be22-c3b0942b646a', '210a4d5d-3af9-46f0-84cc-682dd1431762')"
|
||||
$ psql -d dspace -c "update bundle set primary_bitstream_id=NULL where primary_bitstream_id in ('a7ddf477-1c04-4de0-9c7a-4d3c84a875bc', '9582b661-9c2d-4c86-be22-c3b0942b646a', '210a4d5d-3af9-46f0-84cc-682dd1431762', '51115f07-0a60-4988-8536-b9ebd2a5e15e', '0fc5021d-3264-413a-b2e2-74bda38a394e', '4704fa62-b8ab-4dfe-b7aa-0e4905f8412a')"
|
||||
```
|
||||
|
||||
- This process ended up taking a few days because each iteration ran for over four hours before failing on the next UUID, sighhhhh
|
||||
|
||||
## 2023-04-18
|
||||
|
||||
- Regarding the item Abenet noticed yesterday that has a blank page and a nullPointerException
|
||||
@ -448,4 +450,37 @@ $ psql -d dspace -c "update bundle set primary_bitstream_id=NULL where primary_b
|
||||
- And according to the REST API on CGSpace the item was modified on 2023-04-11, so last week...
|
||||
- According to the DSpace logs it was Francesca who edited the item last week, so I asked her for more information before I troubleshoot more
|
||||
|
||||
## 2023-04-19
|
||||
|
||||
- I fixed the Bioversity item by deleting the `9781138781276.jpg` bitstream via the REST API
|
||||
- I *think* Francesca might have changed the "format" of it?
|
||||
- Anyway, this item has a PDF so we have a proper thumbnail and don't need that other journal cover one
|
||||
- I noticed a URL for this [Bioversity item](https://hdl.handle.net/10568/89049) redirects incorrectly
|
||||
- I had mentioned this to Maria and Francesca a few months ago but it seems to never have been resolved
|
||||
- The `dspace cleanup -v` finally finished after a few days of running and stopping...
|
||||
- I decided to update the thumbnails in the Bioversity books collection because I saw a few old ones suffering from the CropBox issue
|
||||
- Also, all day there's been a high load on CGSpace, with lots of locks in PostgreSQL
|
||||
- I had been waiting until the bitstream cleanup finished... now I might need to restart PostgreSQL to kill some old locks as something needs to give
|
||||
- I restarted PostgreSQL, but DSpace was still hanging on simple XMLUI options so I ended up restarting Tomcat
|
||||
- Tag 544 ORCID identifiers with my script
|
||||
- I updated my `generation-loss.sh` and `improved-dspace-thumbnails` scripts to include thirty-five PDFs from CGSpace (up from twenty-four) to get a larger sample
|
||||
- Now starting to get some numbers comparing JPEG, WebP, and AVIF
|
||||
- First, out of curiousity, I checked the average ssimulacra2 scores at Q75, Q80, and Q92 for each format:
|
||||
|
||||
| | Q75 | Q80 | Q92 |
|
||||
|------|-----|-----|-----|
|
||||
| JPEG | 70 | 73 | 88 |
|
||||
| WebP | 73 | 76 | 82 |
|
||||
| AVIF | 82 | 83 | 92 |
|
||||
|
||||
- Then I checked the quality and file size (bytes) needed to hit an average ssimulacra2 score of 80 with each format:
|
||||
- **JPEG**: Q89, 124596 bytes
|
||||
- **WebP**: Q88, 84935 bytes (32% smaller than JPEG size)
|
||||
- **AVIF**: Q62, 60347 bytes (52% smaller than JPEG size)
|
||||
- [Google's original WebP study](https://developers.google.com/speed/webp/docs/webp_study) uses this technique to compare WebP to JPEG too
|
||||
- As the quality settings are not comparable between formats, we need to compare the formats at matching perceptual scores (ssimulacra2 in this case)
|
||||
- I used a ssimulacra2 score of 80 because that's the about the highest score I see with WebP using my samples, though JPEG and AVIF do go higher
|
||||
- Also, according to current ssimulacra2 (v2.1), a score of 70 is "high quality" and a score of 90 is "very high quality", so 80 should be reasonably high enough...
|
||||
- Export CGSpace to check for missing Initiatives mappings
|
||||
|
||||
<!-- vim: set sw=2 ts=2: -->
|
||||
|
Reference in New Issue
Block a user