Add notes

This commit is contained in:
2023-03-13 21:22:25 +03:00
parent 345cd4365b
commit 40fe625083
31 changed files with 100 additions and 36 deletions

View File

@ -202,6 +202,7 @@ $ ls -lh 10568-126388-*
- Seems like 600px is 3 to 4 times larger file size, so maybe we should shoot for 400px or 500px
- I decided on 500px
- I started re-generating new thumbnails for the ILRI Publications, CGIAR Initiatives, and other collections
- On that note, I also re-worked the XMLUI item display to show larger thumbnails (from a max-width of 128px to 200px)
- And now that I'm looking at thumbnails I am curious what it would take to get DSpace to generate WebP or AVIF thumbnails
- Peter sent me citations and ILRI subjects for the 350 new ILRI publications
@ -209,4 +210,34 @@ $ ls -lh 10568-126388-*
- I merged Peter's citations and subjects with the other metadata, ran one last duplicate check (and found one item!), then ran the items through csv-metadata-quality and uploaded them to CGSpace
- In the end it was only 348 items for some reason...
## 2023-03-12
- Start a harvest on AReS
## 2023-03-13
- Extract a list of DOIs from the Creative Commons licensed ILRI journal articles that I uploaded last week, skipping any that are "no derivatives" (ND):
```console
$ csvgrep -c 'dc.description.provenance[en]' -m 'Made available in DSpace on 2023-03-10' /tmp/ilri-articles.csv \
| csvgrep -c 'dcterms.license[en_US]' -r 'CC(0|\-BY)'
| csvgrep -c 'dcterms.license[en_US]' -i -r '\-ND\-'
| csvcut -c 'id,cg.identifier.doi[en_US],dcterms.type[en_US]' > 2023-03-13-journal-articles.csv
```
- I want to write a script to download the PDFs and create thumbnails for them, then upload to CGSpace
- I wrote one based on `post_ciat_pdfs.py` but it seems there is an issue uploading anything other than a PDF
- When I upload a JPG or a PNG the file begins with:
```console
Content-Disposition: form-data; name="file"; filename="10.1017-s0031182013001625.pdf.jpg"
```
- ... this means it is invalid...
- I tried in both the `ORIGINAL` and `THUMBNAIL` bundle, and with different filenames
- I tried manually on the command line with `http` and both PDF and PNG work... hmmmm
- Hmm, this seems to have been due to some difference in behavior between the `files` and `data` parameters of `requests.get()`
- I finalized the `post_bitstreams.py` script and uploaded eighty-five PDF thumbnails
- It seems Bizu uploaded covers for a handful so I deleted them and ran them through the script to get proper thumbnails
<!-- vim: set sw=2 ts=2: -->