cgspace-notes/content/post/2017-03.md

73 lines
3.8 KiB
Markdown
Raw Normal View History

2017-03-01 16:10:08 +01:00
+++
date = "2017-03-01T17:08:52+02:00"
author = "Alan Orth"
title = "March, 2017"
tags = ["Notes"]
+++
## 2017-03-01
- Run the 279 CIAT author corrections on CGSpace
2017-03-02 18:26:39 +01:00
## 2017-03-02
2017-03-04 00:15:47 +01:00
- Skype with Michael and Peter, discussing moving the CGIAR Library to CGSpace
- CGIAR people possibly open to moving content, redirecting library.cgiar.org to CGSpace and letting CGSpace resolve their handles
- They might come in at the top level in one "CGIAR System" community, or with several communities
- I need to spend a bit of time looking at the multiple handle support in DSpace and see if new content can be minted in both handles, or just one?
- Need to send Peter and Michael some notes about this in a few days
- Also, need to consider talking to Atmire about hiring them to bring ORCiD metadata to REST / OAI
2017-03-02 18:26:39 +01:00
- Filed an issue on DSpace issue tracker for the `filter-media` bug that causes it to process JPGs even when limiting to the PDF thumbnail plugin: [DS-3516](https://jira.duraspace.org/browse/DS-3516)
- Discovered that the ImageMagic `filter-media` plugin creates JPG thumbnails with the CMYK colorspace when the source PDF is using CMYK
- Interestingly, it seems DSpace 4.x's thumbnails were sRGB, but forcing regeneration using DSpace 5.x's ImageMagick plugin creates CMYK JPGs if the source PDF was CMYK (see [10568/51999](https://cgspace.cgiar.org/handle/10568/51999)):
```
$ identify ~/Desktop/alc_contrastes_desafios.jpg
/Users/aorth/Desktop/alc_contrastes_desafios.jpg JPEG 464x600 464x600+0+0 8-bit CMYK 168KB 0.000u 0:00.000
```
2017-03-01 16:10:08 +01:00
<!--more-->
2017-03-02 19:00:18 +01:00
2017-03-03 17:40:38 +01:00
- This results in discolored thumbnails when compared to the original PDF, for example sRGB and CMYK:
2017-03-02 19:00:18 +01:00
![Thumbnail in sRGB colorspace](/cgspace-notes/2017/03/thumbnail-srgb.jpg)
2017-03-02 23:57:37 +01:00
![Thumbnial in CMYK colorspace](/cgspace-notes/2017/03/thumbnail-cmyk.jpg)
2017-03-03 00:32:54 +01:00
- I filed an issue for the color space thing: [DS-3517](https://jira.duraspace.org/browse/DS-3517)
2017-03-03 17:40:38 +01:00
## 2017-03-03
- I created a patch for DS-3517 and made a pull request against upstream `dspace-5_x`: https://github.com/DSpace/DSpace/pull/1669
- Looks like `-colorspace sRGB` alone isn't enough, we need to use profiles:
```
2017-03-04 00:15:47 +01:00
$ convert alc_contrastes_desafios.pdf\[0\] -profile /opt/brew/Cellar/ghostscript/9.20/share/ghostscript/9.20/iccprofiles/default_cmyk.icc -thumbnail 300x300 -flatten -profile /opt/brew/Cellar/ghostscript/9.20/share/ghostscript/9.20/iccprofiles/default_rgb.icc alc_contrastes_desafios.pdf.jpg
2017-03-03 17:40:38 +01:00
```
2017-03-04 00:15:47 +01:00
- This reads the input file, applies the CMYK profile, applies the RGB profile, then writes the file
- Note that you should set the first profile immediately after the input file
- Also, it is better to use profiles than setting `-colorspace`
2017-03-03 17:40:38 +01:00
- This is a great resource describing the color stuff: http://www.imagemagick.org/Usage/formats/#profiles
2017-03-04 00:15:47 +01:00
- Somehow we need to detect the color system being used by the input file and handle each case differently (with profiles)
2017-03-03 17:40:38 +01:00
- This is trivial with `identify` (even by the [Java ImageMagick API](http://im4java.sourceforge.net/api/org/im4java/core/IMOps.html#identify)):
```
$ identify -format '%r\n' alc_contrastes_desafios.pdf\[0\]
DirectClass CMYK
$ identify -format '%r\n' Africa\ group\ of\ negotiators.pdf\[0\]
DirectClass sRGB Alpha
```
2017-03-05 11:39:09 +01:00
## 2017-03-05
- Look into helping developers from landportal.info with a query for items related to LAND on the REST API
- They want something like the items that are returned by the general "LAND" query in the search interface, but we cannot do that
- We can only return specific results for metadata fields, like:
```
$ curl -s -H "accept: application/json" -H "Content-Type: application/json" -X POST "https://dspacetest.cgiar.org/rest/items/find-by-metadata-field" -d '{"key": "cg.subject.ilri","value": "LAND REFORM", "language": null}' | json_pp
```
- But there are hundreds of combinations of fields and values (like `dc.subject` and all the center subjects), and we can't use wildcards in REST!