mirror of
				https://github.com/ilri/cgspace-java-helpers.git
				synced 2025-11-03 22:29:10 +01:00 
			
		
		
		
	Compare commits
	
		
			25 Commits
		
	
	
		
			f0754ab419
			...
			dspace5
		
	
	| Author | SHA1 | Date | |
|---|---|---|---|
| 
						
						
							
						
						d5cf51c464
	
				 | 
					
					
						|||
| 
						
						
							
						
						98c7cfb3a5
	
				 | 
					
					
						|||
| 
						
						
							
						
						58365cdfda
	
				 | 
					
					
						|||
| 
						
						
							
						
						7190b751e1
	
				 | 
					
					
						|||
| 
						
						
							
						
						34acc351a5
	
				 | 
					
					
						|||
| 
						
						
							
						
						ec293b3b28
	
				 | 
					
					
						|||
| 
						
						
							
						
						31cd979b61
	
				 | 
					
					
						|||
| 
						
						
							
						
						fce81c6003
	
				 | 
					
					
						|||
| 
						
						
							
						
						26d3cbd778
	
				 | 
					
					
						|||
| 
						
						
							
						
						fdc910f93b
	
				 | 
					
					
						|||
| 
						
						
							
						
						e0d514e797
	
				 | 
					
					
						|||
| 
						
						
							
						
						fd893d8c4e
	
				 | 
					
					
						|||
| 
						
						
							
						
						2263ac27e8
	
				 | 
					
					
						|||
| 
						
						
							
						
						cf7012d698
	
				 | 
					
					
						|||
| 
						
						
							
						
						7edc60e6ca
	
				 | 
					
					
						|||
| 
						
						
							
						
						fe2abc86c6
	
				 | 
					
					
						|||
| 
						
						
							
						
						e1d92ef2c7
	
				 | 
					
					
						|||
| 
						
						
							
						
						3e3c544cfa
	
				 | 
					
					
						|||
| 
						
						
							
						
						db9881faf6
	
				 | 
					
					
						|||
| 
						
						
							
						
						fa5fb60b5b
	
				 | 
					
					
						|||
| 
						
						
							
						
						44fb9a9f4d
	
				 | 
					
					
						|||
| 
						
						
							
						
						b790d5e4db
	
				 | 
					
					
						|||
| 
						
						
							
						
						08e7546a87
	
				 | 
					
					
						|||
| 
						
						
							
						
						ff076ecf50
	
				 | 
					
					
						|||
| 
						
						
							
						
						7a5dd1c094
	
				 | 
					
					
						
							
								
								
									
										19
									
								
								CHANGELOG.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										19
									
								
								CHANGELOG.md
									
									
									
									
									
										Normal file
									
								
							@@ -0,0 +1,19 @@
 | 
				
			|||||||
 | 
					# Changelog
 | 
				
			||||||
 | 
					All notable changes to this project will be documented in this file.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 | 
				
			||||||
 | 
					and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## [5.3] - 2020-08-07
 | 
				
			||||||
 | 
					### Changed
 | 
				
			||||||
 | 
					- Make sure `FixJpgJpgThumbnails` only replaces thumbnails where the original is less than ~100KiB
 | 
				
			||||||
 | 
					- Make sure `FixJpgJpgThumbnails` only replaces thumbnails if the item type is not `Infographic` (because the JPG in the ORIGINAL bundle is the "real" file and it's OK that the thumbnail is ".jpg.jpg")
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## [5.2] - 2020-08-06
 | 
				
			||||||
 | 
					### Changed
 | 
				
			||||||
 | 
					- Make `FixJpgJpgThumbnails` helper check for files named "JPG" as well as "jpg" (case insensitive)
 | 
				
			||||||
 | 
					- Make `FixJpgJpgThumbnails` helper replace thumbnails with description `IM Thumbnail` as well as `Generated Thumbnail`
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## [5.1] - 2020-08-06
 | 
				
			||||||
 | 
					### Added
 | 
				
			||||||
 | 
					- Add `FixJpgJpgThumbnails` helper to replace ".jpg.jpg" thumbnails with their originals
 | 
				
			||||||
							
								
								
									
										49
									
								
								README.md
									
									
									
									
									
								
							
							
						
						
									
										49
									
								
								README.md
									
									
									
									
									
								
							@@ -1,7 +1,8 @@
 | 
				
			|||||||
# DSpace Curation Tasks [](https://travis-ci.org/ilri/dspace-curation-tasks)
 | 
					# CGSpace Java Helpers [](https://travis-ci.org/ilri/dspace-curation-tasks)
 | 
				
			||||||
Metadata curation tasks used on the [CGSpace](https://cgspace.cgiar.org) institutional repository:
 | 
					DSpace curation tasks and other Java-based helpers used on the [CGSpace](https://cgspace.cgiar.org) institutional repository:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
- **CountryCodeTagger**: add ISO 3166-1 Alpha2 country codes to items based on their existing country metadata
 | 
					- **CountryCodeTagger**: add ISO 3166-1 Alpha2 country codes to items based on their existing country metadata
 | 
				
			||||||
 | 
					- **FixJpgJpgThumbnails**: Fix low-quality ".jpg.jpg" thumbnails by replacing them with their originals
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Tested on DSpace 5.8. Read more about the [DSpace curation system](https://wiki.lyrasis.org/display/DSDOC5x/Curation+System).
 | 
					Tested on DSpace 5.8. Read more about the [DSpace curation system](https://wiki.lyrasis.org/display/DSDOC5x/Curation+System).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
@@ -13,8 +14,8 @@ To use these curation tasks in a DSpace project add the following dependency to
 | 
				
			|||||||
```
 | 
					```
 | 
				
			||||||
<dependency>
 | 
					<dependency>
 | 
				
			||||||
  <groupId>io.github.ilri.cgspace</groupId>
 | 
					  <groupId>io.github.ilri.cgspace</groupId>
 | 
				
			||||||
  <artifactId>dspace-curation-tasks</artifactId>
 | 
					  <artifactId>cgspace-java-helpers</artifactId>
 | 
				
			||||||
  <version>1.0-SNAPSHOT</version>
 | 
					  <version>5.4-SNAPSHOT</version>
 | 
				
			||||||
</dependency>
 | 
					</dependency>
 | 
				
			||||||
```
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
@@ -30,55 +31,25 @@ $ mvn package
 | 
				
			|||||||
Copy the resulting jar to the DSpace `lib` directory:
 | 
					Copy the resulting jar to the DSpace `lib` directory:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
```
 | 
					```
 | 
				
			||||||
$ cp target/dspace-curation-tasks-1.0-SNAPSHOT.jar ~/dspace/lib/dspace-curation-tasks-1.0-SNAPSHOT.jar
 | 
					$ cp target/cgspace-java-helpers-5.4-SNAPSHOT.jar ~/dspace/lib
 | 
				
			||||||
```
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
## Configuration
 | 
					## Configuration
 | 
				
			||||||
Add the curation task to DSpace's `config/modules/curate.cfg`:
 | 
					Please refer to the appropriate README.md file:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
```
 | 
					- Curation Tasks: [src/main/java/io/github/ilri/cgspace/ctasks/README.md](https://github.com/ilri/cgspace-java-helpers/blob/dspace5/src/main/java/io/github/ilri/cgspace/ctasks/README.md)
 | 
				
			||||||
plugin.named.org.dspace.curate.CurationTask = \
 | 
					- Scripts: [src/main/java/io/github/ilri/cgspace/scripts/README.md](https://github.com/ilri/cgspace-java-helpers/blob/dspace5/src/main/java/io/github/ilri/cgspace/scripts/README.md)
 | 
				
			||||||
...
 | 
					 | 
				
			||||||
    io.github.ilri.cgspace.ctasks.CountryCodeTagger = countrycodetagger \
 | 
					 | 
				
			||||||
    io.github.ilri.cgspace.ctasks.CountryCodeTagger = countrycodetagger.force
 | 
					 | 
				
			||||||
```
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
And then add a configuration file for the task in `config/modules/countrycodetagger.cfg`:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
```
 | 
					 | 
				
			||||||
# name of the field containing ISO 3166-1 country names
 | 
					 | 
				
			||||||
iso3166.field = cg.coverage.country
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
# name of the field containing ISO 3166-1 Alpha2 country codes
 | 
					 | 
				
			||||||
iso3166-alpha2.field = cg.coverage.iso3166-alpha2
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
# only add country codes if an item doesn't have any (default false)
 | 
					 | 
				
			||||||
#forceupdate = false
 | 
					 | 
				
			||||||
```
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*Note*: DSpace's curation system supports "profiles" where you can use the same task with different options, for example above I have a normal country code tagger and a "force" variant. To use the "force" variant you create a new configuration file with the overridden options in `config/modules/countrycodetagger.force.cfg`. The "force" profile clears all existing country codes and updates everything.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
## Invocation
 | 
					 | 
				
			||||||
Once the jar is installed and you have added appropriate configuration in `~/dspace/config/modules`:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
```
 | 
					 | 
				
			||||||
$ ~/dspace/bin/dspace curate -t countrycodetagger -i 10568/3 -r - -l 500 -s object
 | 
					 | 
				
			||||||
```
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*Note*: it is very important to set the cache limit (`-l`) and the database transaction scope to something sensible (`object`) if you're curating a community or collection with more than a few hundred items.
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
## Notes
 | 
					## Notes
 | 
				
			||||||
This project was initially created according to the [Maven Getting Started Guide](https://maven.apache.org/guides/getting-started/):
 | 
					This project was initially created according to the [Maven Getting Started Guide](https://maven.apache.org/guides/getting-started/):
 | 
				
			||||||
 | 
					
 | 
				
			||||||
```console
 | 
					```console
 | 
				
			||||||
$ mvn -B archetype:generate -DgroupId=io.github.ilri.cgspace -DartifactId=dspace-curation-tasks -DarchetypeArtifactId=maven-archetype-quickstart -DarchetypeVersion=1.4
 | 
					$ mvn -B archetype:generate -DgroupId=io.github.ilri.cgspace -DartifactId=cgspace-java-helpers -DarchetypeArtifactId=maven-archetype-quickstart -DarchetypeVersion=1.4
 | 
				
			||||||
```
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
## TODO
 | 
					## TODO
 | 
				
			||||||
 | 
					
 | 
				
			||||||
- Make sure this doesn't work on items in the workflow
 | 
					- Make sure this doesn't work on items in the workflow
 | 
				
			||||||
- Port to DSpace 6
 | 
					 | 
				
			||||||
  - Remember to bump Gson version!
 | 
					 | 
				
			||||||
- Check for existence of metadata field before trying to add metadata
 | 
					- Check for existence of metadata field before trying to add metadata
 | 
				
			||||||
- Add tests
 | 
					- Add tests
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 
 | 
				
			|||||||
							
								
								
									
										14
									
								
								pom.xml
									
									
									
									
									
								
							
							
						
						
									
										14
									
								
								pom.xml
									
									
									
									
									
								
							@@ -5,11 +5,11 @@
 | 
				
			|||||||
  <modelVersion>4.0.0</modelVersion>
 | 
					  <modelVersion>4.0.0</modelVersion>
 | 
				
			||||||
 | 
					
 | 
				
			||||||
  <groupId>io.github.ilri.cgspace</groupId>
 | 
					  <groupId>io.github.ilri.cgspace</groupId>
 | 
				
			||||||
  <artifactId>dspace-curation-tasks</artifactId>
 | 
					  <artifactId>cgspace-java-helpers</artifactId>
 | 
				
			||||||
  <version>1.0-SNAPSHOT</version>
 | 
					  <version>5.4-SNAPSHOT</version>
 | 
				
			||||||
 | 
					
 | 
				
			||||||
  <name>dspace-curation-tasks</name>
 | 
					  <name>cgspace-java-helpers</name>
 | 
				
			||||||
  <url>https://github.com/ilri/dspace-curation-tasks</url>
 | 
					  <url>https://github.com/ilri/cgspace-java-helpers</url>
 | 
				
			||||||
 | 
					
 | 
				
			||||||
  <licenses>
 | 
					  <licenses>
 | 
				
			||||||
    <license>
 | 
					    <license>
 | 
				
			||||||
@@ -53,9 +53,9 @@
 | 
				
			|||||||
  </dependencies>
 | 
					  </dependencies>
 | 
				
			||||||
 | 
					
 | 
				
			||||||
  <scm>
 | 
					  <scm>
 | 
				
			||||||
      <connection>scm:git:git://github.com/ilri/dspace-curation-tasks.git</connection>
 | 
					      <connection>scm:git:git://github.com/ilri/cgspace-java-helpers.git</connection>
 | 
				
			||||||
      <developerConnection>scm:git:ssh://github.com:nanosai/dspace-curation-tasks.git</developerConnection>
 | 
					      <developerConnection>scm:git:ssh://github.com:nanosai/cgspace-java-helpers.git</developerConnection>
 | 
				
			||||||
      <url>http://github.com/ilri/dspace-curation-tasks</url>
 | 
					      <url>http://github.com/ilri/cgspace-java-helpers</url>
 | 
				
			||||||
  </scm>
 | 
					  </scm>
 | 
				
			||||||
 | 
					
 | 
				
			||||||
  <distributionManagement>
 | 
					  <distributionManagement>
 | 
				
			||||||
 
 | 
				
			|||||||
@@ -35,6 +35,11 @@ import java.sql.SQLException;
 | 
				
			|||||||
import java.util.ArrayList;
 | 
					import java.util.ArrayList;
 | 
				
			||||||
import java.util.List;
 | 
					import java.util.List;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					/**
 | 
				
			||||||
 | 
					 * @author Alan Orth for the International Livestock Research Institute
 | 
				
			||||||
 | 
					 * @version 5.1
 | 
				
			||||||
 | 
					 * @since 1.0
 | 
				
			||||||
 | 
					*/
 | 
				
			||||||
public class CountryCodeTagger extends AbstractCurationTask
 | 
					public class CountryCodeTagger extends AbstractCurationTask
 | 
				
			||||||
{
 | 
					{
 | 
				
			||||||
    public class CountryCodeTaggerConfig {
 | 
					    public class CountryCodeTaggerConfig {
 | 
				
			||||||
 
 | 
				
			|||||||
							
								
								
									
										74
									
								
								src/main/java/io/github/ilri/cgspace/ctasks/README.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										74
									
								
								src/main/java/io/github/ilri/cgspace/ctasks/README.md
									
									
									
									
									
										Normal file
									
								
							@@ -0,0 +1,74 @@
 | 
				
			|||||||
 | 
					# Curation Tasks
 | 
				
			||||||
 | 
					DSpace curation tasks used on the [CGSpace](https://cgspace.cgiar.org) institutional repository:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- **CountryCodeTagger**: add ISO 3166-1 Alpha2 country codes to items based on their existing country metadata
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Tested on DSpace 5.8. Read more about the [DSpace curation system](https://wiki.lyrasis.org/display/DSDOC5x/Curation+System).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Build and Install
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### Integrate into DSpace Build
 | 
				
			||||||
 | 
					To use these curation tasks in a DSpace project add the following dependency to `dspace/modules/additions/pom.xml`:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					<dependency>
 | 
				
			||||||
 | 
					  <groupId>io.github.ilri.cgspace</groupId>
 | 
				
			||||||
 | 
					  <artifactId>cgspace-java-helpers</artifactId>
 | 
				
			||||||
 | 
					  <version>5.3</version>
 | 
				
			||||||
 | 
					</dependency>
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The jar will be copied to all DSpace applications.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### Manual Build and Install
 | 
				
			||||||
 | 
					To build the standalone jar:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					$ mvn package
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Copy the resulting jar to the DSpace `lib` directory:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					$ cp target/cgspace-java-helpers-5.3.jar ~/dspace/lib
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Configuration
 | 
				
			||||||
 | 
					Add the curation task to DSpace's `config/modules/curate.cfg`:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					plugin.named.org.dspace.curate.CurationTask = \
 | 
				
			||||||
 | 
					...
 | 
				
			||||||
 | 
					    io.github.ilri.cgspace.ctasks.CountryCodeTagger = countrycodetagger \
 | 
				
			||||||
 | 
					    io.github.ilri.cgspace.ctasks.CountryCodeTagger = countrycodetagger.force
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					And then add a configuration file for the task in `config/modules/countrycodetagger.cfg`:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					# name of the field containing ISO 3166-1 country names
 | 
				
			||||||
 | 
					iso3166.field = cg.coverage.country
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					# name of the field containing ISO 3166-1 Alpha2 country codes
 | 
				
			||||||
 | 
					iso3166-alpha2.field = cg.coverage.iso3166-alpha2
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					# only add country codes if an item doesn't have any (default false)
 | 
				
			||||||
 | 
					#forceupdate = false
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					*Note*: DSpace's curation system supports "profiles" where you can use the same task with different options, for example above I have a normal country code tagger and a "force" variant. To use the "force" variant you create a new configuration file with the overridden options in `config/modules/countrycodetagger.force.cfg`. The "force" profile clears all existing country codes and updates everything.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Invocation
 | 
				
			||||||
 | 
					Once the jar is installed and you have added appropriate configuration in `~/dspace/config/modules`:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					$ ~/dspace/bin/dspace curate -t countrycodetagger -i 10568/3 -r - -l 500 -s object
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					*Note*: it is very important to set the cache limit (`-l`) and the database transaction scope to something sensible (`object`) if you're curating a community or collection with more than a few hundred items.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## TODO
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- Make sure this doesn't work on items in the workflow
 | 
				
			||||||
 | 
					- Check for existence of metadata field before trying to add metadata
 | 
				
			||||||
 | 
					- Add tests
 | 
				
			||||||
@@ -0,0 +1,132 @@
 | 
				
			|||||||
 | 
					package io.github.ilri.cgspace.scripts;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					import org.apache.commons.lang.StringUtils;
 | 
				
			||||||
 | 
					import org.dspace.authorize.AuthorizeException;
 | 
				
			||||||
 | 
					import org.dspace.content.*;
 | 
				
			||||||
 | 
					import org.dspace.core.Constants;
 | 
				
			||||||
 | 
					import org.dspace.core.Context;
 | 
				
			||||||
 | 
					import org.dspace.handle.HandleManager;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					import java.io.IOException;
 | 
				
			||||||
 | 
					import java.sql.SQLException;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					/**
 | 
				
			||||||
 | 
					 * @author Andrea Schweer schweer@waikato.ac.nz for the LCoNZ Institutional Research Repositories
 | 
				
			||||||
 | 
					 * @author Alan Orth for the International Livestock Research Institute
 | 
				
			||||||
 | 
					 * @version 5.4
 | 
				
			||||||
 | 
					 * @since 5.1
 | 
				
			||||||
 | 
					 */
 | 
				
			||||||
 | 
					public class FixJpgJpgThumbnails {
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
						public static void main(String[] args) {
 | 
				
			||||||
 | 
							String parentHandle = null;
 | 
				
			||||||
 | 
							if (args.length >= 1) {
 | 
				
			||||||
 | 
								parentHandle = args[0];
 | 
				
			||||||
 | 
							}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
							Context context = null;
 | 
				
			||||||
 | 
							try {
 | 
				
			||||||
 | 
								context = new Context();
 | 
				
			||||||
 | 
								context.turnOffAuthorisationSystem();
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
								if (StringUtils.isBlank(parentHandle)) {
 | 
				
			||||||
 | 
									process(context, Item.findAll(context));
 | 
				
			||||||
 | 
								} else {
 | 
				
			||||||
 | 
									DSpaceObject parent = HandleManager.resolveToObject(context, parentHandle);
 | 
				
			||||||
 | 
									if (parent != null) {
 | 
				
			||||||
 | 
										switch (parent.getType()) {
 | 
				
			||||||
 | 
											case Constants.COLLECTION:
 | 
				
			||||||
 | 
												process(context, ((Collection) parent).getAllItems()); // getAllItems because we want to work on non-archived ones as well
 | 
				
			||||||
 | 
												break;
 | 
				
			||||||
 | 
											case Constants.COMMUNITY:
 | 
				
			||||||
 | 
												Collection[] collections = ((Community) parent).getCollections();
 | 
				
			||||||
 | 
												for (Collection collection : collections) {
 | 
				
			||||||
 | 
													process(context, collection.getAllItems()); // getAllItems because we want to work on non-archived ones as well
 | 
				
			||||||
 | 
												}
 | 
				
			||||||
 | 
												break;
 | 
				
			||||||
 | 
											case Constants.SITE:
 | 
				
			||||||
 | 
												process(context, Item.findAll(context));
 | 
				
			||||||
 | 
												break;
 | 
				
			||||||
 | 
											case Constants.ITEM:
 | 
				
			||||||
 | 
												processItem((Item) parent);
 | 
				
			||||||
 | 
												context.commit();
 | 
				
			||||||
 | 
												break;
 | 
				
			||||||
 | 
										}
 | 
				
			||||||
 | 
									}
 | 
				
			||||||
 | 
								}
 | 
				
			||||||
 | 
							} catch (SQLException | AuthorizeException | IOException e) {
 | 
				
			||||||
 | 
								e.printStackTrace(System.err);
 | 
				
			||||||
 | 
							} finally {
 | 
				
			||||||
 | 
								if (context != null && context.isValid()) {
 | 
				
			||||||
 | 
									context.abort();
 | 
				
			||||||
 | 
								}
 | 
				
			||||||
 | 
							}
 | 
				
			||||||
 | 
						}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
						private static void process(Context context, ItemIterator items) throws SQLException, IOException, AuthorizeException {
 | 
				
			||||||
 | 
							while (items.hasNext()) {
 | 
				
			||||||
 | 
								Item item = items.next();
 | 
				
			||||||
 | 
								processItem(item);
 | 
				
			||||||
 | 
								context.commit();
 | 
				
			||||||
 | 
								item.decache();
 | 
				
			||||||
 | 
							}
 | 
				
			||||||
 | 
						}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
						private static void processItem(Item item) throws SQLException, AuthorizeException, IOException {
 | 
				
			||||||
 | 
							// Some bitstreams like Infographics are large JPGs and put in the ORIGINAL bundle on purpose so we shouldn't
 | 
				
			||||||
 | 
							// swap them.
 | 
				
			||||||
 | 
							Metadatum[] itemTypes = item.getMetadataByMetadataString("dc.type");
 | 
				
			||||||
 | 
							boolean itemHasInfographic = false;
 | 
				
			||||||
 | 
							for (Metadatum itemType: itemTypes) {
 | 
				
			||||||
 | 
								if (itemType.value.equals("Infographic")) {
 | 
				
			||||||
 | 
									itemHasInfographic = true;
 | 
				
			||||||
 | 
									break;
 | 
				
			||||||
 | 
								}
 | 
				
			||||||
 | 
							}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
							Bundle[] thumbnailBundles = item.getBundles("THUMBNAIL");
 | 
				
			||||||
 | 
							for (Bundle thumbnailBundle : thumbnailBundles) {
 | 
				
			||||||
 | 
								Bitstream[] thumbnailBundleBitstreams = thumbnailBundle.getBitstreams();
 | 
				
			||||||
 | 
								for (Bitstream thumbnailBitstream : thumbnailBundleBitstreams) {
 | 
				
			||||||
 | 
									String thumbnailName = thumbnailBitstream.getName();
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
									if (thumbnailName.toLowerCase().contains(".jpg.jpg")) {
 | 
				
			||||||
 | 
										Bundle[] originalBundles = item.getBundles("ORIGINAL");
 | 
				
			||||||
 | 
										for (Bundle originalBundle : originalBundles) {
 | 
				
			||||||
 | 
											Bitstream[] originalBundleBitstreams = originalBundle.getBitstreams();
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
											for (Bitstream originalBitstream : originalBundleBitstreams) {
 | 
				
			||||||
 | 
												String originalName = originalBitstream.getName();
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
												long originalBitstreamBytes = originalBitstream.getSize();
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
												/*
 | 
				
			||||||
 | 
												- check if the original file name is the same as the thumbnail name minus the extra ".jpg"
 | 
				
			||||||
 | 
												- check if the thumbnail description indicates it was automatically generated
 | 
				
			||||||
 | 
												- check if the item has dc.type Infographic (JPG could be the "real" item!)
 | 
				
			||||||
 | 
												- check if the original bitstream is less than ~100KiB
 | 
				
			||||||
 | 
												    - Note: in my tests there were 4022 items with ".jpg.jpg" thumbnails totaling 394549249
 | 
				
			||||||
 | 
												      bytes for an average of about 98KiB so ~100KiB seems like a good cut off
 | 
				
			||||||
 | 
												*/
 | 
				
			||||||
 | 
												if (
 | 
				
			||||||
 | 
														originalName.equalsIgnoreCase(StringUtils.removeEndIgnoreCase(thumbnailName, ".jpg"))
 | 
				
			||||||
 | 
														&& ("Generated Thumbnail".equals(thumbnailBitstream.getDescription()) || "IM Thumbnail".equals(thumbnailBitstream.getDescription()))
 | 
				
			||||||
 | 
														&& !itemHasInfographic
 | 
				
			||||||
 | 
														&& originalBitstreamBytes < 100000
 | 
				
			||||||
 | 
												) {
 | 
				
			||||||
 | 
													System.out.println(item.getHandle() + ": replacing " + thumbnailName + " with " + originalName);
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
													//add the original bitstream to the THUMBNAIL bundle
 | 
				
			||||||
 | 
													thumbnailBundle.addBitstream(originalBitstream);
 | 
				
			||||||
 | 
													//remove the original bitstream from the ORIGINAL bundle
 | 
				
			||||||
 | 
													originalBundle.removeBitstream(originalBitstream);
 | 
				
			||||||
 | 
													//remove the JpgJpg bitstream from the THUMBNAIL bundle
 | 
				
			||||||
 | 
													thumbnailBundle.removeBitstream(thumbnailBitstream);
 | 
				
			||||||
 | 
												}
 | 
				
			||||||
 | 
											}
 | 
				
			||||||
 | 
										}
 | 
				
			||||||
 | 
									}
 | 
				
			||||||
 | 
								}
 | 
				
			||||||
 | 
							}
 | 
				
			||||||
 | 
						}
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
							
								
								
									
										41
									
								
								src/main/java/io/github/ilri/cgspace/scripts/README.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										41
									
								
								src/main/java/io/github/ilri/cgspace/scripts/README.md
									
									
									
									
									
										Normal file
									
								
							@@ -0,0 +1,41 @@
 | 
				
			|||||||
 | 
					# Scripts
 | 
				
			||||||
 | 
					Java-based helpers used on the [CGSpace](https://cgspace.cgiar.org) institutional repository:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- **FixJpgJpgThumbnails**: Fix low-quality ".jpg.jpg" thumbnails by replacing them with their originals
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Tested on DSpace 5.8. Read more about the [DSpace curation system](https://wiki.lyrasis.org/display/DSDOC5x/Curation+System).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Build and Install
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### Integrate into DSpace Build
 | 
				
			||||||
 | 
					To use these curation tasks in a DSpace project add the following dependency to `dspace/modules/additions/pom.xml`:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					<dependency>
 | 
				
			||||||
 | 
					  <groupId>io.github.ilri.cgspace</groupId>
 | 
				
			||||||
 | 
					  <artifactId>cgspace-java-helpers</artifactId>
 | 
				
			||||||
 | 
					  <version>5.3</version>
 | 
				
			||||||
 | 
					</dependency>
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The jar will be copied to all DSpace applications.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### Manual Build and Install
 | 
				
			||||||
 | 
					To build the standalone jar:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					$ mvn package
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Copy the resulting jar to the DSpace `lib` directory:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					$ cp target/cgspace-java-helpers-5.3.jar ~/dspace/lib
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### Invocation
 | 
				
			||||||
 | 
					The script only takes one argument, which is a community, collection, or item:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					$ dspace dsrun io.github.ilri.cgspace.scripts.FixJpgJpgThumbnails 10568/83389
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
		Reference in New Issue
	
	Block a user