31 Commits

Author SHA1 Message Date
b996b1c0b7 .github/workflows/maven.yml: use latest actions 2025-11-05 10:47:02 +03:00
f5a336ae78 .github/workflows/maven.yml: run on ubuntu-24.04 2025-11-05 10:46:12 +03:00
1d46bea610 .github/workflows/maven.yml: only run on dspace8 branch 2025-11-05 10:45:38 +03:00
8519d5ec04 Version 8.2.0
Tested on DSpace 8.2.
2025-11-05 10:40:26 +03:00
1973ecb85e Add new RemoveGeneratedThumbnails script 2025-11-05 10:39:29 +03:00
e16988aad7 pom.xml: migrate from ossrh to central
Here we use central-publishing-maven-plugin, which uses the correct
server for releases and snapshots automatically without the use of
the <distributionManagement> configuration.

Also, for some reason the deploy was still trying to use and older
maven-deploy-plugin until I removed the <pluginManagement> section.

See: https://central.sonatype.org/pages/ossrh-eol/
2025-11-05 10:27:55 +03:00
fe11add9f2 pom.xml: set Java version to 17 2025-11-05 10:27:55 +03:00
1336610d57 pom.xml: update for DSpace 8.2 2025-11-05 10:27:54 +03:00
13c6612c7f Update gson to version used by dspace-api 2025-04-12 20:19:50 +03:00
813517c789 README.md: bump tested version 2025-02-13 09:51:09 +03:00
5f9490e4e5 Use dspace-api 7.6.3 2025-02-12 15:16:45 +03:00
9a46416331 Use gson 2.10.1
Prevent dependency convergence.
2025-01-28 16:19:50 +03:00
2be5c62d92 CHANGELOG.md: add changes 2025-01-27 16:03:52 +03:00
2bd7d5e679 src/main: update DSDOC links 2025-01-27 16:03:12 +03:00
70cf68b8bc Update tested on versions 2025-01-27 16:01:39 +03:00
4f81e1e17e pom.xml: use gson >= 2.10
This is used by dspace-api 7.6.2+.
2025-01-27 13:24:46 +03:00
5113a91257 pom.xml: use dspace-api 7.6.2 2025-01-27 13:24:17 +03:00
3c36452891 Update "tested on" versions. 2024-06-26 16:45:11 +03:00
3a860dabe4 Update install instructions 2024-06-26 16:42:30 +03:00
5f44c9ea8a README.md: remove TODO about migrating to nexus-staging-maven-plugin 2024-06-26 16:40:47 +03:00
32a14c0ea5 pom.xml: replace maven-deploy-plugin
The nexus-staging-maven-plugin replaces maven-deploy-plugin. I am
not sure if my configuration is correct yet.

See: https://github.com/sonatype/nexus-maven-plugins/tree/main/staging/maven-plugin
2024-06-26 16:29:35 +03:00
13d3dfb885 pom.xml: add more information
Add description and developers section to satisfy requirements.

See: https://central.sonatype.org/publish/requirements/
2024-06-26 16:10:36 +03:00
1e7df1ce46 Remove use of oss-parent
This is boilerplate that came from setting up the project and has
been deprecated for several years.
2024-06-26 16:04:14 +03:00
443e5576ab Bump version to 7.6.1.4-SNAPSHOT 2024-06-26 15:02:07 +03:00
8531992412 Version 7.6.1.3 2024-06-26 15:00:25 +03:00
27016f5f77 CHANGELOG.md: add unreleased notes 2024-06-26 14:12:37 +03:00
3a583c4f86 src/main/java: more DOI normalization
Normalize %2f to /.
2024-06-26 12:46:08 +03:00
28668f76c9 src/main: remove numbered comments in NormalizeDOIs 2024-06-25 11:55:36 +03:00
e0153fd38a src/main: add more DOI formats to NormalizeDOIs
I saw some DOIs like "www.doi.org" in our repository recently.
2024-06-25 11:42:37 +03:00
12a606ac61 pom.xml: bump version to 7.6.1.3-SNAPSHOT 2024-05-14 12:47:47 +03:00
692a62b454 src/main/java: update curation tasks README.md
Add eperson ID to curation invocation. DSpace 7 requires this.
2024-04-29 09:33:39 +03:00
8 changed files with 257 additions and 94 deletions

View File

@@ -5,19 +5,19 @@ name: Build
on: on:
push: push:
branches: [ dspace7 ] branches: [ dspace8 ]
pull_request: pull_request:
branches: [ dspace7 ] branches: [ dspace8 ]
jobs: jobs:
build: build:
runs-on: ubuntu-22.04 runs-on: ubuntu-24.04
steps: steps:
- uses: actions/checkout@v4 - uses: actions/checkout@v5
- name: Set up JDK 17 - name: Set up JDK 17
uses: actions/setup-java@v4 uses: actions/setup-java@v5
with: with:
java-version: 17 java-version: 17
distribution: 'temurin' distribution: 'temurin'

View File

@@ -4,6 +4,19 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [8.2.0] - 2025-09-16
### Added
- New `RemoveGeneratedThumbnails` script
### Updated
- Update dspace-api dependency to 8.2
- Update gson dependency to 2.13.1 to match dspace-api
- Publish to Maven Central instead of OSSRH
## [7.6.1.3] - 2024-06-26
### Updated
- Add more formats to `NormalizeDOIs` curation task
## [7.6.1.2] - 2024-04-25 ## [7.6.1.2] - 2024-04-25
### Changed ### Changed
- Remove reporting from curation tasks since "results" are enough - Remove reporting from curation tasks since "results" are enough

View File

@@ -4,9 +4,10 @@ DSpace curation tasks and other Java-based helpers used on the [CGSpace](https:/
- **CountryCodeTagger**: add ISO 3166-1 Alpha2 country codes to items based on their existing country metadata - **CountryCodeTagger**: add ISO 3166-1 Alpha2 country codes to items based on their existing country metadata
- **FixJpgJpgThumbnails**: fix low-quality ".jpg.jpg" thumbnails by replacing them with their originals - **FixJpgJpgThumbnails**: fix low-quality ".jpg.jpg" thumbnails by replacing them with their originals
- **FixLowQualityThumbnails**: remove low-quality thumbnails when PDF bitstreams are present - **FixLowQualityThumbnails**: remove low-quality thumbnails when PDF bitstreams are present
- **RemoveGeneratedThumbnails**: remove generated thumbnails (in preparation for re-generating)
- **NormalizeDOIs**: normalize DOIs by stripping whitespace, lowercasing, and converting to https://doi.org/ format - **NormalizeDOIs**: normalize DOIs by stripping whitespace, lowercasing, and converting to https://doi.org/ format
Tested on DSpace 7.6. Read more about the [DSpace curation system](https://wiki.lyrasis.org/display/DSDOC7x/Curation+System). Tested on DSpace 8.2. Read more about the [DSpace curation system](https://wiki.lyrasis.org/display/DSDOC8x/Curation+System).
## Build and Install ## Build and Install
@@ -17,7 +18,7 @@ To use these curation tasks in a DSpace project add the following dependency to
<dependency> <dependency>
<groupId>io.github.ilri.cgspace</groupId> <groupId>io.github.ilri.cgspace</groupId>
<artifactId>cgspace-java-helpers</artifactId> <artifactId>cgspace-java-helpers</artifactId>
<version>7.6.1.2-SNAPSHOT</version> <version>8.2.0-SNAPSHOT</version>
</dependency> </dependency>
``` ```
@@ -33,7 +34,7 @@ $ mvn package
Copy the resulting jar to the DSpace `lib` directory: Copy the resulting jar to the DSpace `lib` directory:
```console ```console
$ cp target/cgspace-java-helpers-7.6.1.2-SNAPSHOT.jar ~/dspace/lib/ $ cp target/cgspace-java-helpers-8.2.0-SNAPSHOT.jar ~/dspace/lib/
``` ```
## Configuration ## Configuration
@@ -42,11 +43,6 @@ Please refer to the appropriate README.md file:
- Curation Tasks: [src/main/java/io/github/ilri/cgspace/ctasks/README.md](https://github.com/ilri/cgspace-java-helpers/blob/dspace7/src/main/java/io/github/ilri/cgspace/ctasks/README.md) - Curation Tasks: [src/main/java/io/github/ilri/cgspace/ctasks/README.md](https://github.com/ilri/cgspace-java-helpers/blob/dspace7/src/main/java/io/github/ilri/cgspace/ctasks/README.md)
- Scripts: [src/main/java/io/github/ilri/cgspace/scripts/README.md](https://github.com/ilri/cgspace-java-helpers/blob/dspace7/src/main/java/io/github/ilri/cgspace/scripts/README.md) - Scripts: [src/main/java/io/github/ilri/cgspace/scripts/README.md](https://github.com/ilri/cgspace-java-helpers/blob/dspace7/src/main/java/io/github/ilri/cgspace/scripts/README.md)
## TODO
- Migrate from maven-deploy-plugin to nexus-staging-maven-plugin, see: https://central.sonatype.org/publish/publish-maven/#nexus-staging-maven-plugin-for-deployment-and-release
- Stop using oss-parent, see: https://central.sonatype.org/publish/publish-maven/#create-a-ticket-with-sonatype
## Notes ## Notes
This project was initially created according to the [Maven Getting Started Guide](https://maven.apache.org/guides/getting-started/): This project was initially created according to the [Maven Getting Started Guide](https://maven.apache.org/guides/getting-started/):

58
pom.xml
View File

@@ -6,10 +6,19 @@
<groupId>io.github.ilri.cgspace</groupId> <groupId>io.github.ilri.cgspace</groupId>
<artifactId>cgspace-java-helpers</artifactId> <artifactId>cgspace-java-helpers</artifactId>
<version>7.6.1.2-SNAPSHOT</version> <version>8.2.0-SNAPSHOT</version>
<name>cgspace-java-helpers</name> <name>cgspace-java-helpers</name>
<url>https://github.com/ilri/cgspace-java-helpers</url> <url>https://github.com/ilri/cgspace-java-helpers</url>
<description>Curation tasks and helper scripts for the CGSpace institutional repository</description>
<developers>
<developer>
<name>Alan Orth</name>
<email>maven@mjanja.mozmail.com</email>
<organizationUrl>https://mjanja.ch</organizationUrl>
</developer>
</developers>
<licenses> <licenses>
<license> <license>
@@ -18,29 +27,28 @@
</license> </license>
</licenses> </licenses>
<!-- brings the sonatype snapshot repository and signing requirement on board -->
<parent>
<groupId>org.sonatype.oss</groupId>
<artifactId>oss-parent</artifactId>
<version>9</version>
<relativePath />
</parent>
<properties> <properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<maven.compiler.release>11</maven.compiler.release> <maven.compiler.release>17</maven.compiler.release>
</properties> </properties>
<dependencies> <dependencies>
<dependency> <dependency>
<groupId>com.google.code.gson</groupId> <groupId>com.google.code.gson</groupId>
<artifactId>gson</artifactId> <artifactId>gson</artifactId>
<version>2.9.0</version> <version>2.13.1</version>
<!-- Ignore gson's dependency on error_prone_annotations because it causes dependency convergence with something pulled in by dspace-api -->
<exclusions>
<exclusion>
<groupId>com.google.errorprone</groupId>
<artifactId>error_prone_annotations</artifactId>
</exclusion>
</exclusions>
</dependency> </dependency>
<dependency> <dependency>
<groupId>org.dspace</groupId> <groupId>org.dspace</groupId>
<artifactId>dspace-api</artifactId> <artifactId>dspace-api</artifactId>
<version>7.6.1</version> <version>8.2</version>
<scope>provided</scope> <scope>provided</scope>
</dependency> </dependency>
</dependencies> </dependencies>
@@ -51,19 +59,7 @@
<url>https://github.com/ilri/cgspace-java-helpers</url> <url>https://github.com/ilri/cgspace-java-helpers</url>
</scm> </scm>
<distributionManagement>
<snapshotRepository>
<id>ossrh</id>
<url>https://oss.sonatype.org/content/repositories/snapshots</url>
</snapshotRepository>
<repository>
<id>ossrh</id>
<url>https://oss.sonatype.org/service/local/staging/deploy/maven2</url>
</repository>
</distributionManagement>
<build> <build>
<pluginManagement><!-- lock down plugins versions to avoid using Maven defaults (may be moved to parent pom) -->
<plugins> <plugins>
<!-- clean lifecycle, see https://maven.apache.org/ref/current/maven-core/lifecycles.html#clean_Lifecycle --> <!-- clean lifecycle, see https://maven.apache.org/ref/current/maven-core/lifecycles.html#clean_Lifecycle -->
<plugin> <plugin>
@@ -91,10 +87,6 @@
<artifactId>maven-install-plugin</artifactId> <artifactId>maven-install-plugin</artifactId>
<version>3.1.1</version> <version>3.1.1</version>
</plugin> </plugin>
<plugin>
<artifactId>maven-deploy-plugin</artifactId>
<version>3.1.1</version>
</plugin>
<!-- site lifecycle, see https://maven.apache.org/ref/current/maven-core/lifecycles.html#site_Lifecycle --> <!-- site lifecycle, see https://maven.apache.org/ref/current/maven-core/lifecycles.html#site_Lifecycle -->
<plugin> <plugin>
<artifactId>maven-site-plugin</artifactId> <artifactId>maven-site-plugin</artifactId>
@@ -104,8 +96,16 @@
<artifactId>maven-project-info-reports-plugin</artifactId> <artifactId>maven-project-info-reports-plugin</artifactId>
<version>3.5.0</version> <version>3.5.0</version>
</plugin> </plugin>
<plugin>
<groupId>org.sonatype.central</groupId>
<artifactId>central-publishing-maven-plugin</artifactId>
<version>0.8.0</version>
<extensions>true</extensions>
<configuration>
<publishingServerId>central</publishingServerId>
</configuration>
</plugin>
</plugins> </plugins>
</pluginManagement>
</build> </build>
<repositories> <repositories>

View File

@@ -29,7 +29,7 @@ import java.util.List;
* TODO: allow operation on communities and collections (currently only works on items) * TODO: allow operation on communities and collections (currently only works on items)
* *
* @author Alan Orth for the International Livestock Research Institute * @author Alan Orth for the International Livestock Research Institute
* @version 7.6.1.2 * @version 7.6.1.3
* @since 7.6.1.1 * @since 7.6.1.1
*/ */
@Suspendable @Suspendable
@@ -78,17 +78,21 @@ public class NormalizeDOIs extends AbstractCurationTask {
} }
private static String getNormalizedDOI(MetadataValue itemDOI) { private static String getNormalizedDOI(MetadataValue itemDOI) {
// 1. Convert to lowercase // Convert to lowercase
String newDOI = itemDOI.getValue().toLowerCase(); String newDOI = itemDOI.getValue().toLowerCase();
// 2. Strip leading and trailing whitespace // Strip leading and trailing whitespace
newDOI = newDOI.strip(); newDOI = newDOI.strip();
// 3. Convert to HTTPS // Convert to HTTPS
newDOI = newDOI.replace("http://", "https://"); newDOI = newDOI.replace("http://", "https://");
// 4. Prefer doi.org to dx.doi.org // Prefer doi.org to dx.doi.org
newDOI = newDOI.replace("dx.doi.org", "doi.org"); newDOI = newDOI.replace("dx.doi.org", "doi.org");
// 5. Replace values like doi: 10.11648/j.jps.20140201.14 // Prefer doi.org to www.doi.org
newDOI = newDOI.replace("www.doi.org", "doi.org");
// Fix URL encoded slashes (%2f)
newDOI = newDOI.replace("%2f", "/");
// Replace values like doi: 10.11648/j.jps.20140201.14
newDOI = newDOI.replaceAll("^doi: 10\\.", "https://doi.org/10."); newDOI = newDOI.replaceAll("^doi: 10\\.", "https://doi.org/10.");
// 6. Replace values like 10.3390/foods12010115 // Replace values like 10.3390/foods12010115
newDOI = newDOI.replaceAll("^10\\.", "https://doi.org/10."); newDOI = newDOI.replaceAll("^10\\.", "https://doi.org/10.");
return newDOI; return newDOI;

View File

@@ -4,7 +4,7 @@ DSpace curation tasks used on the [CGSpace](https://cgspace.cgiar.org) instituti
- **CountryCodeTagger**: add ISO 3166-1 Alpha2 country codes to items based on their existing country metadata - **CountryCodeTagger**: add ISO 3166-1 Alpha2 country codes to items based on their existing country metadata
- **NormalizeDOIs**: normalize DOIs by stripping whitespace, lowercasing, and converting to https://doi.org/ format - **NormalizeDOIs**: normalize DOIs by stripping whitespace, lowercasing, and converting to https://doi.org/ format
Tested on DSpace 7.6. Read more about the [DSpace curation system](https://wiki.lyrasis.org/display/DSDOC5x/Curation+System). Tested on DSpace 8.2. Read more about the [DSpace curation system](https://wiki.lyrasis.org/display/DSDOC8x/Curation+System).
## Build and Install ## Build and Install
@@ -15,7 +15,7 @@ To use these curation tasks in a DSpace project add the following dependency to
<dependency> <dependency>
<groupId>io.github.ilri.cgspace</groupId> <groupId>io.github.ilri.cgspace</groupId>
<artifactId>cgspace-java-helpers</artifactId> <artifactId>cgspace-java-helpers</artifactId>
<version>7.6.1.2-SNAPSHOT</version> <version>8.2.0-SNAPSHOT</version>
</dependency> </dependency>
``` ```
@@ -31,7 +31,7 @@ $ mvn package
Copy the resulting jar to the DSpace `lib` directory: Copy the resulting jar to the DSpace `lib` directory:
``` ```
$ cp target/cgspace-java-helpers-7.6.1.2-SNAPSHOT.jar ~/dspace/lib/ $ cp target/cgspace-java-helpers-8.2.0-SNAPSHOT.jar ~/dspace/lib/
``` ```
## Configuration ## Configuration
@@ -43,7 +43,7 @@ plugin.named.org.dspace.curate.CurationTask = io.github.ilri.cgspace.ctasks.Coun
plugin.named.org.dspace.curate.CurationTask = io.github.ilri.cgspace.ctasks.NormalizeDOIs = normalizedois plugin.named.org.dspace.curate.CurationTask = io.github.ilri.cgspace.ctasks.NormalizeDOIs = normalizedois
``` ```
And then add the following variables to your `local.cfg` or some other [configuration file that is included](https://wiki.lyrasis.org/display/DSDOC6x/Configuration+Reference#ConfigurationReference-IncludingotherPropertyFiles): And then add the following variables to your `local.cfg` or some other [configuration file that is included](https://wiki.lyrasis.org/display/DSDOC7x/Configuration+Reference#ConfigurationReference-IncludingotherPropertyFiles):
``` ```
# name of the field containing ISO 3166-1 country names # name of the field containing ISO 3166-1 country names
@@ -62,7 +62,7 @@ countrycodetagger.iso3166-alpha2.field = cg.coverage.iso3166-alpha2
Once the jar is installed and you have added appropriate configuration in `~/dspace/config/modules`: Once the jar is installed and you have added appropriate configuration in `~/dspace/config/modules`:
``` ```
$ ~/dspace/bin/dspace curate -t countrycodetagger -i 10568/3 -r - -s object $ ~/dspace/bin/dspace curate -e eperson@repo.org -t countrycodetagger -i 10568/3 -r - -s object
``` ```
*Note*: it is very important to set the database transaction scope to something sensible (`object`) if you're curating a community or collection with more than a few hundred items. *Note*: it is very important to set the database transaction scope to something sensible (`object`) if you're curating a community or collection with more than a few hundred items.

View File

@@ -3,8 +3,9 @@ Java-based helpers used on the [CGSpace](https://cgspace.cgiar.org) institutiona
- **FixJpgJpgThumbnails**: fix low-quality ".jpg.jpg" thumbnails by replacing them with their originals - **FixJpgJpgThumbnails**: fix low-quality ".jpg.jpg" thumbnails by replacing them with their originals
- **FixLowQualityThumbnails**: remove low-quality thumbnails when PDF bitstreams are present - **FixLowQualityThumbnails**: remove low-quality thumbnails when PDF bitstreams are present
- **RemoveGeneratedThumbnails**: remove generated thumbnails (in preparation for re-generating)
Tested on DSpace 7.6. Read more about the [DSpace curation system](https://wiki.lyrasis.org/display/DSDOC6x/Curation+System). Tested on DSpace 8.2. Read more about the [DSpace curation system](https://wiki.lyrasis.org/display/DSDOC8x/Curation+System).
## Build and Install ## Build and Install
@@ -15,7 +16,7 @@ To use these curation tasks in a DSpace project add the following dependency to
<dependency> <dependency>
<groupId>io.github.ilri.cgspace</groupId> <groupId>io.github.ilri.cgspace</groupId>
<artifactId>cgspace-java-helpers</artifactId> <artifactId>cgspace-java-helpers</artifactId>
<version>7.6.1.2-SNAPSHOT</version> <version>8.2.0-SNAPSHOT</version>
</dependency> </dependency>
``` ```
@@ -31,7 +32,7 @@ $ mvn package
Copy the resulting jar to the DSpace `lib` directory: Copy the resulting jar to the DSpace `lib` directory:
```console ```console
$ cp target/cgspace-java-helpers-7.6.1.2-SNAPSHOT.jar ~/dspace/lib/ $ cp target/cgspace-java-helpers-8.2.0-SNAPSHOT.jar ~/dspace/lib/
``` ```
## Invocation ## Invocation

View File

@@ -0,0 +1,149 @@
/*
* Copyright (C) 2025 Alan Orth
*
* SPDX-License-Identifier: GPL-3.0-only
*/
package io.github.ilri.cgspace.scripts;
import org.apache.commons.lang.StringUtils;
import org.dspace.authorize.AuthorizeException;
import org.dspace.content.Bitstream;
import org.dspace.content.Bundle;
import org.dspace.content.Collection;
import org.dspace.content.Community;
import org.dspace.content.DSpaceObject;
import org.dspace.content.Item;
import org.dspace.content.MetadataValue;
import org.dspace.content.factory.ContentServiceFactory;
import org.dspace.content.service.BundleService;
import org.dspace.content.service.ItemService;
import org.dspace.core.Constants;
import org.dspace.core.Context;
import org.dspace.handle.factory.HandleServiceFactory;
import org.dspace.handle.service.HandleService;
import java.io.IOException;
import java.sql.SQLException;
import java.util.Iterator;
import java.util.List;
/**
* @author Andrea Schweer schweer@waikato.ac.nz for the LCoNZ Institutional Research Repositories
* @author Alan Orth for the International Livestock Research Institute
* @version 8.2.0
* @since 8.2.0
*/
public class RemoveGeneratedThumbnails {
// note: static members belong to the class itself, not any one instance
public static ItemService itemService = ContentServiceFactory.getInstance().getItemService();
public static HandleService handleService =
HandleServiceFactory.getInstance().getHandleService();
public static BundleService bundleService =
ContentServiceFactory.getInstance().getBundleService();
public static void main(String[] args) {
String parentHandle = null;
if (args.length >= 1) {
parentHandle = args[0];
}
Context context = null;
try {
context = new Context();
context.turnOffAuthorisationSystem();
if (StringUtils.isBlank(parentHandle)) {
process(context, itemService.findAll(context));
} else {
DSpaceObject parent = handleService.resolveToObject(context, parentHandle);
if (parent != null) {
switch (parent.getType()) {
case Constants.SITE:
process(context, itemService.findAll(context));
context.commit();
break;
case Constants.COMMUNITY:
List<Collection> collections = ((Community) parent).getCollections();
for (Collection collection : collections) {
process(
context,
itemService.findAllByCollection(context, collection));
}
context.commit();
break;
case Constants.COLLECTION:
process(
context,
itemService.findByCollection(context, (Collection) parent));
context.commit();
break;
case Constants.ITEM:
processItem(context, (Item) parent);
context.commit();
break;
}
}
}
} catch (SQLException | AuthorizeException | IOException e) {
e.printStackTrace(System.err);
} finally {
if (context != null && context.isValid()) {
context.abort();
}
}
}
private static void process(Context context, Iterator<Item> items)
throws SQLException, IOException, AuthorizeException {
while (items.hasNext()) {
Item item = items.next();
processItem(context, item);
itemService.update(context, item);
}
}
private static void processItem(Context context, Item item)
throws SQLException, AuthorizeException, IOException {
List<Bundle> thumbnailBundles = item.getBundles("THUMBNAIL");
for (Bundle thumbnailBundle : thumbnailBundles) {
List<Bitstream> thumbnailBundleBitstreams = thumbnailBundle.getBitstreams();
for (Bitstream thumbnailBitstream : thumbnailBundleBitstreams) {
String thumbnailName = thumbnailBitstream.getName();
String thumbnailDescription = thumbnailBitstream.getDescription();
// There is no point continuing if the thumbnail's description is empty or null
if (StringUtils.isEmpty(thumbnailDescription)) {
continue;
}
if (thumbnailName.toLowerCase().endsWith(".pdf.jpg")) {
List<Bundle> originalBundles = item.getBundles("ORIGINAL");
for (Bundle originalBundle : originalBundles) {
List<Bitstream> originalBundleBitstreams = originalBundle.getBitstreams();
for (Bitstream originalBitstream : originalBundleBitstreams) {
String originalName = originalBitstream.getName();
/*
- check if the original file name is the same as the thumbnail name minus the extra ".jpg"
- check if the thumbnail description indicates it was automatically generated
*/
if (originalName.equalsIgnoreCase(
StringUtils.removeEndIgnoreCase(thumbnailName, ".jpg"))
&& ("Generated Thumbnail".equals(thumbnailDescription)
|| "IM Thumbnail".equals(thumbnailDescription))) {
System.out.println(
item.getHandle()
+ ": removing "
+ thumbnailName);
thumbnailBundle.removeBitstream(thumbnailBitstream);
}
}
}
}
}
}
}
}