mirror of
https://github.com/ilri/cgspace-java-helpers.git
synced 2024-11-26 00:28:20 +01:00
Compare commits
12 Commits
12a606ac61
...
3c36452891
Author | SHA1 | Date | |
---|---|---|---|
3c36452891 | |||
3a860dabe4 | |||
5f44c9ea8a | |||
32a14c0ea5 | |||
13d3dfb885 | |||
1e7df1ce46 | |||
443e5576ab | |||
8531992412 | |||
27016f5f77 | |||
3a583c4f86 | |||
28668f76c9 | |||
e0153fd38a |
@ -4,6 +4,12 @@ All notable changes to this project will be documented in this file.
|
|||||||
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
||||||
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
||||||
|
|
||||||
|
## Unreleased
|
||||||
|
|
||||||
|
## [7.6.1.3] - 2024-06-26
|
||||||
|
### Updated
|
||||||
|
- Add more formats to `NormalizeDOIs` curation task
|
||||||
|
|
||||||
## [7.6.1.2] - 2024-04-25
|
## [7.6.1.2] - 2024-04-25
|
||||||
### Changed
|
### Changed
|
||||||
- Remove reporting from curation tasks since "results" are enough
|
- Remove reporting from curation tasks since "results" are enough
|
||||||
|
11
README.md
11
README.md
@ -6,7 +6,7 @@ DSpace curation tasks and other Java-based helpers used on the [CGSpace](https:/
|
|||||||
- **FixLowQualityThumbnails**: remove low-quality thumbnails when PDF bitstreams are present
|
- **FixLowQualityThumbnails**: remove low-quality thumbnails when PDF bitstreams are present
|
||||||
- **NormalizeDOIs**: normalize DOIs by stripping whitespace, lowercasing, and converting to https://doi.org/ format
|
- **NormalizeDOIs**: normalize DOIs by stripping whitespace, lowercasing, and converting to https://doi.org/ format
|
||||||
|
|
||||||
Tested on DSpace 7.6. Read more about the [DSpace curation system](https://wiki.lyrasis.org/display/DSDOC7x/Curation+System).
|
Tested on DSpace 7.6.1. Read more about the [DSpace curation system](https://wiki.lyrasis.org/display/DSDOC7x/Curation+System).
|
||||||
|
|
||||||
## Build and Install
|
## Build and Install
|
||||||
|
|
||||||
@ -17,7 +17,7 @@ To use these curation tasks in a DSpace project add the following dependency to
|
|||||||
<dependency>
|
<dependency>
|
||||||
<groupId>io.github.ilri.cgspace</groupId>
|
<groupId>io.github.ilri.cgspace</groupId>
|
||||||
<artifactId>cgspace-java-helpers</artifactId>
|
<artifactId>cgspace-java-helpers</artifactId>
|
||||||
<version>7.6.1.2-SNAPSHOT</version>
|
<version>7.6.1.4-SNAPSHOT</version>
|
||||||
</dependency>
|
</dependency>
|
||||||
```
|
```
|
||||||
|
|
||||||
@ -33,7 +33,7 @@ $ mvn package
|
|||||||
Copy the resulting jar to the DSpace `lib` directory:
|
Copy the resulting jar to the DSpace `lib` directory:
|
||||||
|
|
||||||
```console
|
```console
|
||||||
$ cp target/cgspace-java-helpers-7.6.1.2-SNAPSHOT.jar ~/dspace/lib/
|
$ cp target/cgspace-java-helpers-7.6.1.4-SNAPSHOT.jar ~/dspace/lib/
|
||||||
```
|
```
|
||||||
|
|
||||||
## Configuration
|
## Configuration
|
||||||
@ -42,11 +42,6 @@ Please refer to the appropriate README.md file:
|
|||||||
- Curation Tasks: [src/main/java/io/github/ilri/cgspace/ctasks/README.md](https://github.com/ilri/cgspace-java-helpers/blob/dspace7/src/main/java/io/github/ilri/cgspace/ctasks/README.md)
|
- Curation Tasks: [src/main/java/io/github/ilri/cgspace/ctasks/README.md](https://github.com/ilri/cgspace-java-helpers/blob/dspace7/src/main/java/io/github/ilri/cgspace/ctasks/README.md)
|
||||||
- Scripts: [src/main/java/io/github/ilri/cgspace/scripts/README.md](https://github.com/ilri/cgspace-java-helpers/blob/dspace7/src/main/java/io/github/ilri/cgspace/scripts/README.md)
|
- Scripts: [src/main/java/io/github/ilri/cgspace/scripts/README.md](https://github.com/ilri/cgspace-java-helpers/blob/dspace7/src/main/java/io/github/ilri/cgspace/scripts/README.md)
|
||||||
|
|
||||||
## TODO
|
|
||||||
|
|
||||||
- Migrate from maven-deploy-plugin to nexus-staging-maven-plugin, see: https://central.sonatype.org/publish/publish-maven/#nexus-staging-maven-plugin-for-deployment-and-release
|
|
||||||
- Stop using oss-parent, see: https://central.sonatype.org/publish/publish-maven/#create-a-ticket-with-sonatype
|
|
||||||
|
|
||||||
## Notes
|
## Notes
|
||||||
This project was initially created according to the [Maven Getting Started Guide](https://maven.apache.org/guides/getting-started/):
|
This project was initially created according to the [Maven Getting Started Guide](https://maven.apache.org/guides/getting-started/):
|
||||||
|
|
||||||
|
34
pom.xml
34
pom.xml
@ -6,10 +6,19 @@
|
|||||||
|
|
||||||
<groupId>io.github.ilri.cgspace</groupId>
|
<groupId>io.github.ilri.cgspace</groupId>
|
||||||
<artifactId>cgspace-java-helpers</artifactId>
|
<artifactId>cgspace-java-helpers</artifactId>
|
||||||
<version>7.6.1.3-SNAPSHOT</version>
|
<version>7.6.1.4-SNAPSHOT</version>
|
||||||
|
|
||||||
<name>cgspace-java-helpers</name>
|
<name>cgspace-java-helpers</name>
|
||||||
<url>https://github.com/ilri/cgspace-java-helpers</url>
|
<url>https://github.com/ilri/cgspace-java-helpers</url>
|
||||||
|
<description>Curation tasks and helper scripts for the CGSpace institutional repository</description>
|
||||||
|
|
||||||
|
<developers>
|
||||||
|
<developer>
|
||||||
|
<name>Alan Orth</name>
|
||||||
|
<email>maven@mjanja.mozmail.com</email>
|
||||||
|
<organizationUrl>https://mjanja.ch</organizationUrl>
|
||||||
|
</developer>
|
||||||
|
</developers>
|
||||||
|
|
||||||
<licenses>
|
<licenses>
|
||||||
<license>
|
<license>
|
||||||
@ -18,14 +27,6 @@
|
|||||||
</license>
|
</license>
|
||||||
</licenses>
|
</licenses>
|
||||||
|
|
||||||
<!-- brings the sonatype snapshot repository and signing requirement on board -->
|
|
||||||
<parent>
|
|
||||||
<groupId>org.sonatype.oss</groupId>
|
|
||||||
<artifactId>oss-parent</artifactId>
|
|
||||||
<version>9</version>
|
|
||||||
<relativePath />
|
|
||||||
</parent>
|
|
||||||
|
|
||||||
<properties>
|
<properties>
|
||||||
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
|
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
|
||||||
<maven.compiler.release>11</maven.compiler.release>
|
<maven.compiler.release>11</maven.compiler.release>
|
||||||
@ -91,10 +92,6 @@
|
|||||||
<artifactId>maven-install-plugin</artifactId>
|
<artifactId>maven-install-plugin</artifactId>
|
||||||
<version>3.1.1</version>
|
<version>3.1.1</version>
|
||||||
</plugin>
|
</plugin>
|
||||||
<plugin>
|
|
||||||
<artifactId>maven-deploy-plugin</artifactId>
|
|
||||||
<version>3.1.1</version>
|
|
||||||
</plugin>
|
|
||||||
<!-- site lifecycle, see https://maven.apache.org/ref/current/maven-core/lifecycles.html#site_Lifecycle -->
|
<!-- site lifecycle, see https://maven.apache.org/ref/current/maven-core/lifecycles.html#site_Lifecycle -->
|
||||||
<plugin>
|
<plugin>
|
||||||
<artifactId>maven-site-plugin</artifactId>
|
<artifactId>maven-site-plugin</artifactId>
|
||||||
@ -104,6 +101,17 @@
|
|||||||
<artifactId>maven-project-info-reports-plugin</artifactId>
|
<artifactId>maven-project-info-reports-plugin</artifactId>
|
||||||
<version>3.5.0</version>
|
<version>3.5.0</version>
|
||||||
</plugin>
|
</plugin>
|
||||||
|
<plugin>
|
||||||
|
<groupId>org.sonatype.plugins</groupId>
|
||||||
|
<artifactId>nexus-staging-maven-plugin</artifactId>
|
||||||
|
<version>1.7.0</version>
|
||||||
|
<extensions>true</extensions>
|
||||||
|
<configuration>
|
||||||
|
<serverId>ossrh</serverId>
|
||||||
|
<nexusUrl>https://oss.sonatype.org/</nexusUrl>
|
||||||
|
<autoReleaseAfterClose>true</autoReleaseAfterClose>
|
||||||
|
</configuration>
|
||||||
|
</plugin>
|
||||||
</plugins>
|
</plugins>
|
||||||
</pluginManagement>
|
</pluginManagement>
|
||||||
</build>
|
</build>
|
||||||
|
@ -29,7 +29,7 @@ import java.util.List;
|
|||||||
* TODO: allow operation on communities and collections (currently only works on items)
|
* TODO: allow operation on communities and collections (currently only works on items)
|
||||||
*
|
*
|
||||||
* @author Alan Orth for the International Livestock Research Institute
|
* @author Alan Orth for the International Livestock Research Institute
|
||||||
* @version 7.6.1.2
|
* @version 7.6.1.3
|
||||||
* @since 7.6.1.1
|
* @since 7.6.1.1
|
||||||
*/
|
*/
|
||||||
@Suspendable
|
@Suspendable
|
||||||
@ -78,17 +78,21 @@ public class NormalizeDOIs extends AbstractCurationTask {
|
|||||||
}
|
}
|
||||||
|
|
||||||
private static String getNormalizedDOI(MetadataValue itemDOI) {
|
private static String getNormalizedDOI(MetadataValue itemDOI) {
|
||||||
// 1. Convert to lowercase
|
// Convert to lowercase
|
||||||
String newDOI = itemDOI.getValue().toLowerCase();
|
String newDOI = itemDOI.getValue().toLowerCase();
|
||||||
// 2. Strip leading and trailing whitespace
|
// Strip leading and trailing whitespace
|
||||||
newDOI = newDOI.strip();
|
newDOI = newDOI.strip();
|
||||||
// 3. Convert to HTTPS
|
// Convert to HTTPS
|
||||||
newDOI = newDOI.replace("http://", "https://");
|
newDOI = newDOI.replace("http://", "https://");
|
||||||
// 4. Prefer doi.org to dx.doi.org
|
// Prefer doi.org to dx.doi.org
|
||||||
newDOI = newDOI.replace("dx.doi.org", "doi.org");
|
newDOI = newDOI.replace("dx.doi.org", "doi.org");
|
||||||
// 5. Replace values like doi: 10.11648/j.jps.20140201.14
|
// Prefer doi.org to www.doi.org
|
||||||
|
newDOI = newDOI.replace("www.doi.org", "doi.org");
|
||||||
|
// Fix URL encoded slashes (%2f)
|
||||||
|
newDOI = newDOI.replace("%2f", "/");
|
||||||
|
// Replace values like doi: 10.11648/j.jps.20140201.14
|
||||||
newDOI = newDOI.replaceAll("^doi: 10\\.", "https://doi.org/10.");
|
newDOI = newDOI.replaceAll("^doi: 10\\.", "https://doi.org/10.");
|
||||||
// 6. Replace values like 10.3390/foods12010115
|
// Replace values like 10.3390/foods12010115
|
||||||
newDOI = newDOI.replaceAll("^10\\.", "https://doi.org/10.");
|
newDOI = newDOI.replaceAll("^10\\.", "https://doi.org/10.");
|
||||||
|
|
||||||
return newDOI;
|
return newDOI;
|
||||||
|
@ -4,7 +4,7 @@ DSpace curation tasks used on the [CGSpace](https://cgspace.cgiar.org) instituti
|
|||||||
- **CountryCodeTagger**: add ISO 3166-1 Alpha2 country codes to items based on their existing country metadata
|
- **CountryCodeTagger**: add ISO 3166-1 Alpha2 country codes to items based on their existing country metadata
|
||||||
- **NormalizeDOIs**: normalize DOIs by stripping whitespace, lowercasing, and converting to https://doi.org/ format
|
- **NormalizeDOIs**: normalize DOIs by stripping whitespace, lowercasing, and converting to https://doi.org/ format
|
||||||
|
|
||||||
Tested on DSpace 7.6. Read more about the [DSpace curation system](https://wiki.lyrasis.org/display/DSDOC5x/Curation+System).
|
Tested on DSpace 7.6.1. Read more about the [DSpace curation system](https://wiki.lyrasis.org/display/DSDOC5x/Curation+System).
|
||||||
|
|
||||||
## Build and Install
|
## Build and Install
|
||||||
|
|
||||||
@ -15,7 +15,7 @@ To use these curation tasks in a DSpace project add the following dependency to
|
|||||||
<dependency>
|
<dependency>
|
||||||
<groupId>io.github.ilri.cgspace</groupId>
|
<groupId>io.github.ilri.cgspace</groupId>
|
||||||
<artifactId>cgspace-java-helpers</artifactId>
|
<artifactId>cgspace-java-helpers</artifactId>
|
||||||
<version>7.6.1.2-SNAPSHOT</version>
|
<version>7.6.1.4-SNAPSHOT</version>
|
||||||
</dependency>
|
</dependency>
|
||||||
```
|
```
|
||||||
|
|
||||||
@ -31,7 +31,7 @@ $ mvn package
|
|||||||
Copy the resulting jar to the DSpace `lib` directory:
|
Copy the resulting jar to the DSpace `lib` directory:
|
||||||
|
|
||||||
```
|
```
|
||||||
$ cp target/cgspace-java-helpers-7.6.1.2-SNAPSHOT.jar ~/dspace/lib/
|
$ cp target/cgspace-java-helpers-7.6.1.4-SNAPSHOT.jar ~/dspace/lib/
|
||||||
```
|
```
|
||||||
|
|
||||||
## Configuration
|
## Configuration
|
||||||
|
@ -4,7 +4,7 @@ Java-based helpers used on the [CGSpace](https://cgspace.cgiar.org) institutiona
|
|||||||
- **FixJpgJpgThumbnails**: fix low-quality ".jpg.jpg" thumbnails by replacing them with their originals
|
- **FixJpgJpgThumbnails**: fix low-quality ".jpg.jpg" thumbnails by replacing them with their originals
|
||||||
- **FixLowQualityThumbnails**: remove low-quality thumbnails when PDF bitstreams are present
|
- **FixLowQualityThumbnails**: remove low-quality thumbnails when PDF bitstreams are present
|
||||||
|
|
||||||
Tested on DSpace 7.6. Read more about the [DSpace curation system](https://wiki.lyrasis.org/display/DSDOC6x/Curation+System).
|
Tested on DSpace 7.6.1. Read more about the [DSpace curation system](https://wiki.lyrasis.org/display/DSDOC6x/Curation+System).
|
||||||
|
|
||||||
## Build and Install
|
## Build and Install
|
||||||
|
|
||||||
@ -15,7 +15,7 @@ To use these curation tasks in a DSpace project add the following dependency to
|
|||||||
<dependency>
|
<dependency>
|
||||||
<groupId>io.github.ilri.cgspace</groupId>
|
<groupId>io.github.ilri.cgspace</groupId>
|
||||||
<artifactId>cgspace-java-helpers</artifactId>
|
<artifactId>cgspace-java-helpers</artifactId>
|
||||||
<version>7.6.1.2-SNAPSHOT</version>
|
<version>7.6.1.4-SNAPSHOT</version>
|
||||||
</dependency>
|
</dependency>
|
||||||
```
|
```
|
||||||
|
|
||||||
@ -31,7 +31,7 @@ $ mvn package
|
|||||||
Copy the resulting jar to the DSpace `lib` directory:
|
Copy the resulting jar to the DSpace `lib` directory:
|
||||||
|
|
||||||
```console
|
```console
|
||||||
$ cp target/cgspace-java-helpers-7.6.1.2-SNAPSHOT.jar ~/dspace/lib/
|
$ cp target/cgspace-java-helpers-7.6.1.4-SNAPSHOT.jar ~/dspace/lib/
|
||||||
```
|
```
|
||||||
|
|
||||||
## Invocation
|
## Invocation
|
||||||
|
Loading…
Reference in New Issue
Block a user