7 Commits

Author SHA1 Message Date
8531992412 Version 7.6.1.3 2024-06-26 15:00:25 +03:00
27016f5f77 CHANGELOG.md: add unreleased notes 2024-06-26 14:12:37 +03:00
3a583c4f86 src/main/java: more DOI normalization
Normalize %2f to /.
2024-06-26 12:46:08 +03:00
28668f76c9 src/main: remove numbered comments in NormalizeDOIs 2024-06-25 11:55:36 +03:00
e0153fd38a src/main: add more DOI formats to NormalizeDOIs
I saw some DOIs like "www.doi.org" in our repository recently.
2024-06-25 11:42:37 +03:00
12a606ac61 pom.xml: bump version to 7.6.1.3-SNAPSHOT 2024-05-14 12:47:47 +03:00
692a62b454 src/main/java: update curation tasks README.md
Add eperson ID to curation invocation. DSpace 7 requires this.
2024-04-29 09:33:39 +03:00
6 changed files with 23 additions and 15 deletions

View File

@ -4,6 +4,10 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [7.6.1.3] - 2024-06-26
### Updated
- Add more formats to `NormalizeDOIs` curation task
## [7.6.1.2] - 2024-04-25
### Changed
- Remove reporting from curation tasks since "results" are enough

View File

@ -17,7 +17,7 @@ To use these curation tasks in a DSpace project add the following dependency to
<dependency>
<groupId>io.github.ilri.cgspace</groupId>
<artifactId>cgspace-java-helpers</artifactId>
<version>7.6.1.2-SNAPSHOT</version>
<version>7.6.1.3-SNAPSHOT</version>
</dependency>
```
@ -33,7 +33,7 @@ $ mvn package
Copy the resulting jar to the DSpace `lib` directory:
```console
$ cp target/cgspace-java-helpers-7.6.1.2-SNAPSHOT.jar ~/dspace/lib/
$ cp target/cgspace-java-helpers-7.6.1.3-SNAPSHOT.jar ~/dspace/lib/
```
## Configuration

View File

@ -6,7 +6,7 @@
<groupId>io.github.ilri.cgspace</groupId>
<artifactId>cgspace-java-helpers</artifactId>
<version>7.6.1.2-SNAPSHOT</version>
<version>7.6.1.3-SNAPSHOT</version>
<name>cgspace-java-helpers</name>
<url>https://github.com/ilri/cgspace-java-helpers</url>

View File

@ -29,7 +29,7 @@ import java.util.List;
* TODO: allow operation on communities and collections (currently only works on items)
*
* @author Alan Orth for the International Livestock Research Institute
* @version 7.6.1.2
* @version 7.6.1.3
* @since 7.6.1.1
*/
@Suspendable
@ -78,17 +78,21 @@ public class NormalizeDOIs extends AbstractCurationTask {
}
private static String getNormalizedDOI(MetadataValue itemDOI) {
// 1. Convert to lowercase
// Convert to lowercase
String newDOI = itemDOI.getValue().toLowerCase();
// 2. Strip leading and trailing whitespace
// Strip leading and trailing whitespace
newDOI = newDOI.strip();
// 3. Convert to HTTPS
// Convert to HTTPS
newDOI = newDOI.replace("http://", "https://");
// 4. Prefer doi.org to dx.doi.org
// Prefer doi.org to dx.doi.org
newDOI = newDOI.replace("dx.doi.org", "doi.org");
// 5. Replace values like doi: 10.11648/j.jps.20140201.14
// Prefer doi.org to www.doi.org
newDOI = newDOI.replace("www.doi.org", "doi.org");
// Fix URL encoded slashes (%2f)
newDOI = newDOI.replace("%2f", "/");
// Replace values like doi: 10.11648/j.jps.20140201.14
newDOI = newDOI.replaceAll("^doi: 10\\.", "https://doi.org/10.");
// 6. Replace values like 10.3390/foods12010115
// Replace values like 10.3390/foods12010115
newDOI = newDOI.replaceAll("^10\\.", "https://doi.org/10.");
return newDOI;

View File

@ -15,7 +15,7 @@ To use these curation tasks in a DSpace project add the following dependency to
<dependency>
<groupId>io.github.ilri.cgspace</groupId>
<artifactId>cgspace-java-helpers</artifactId>
<version>7.6.1.2-SNAPSHOT</version>
<version>7.6.1.3-SNAPSHOT</version>
</dependency>
```
@ -31,7 +31,7 @@ $ mvn package
Copy the resulting jar to the DSpace `lib` directory:
```
$ cp target/cgspace-java-helpers-7.6.1.2-SNAPSHOT.jar ~/dspace/lib/
$ cp target/cgspace-java-helpers-7.6.1.3-SNAPSHOT.jar ~/dspace/lib/
```
## Configuration
@ -62,7 +62,7 @@ countrycodetagger.iso3166-alpha2.field = cg.coverage.iso3166-alpha2
Once the jar is installed and you have added appropriate configuration in `~/dspace/config/modules`:
```
$ ~/dspace/bin/dspace curate -t countrycodetagger -i 10568/3 -r - -s object
$ ~/dspace/bin/dspace curate -e eperson@repo.org -t countrycodetagger -i 10568/3 -r - -s object
```
*Note*: it is very important to set the database transaction scope to something sensible (`object`) if you're curating a community or collection with more than a few hundred items.

View File

@ -15,7 +15,7 @@ To use these curation tasks in a DSpace project add the following dependency to
<dependency>
<groupId>io.github.ilri.cgspace</groupId>
<artifactId>cgspace-java-helpers</artifactId>
<version>7.6.1.2-SNAPSHOT</version>
<version>7.6.1.3-SNAPSHOT</version>
</dependency>
```
@ -31,7 +31,7 @@ $ mvn package
Copy the resulting jar to the DSpace `lib` directory:
```console
$ cp target/cgspace-java-helpers-7.6.1.2-SNAPSHOT.jar ~/dspace/lib/
$ cp target/cgspace-java-helpers-7.6.1.3-SNAPSHOT.jar ~/dspace/lib/
```
## Invocation