cgspace-java-helpers

mirror of https://github.com/ilri/cgspace-java-helpers.git synced 2025-08-23 13:21:54 +02:00

Author	SHA1	Message	Date
Alan Orth	692a62b454	src/main/java: update curation tasks README.md Add eperson ID to curation invocation. DSpace 7 requires this.	2024-04-29 09:33:39 +03:00
Alan Orth	d4ca92066a	Version 7.6.1.2	2024-04-25 12:58:07 +03:00
Alan Orth	5ad8c556e9	src/main/java: simplify curation task results We don't need to print the Handle because some items can be in the workflow still so this will be null, but also because DSpace will already show the Handle in the log before printing the result.	2024-04-25 12:53:15 +03:00
Alan Orth	77425c13bf	src/main/java: remove report() from curation tasks Results are a single-line status that shows the result of the task, but reports are like a running log of changes to the item and have more complicated use cases and configuration requirements. For now I will disable reports since I'm not using them.	2024-04-25 12:51:30 +03:00
Alan Orth	9050caf37f	Version 7.6.1.1 Unsure of the versioning, but something tells me I should follow the upstream DSpace versioning to keep things simple.	2024-04-23 13:11:12 +03:00
Alan Orth	639148dc19	src/main/java: minor update to ctasks README.md	2024-04-23 13:08:52 +03:00
Alan Orth	7a91305742	Add new NormalizeDOIs curation task	2024-04-23 13:07:55 +03:00
Alan Orth	0cb533b2c4	Fix license headers I meant to use GPL-3.0-only.	2024-04-22 16:59:12 +03:00
Alan Orth	ee6518035e	Bump version to 7.6.1	2024-01-02 20:34:14 +03:00
Alan Orth	9faf657c59	Bump version to 7.6-SNAPSHOT	2024-01-02 19:54:46 +03:00
Alan Orth	7fb78c2722	src/main/java: minor refactoring Suggested by IntelliJ.	2024-01-02 19:34:51 +03:00
Alan Orth	6ef9f521bf	src/main/resources: fix trailing comma in JSON	2024-01-02 18:03:52 +03:00
Alan Orth	f9d7e5f6a2	src/main/java: minor refactor Use isEmpty() instead of checking size.	2023-12-28 10:26:11 +03:00
Alan Orth	9e965afdb7	src/main/java: change getSize() to getSizeBytes() Apparently this changed in DSpace 7. Untested, but it compiles now.	2023-12-28 10:18:40 +03:00
Alan Orth	408a0e1c19	src/main/java: update log4j usage Untested, but compiles.	2023-12-28 10:17:24 +03:00
Alan Orth	0a7cf7bf59	Import iso-codes snapshot After my merge request to Debian's iso-codes package was merged we now no longer need to maintain local overrides for Iran, Laos, and Syria, as those are officially in iso-codes. See: https://salsa.debian.org/iso-codes-team/iso-codes/-/merge_requests/32	2023-02-26 21:13:44 +03:00
Alan Orth	8c0a8fbcd1	Bump version to 6.2-SNAPSHOT I can't figure out how to get non-snapshot releases on Central.	2023-02-21 10:59:54 +03:00
Alan Orth	c05a2e4f96	Version 6.2	2023-02-20 20:37:40 +03:00
Alan Orth	1f6ba4af67	src: import iso-codes 4.12.0 This updates the name for TR from "Turkey" to "Türkiye". See: https://salsa.debian.org/iso-codes-team/iso-codes/-/blob/main/CHANGELOG.md#4120-2022-11-06	2022-11-07 12:21:39 +03:00
Alan Orth	dfaa234a90	src/main/resources: sync cgspace-countries.json with iso-codes Not sure this is needed, but we copy the JSON object from iso-codes so we should keep it in sync when there are changes to countries we override.	2022-10-14 20:49:23 +03:00
Alan Orth	f46e81b8cd	src/main/resources: import iso-codes 4.11.0 This is a bit old by now even, but there are two changes: - South Korea - North Korea	2022-10-14 20:47:26 +03:00
Alan Orth	dbd8721579	src: add better status messages to FixLowQualityThumbnails	2022-10-07 15:33:13 +03:00
Alan Orth	80a336f94d	src: fix context commit in scripts I was wondering why the same bitstreams appeared to be getting de- leted on every single run. It turns out that the only mode we were committing the context in was in single item mode. If the argument was a site, community, or collection we were updating the item but not actually committing the changes!	2022-10-07 14:49:58 +03:00
Alan Orth	5ebf4930cf	src: re-organize switch statements in scripts It makes more sense to me to start from the top level of the hier- archy.	2022-10-07 13:11:03 +03:00
Alan Orth	b396fba043	src: format Java files with google-java-format Using AOSP format so we get four spaces instead of two.	2022-10-06 14:27:51 +03:00
Alan Orth	38a9cc5188	src: organize imports in VS Code	2022-10-06 14:26:44 +03:00
Alan Orth	16db38967b	src: handle null descriptions in FixJpgJpgThumbnails	2022-10-06 14:17:41 +03:00
Alan Orth	2604dc3cce	src: skip Infographics and Maps in FixJpgJpgThumbnails Instead of checking whether they exist and then skipping them just at the moment when we want to swap the bitstreams let's bail early when we know an item is an Infographic or a Map.	2022-10-06 14:15:58 +03:00
Alan Orth	f0754ab419	src: fix npe on null description In FixLowQualityThumbnails we need to make sure that bitstream de- scriptions are not null or empty before trying to evaluate them.	2022-10-05 21:00:14 +03:00
Alan Orth	6772145bec	src: fix SPDX license header Use GPL-3.0-or-later instead of GPL-3.0-only. I had specified this in pom.xml already.	2022-10-05 16:53:00 +03:00
Alan Orth	095f843067	src: add SPDX license headers	2022-10-05 15:48:57 +03:00
Alan Orth	922e3892a7	Update README.md files	2022-10-05 15:24:08 +03:00
Alan Orth	6b648c2c85	src: add FixLowQualityThumbnails.java This adds another script to detect and remove more low-quality thu- mbnails. For example: - If an item has an "IM Thumbnail" and a "Generated Thumbnail" in the THUMBNAIL bundle, remove the "Generated Thumbnail" - If an item has a PDF bitstream and a JPEG bitstream with a name or description "thumbnail" in the ORIGINAL bundle, remove the "thumbnail" bitstream in the ORIGINAL bundle and try to remove the "thumbnail.jpg" bitstream in the THUMBNAIL bundle The idea is that we should always prefer thumbnails generated by ImageMagick from PDFs in the ORIGINAL bundle and should remove any other manually uploaded thumbnails.	2022-10-05 15:07:56 +03:00
Alan Orth	3aa1503163	src: bump version of FixJpgJpgThumbnails.java	2022-10-04 21:13:24 +03:00
Alan Orth	26597e2f8f	Use dcterms.type in FixJpgJpgThumbnails script We are now using dcterms.type instead of dc.type.	2022-10-04 16:16:43 +03:00
Alan Orth	2e779efb14	src/main/java: Adjust curation README DSpace 6 doesn't have the `-l` option to limit the cache size.	2020-08-10 20:04:46 +03:00
Alan Orth	735e759033	Adjust READMEs again...	2020-08-10 17:16:14 +03:00
Alan Orth	271a9ce970	Adjust README.md files	2020-08-10 15:55:11 +03:00
Alan Orth	4bc7971ecb	src/main/java: Remove debug comment	2020-08-07 22:55:35 +03:00
Alan Orth	da1ecad238	src/main/java: DSpace 6 port of FixJpgJpgThumbnails.java Need to use the new DSpace 6 service model in most places. Not sure why addBitstream is no longer public, but removeBitstream is...	2020-08-07 22:45:07 +03:00
Alan Orth	f3ab89f7a1	CountryCodeTagger.java: Port to DSpace 6 We need to use the new DSpace 6 service API. Also, the way we read task properties changes because of the configuration changes. See: https://wiki.lyrasis.org/display/DSDOC6x/Curation+System See: https://wiki.lyrasis.org/display/DSDOC6x/Configuration+Reference	2020-08-05 12:28:37 +03:00
Alan Orth	7251b85436	cgspace-countries.json: Remove Palestine It's the same in the ISO 3166-1 list.	2020-08-04 14:52:36 +03:00
Alan Orth	dcb0532be2	Change groupId to prepare for upload to Central It's much easier to get your package verified on Central if it uses a GitHub groupId. Otherwise you need to use DNS verification! This changes the groupId: - from: org.cgiar.cgspace.ctask - to: io.github.ilri.cgspace Also the package changed as well. See: https://central.sonatype.org/pages/producers.html	2020-08-02 23:48:13 +03:00
Alan Orth	ca7deaac8f	CountryCodeTagger.java: Remove unused variable Some of the other curation tasks use an array of results.	2020-08-02 22:03:10 +03:00
Alan Orth	e158e4bc98	CountryCodeTagger.java: Refactor adding of alpha2 codes We can append the codes we will add to a List of Strings and then actually apply them later in one addMetadata call, and update the item with one item.update() call. This reduces identical code and is more efficient. Note that when testing this on a collection with thousands of items I realized that it is really important to limit both the cache size as well as set the database transaction model to be per object/item or else you will crash due to Java heap issues. For example: $ ~/dspace/bin/dspace curate -t countrycodetagger -i 10568/3 -r - -l 500 -s object See: https://wiki.lyrasis.org/display/DSPACE/Curation+Task+Cookbook	2020-08-02 18:33:32 +03:00
Alan Orth	1c866bdf64	src/main/java: Remove unnecessary comments and prints	2020-08-02 18:32:04 +03:00
Alan Orth	e5d45e62be	src/main/java: Refactor CountryCodeTagger.java Now is much more modular and can easily, cleanly be extended to do ISO 3166-1 Alpha3, numeric, etc...	2020-08-02 15:51:18 +03:00
Alan Orth	6228f337e9	src/main/java: Skip items that have country codes Originally I wasn't sure if I was going to try to parse each code, check them against the mapping, and possibly correct them, but it's easier to just skip items with codes unless we're in "force" mode.	2020-08-01 23:14:19 +03:00
Alan Orth	4b553676dd	src/main/java: Implement task "profiles" The DSpace curation system has task properties that can be used to create "profiles" of sorts. For example, if you set a custom task name in curate.cfg: plugin.named.org.dspace.curate.CurationTask = \ org.cgiar.cgspace.ctasks.CountryCodeTagger = countrycodetagger \ org.cgiar.cgspace.ctasks.CountryCodeTagger = countrycodetagger.force ... then DSpace will look for countrycodetagger.cfg by default, and countrycodetagger.force.cfg for the second task. We can set different properties in each one, for example "force=true", and then operate accordingly in the task when we check the value using taskProperty(). I will use this to force all country tags to be cleared and updated, where by default we only tag if there are no existing country tags. See: https://wiki.lyrasis.org/display/DSDOC5x/Curation+System	2020-08-01 23:04:35 +03:00
Alan Orth	d4cd5bfd61	src/main/java: Optimize imports	2020-08-01 23:03:51 +03:00

1 2

70 Commits