Add notes for 2020-08-07

This commit is contained in:
Alan Orth 2020-08-07 19:55:21 +03:00
parent 811a12cb5e
commit dbea57cf8e
Signed by: alanorth
GPG Key ID: 0FB860CC9C45B1B9
20 changed files with 42 additions and 27 deletions

View File

@ -201,6 +201,12 @@ on_id=[A-Z0-9]{32}' | sort | uniq | wc -l
- I developed a small Java class called `FixJpgJpgThumbnails` to remove ".jpg.jpg" thumbnails from the `THUMBNAIL` bundle and replace them with their originals from the `ORIGINAL` bundle - I developed a small Java class called `FixJpgJpgThumbnails` to remove ".jpg.jpg" thumbnails from the `THUMBNAIL` bundle and replace them with their originals from the `ORIGINAL` bundle
- The code is based on [RemovePNGThumbnailsForPDFs.java](https://github.com/UoW-IRRs/DSpace-Scripts/blob/master/src/main/java/nz/ac/waikato/its/irr/scripts/RemovePNGThumbnailsForPDFs.java) by Andrea Schweer - The code is based on [RemovePNGThumbnailsForPDFs.java](https://github.com/UoW-IRRs/DSpace-Scripts/blob/master/src/main/java/nz/ac/waikato/its/irr/scripts/RemovePNGThumbnailsForPDFs.java) by Andrea Schweer
- I incorporated it into my dspace-curation-tasks repository, then renamed it to [cgspace-java-helpers](https://github.com/ilri/cgspace-java-helpers) - I incorporated it into my dspace-curation-tasks repository, then renamed it to [cgspace-java-helpers](https://github.com/ilri/cgspace-java-helpers)
- In testing I found that I can replace ~3,500 thumbnails on CGSpace! - In testing I found that I can replace ~4,000 thumbnails on CGSpace!
## 2020-08-07
- I improved the `RemovePNGThumbnailsForPDFs.java` a bit more to exclude infographics and original bitstreams larger than 100KiB
- I ran it on CGSpace and it cleaned up 3,769 thumbnails!
- Afterwards I ran `dspace cleanup -v` to remove the deleted thumbnails
<!-- vim: set sw=2 ts=2: --> <!-- vim: set sw=2 ts=2: -->

View File

@ -19,7 +19,7 @@ It is class based so I can easily add support for other vocabularies, and the te
<meta property="og:type" content="article" /> <meta property="og:type" content="article" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2020-08/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2020-08/" />
<meta property="article:published_time" content="2020-08-02T15:35:54+03:00" /> <meta property="article:published_time" content="2020-08-02T15:35:54+03:00" />
<meta property="article:modified_time" content="2020-08-06T10:56:13+03:00" /> <meta property="article:modified_time" content="2020-08-06T16:24:01+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="August, 2020"/> <meta name="twitter:title" content="August, 2020"/>
@ -43,9 +43,9 @@ It is class based so I can easily add support for other vocabularies, and the te
"@type": "BlogPosting", "@type": "BlogPosting",
"headline": "August, 2020", "headline": "August, 2020",
"url": "https://alanorth.github.io/cgspace-notes/2020-08/", "url": "https://alanorth.github.io/cgspace-notes/2020-08/",
"wordCount": "1382", "wordCount": "1421",
"datePublished": "2020-08-02T15:35:54+03:00", "datePublished": "2020-08-02T15:35:54+03:00",
"dateModified": "2020-08-06T10:56:13+03:00", "dateModified": "2020-08-06T16:24:01+03:00",
"author": { "author": {
"@type": "Person", "@type": "Person",
"name": "Alan Orth" "name": "Alan Orth"
@ -357,7 +357,16 @@ on_id=[A-Z0-9]{32}' | sort | uniq | wc -l
<ul> <ul>
<li>The code is based on <a href="https://github.com/UoW-IRRs/DSpace-Scripts/blob/master/src/main/java/nz/ac/waikato/its/irr/scripts/RemovePNGThumbnailsForPDFs.java">RemovePNGThumbnailsForPDFs.java</a> by Andrea Schweer</li> <li>The code is based on <a href="https://github.com/UoW-IRRs/DSpace-Scripts/blob/master/src/main/java/nz/ac/waikato/its/irr/scripts/RemovePNGThumbnailsForPDFs.java">RemovePNGThumbnailsForPDFs.java</a> by Andrea Schweer</li>
<li>I incorporated it into my dspace-curation-tasks repository, then renamed it to <a href="https://github.com/ilri/cgspace-java-helpers">cgspace-java-helpers</a></li> <li>I incorporated it into my dspace-curation-tasks repository, then renamed it to <a href="https://github.com/ilri/cgspace-java-helpers">cgspace-java-helpers</a></li>
<li>In testing I found that I can replace ~3,500 thumbnails on CGSpace!</li> <li>In testing I found that I can replace ~4,000 thumbnails on CGSpace!</li>
</ul>
</li>
</ul>
<h2 id="2020-08-07">2020-08-07</h2>
<ul>
<li>I improved the <code>RemovePNGThumbnailsForPDFs.java</code> a bit more to exclude infographics and original bitstreams larger than 100KiB
<ul>
<li>I ran it on CGSpace and it cleaned up 3,769 thumbnails!</li>
<li>Afterwards I ran <code>dspace cleanup -v</code> to remove the deleted thumbnails</li>
</ul> </ul>
</li> </li>
</ul> </ul>

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/" />
<meta property="og:updated_time" content="2020-08-06T10:56:13+03:00" /> <meta property="og:updated_time" content="2020-08-06T16:24:01+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="Categories"/> <meta name="twitter:title" content="Categories"/>

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2020-08-06T10:56:13+03:00" /> <meta property="og:updated_time" content="2020-08-06T16:24:01+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="Notes"/> <meta name="twitter:title" content="Notes"/>

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2020-08-06T10:56:13+03:00" /> <meta property="og:updated_time" content="2020-08-06T16:24:01+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="Notes"/> <meta name="twitter:title" content="Notes"/>

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2020-08-06T10:56:13+03:00" /> <meta property="og:updated_time" content="2020-08-06T16:24:01+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="Notes"/> <meta name="twitter:title" content="Notes"/>

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2020-08-06T10:56:13+03:00" /> <meta property="og:updated_time" content="2020-08-06T16:24:01+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="Notes"/> <meta name="twitter:title" content="Notes"/>

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2020-08-06T10:56:13+03:00" /> <meta property="og:updated_time" content="2020-08-06T16:24:01+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="CGSpace Notes"/> <meta name="twitter:title" content="CGSpace Notes"/>

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2020-08-06T10:56:13+03:00" /> <meta property="og:updated_time" content="2020-08-06T16:24:01+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="CGSpace Notes"/> <meta name="twitter:title" content="CGSpace Notes"/>

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2020-08-06T10:56:13+03:00" /> <meta property="og:updated_time" content="2020-08-06T16:24:01+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="CGSpace Notes"/> <meta name="twitter:title" content="CGSpace Notes"/>

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2020-08-06T10:56:13+03:00" /> <meta property="og:updated_time" content="2020-08-06T16:24:01+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="CGSpace Notes"/> <meta name="twitter:title" content="CGSpace Notes"/>

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2020-08-06T10:56:13+03:00" /> <meta property="og:updated_time" content="2020-08-06T16:24:01+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="CGSpace Notes"/> <meta name="twitter:title" content="CGSpace Notes"/>

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2020-08-06T10:56:13+03:00" /> <meta property="og:updated_time" content="2020-08-06T16:24:01+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="CGSpace Notes"/> <meta name="twitter:title" content="CGSpace Notes"/>

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2020-08-06T10:56:13+03:00" /> <meta property="og:updated_time" content="2020-08-06T16:24:01+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="Posts"/> <meta name="twitter:title" content="Posts"/>

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2020-08-06T10:56:13+03:00" /> <meta property="og:updated_time" content="2020-08-06T16:24:01+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="Posts"/> <meta name="twitter:title" content="Posts"/>

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2020-08-06T10:56:13+03:00" /> <meta property="og:updated_time" content="2020-08-06T16:24:01+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="Posts"/> <meta name="twitter:title" content="Posts"/>

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2020-08-06T10:56:13+03:00" /> <meta property="og:updated_time" content="2020-08-06T16:24:01+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="Posts"/> <meta name="twitter:title" content="Posts"/>

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2020-08-06T10:56:13+03:00" /> <meta property="og:updated_time" content="2020-08-06T16:24:01+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="Posts"/> <meta name="twitter:title" content="Posts"/>

View File

@ -9,7 +9,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." /> <meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" /> <meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" /> <meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2020-08-06T10:56:13+03:00" /> <meta property="og:updated_time" content="2020-08-06T16:24:01+03:00" />
<meta name="twitter:card" content="summary"/> <meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="Posts"/> <meta name="twitter:title" content="Posts"/>

View File

@ -4,27 +4,27 @@
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/2020-08/</loc> <loc>https://alanorth.github.io/cgspace-notes/2020-08/</loc>
<lastmod>2020-08-06T10:56:13+03:00</lastmod> <lastmod>2020-08-06T16:24:01+03:00</lastmod>
</url> </url>
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/categories/</loc> <loc>https://alanorth.github.io/cgspace-notes/categories/</loc>
<lastmod>2020-08-06T10:56:13+03:00</lastmod> <lastmod>2020-08-06T16:24:01+03:00</lastmod>
</url> </url>
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/</loc> <loc>https://alanorth.github.io/cgspace-notes/</loc>
<lastmod>2020-08-06T10:56:13+03:00</lastmod> <lastmod>2020-08-06T16:24:01+03:00</lastmod>
</url> </url>
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/categories/notes/</loc> <loc>https://alanorth.github.io/cgspace-notes/categories/notes/</loc>
<lastmod>2020-08-06T10:56:13+03:00</lastmod> <lastmod>2020-08-06T16:24:01+03:00</lastmod>
</url> </url>
<url> <url>
<loc>https://alanorth.github.io/cgspace-notes/posts/</loc> <loc>https://alanorth.github.io/cgspace-notes/posts/</loc>
<lastmod>2020-08-06T10:56:13+03:00</lastmod> <lastmod>2020-08-06T16:24:01+03:00</lastmod>
</url> </url>
<url> <url>