diff --git a/content/posts/2018-03.md b/content/posts/2018-03.md index fc41be33c..ff9362d78 100644 --- a/content/posts/2018-03.md +++ b/content/posts/2018-03.md @@ -414,4 +414,24 @@ java.lang.OutOfMemoryError: Java heap space - Update [Ansible playbooks](https://github.com/ilri/rmg-ansible-public) to use [PostgreSQL JBDC driver](https://jdbc.postgresql.org/) 42.2.2 - Deploy the new JDBC driver on DSpace Test - I'm also curious to see how long the `dspace index-discovery -b` takes on DSpace Test where the DSpace installation directory is on one of Linode's new block storage volumes + +``` +$ time schedtool -D -e ionice -c2 -n7 nice -n19 dspace index-discovery -b + +real 208m19.155s +user 8m39.138s +sys 2m45.135s +``` + +- So that's about three times as long as it took on CGSpace this morning - I should also check the raw read speed with `hdparm -tT /dev/sdc` +- Looking at Peter's author corrections there are some mistakes due to Windows 1252 encoding +- I need to find a way to filter these easily with OpenRefine +- For example, Peter has inadvertantly introduced Unicode character 0xfffd into several fields +- I can search for Unicode values by their hex code in OpenRefine using the following GREL expression: + +``` +isNotNull(value.match(/.*\ufffd.*/)) +``` + +- I need to be able to add many common characters though so that it is useful to copy and paste into a new project to find issues diff --git a/docs/2018-03/index.html b/docs/2018-03/index.html index 2bdee3269..4f235958b 100644 --- a/docs/2018-03/index.html +++ b/docs/2018-03/index.html @@ -20,7 +20,7 @@ Export a CSV of the IITA community metadata for Martin Mueller - + @@ -51,9 +51,9 @@ Export a CSV of the IITA community metadata for Martin Mueller "@type": "BlogPosting", "headline": "March, 2018", "url": "https://alanorth.github.io/cgspace-notes/2018-03/", - "wordCount": "2343", + "wordCount": "2459", "datePublished": "2018-03-02T16:07:54+02:00", - "dateModified": "2018-03-21T11:44:06+02:00", + "dateModified": "2018-03-21T18:11:22+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -587,7 +587,29 @@ java.lang.OutOfMemoryError: Java heap space
dspace index-discovery -b
takes on DSpace Test where the DSpace installation directory is on one of Linode’s new block storage volumes$ time schedtool -D -e ionice -c2 -n7 nice -n19 dspace index-discovery -b
+
+real 208m19.155s
+user 8m39.138s
+sys 2m45.135s
+
+
+hdparm -tT /dev/sdc
isNotNull(value.match(/.*\ufffd.*/))
+
+
+