diff --git a/content/posts/2018-03.md b/content/posts/2018-03.md index fc41be33c..ff9362d78 100644 --- a/content/posts/2018-03.md +++ b/content/posts/2018-03.md @@ -414,4 +414,24 @@ java.lang.OutOfMemoryError: Java heap space - Update [Ansible playbooks](https://github.com/ilri/rmg-ansible-public) to use [PostgreSQL JBDC driver](https://jdbc.postgresql.org/) 42.2.2 - Deploy the new JDBC driver on DSpace Test - I'm also curious to see how long the `dspace index-discovery -b` takes on DSpace Test where the DSpace installation directory is on one of Linode's new block storage volumes + +``` +$ time schedtool -D -e ionice -c2 -n7 nice -n19 dspace index-discovery -b + +real 208m19.155s +user 8m39.138s +sys 2m45.135s +``` + +- So that's about three times as long as it took on CGSpace this morning - I should also check the raw read speed with `hdparm -tT /dev/sdc` +- Looking at Peter's author corrections there are some mistakes due to Windows 1252 encoding +- I need to find a way to filter these easily with OpenRefine +- For example, Peter has inadvertantly introduced Unicode character 0xfffd into several fields +- I can search for Unicode values by their hex code in OpenRefine using the following GREL expression: + +``` +isNotNull(value.match(/.*\ufffd.*/)) +``` + +- I need to be able to add many common characters though so that it is useful to copy and paste into a new project to find issues diff --git a/docs/2018-03/index.html b/docs/2018-03/index.html index 2bdee3269..4f235958b 100644 --- a/docs/2018-03/index.html +++ b/docs/2018-03/index.html @@ -20,7 +20,7 @@ Export a CSV of the IITA community metadata for Martin Mueller - + @@ -51,9 +51,9 @@ Export a CSV of the IITA community metadata for Martin Mueller "@type": "BlogPosting", "headline": "March, 2018", "url": "https://alanorth.github.io/cgspace-notes/2018-03/", - "wordCount": "2343", + "wordCount": "2459", "datePublished": "2018-03-02T16:07:54+02:00", - "dateModified": "2018-03-21T11:44:06+02:00", + "dateModified": "2018-03-21T18:11:22+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -587,7 +587,29 @@ java.lang.OutOfMemoryError: Java heap space
  • Update Ansible playbooks to use PostgreSQL JBDC driver 42.2.2
  • Deploy the new JDBC driver on DSpace Test
  • I’m also curious to see how long the dspace index-discovery -b takes on DSpace Test where the DSpace installation directory is on one of Linode’s new block storage volumes
  • + + +
    $ time schedtool -D -e ionice -c2 -n7 nice -n19 dspace index-discovery -b
    +
    +real    208m19.155s
    +user    8m39.138s
    +sys     2m45.135s
    +
    + + + +
    isNotNull(value.match(/.*\ufffd.*/))
    +
    + + diff --git a/docs/sitemap.xml b/docs/sitemap.xml index 7746a2c58..156ad26c8 100644 --- a/docs/sitemap.xml +++ b/docs/sitemap.xml @@ -4,7 +4,7 @@ https://alanorth.github.io/cgspace-notes/2018-03/ - 2018-03-21T11:44:06+02:00 + 2018-03-21T18:11:22+02:00 @@ -154,7 +154,7 @@ https://alanorth.github.io/cgspace-notes/ - 2018-03-21T11:44:06+02:00 + 2018-03-21T18:11:22+02:00 0 @@ -165,7 +165,7 @@ https://alanorth.github.io/cgspace-notes/tags/notes/ - 2018-03-21T11:44:06+02:00 + 2018-03-21T18:11:22+02:00 0 @@ -177,13 +177,13 @@ https://alanorth.github.io/cgspace-notes/posts/ - 2018-03-21T11:44:06+02:00 + 2018-03-21T18:11:22+02:00 0 https://alanorth.github.io/cgspace-notes/tags/ - 2018-03-21T11:44:06+02:00 + 2018-03-21T18:11:22+02:00 0