+++ date = "2016-02-05T13:18:00+03:00" author = "Alan Orth" title = "February, 2016" tags = ["notes"] image = "../images/bg.jpg" +++ ## 2016-02-05 - Looking at some DAGRIS data for Abenet Yabowork - Lots of issues with spaces, newlines, etc causing the import to fail - I noticed we have a very *interesting* list of countries on CGSpace: ![CGSpace country list](../images/2016/02/cgspace-countries.png) - Not only are there 49,000 countries, we have some blanks (25)... - Also, lots of things like "COTE D`LVOIRE" and "COTE D IVOIRE" ## 2016-02-06 - Found a way to get items with null/empty metadata values from SQL - First, find the `metadata_field_id` for the field you want from the `metadatafieldregistry` table: ``` dspacetest=# select * from metadatafieldregistry; ``` - In this case our country field is 78 - Now find all resources with type 2 (item) that have null/empty values for that field: ``` dspacetest=# select resource_id from metadatavalue where resource_type_id=2 and metadata_field_id=78 and (text_value='' OR text_value IS NULL); ``` - Then you can find the handle that owns it from its `resource_id`: ``` dspacetest=# select handle from item, handle where handle.resource_id = item.item_id AND item.item_id = '22678'; ``` - It's 25 items so editing in the web UI is annoying, let's try SQL! ``` dspacetest=# delete from metadatavalue where metadata_field_id=78 and text_value=''; DELETE 25 ``` - After that perhaps a regular `dspace index-discovery` (no -b) *should* suffice... - Hmm, I indexed, cleared the Cocoon cache, and restarted Tomcat but the 25 "|||" countries are still there - Maybe I need to do a full re-index... - Yep! The full re-index seems to work. - Process the empty countries on CGSpace ## 2016-02-07 - Working on cleaning up Abenet's DAGRIS data with OpenRefine - I discovered two really nice functions in OpenRefine: `value.trim()` and `value.escape("javascript")` which shows whitespace characters like `\r\n`! - For some reason when you import an Excel file into OpenRefine it exports dates like 1949 to 1949.0 in the CSV - I re-import the resulting CSV and run a GREL on the date issued column: `value.replace("\.0", "")` - I need to start running DSpace in Mac OS X instead of a Linux VM - Install PostgreSQL from homebrew, then configure and import CGSpace database dump: ``` $ postgres -D /opt/brew/var/postgres $ createuser --superuser postgres $ createuser --pwprompt dspacetest $ createdb -O dspacetest --encoding=UNICODE dspacetest $ psql postgres postgres=# alter user dspacetest createuser; postgres=# \q $ pg_restore -O -U dspacetest -d dspacetest ~/Downloads/cgspace_2016-02-07.backup $ psql postgres postgres=# alter user dspacetest nocreateuser; postgres=# \q $ vacuumdb dspacetest $ psql -U dspacetest -f ~/src/git/DSpace/dspace/etc/postgres/update-sequences.sql dspacetest -h localhost ``` - After building and running a `fresh_install` I symlinked the webapps into Tomcat's webapps folder: ``` $ mv /opt/brew/Cellar/tomcat/8.0.30/libexec/webapps/ROOT /opt/brew/Cellar/tomcat/8.0.30/libexec/webapps/ROOT.orig $ ln -sfv ~/dspace/webapps/xmlui /opt/brew/Cellar/tomcat/8.0.30/libexec/webapps/ROOT $ ln -sfv ~/dspace/webapps/rest /opt/brew/Cellar/tomcat/8.0.30/libexec/webapps/rest $ ln -sfv ~/dspace/webapps/jspui /opt/brew/Cellar/tomcat/8.0.30/libexec/webapps/jspui $ ln -sfv ~/dspace/webapps/oai /opt/brew/Cellar/tomcat/8.0.30/libexec/webapps/oai $ ln -sfv ~/dspace/webapps/solr /opt/brew/Cellar/tomcat/8.0.30/libexec/webapps/solr $ /opt/brew/Cellar/tomcat/8.0.30/bin/catalina start ``` - Add CATALINA_OPTS in `/opt/brew/Cellar/tomcat/8.0.30/libexec/bin/setenv.sh`, as this script is sourced by the `catalina` startup script - For example: ``` CATALINA_OPTS="-Djava.awt.headless=true -Xms2048m -Xmx2048m -XX:MaxPermSize=256m -XX:+UseConcMarkSweepGC -Dfile.encoding=UTF-8" ``` - After verifying that the site is working, start a full index: ``` $ ~/dspace/bin/dspace index-discovery -b ``` ## 2016-02-08 - Finish cleaning up and importing ~400 DAGRIS items into CGSpace - Whip up some quick CSS to make the button in the submission workflow use the XMLUI theme's brand colors ([#154](https://github.com/ilri/DSpace/issues/154)) ![ILRI submission buttons](../images/2016/02/submit-button-ilri.png) ![Drylands submission buttons](../images/2016/02/submit-button-drylands.png)