March, 2016

2016-03-02 Looking at issues with author authorities on CGSpace For some reason we still have the index-lucene-update cron job active on CGSpace, but I’m pretty sure we don’t need it as of the latest few versions of Atmire’s Listings and Reports module Reinstall my local (Mac OS X) DSpace stack with Tomcat 7, PostgreSQL 9.3, and Java JDK 1.7 to match environment on CGSpace server

February, 2016

2016-02-05 Looking at some DAGRIS data for Abenet Yabowork Lots of issues with spaces, newlines, etc causing the import to fail I noticed we have a very interesting list of countries on CGSpace: Not only are there 49,000 countries, we have some blanks (25)… Also, lots of things like “COTE D`LVOIRE” and “COTE D IVOIRE” 2016-02-06 Found a way to get items with null/empty metadata values from SQL First, find the metadata_field_id for the field you want from the metadatafieldregistry table: dspacetest=# select * from metadatafieldregistry; In this case our country field is 78 Now find all resources with type 2 (item) that have null/empty values for that field: dspacetest=# select resource_id from metadatavalue where resource_type_id=2 and metadata_field_id=78 and (text_value='' OR text_value IS NULL); Then you can find the handle that owns it from its resource_id: dspacetest=# select handle from item, handle where handle.resource_id = item.item_id AND item.item_id = '22678'; It’s 25 items so editing in the web UI is annoying, let’s try SQL!

January, 2016

2016-01-13 Move ILRI collection 10568/12503 from 10568/27869 to 10568/27629 using the move_collections.sh script I wrote last year. I realized it is only necessary to clear the Cocoon cache after moving collections—rather than reindexing—as no metadata has changed, and therefore no search or browse indexes need to be updated. Update GitHub wiki for documentation of maintenance tasks. 2016-01-14 Update CCAFS project identifiers in input-forms.xml Run system updates and restart the server 2016-01-18 Change “Extension material” to “Extension Material” in input-forms.xml (a mistake that fell through the cracks when we fixed the others in DSpace 4 era) 2016-01-19 Work on tweaks and updates for the social sharing icons on item pages: add Delicious and Mendeley (from Academicons), make links open in new windows, and set the icon color to the theme’s primary color (#157) Tweak date-based facets to show more values in drill-down ranges (#162) Need to remember to clear the Cocoon cache after deployment or else you don’t see the new ranges immediately Set up recipe on IFTTT to tweet new items from the CGSpace Atom feed to my twitter account Altmetrics’ support for Handles is kinda weak, so they can’t associate our items with DOIs until they are tweeted or blogged, etc first.

December, 2015

2015-12-02 Replace lzop with xz in log compression cron jobs on DSpace Test—it uses less space: # cd /home/dspacetest.cgiar.org/log # ls -lh dspace.log.2015-11-18* -rw-rw-r-- 1 tomcat7 tomcat7 2.0M Nov 18 23:59 dspace.log.2015-11-18 -rw-rw-r-- 1 tomcat7 tomcat7 387K Nov 18 23:59 dspace.log.2015-11-18.lzo -rw-rw-r-- 1 tomcat7 tomcat7 169K Nov 18 23:59 dspace.log.2015-11-18.xz I had used lrzip once, but it needs more memory and is harder to use as it requires the lrztar

November, 2015

2015-11-22 CGSpace went down Looks like DSpace exhausted its PostgreSQL connection pool Last week I had increased the limit from 30 to 60, which seemed to help, but now there are many more idle connections: $ psql -c 'SELECT * from pg_stat_activity;' | grep idle | grep -c cgspace 78 For now I have increased the limit from 60 to 90, run updates, and rebooted the server 2015-11-24 CGSpace went down again Getting emails from uptimeRobot and uptimeButler that it’s down, and Google Webmaster Tools is sending emails that there is an increase in crawl errors Looks like there are still a bunch of idle PostgreSQL connections: $ psql -c 'SELECT * from pg_stat_activity;' | grep idle | grep -c cgspace 96 For some reason the number of idle connections is very high since we upgraded to DSpace 5 2015-11-25 Troubleshoot the DSpace 5 OAI breakage caused by nginx routing config The OAI application requests stylesheets and javascript files with the path /oai/static/css, which gets matched here: # static assets we can load from the file system directly with nginx location ~ /(themes|static|aspects/ReportingSuite) { try_files $uri @tomcat; ...