diff --git a/content/posts/2018-03.md b/content/posts/2018-03.md index b23da319c..7b395cec6 100644 --- a/content/posts/2018-03.md +++ b/content/posts/2018-03.md @@ -139,3 +139,8 @@ dc.contributor.author,cg.creator.id - I didn't integrate the ORCID API lookup for author names in this script for now because I was only interested in "tagging" old items for a few given authors - I added ORCID identifers for 187 items by CIAT's Hernan Ceballos, because that is what Elizabeth was trying to do manually! - Also, I decided to add ORCID identifiers for all records from Peter, Abenet, and Sisay as well + +## 2018-03-09 + +- Give James Stapleton input on Sisay's KRAs +- Create a pull request to disable ORCID authority integration for `dc.contributor.author` in the submission forms and XMLUI display ([#363](https://github.com/ilri/DSpace/pull/363)) diff --git a/docs/2015-11/index.html b/docs/2015-11/index.html index 2b5acd728..4b4498325 100644 --- a/docs/2015-11/index.html +++ b/docs/2015-11/index.html @@ -26,7 +26,7 @@ $ psql -c 'SELECT * from pg_stat_activity;' | grep idle | grep -c cgspac - + @@ -65,7 +65,7 @@ $ psql -c 'SELECT * from pg_stat_activity;' | grep idle | grep -c cgspac "url": "https://alanorth.github.io/cgspace-notes/2015-11/", "wordCount": "798", "datePublished": "2015-11-23T17:00:57+03:00", - "dateModified": "2016-09-28T17:02:30+03:00", + "dateModified": "2018-03-09T22:10:33+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -100,8 +100,6 @@ $ psql -c 'SELECT * from pg_stat_activity;' | grep idle | grep -c cgspac
diff --git a/docs/2015-12/index.html b/docs/2015-12/index.html index 1069f8694..5660dc8c9 100644 --- a/docs/2015-12/index.html +++ b/docs/2015-12/index.html @@ -27,7 +27,7 @@ Replace lzop with xz in log compression cron jobs on DSpace Test—it uses less - + @@ -67,7 +67,7 @@ Replace lzop with xz in log compression cron jobs on DSpace Test—it uses less "url": "https://alanorth.github.io/cgspace-notes/2015-12/", "wordCount": "753", "datePublished": "2015-12-02T13:18:00+03:00", - "dateModified": "2017-01-09T16:18:07+02:00", + "dateModified": "2018-03-09T22:10:33+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -102,8 +102,6 @@ Replace lzop with xz in log compression cron jobs on DSpace Test—it uses less
diff --git a/docs/2016-01/index.html b/docs/2016-01/index.html index 66d92918c..4bd1241c0 100644 --- a/docs/2016-01/index.html +++ b/docs/2016-01/index.html @@ -22,7 +22,7 @@ Update GitHub wiki for documentation of maintenance tasks. - + @@ -57,7 +57,7 @@ Update GitHub wiki for documentation of maintenance tasks. "url": "https://alanorth.github.io/cgspace-notes/2016-01/", "wordCount": "466", "datePublished": "2016-01-13T13:18:00+03:00", - "dateModified": "2017-01-09T16:18:07+02:00", + "dateModified": "2018-03-09T22:10:33+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -92,8 +92,6 @@ Update GitHub wiki for documentation of maintenance tasks.
diff --git a/docs/2016-02/index.html b/docs/2016-02/index.html index 6f90458ff..036aa22c3 100644 --- a/docs/2016-02/index.html +++ b/docs/2016-02/index.html @@ -29,7 +29,7 @@ Also, lots of things like “COTE D`LVOIRE” and “COTE D IVOIRE&r - + @@ -71,7 +71,7 @@ Also, lots of things like “COTE D`LVOIRE” and “COTE D IVOIRE&r "url": "https://alanorth.github.io/cgspace-notes/2016-02/", "wordCount": "1657", "datePublished": "2016-02-05T13:18:00+03:00", - "dateModified": "2017-01-09T16:18:07+02:00", + "dateModified": "2018-03-09T22:10:33+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -106,8 +106,6 @@ Also, lots of things like “COTE D`LVOIRE” and “COTE D IVOIRE&r
diff --git a/docs/2016-03/index.html b/docs/2016-03/index.html index 6905d62b8..35e719a7b 100644 --- a/docs/2016-03/index.html +++ b/docs/2016-03/index.html @@ -22,7 +22,7 @@ Reinstall my local (Mac OS X) DSpace stack with Tomcat 7, PostgreSQL 9.3, and Ja - + @@ -57,7 +57,7 @@ Reinstall my local (Mac OS X) DSpace stack with Tomcat 7, PostgreSQL 9.3, and Ja "url": "https://alanorth.github.io/cgspace-notes/2016-03/", "wordCount": "1581", "datePublished": "2016-03-02T16:50:00+03:00", - "dateModified": "2017-01-09T16:18:07+02:00", + "dateModified": "2018-03-09T22:10:33+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -92,8 +92,6 @@ Reinstall my local (Mac OS X) DSpace stack with Tomcat 7, PostgreSQL 9.3, and Ja
diff --git a/docs/2016-04/index.html b/docs/2016-04/index.html index 6e758bd45..81572d635 100644 --- a/docs/2016-04/index.html +++ b/docs/2016-04/index.html @@ -24,7 +24,7 @@ Also, I noticed the checker log has some errors we should pay attention to: - + @@ -61,7 +61,7 @@ Also, I noticed the checker log has some errors we should pay attention to: "url": "https://alanorth.github.io/cgspace-notes/2016-04/", "wordCount": "2006", "datePublished": "2016-04-04T11:06:00+03:00", - "dateModified": "2016-09-28T17:02:30+03:00", + "dateModified": "2018-03-09T22:10:33+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -96,8 +96,6 @@ Also, I noticed the checker log has some errors we should pay attention to:
diff --git a/docs/2016-05/index.html b/docs/2016-05/index.html index f77c58e26..e0cbd311f 100644 --- a/docs/2016-05/index.html +++ b/docs/2016-05/index.html @@ -26,7 +26,7 @@ There are 3,000 IPs accessing the REST API in a 24-hour period! - + @@ -65,7 +65,7 @@ There are 3,000 IPs accessing the REST API in a 24-hour period! "url": "https://alanorth.github.io/cgspace-notes/2016-05/", "wordCount": "1349", "datePublished": "2016-05-01T23:06:00+03:00", - "dateModified": "2017-01-09T16:18:07+02:00", + "dateModified": "2018-03-09T22:10:33+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -100,8 +100,6 @@ There are 3,000 IPs accessing the REST API in a 24-hour period!
diff --git a/docs/2016-06/index.html b/docs/2016-06/index.html index b9ef84c33..225d6c3f1 100644 --- a/docs/2016-06/index.html +++ b/docs/2016-06/index.html @@ -25,7 +25,7 @@ Working on second phase of metadata migration, looks like this will work for mov - + @@ -63,7 +63,7 @@ Working on second phase of metadata migration, looks like this will work for mov "url": "https://alanorth.github.io/cgspace-notes/2016-06/", "wordCount": "1549", "datePublished": "2016-06-01T10:53:00+03:00", - "dateModified": "2017-01-09T16:18:07+02:00", + "dateModified": "2018-03-09T22:10:33+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -98,8 +98,6 @@ Working on second phase of metadata migration, looks like this will work for mov
diff --git a/docs/2016-07/index.html b/docs/2016-07/index.html index 6c6a1adf0..e1706b84e 100644 --- a/docs/2016-07/index.html +++ b/docs/2016-07/index.html @@ -33,7 +33,7 @@ In this case the select query was showing 95 results before the update - + @@ -79,7 +79,7 @@ In this case the select query was showing 95 results before the update "url": "https://alanorth.github.io/cgspace-notes/2016-07/", "wordCount": "866", "datePublished": "2016-07-01T10:53:00+03:00", - "dateModified": "2017-01-09T16:18:07+02:00", + "dateModified": "2018-03-09T22:10:33+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -114,8 +114,6 @@ In this case the select query was showing 95 results before the update
diff --git a/docs/2016-08/index.html b/docs/2016-08/index.html index f56fe76fd..06d87f04d 100644 --- a/docs/2016-08/index.html +++ b/docs/2016-08/index.html @@ -30,7 +30,7 @@ $ git rebase -i dspace-5.5 - + @@ -73,7 +73,7 @@ $ git rebase -i dspace-5.5 "url": "https://alanorth.github.io/cgspace-notes/2016-08/", "wordCount": "1514", "datePublished": "2016-08-01T15:53:00+03:00", - "dateModified": "2017-01-09T16:18:07+02:00", + "dateModified": "2018-03-09T22:10:33+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -108,8 +108,6 @@ $ git rebase -i dspace-5.5
diff --git a/docs/2016-09/index.html b/docs/2016-09/index.html index 48d87959f..5001e02ff 100644 --- a/docs/2016-09/index.html +++ b/docs/2016-09/index.html @@ -26,7 +26,7 @@ $ ldapsearch -x -H ldaps://svcgroot2.cgiarad.org:3269/ -b "dc=cgiarad,dc=or - + @@ -65,7 +65,7 @@ $ ldapsearch -x -H ldaps://svcgroot2.cgiarad.org:3269/ -b "dc=cgiarad,dc=or "url": "https://alanorth.github.io/cgspace-notes/2016-09/", "wordCount": "3298", "datePublished": "2016-09-01T15:53:00+03:00", - "dateModified": "2017-01-09T16:18:07+02:00", + "dateModified": "2018-03-09T22:10:33+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -100,8 +100,6 @@ $ ldapsearch -x -H ldaps://svcgroot2.cgiarad.org:3269/ -b "dc=cgiarad,dc=or
diff --git a/docs/2016-10/index.html b/docs/2016-10/index.html index 29469e47f..714f3e69a 100644 --- a/docs/2016-10/index.html +++ b/docs/2016-10/index.html @@ -30,7 +30,7 @@ I exported a random item’s metadata as CSV, deleted all columns except id - + @@ -73,7 +73,7 @@ I exported a random item’s metadata as CSV, deleted all columns except id "url": "https://alanorth.github.io/cgspace-notes/2016-10/", "wordCount": "1828", "datePublished": "2016-10-03T15:53:00+03:00", - "dateModified": "2017-01-10T16:21:47+02:00", + "dateModified": "2018-03-09T22:10:33+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -108,8 +108,6 @@ I exported a random item’s metadata as CSV, deleted all columns except id
diff --git a/docs/2016-11/index.html b/docs/2016-11/index.html index 49bb16f13..f4a5a8431 100644 --- a/docs/2016-11/index.html +++ b/docs/2016-11/index.html @@ -22,7 +22,7 @@ Add dc.type to the output options for Atmire’s Listings and Reports module - + @@ -57,7 +57,7 @@ Add dc.type to the output options for Atmire’s Listings and Reports module "url": "https://alanorth.github.io/cgspace-notes/2016-11/", "wordCount": "2825", "datePublished": "2016-11-01T09:21:00+03:00", - "dateModified": "2017-01-10T16:21:47+02:00", + "dateModified": "2018-03-09T22:10:33+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -92,8 +92,6 @@ Add dc.type to the output options for Atmire’s Listings and Reports module
diff --git a/docs/2016-12/index.html b/docs/2016-12/index.html index efb0cc21b..23738c2a7 100644 --- a/docs/2016-12/index.html +++ b/docs/2016-12/index.html @@ -34,7 +34,7 @@ Another worrying error from dspace.log is: - + @@ -81,7 +81,7 @@ Another worrying error from dspace.log is: "url": "https://alanorth.github.io/cgspace-notes/2016-12/", "wordCount": "4078", "datePublished": "2016-12-02T10:43:00+03:00", - "dateModified": "2017-09-19T16:07:20+03:00", + "dateModified": "2018-03-09T22:10:33+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -116,8 +116,6 @@ Another worrying error from dspace.log is:
diff --git a/docs/2017-01/index.html b/docs/2017-01/index.html index 4a15db77c..8e9df4dba 100644 --- a/docs/2017-01/index.html +++ b/docs/2017-01/index.html @@ -22,7 +22,7 @@ I asked on the dspace-tech mailing list because it seems to be broken, and actua - + @@ -57,7 +57,7 @@ I asked on the dspace-tech mailing list because it seems to be broken, and actua "url": "https://alanorth.github.io/cgspace-notes/2017-01/", "wordCount": "1594", "datePublished": "2017-01-02T10:43:00+03:00", - "dateModified": "2017-01-29T13:18:32+02:00", + "dateModified": "2018-03-09T22:10:33+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -92,8 +92,6 @@ I asked on the dspace-tech mailing list because it seems to be broken, and actua
diff --git a/docs/2017-02/index.html b/docs/2017-02/index.html index f70f26e33..61e3a0b3d 100644 --- a/docs/2017-02/index.html +++ b/docs/2017-02/index.html @@ -36,7 +36,7 @@ Looks like we’ll be using cg.identifier.ccafsprojectpii as the field name - + @@ -85,7 +85,7 @@ Looks like we’ll be using cg.identifier.ccafsprojectpii as the field name "url": "https://alanorth.github.io/cgspace-notes/2017-02/", "wordCount": "2028", "datePublished": "2017-02-07T07:04:52-08:00", - "dateModified": "2017-02-28T22:58:29+02:00", + "dateModified": "2018-03-09T22:10:33+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -120,8 +120,6 @@ Looks like we’ll be using cg.identifier.ccafsprojectpii as the field name
diff --git a/docs/2017-03/index.html b/docs/2017-03/index.html index 24f1746c0..d0db27984 100644 --- a/docs/2017-03/index.html +++ b/docs/2017-03/index.html @@ -38,7 +38,7 @@ $ identify ~/Desktop/alc_contrastes_desafios.jpg - + @@ -89,7 +89,7 @@ $ identify ~/Desktop/alc_contrastes_desafios.jpg "url": "https://alanorth.github.io/cgspace-notes/2017-03/", "wordCount": "1538", "datePublished": "2017-03-01T17:08:52+02:00", - "dateModified": "2017-03-31T05:36:10+03:00", + "dateModified": "2018-03-09T22:10:33+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -124,8 +124,6 @@ $ identify ~/Desktop/alc_contrastes_desafios.jpg
diff --git a/docs/2017-04/index.html b/docs/2017-04/index.html index 6cfd67a06..00f9160e4 100644 --- a/docs/2017-04/index.html +++ b/docs/2017-04/index.html @@ -31,7 +31,7 @@ $ [dspace]/bin/dspace filter-media -f -i 10568/16498 -p "ImageMagick PDF Th - + @@ -75,7 +75,7 @@ $ [dspace]/bin/dspace filter-media -f -i 10568/16498 -p "ImageMagick PDF Th "url": "https://alanorth.github.io/cgspace-notes/2017-04/", "wordCount": "2917", "datePublished": "2017-04-02T17:08:52+02:00", - "dateModified": "2017-04-26T13:35:10+03:00", + "dateModified": "2018-03-09T22:10:33+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -110,8 +110,6 @@ $ [dspace]/bin/dspace filter-media -f -i 10568/16498 -p "ImageMagick PDF Th
diff --git a/docs/2017-05/index.html b/docs/2017-05/index.html index ac7b80eb6..bc1a9839b 100644 --- a/docs/2017-05/index.html +++ b/docs/2017-05/index.html @@ -14,7 +14,7 @@ - + @@ -41,7 +41,7 @@ "url": "https://alanorth.github.io/cgspace-notes/2017-05/", "wordCount": "2398", "datePublished": "2017-05-01T16:21:52+02:00", - "dateModified": "2017-09-10T17:46:54+03:00", + "dateModified": "2018-03-09T22:10:33+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -76,8 +76,6 @@
diff --git a/docs/2017-06/index.html b/docs/2017-06/index.html index 50b3d195b..72d82f386 100644 --- a/docs/2017-06/index.html +++ b/docs/2017-06/index.html @@ -14,7 +14,7 @@ - + @@ -41,7 +41,7 @@ "url": "https://alanorth.github.io/cgspace-notes/2017-06/", "wordCount": "1261", "datePublished": "2017-06-01T10:14:52+03:00", - "dateModified": "2017-06-30T18:34:51+03:00", + "dateModified": "2018-03-09T22:10:33+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -76,8 +76,6 @@
diff --git a/docs/2017-07/index.html b/docs/2017-07/index.html index dfe94047e..55b6471d4 100644 --- a/docs/2017-07/index.html +++ b/docs/2017-07/index.html @@ -28,7 +28,7 @@ We can use PostgreSQL’s extended output format (-x) plus sed to format the - + @@ -69,7 +69,7 @@ We can use PostgreSQL’s extended output format (-x) plus sed to format the "url": "https://alanorth.github.io/cgspace-notes/2017-07/", "wordCount": "1151", "datePublished": "2017-07-01T18:03:52+03:00", - "dateModified": "2017-08-01T08:55:37+03:00", + "dateModified": "2018-03-09T22:10:33+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -104,8 +104,6 @@ We can use PostgreSQL’s extended output format (-x) plus sed to format the
diff --git a/docs/2017-08/index.html b/docs/2017-08/index.html index 2f554f35c..98433bebf 100644 --- a/docs/2017-08/index.html +++ b/docs/2017-08/index.html @@ -38,7 +38,7 @@ Then I cleaned up the author authorities and HTML characters in OpenRefine and s - + @@ -89,7 +89,7 @@ Then I cleaned up the author authorities and HTML characters in OpenRefine and s "url": "https://alanorth.github.io/cgspace-notes/2017-08/", "wordCount": "3542", "datePublished": "2017-08-01T11:51:52+03:00", - "dateModified": "2017-09-10T19:18:52+03:00", + "dateModified": "2018-03-09T22:10:33+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -124,8 +124,6 @@ Then I cleaned up the author authorities and HTML characters in OpenRefine and s
diff --git a/docs/2017-09/index.html b/docs/2017-09/index.html index 2d856be57..0d7710ce4 100644 --- a/docs/2017-09/index.html +++ b/docs/2017-09/index.html @@ -26,7 +26,7 @@ Ask Sisay to clean up the WLE approvers a bit, as Marianne’s user account - + @@ -65,7 +65,7 @@ Ask Sisay to clean up the WLE approvers a bit, as Marianne’s user account "url": "https://alanorth.github.io/cgspace-notes/2017-09/", "wordCount": "4199", "datePublished": "2017-09-07T16:54:52+07:00", - "dateModified": "2017-09-28T07:56:11+03:00", + "dateModified": "2018-03-09T22:10:33+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -100,8 +100,6 @@ Ask Sisay to clean up the WLE approvers a bit, as Marianne’s user account
diff --git a/docs/2017-10/index.html b/docs/2017-10/index.html index 439088f69..cdd852b70 100644 --- a/docs/2017-10/index.html +++ b/docs/2017-10/index.html @@ -28,7 +28,7 @@ Add Katherine Lutz to the groups for content submission and edit steps of the CG - + @@ -69,7 +69,7 @@ Add Katherine Lutz to the groups for content submission and edit steps of the CG "url": "https://alanorth.github.io/cgspace-notes/2017-10/", "wordCount": "2613", "datePublished": "2017-10-01T08:07:54+03:00", - "dateModified": "2017-11-02T16:13:10+02:00", + "dateModified": "2018-03-09T22:10:33+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -104,8 +104,6 @@ Add Katherine Lutz to the groups for content submission and edit steps of the CG
diff --git a/docs/2017-11/index.html b/docs/2017-11/index.html index 89ee23d57..25b857254 100644 --- a/docs/2017-11/index.html +++ b/docs/2017-11/index.html @@ -38,7 +38,7 @@ COPY 54701 - + @@ -89,7 +89,7 @@ COPY 54701 "url": "https://alanorth.github.io/cgspace-notes/2017-11/", "wordCount": "5428", "datePublished": "2017-11-02T09:37:54+02:00", - "dateModified": "2018-01-12T06:07:03+02:00", + "dateModified": "2018-03-09T22:10:33+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -124,8 +124,6 @@ COPY 54701
diff --git a/docs/2017-12/index.html b/docs/2017-12/index.html index 8dc6ae775..616963470 100644 --- a/docs/2017-12/index.html +++ b/docs/2017-12/index.html @@ -23,7 +23,7 @@ The list of connections to XMLUI and REST API for today: - + @@ -59,7 +59,7 @@ The list of connections to XMLUI and REST API for today: "url": "https://alanorth.github.io/cgspace-notes/2017-12/", "wordCount": "4088", "datePublished": "2017-12-01T13:53:54+03:00", - "dateModified": "2017-12-31T10:42:16-08:00", + "dateModified": "2018-03-09T22:10:33+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -94,8 +94,6 @@ The list of connections to XMLUI and REST API for today:
diff --git a/docs/2018-01/index.html b/docs/2018-01/index.html index 87eb16ce6..ee17cf851 100644 --- a/docs/2018-01/index.html +++ b/docs/2018-01/index.html @@ -92,7 +92,7 @@ Danny wrote to ask for help renewing the wildcard ilri.org certificate and I adv - + @@ -197,7 +197,7 @@ Danny wrote to ask for help renewing the wildcard ilri.org certificate and I adv "url": "https://alanorth.github.io/cgspace-notes/2018-01/", "wordCount": "7940", "datePublished": "2018-01-02T08:35:54-08:00", - "dateModified": "2018-01-31T16:17:39+02:00", + "dateModified": "2018-03-09T22:10:33+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -232,8 +232,6 @@ Danny wrote to ask for help renewing the wildcard ilri.org certificate and I adv
diff --git a/docs/2018-02/index.html b/docs/2018-02/index.html index 7796e8445..a7f7bba5a 100644 --- a/docs/2018-02/index.html +++ b/docs/2018-02/index.html @@ -23,7 +23,7 @@ I copied the logic in the jmx_tomcat_dbpools provided by Ubuntu’s munin-pl - + @@ -59,7 +59,7 @@ I copied the logic in the jmx_tomcat_dbpools provided by Ubuntu’s munin-pl "url": "https://alanorth.github.io/cgspace-notes/2018-02/", "wordCount": "6400", "datePublished": "2018-02-01T16:28:54+02:00", - "dateModified": "2018-02-28T17:30:16+02:00", + "dateModified": "2018-03-09T22:10:33+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -94,8 +94,6 @@ I copied the logic in the jmx_tomcat_dbpools provided by Ubuntu’s munin-pl
diff --git a/docs/2018-03/index.html b/docs/2018-03/index.html index ee6628f1e..b764511ef 100644 --- a/docs/2018-03/index.html +++ b/docs/2018-03/index.html @@ -20,7 +20,7 @@ Export a CSV of the IITA community metadata for Martin Mueller - + @@ -51,9 +51,9 @@ Export a CSV of the IITA community metadata for Martin Mueller "@type": "BlogPosting", "headline": "March, 2018", "url": "https://alanorth.github.io/cgspace-notes/2018-03/", - "wordCount": "780", + "wordCount": "807", "datePublished": "2018-03-02T16:07:54+02:00", - "dateModified": "2018-03-08T21:29:37+02:00", + "dateModified": "2018-03-09T22:10:33+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -88,8 +88,6 @@ Export a CSV of the IITA community metadata for Martin Mueller
@@ -272,6 +270,13 @@ UPDATE 2309
  • Also, I decided to add ORCID identifiers for all records from Peter, Abenet, and Sisay as well
  • +

    2018-03-09

    + + + diff --git a/docs/404.html b/docs/404.html index f780e0010..52e71fce4 100644 --- a/docs/404.html +++ b/docs/404.html @@ -60,8 +60,6 @@
    diff --git a/docs/categories/index.html b/docs/categories/index.html index 50b361e2f..a0fc920b5 100644 --- a/docs/categories/index.html +++ b/docs/categories/index.html @@ -73,8 +73,6 @@
    @@ -102,6 +100,387 @@ +
    +
    +

    March, 2018

    + +
    +

    2018-03-02

    + + + +

    + Read more → +
    + + + + + + +
    +
    +

    February, 2018

    + +
    +

    2018-02-01

    + + + +

    + Read more → +
    + + + + + + +
    +
    +

    January, 2018

    + +
    +

    2018-01-02

    + + + +
    Caused by: org.apache.tomcat.jdbc.pool.PoolExhaustedException: [http-bio-127.0.0.1-8443-exec-980] Timeout: Pool empty. Unable to fetch a connection in 5 seconds, none available[size:50; busy:50; idle:0; lastwait:5000].
    +
    + + + +
    2018-01-02 01:21:19,137 ERROR org.dspace.app.xmlui.aspect.discovery.SidebarFacetsTransformer @ Error while searching for sidebar facets
    +org.dspace.discovery.SearchServiceException: org.apache.solr.search.SyntaxError: Cannot parse 'dateIssued_keyword:[1976+TO+1979]': Encountered " "]" "] "" at line 1, column 32.
    +
    + + + +
    $ grep -c "Error while searching for sidebar facets" dspace.log.*
    +dspace.log.2017-11-21:4
    +dspace.log.2017-11-22:1
    +dspace.log.2017-11-23:4
    +dspace.log.2017-11-24:11
    +dspace.log.2017-11-25:0
    +dspace.log.2017-11-26:1
    +dspace.log.2017-11-27:7
    +dspace.log.2017-11-28:21
    +dspace.log.2017-11-29:31
    +dspace.log.2017-11-30:15
    +dspace.log.2017-12-01:15
    +dspace.log.2017-12-02:20
    +dspace.log.2017-12-03:38
    +dspace.log.2017-12-04:65
    +dspace.log.2017-12-05:43
    +dspace.log.2017-12-06:72
    +dspace.log.2017-12-07:27
    +dspace.log.2017-12-08:15
    +dspace.log.2017-12-09:29
    +dspace.log.2017-12-10:35
    +dspace.log.2017-12-11:20
    +dspace.log.2017-12-12:44
    +dspace.log.2017-12-13:36
    +dspace.log.2017-12-14:59
    +dspace.log.2017-12-15:104
    +dspace.log.2017-12-16:53
    +dspace.log.2017-12-17:66
    +dspace.log.2017-12-18:83
    +dspace.log.2017-12-19:101
    +dspace.log.2017-12-20:74
    +dspace.log.2017-12-21:55
    +dspace.log.2017-12-22:66
    +dspace.log.2017-12-23:50
    +dspace.log.2017-12-24:85
    +dspace.log.2017-12-25:62
    +dspace.log.2017-12-26:49
    +dspace.log.2017-12-27:30
    +dspace.log.2017-12-28:54
    +dspace.log.2017-12-29:68
    +dspace.log.2017-12-30:89
    +dspace.log.2017-12-31:53
    +dspace.log.2018-01-01:45
    +dspace.log.2018-01-02:34
    +
    + + + +

    + Read more → +
    + + + + + + +
    +
    +

    December, 2017

    + +
    +

    2017-12-01

    + + + +

    + Read more → +
    + + + + + + +
    +
    +

    November, 2017

    + +
    +

    2017-11-01

    + + + +

    2017-11-02

    + + + +
    # grep -c "CORE" /var/log/nginx/access.log
    +0
    +
    + + + +
    dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'contributor' and qualifier = 'author') AND resource_type_id = 2 group by text_value order by count desc) to /tmp/authors.csv with csv;
    +COPY 54701
    +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    October, 2017

    + +
    +

    2017-10-01

    + + + +
    http://hdl.handle.net/10568/78495||http://hdl.handle.net/10568/79336
    +
    + + + +

    + Read more → +
    + + + + + + +
    +
    +

    CGIAR Library Migration

    + +
    +

    Rough notes for importing the CGIAR Library content. It was decided that this content would go to a new top-level community called CGIAR System Organization.

    + +

    + Read more → +
    + + + + + + +
    +
    +

    September, 2017

    + +
    +

    2017-09-06

    + + + +

    2017-09-07

    + + + +

    + Read more → +
    + + + + + + +
    +
    +

    August, 2017

    + +
    +

    2017-08-01

    + + + +

    + Read more → +
    + + + + + + +
    +
    +

    July, 2017

    + +
    +

    2017-07-01

    + + + +

    2017-07-04

    + + + +

    + Read more → +
    + + + + + + + + diff --git a/docs/categories/notes/index.html b/docs/categories/notes/index.html index bf7121b6b..fd9865156 100644 --- a/docs/categories/notes/index.html +++ b/docs/categories/notes/index.html @@ -74,8 +74,6 @@
    @@ -103,6 +101,254 @@ +
    +
    +

    March, 2018

    + +
    +

    2018-03-02

    + + + +

    + Read more → +
    + + + + + + +
    +
    +

    February, 2018

    + +
    +

    2018-02-01

    + + + +

    + Read more → +
    + + + + + + +
    +
    +

    January, 2018

    + +
    +

    2018-01-02

    + + + +
    Caused by: org.apache.tomcat.jdbc.pool.PoolExhaustedException: [http-bio-127.0.0.1-8443-exec-980] Timeout: Pool empty. Unable to fetch a connection in 5 seconds, none available[size:50; busy:50; idle:0; lastwait:5000].
    +
    + + + +
    2018-01-02 01:21:19,137 ERROR org.dspace.app.xmlui.aspect.discovery.SidebarFacetsTransformer @ Error while searching for sidebar facets
    +org.dspace.discovery.SearchServiceException: org.apache.solr.search.SyntaxError: Cannot parse 'dateIssued_keyword:[1976+TO+1979]': Encountered " "]" "] "" at line 1, column 32.
    +
    + + + +
    $ grep -c "Error while searching for sidebar facets" dspace.log.*
    +dspace.log.2017-11-21:4
    +dspace.log.2017-11-22:1
    +dspace.log.2017-11-23:4
    +dspace.log.2017-11-24:11
    +dspace.log.2017-11-25:0
    +dspace.log.2017-11-26:1
    +dspace.log.2017-11-27:7
    +dspace.log.2017-11-28:21
    +dspace.log.2017-11-29:31
    +dspace.log.2017-11-30:15
    +dspace.log.2017-12-01:15
    +dspace.log.2017-12-02:20
    +dspace.log.2017-12-03:38
    +dspace.log.2017-12-04:65
    +dspace.log.2017-12-05:43
    +dspace.log.2017-12-06:72
    +dspace.log.2017-12-07:27
    +dspace.log.2017-12-08:15
    +dspace.log.2017-12-09:29
    +dspace.log.2017-12-10:35
    +dspace.log.2017-12-11:20
    +dspace.log.2017-12-12:44
    +dspace.log.2017-12-13:36
    +dspace.log.2017-12-14:59
    +dspace.log.2017-12-15:104
    +dspace.log.2017-12-16:53
    +dspace.log.2017-12-17:66
    +dspace.log.2017-12-18:83
    +dspace.log.2017-12-19:101
    +dspace.log.2017-12-20:74
    +dspace.log.2017-12-21:55
    +dspace.log.2017-12-22:66
    +dspace.log.2017-12-23:50
    +dspace.log.2017-12-24:85
    +dspace.log.2017-12-25:62
    +dspace.log.2017-12-26:49
    +dspace.log.2017-12-27:30
    +dspace.log.2017-12-28:54
    +dspace.log.2017-12-29:68
    +dspace.log.2017-12-30:89
    +dspace.log.2017-12-31:53
    +dspace.log.2018-01-01:45
    +dspace.log.2018-01-02:34
    +
    + + + +

    + Read more → +
    + + + + + + +
    +
    +

    December, 2017

    + +
    +

    2017-12-01

    + + + +

    + Read more → +
    + + + + + + +
    +
    +

    November, 2017

    + +
    +

    2017-11-01

    + + + +

    2017-11-02

    + + + +
    # grep -c "CORE" /var/log/nginx/access.log
    +0
    +
    + + + +
    dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'contributor' and qualifier = 'author') AND resource_type_id = 2 group by text_value order by count desc) to /tmp/authors.csv with csv;
    +COPY 54701
    +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    October, 2017

    + +
    +

    2017-10-01

    + + + +
    http://hdl.handle.net/10568/78495||http://hdl.handle.net/10568/79336
    +
    + + + +

    + Read more → +
    + + + + + +

    CGIAR Library Migration

    @@ -123,6 +369,119 @@ +
    +
    +

    September, 2017

    + +
    +

    2017-09-06

    + +
      +
    • Linode sent an alert that CGSpace (linode18) was using 261% CPU for the past two hours
    • +
    + +

    2017-09-07

    + +
      +
    • Ask Sisay to clean up the WLE approvers a bit, as Marianne’s user account is both in the approvers step as well as the group
    • +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    August, 2017

    + +
    +

    2017-08-01

    + +
      +
    • Linode sent an alert that CGSpace (linode18) was using 350% CPU for the past two hours
    • +
    • I looked in the Activity pane of the Admin Control Panel and it seems that Google, Baidu, Yahoo, and Bing are all crawling with massive numbers of bots concurrently (~100 total, mostly Baidu and Google)
    • +
    • The good thing is that, according to dspace.log.2017-08-01, they are all using the same Tomcat session
    • +
    • This means our Tomcat Crawler Session Valve is working
    • +
    • But many of the bots are browsing dynamic URLs like: + +
        +
      • /handle/10568/3353/discover
      • +
      • /handle/10568/16510/browse
      • +
    • +
    • The robots.txt only blocks the top-level /discover and /browse URLs… we will need to find a way to forbid them from accessing these!
    • +
    • Relevant issue from DSpace Jira (semi resolved in DSpace 6.0): https://jira.duraspace.org/browse/DS-2962
    • +
    • It turns out that we’re already adding the X-Robots-Tag "none" HTTP header, but this only forbids the search engine from indexing the page, not crawling it!
    • +
    • Also, the bot has to successfully browse the page first so it can receive the HTTP header…
    • +
    • We might actually have to block these requests with HTTP 403 depending on the user agent
    • +
    • Abenet pointed out that the CGIAR Library Historical Archive collection I sent July 20th only had ~100 entries, instead of 2415
    • +
    • This was due to newline characters in the dc.description.abstract column, which caused OpenRefine to choke when exporting the CSV
    • +
    • I exported a new CSV from the collection on DSpace Test and then manually removed the characters in vim using g/^$/d
    • +
    • Then I cleaned up the author authorities and HTML characters in OpenRefine and sent the file back to Abenet
    • +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    July, 2017

    + +
    +

    2017-07-01

    + +
      +
    • Run system updates and reboot DSpace Test
    • +
    + +

    2017-07-04

    + +
      +
    • Merge changes for WLE Phase II theme rename (#329)
    • +
    • Looking at extracting the metadata registries from ICARDA’s MEL DSpace database so we can compare fields with CGSpace
    • +
    • We can use PostgreSQL’s extended output format (-x) plus sed to format the output into quasi XML:
    • +
    + +

    + Read more → +
    + + + + + + + + diff --git a/docs/categories/notes/page/2/index.html b/docs/categories/notes/page/2/index.html new file mode 100644 index 000000000..9aba01cbf --- /dev/null +++ b/docs/categories/notes/page/2/index.html @@ -0,0 +1,484 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + CGSpace Notes + + + + + + + + + + + + + + + + + + + + +
    +
    + +
    +
    + + + +
    +
    +

    CGSpace Notes

    +

    Documenting day-to-day work on the CGSpace repository.

    +
    +
    + + + +
    +
    +
    + + + + + + + + + +
    +
    +

    June, 2017

    + +
    + 2017-06-01 After discussion with WLE and CGSpace content people, we decided to just add one metadata field for the WLE Research Themes The cg.identifier.wletheme field will be used for both Phase I and Phase II Research Themes Then we’ll create a new sub-community for Phase II and create collections for the research themes there The current “Research Themes” community will be renamed to “WLE Phase I Research Themes” Tagged all items in the current Phase I collections with their appropriate themes Create pull request to add Phase II research themes to the submission form: #328 Add cg. + Read more → +
    + + + + + + +
    +
    +

    May, 2017

    + +
    + 2017-05-01 ICARDA apparently started working on CG Core on their MEL repository They have done a few cg.* fields, but not very consistent and even copy some of CGSpace items: https://mel.cgiar.org/xmlui/handle/20.500.11766/6911?show=full https://cgspace.cgiar.org/handle/10568/73683 2017-05-02 Atmire got back about the Workflow Statistics issue, and apparently it’s a bug in the CUA module so they will send us a pull request 2017-05-04 Sync DSpace Test with database and assetstore from CGSpace Re-deploy DSpace Test with Atmire’s CUA patch for workflow statistics, run system updates, and restart the server Now I can see the workflow statistics and am able to select users, but everything returns 0 items Megan says there are still some mapped items are not appearing since last week, so I forced a full index-discovery -b Need to remember to check if the collection has more items (currently 39 on CGSpace, but 118 on the freshly reindexed DSPace Test) tomorrow: https://cgspace. + Read more → +
    + + + + + + +
    +
    +

    April, 2017

    + +
    +

    2017-04-02

    + +
      +
    • Merge one change to CCAFS flagships that I had forgotten to remove last month (“MANAGING CLIMATE RISK”): https://github.com/ilri/DSpace/pull/317
    • +
    • Quick proof-of-concept hack to add dc.rights to the input form, including some inline instructions/hints:
    • +
    + +

    dc.rights in the submission form

    + +
      +
    • Remove redundant/duplicate text in the DSpace submission license
    • +
    • Testing the CMYK patch on a collection with 650 items:
    • +
    + +
    $ [dspace]/bin/dspace filter-media -f -i 10568/16498 -p "ImageMagick PDF Thumbnail" -v >& /tmp/filter-media-cmyk.txt
    +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    March, 2017

    + +
    +

    2017-03-01

    + +
      +
    • Run the 279 CIAT author corrections on CGSpace
    • +
    + +

    2017-03-02

    + +
      +
    • Skype with Michael and Peter, discussing moving the CGIAR Library to CGSpace
    • +
    • CGIAR people possibly open to moving content, redirecting library.cgiar.org to CGSpace and letting CGSpace resolve their handles
    • +
    • They might come in at the top level in one “CGIAR System” community, or with several communities
    • +
    • I need to spend a bit of time looking at the multiple handle support in DSpace and see if new content can be minted in both handles, or just one?
    • +
    • Need to send Peter and Michael some notes about this in a few days
    • +
    • Also, need to consider talking to Atmire about hiring them to bring ORCiD metadata to REST / OAI
    • +
    • Filed an issue on DSpace issue tracker for the filter-media bug that causes it to process JPGs even when limiting to the PDF thumbnail plugin: DS-3516
    • +
    • Discovered that the ImageMagic filter-media plugin creates JPG thumbnails with the CMYK colorspace when the source PDF is using CMYK
    • +
    • Interestingly, it seems DSpace 4.x’s thumbnails were sRGB, but forcing regeneration using DSpace 5.x’s ImageMagick plugin creates CMYK JPGs if the source PDF was CMYK (see 1056851999):
    • +
    + +
    $ identify ~/Desktop/alc_contrastes_desafios.jpg
    +/Users/aorth/Desktop/alc_contrastes_desafios.jpg JPEG 464x600 464x600+0+0 8-bit CMYK 168KB 0.000u 0:00.000
    +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    February, 2017

    + +
    +

    2017-02-07

    + +
      +
    • An item was mapped twice erroneously again, so I had to remove one of the mappings manually:
    • +
    + +
    dspace=# select * from collection2item where item_id = '80278';
    +  id   | collection_id | item_id
    +-------+---------------+---------
    + 92551 |           313 |   80278
    + 92550 |           313 |   80278
    + 90774 |          1051 |   80278
    +(3 rows)
    +dspace=# delete from collection2item where id = 92551 and item_id = 80278;
    +DELETE 1
    +
    + +
      +
    • Create issue on GitHub to track the addition of CCAFS Phase II project tags (#301)
    • +
    • Looks like we’ll be using cg.identifier.ccafsprojectpii as the field name
    • +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    January, 2017

    + +
    +

    2017-01-02

    + +
      +
    • I checked to see if the Solr sharding task that is supposed to run on January 1st had run and saw there was an error
    • +
    • I tested on DSpace Test as well and it doesn’t work there either
    • +
    • I asked on the dspace-tech mailing list because it seems to be broken, and actually now I’m not sure if we’ve ever had the sharding task run successfully over all these years
    • +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    December, 2016

    + +
    +

    2016-12-02

    + +
      +
    • CGSpace was down for five hours in the morning while I was sleeping
    • +
    • While looking in the logs for errors, I see tons of warnings about Atmire MQM:
    • +
    + +
    2016-12-02 03:00:32,352 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=CREATE, SubjectType=BUNDLE, SubjectID=70316, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632305, dispatcher=1544803905, detail=[null], transactionID="TX157907838689377964651674089851855413607")
    +2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=MODIFY_METADATA, SubjectType=BUNDLE, SubjectID =70316, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632309, dispatcher=1544803905, detail="dc.title", transactionID="TX157907838689377964651674089851855413607")
    +2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=ADD, SubjectType=ITEM, SubjectID=80044, Object Type=BUNDLE, ObjectID=70316, TimeStamp=1480647632311, dispatcher=1544803905, detail="THUMBNAIL", transactionID="TX157907838689377964651674089851855413607")
    +2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=ADD, SubjectType=BUNDLE, SubjectID=70316, Obje ctType=BITSTREAM, ObjectID=86715, TimeStamp=1480647632318, dispatcher=1544803905, detail="-1", transactionID="TX157907838689377964651674089851855413607")
    +2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=MODIFY, SubjectType=ITEM, SubjectID=80044, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632351, dispatcher=1544803905, detail=[null], transactionID="TX157907838689377964651674089851855413607")
    +
    + +
      +
    • I see thousands of them in the logs for the last few months, so it’s not related to the DSpace 5.5 upgrade
    • +
    • I’ve raised a ticket with Atmire to ask
    • +
    • Another worrying error from dspace.log is:
    • +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    November, 2016

    + +
    +

    2016-11-01

    + +
      +
    • Add dc.type to the output options for Atmire’s Listings and Reports module (#286)
    • +
    + +

    Listings and Reports with output type

    + +

    + Read more → +
    + + + + + + +
    +
    +

    October, 2016

    + +
    +

    2016-10-03

    + +
      +
    • Testing adding ORCIDs to a CSV file for a single item to see if the author orders get messed up
    • +
    • Need to test the following scenarios to see how author order is affected: + +
        +
      • ORCIDs only
      • +
      • ORCIDs plus normal authors
      • +
    • +
    • I exported a random item’s metadata as CSV, deleted all columns except id and collection, and made a new coloum called ORCID:dc.contributor.author with the following random ORCIDs from the ORCID registry:
    • +
    + +
    0000-0002-6115-0956||0000-0002-3812-8793||0000-0001-7462-405X
    +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    September, 2016

    + +
    +

    2016-09-01

    + +
      +
    • Discuss helping CCAFS with some batch tagging of ORCID IDs for their authors
    • +
    • Discuss how the migration of CGIAR’s Active Directory to a flat structure will break our LDAP groups in DSpace
    • +
    • We had been using DC=ILRI to determine whether a user was ILRI or not
    • +
    • It looks like we might be able to use OUs now, instead of DCs:
    • +
    + +
    $ ldapsearch -x -H ldaps://svcgroot2.cgiarad.org:3269/ -b "dc=cgiarad,dc=org" -D "admigration1@cgiarad.org" -W "(sAMAccountName=admigration1)"
    +
    + +

    + Read more → +
    + + + + + + + + + + +
    + + + + +
    +
    + + + + + + + + + diff --git a/docs/categories/notes/page/3/index.html b/docs/categories/notes/page/3/index.html new file mode 100644 index 000000000..1885802fa --- /dev/null +++ b/docs/categories/notes/page/3/index.html @@ -0,0 +1,481 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + CGSpace Notes + + + + + + + + + + + + + + + + + + + + +
    +
    + +
    +
    + + + +
    +
    +

    CGSpace Notes

    +

    Documenting day-to-day work on the CGSpace repository.

    +
    +
    + + + +
    +
    +
    + + + + + + + + + +
    +
    +

    August, 2016

    + +
    +

    2016-08-01

    + +
      +
    • Add updated distribution license from Sisay (#259)
    • +
    • Play with upgrading Mirage 2 dependencies in bower.json because most are several versions of out date
    • +
    • Bootstrap is at 3.3.0 but upstream is at 3.3.7, and upgrading to anything beyond 3.3.1 breaks glyphicons and probably more
    • +
    • bower stuff is a dead end, waste of time, too many issues
    • +
    • Anything after Bootstrap 3.3.1 makes glyphicons disappear (HTTP 404 trying to access from incorrect path of fonts)
    • +
    • Start working on DSpace 5.1 → 5.5 port:
    • +
    + +
    $ git checkout -b 55new 5_x-prod
    +$ git reset --hard ilri/5_x-prod
    +$ git rebase -i dspace-5.5
    +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    July, 2016

    + +
    +

    2016-07-01

    + +
      +
    • Add dc.description.sponsorship to Discovery sidebar facets and make investors clickable in item view (#232)
    • +
    • I think this query should find and replace all authors that have “,” at the end of their names:
    • +
    + +
    dspacetest=# update metadatavalue set text_value = regexp_replace(text_value, '(^.+?),$', '\1') where metadata_field_id=3 and resource_type_id=2 and text_value ~ '^.+?,$';
    +UPDATE 95
    +dspacetest=# select text_value from  metadatavalue where metadata_field_id=3 and resource_type_id=2 and text_value ~ '^.+?,$';
    + text_value
    +------------
    +(0 rows)
    +
    + +
      +
    • In this case the select query was showing 95 results before the update
    • +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    June, 2016

    + +
    +

    2016-06-01

    + + + +

    + Read more → +
    + + + + + + +
    +
    +

    May, 2016

    + +
    +

    2016-05-01

    + +
      +
    • Since yesterday there have been 10,000 REST errors and the site has been unstable again
    • +
    • I have blocked access to the API now
    • +
    • There are 3,000 IPs accessing the REST API in a 24-hour period!
    • +
    + +
    # awk '{print $1}' /var/log/nginx/rest.log  | uniq | wc -l
    +3168
    +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    April, 2016

    + +
    +

    2016-04-04

    + +
      +
    • Looking at log file use on CGSpace and notice that we need to work on our cron setup a bit
    • +
    • We are backing up all logs in the log folder, including useless stuff like solr, cocoon, handle-plugin, etc
    • +
    • After running DSpace for over five years I’ve never needed to look in any other log file than dspace.log, leave alone one from last year!
    • +
    • This will save us a few gigs of backup space we’re paying for on S3
    • +
    • Also, I noticed the checker log has some errors we should pay attention to:
    • +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    March, 2016

    + +
    +

    2016-03-02

    + +
      +
    • Looking at issues with author authorities on CGSpace
    • +
    • For some reason we still have the index-lucene-update cron job active on CGSpace, but I’m pretty sure we don’t need it as of the latest few versions of Atmire’s Listings and Reports module
    • +
    • Reinstall my local (Mac OS X) DSpace stack with Tomcat 7, PostgreSQL 9.3, and Java JDK 1.7 to match environment on CGSpace server
    • +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    February, 2016

    + +
    +

    2016-02-05

    + +
      +
    • Looking at some DAGRIS data for Abenet Yabowork
    • +
    • Lots of issues with spaces, newlines, etc causing the import to fail
    • +
    • I noticed we have a very interesting list of countries on CGSpace:
    • +
    + +

    CGSpace country list

    + +
      +
    • Not only are there 49,000 countries, we have some blanks (25)…
    • +
    • Also, lots of things like “COTE D`LVOIRE” and “COTE D IVOIRE”
    • +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    January, 2016

    + +
    +

    2016-01-13

    + +
      +
    • Move ILRI collection 10568/12503 from 10568/27869 to 10568/27629 using the move_collections.sh script I wrote last year.
    • +
    • I realized it is only necessary to clear the Cocoon cache after moving collections—rather than reindexing—as no metadata has changed, and therefore no search or browse indexes need to be updated.
    • +
    • Update GitHub wiki for documentation of maintenance tasks.
    • +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    December, 2015

    + +
    +

    2015-12-02

    + +
      +
    • Replace lzop with xz in log compression cron jobs on DSpace Test—it uses less space:
    • +
    + +
    # cd /home/dspacetest.cgiar.org/log
    +# ls -lh dspace.log.2015-11-18*
    +-rw-rw-r-- 1 tomcat7 tomcat7 2.0M Nov 18 23:59 dspace.log.2015-11-18
    +-rw-rw-r-- 1 tomcat7 tomcat7 387K Nov 18 23:59 dspace.log.2015-11-18.lzo
    +-rw-rw-r-- 1 tomcat7 tomcat7 169K Nov 18 23:59 dspace.log.2015-11-18.xz
    +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    November, 2015

    + +
    +

    2015-11-22

    + +
      +
    • CGSpace went down
    • +
    • Looks like DSpace exhausted its PostgreSQL connection pool
    • +
    • Last week I had increased the limit from 30 to 60, which seemed to help, but now there are many more idle connections:
    • +
    + +
    $ psql -c 'SELECT * from pg_stat_activity;' | grep idle | grep -c cgspace
    +78
    +
    + +

    + Read more → +
    + + + + + + + + + + +
    + + + + +
    +
    + + + + + + + + + diff --git a/docs/categories/notes/page/4/index.html b/docs/categories/notes/page/4/index.html new file mode 100644 index 000000000..faf203b7d --- /dev/null +++ b/docs/categories/notes/page/4/index.html @@ -0,0 +1,191 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + CGSpace Notes + + + + + + + + + + + + + + + + + + + + +
    +
    + +
    +
    + + + +
    +
    +

    CGSpace Notes

    +

    Documenting day-to-day work on the CGSpace repository.

    +
    +
    + + + +
    +
    +
    + + + + + + + + + + + + + + + + + + + + +
    + + + + +
    +
    + + + + + + + + + diff --git a/docs/categories/page/2/index.html b/docs/categories/page/2/index.html new file mode 100644 index 000000000..c86e25129 --- /dev/null +++ b/docs/categories/page/2/index.html @@ -0,0 +1,484 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + CGSpace Notes + + + + + + + + + + + + + + + + + + + + +
    +
    + +
    +
    + + + +
    +
    +

    CGSpace Notes

    +

    Documenting day-to-day work on the CGSpace repository.

    +
    +
    + + + +
    +
    +
    + + + + + + + + + +
    +
    +

    June, 2017

    + +
    + 2017-06-01 After discussion with WLE and CGSpace content people, we decided to just add one metadata field for the WLE Research Themes The cg.identifier.wletheme field will be used for both Phase I and Phase II Research Themes Then we’ll create a new sub-community for Phase II and create collections for the research themes there The current “Research Themes” community will be renamed to “WLE Phase I Research Themes” Tagged all items in the current Phase I collections with their appropriate themes Create pull request to add Phase II research themes to the submission form: #328 Add cg. + Read more → +
    + + + + + + +
    +
    +

    May, 2017

    + +
    + 2017-05-01 ICARDA apparently started working on CG Core on their MEL repository They have done a few cg.* fields, but not very consistent and even copy some of CGSpace items: https://mel.cgiar.org/xmlui/handle/20.500.11766/6911?show=full https://cgspace.cgiar.org/handle/10568/73683 2017-05-02 Atmire got back about the Workflow Statistics issue, and apparently it’s a bug in the CUA module so they will send us a pull request 2017-05-04 Sync DSpace Test with database and assetstore from CGSpace Re-deploy DSpace Test with Atmire’s CUA patch for workflow statistics, run system updates, and restart the server Now I can see the workflow statistics and am able to select users, but everything returns 0 items Megan says there are still some mapped items are not appearing since last week, so I forced a full index-discovery -b Need to remember to check if the collection has more items (currently 39 on CGSpace, but 118 on the freshly reindexed DSPace Test) tomorrow: https://cgspace. + Read more → +
    + + + + + + +
    +
    +

    April, 2017

    + +
    +

    2017-04-02

    + +
      +
    • Merge one change to CCAFS flagships that I had forgotten to remove last month (“MANAGING CLIMATE RISK”): https://github.com/ilri/DSpace/pull/317
    • +
    • Quick proof-of-concept hack to add dc.rights to the input form, including some inline instructions/hints:
    • +
    + +

    dc.rights in the submission form

    + +
      +
    • Remove redundant/duplicate text in the DSpace submission license
    • +
    • Testing the CMYK patch on a collection with 650 items:
    • +
    + +
    $ [dspace]/bin/dspace filter-media -f -i 10568/16498 -p "ImageMagick PDF Thumbnail" -v >& /tmp/filter-media-cmyk.txt
    +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    March, 2017

    + +
    +

    2017-03-01

    + +
      +
    • Run the 279 CIAT author corrections on CGSpace
    • +
    + +

    2017-03-02

    + +
      +
    • Skype with Michael and Peter, discussing moving the CGIAR Library to CGSpace
    • +
    • CGIAR people possibly open to moving content, redirecting library.cgiar.org to CGSpace and letting CGSpace resolve their handles
    • +
    • They might come in at the top level in one “CGIAR System” community, or with several communities
    • +
    • I need to spend a bit of time looking at the multiple handle support in DSpace and see if new content can be minted in both handles, or just one?
    • +
    • Need to send Peter and Michael some notes about this in a few days
    • +
    • Also, need to consider talking to Atmire about hiring them to bring ORCiD metadata to REST / OAI
    • +
    • Filed an issue on DSpace issue tracker for the filter-media bug that causes it to process JPGs even when limiting to the PDF thumbnail plugin: DS-3516
    • +
    • Discovered that the ImageMagic filter-media plugin creates JPG thumbnails with the CMYK colorspace when the source PDF is using CMYK
    • +
    • Interestingly, it seems DSpace 4.x’s thumbnails were sRGB, but forcing regeneration using DSpace 5.x’s ImageMagick plugin creates CMYK JPGs if the source PDF was CMYK (see 1056851999):
    • +
    + +
    $ identify ~/Desktop/alc_contrastes_desafios.jpg
    +/Users/aorth/Desktop/alc_contrastes_desafios.jpg JPEG 464x600 464x600+0+0 8-bit CMYK 168KB 0.000u 0:00.000
    +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    February, 2017

    + +
    +

    2017-02-07

    + +
      +
    • An item was mapped twice erroneously again, so I had to remove one of the mappings manually:
    • +
    + +
    dspace=# select * from collection2item where item_id = '80278';
    +  id   | collection_id | item_id
    +-------+---------------+---------
    + 92551 |           313 |   80278
    + 92550 |           313 |   80278
    + 90774 |          1051 |   80278
    +(3 rows)
    +dspace=# delete from collection2item where id = 92551 and item_id = 80278;
    +DELETE 1
    +
    + +
      +
    • Create issue on GitHub to track the addition of CCAFS Phase II project tags (#301)
    • +
    • Looks like we’ll be using cg.identifier.ccafsprojectpii as the field name
    • +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    January, 2017

    + +
    +

    2017-01-02

    + +
      +
    • I checked to see if the Solr sharding task that is supposed to run on January 1st had run and saw there was an error
    • +
    • I tested on DSpace Test as well and it doesn’t work there either
    • +
    • I asked on the dspace-tech mailing list because it seems to be broken, and actually now I’m not sure if we’ve ever had the sharding task run successfully over all these years
    • +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    December, 2016

    + +
    +

    2016-12-02

    + +
      +
    • CGSpace was down for five hours in the morning while I was sleeping
    • +
    • While looking in the logs for errors, I see tons of warnings about Atmire MQM:
    • +
    + +
    2016-12-02 03:00:32,352 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=CREATE, SubjectType=BUNDLE, SubjectID=70316, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632305, dispatcher=1544803905, detail=[null], transactionID="TX157907838689377964651674089851855413607")
    +2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=MODIFY_METADATA, SubjectType=BUNDLE, SubjectID =70316, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632309, dispatcher=1544803905, detail="dc.title", transactionID="TX157907838689377964651674089851855413607")
    +2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=ADD, SubjectType=ITEM, SubjectID=80044, Object Type=BUNDLE, ObjectID=70316, TimeStamp=1480647632311, dispatcher=1544803905, detail="THUMBNAIL", transactionID="TX157907838689377964651674089851855413607")
    +2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=ADD, SubjectType=BUNDLE, SubjectID=70316, Obje ctType=BITSTREAM, ObjectID=86715, TimeStamp=1480647632318, dispatcher=1544803905, detail="-1", transactionID="TX157907838689377964651674089851855413607")
    +2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=MODIFY, SubjectType=ITEM, SubjectID=80044, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632351, dispatcher=1544803905, detail=[null], transactionID="TX157907838689377964651674089851855413607")
    +
    + +
      +
    • I see thousands of them in the logs for the last few months, so it’s not related to the DSpace 5.5 upgrade
    • +
    • I’ve raised a ticket with Atmire to ask
    • +
    • Another worrying error from dspace.log is:
    • +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    November, 2016

    + +
    +

    2016-11-01

    + +
      +
    • Add dc.type to the output options for Atmire’s Listings and Reports module (#286)
    • +
    + +

    Listings and Reports with output type

    + +

    + Read more → +
    + + + + + + +
    +
    +

    October, 2016

    + +
    +

    2016-10-03

    + +
      +
    • Testing adding ORCIDs to a CSV file for a single item to see if the author orders get messed up
    • +
    • Need to test the following scenarios to see how author order is affected: + +
        +
      • ORCIDs only
      • +
      • ORCIDs plus normal authors
      • +
    • +
    • I exported a random item’s metadata as CSV, deleted all columns except id and collection, and made a new coloum called ORCID:dc.contributor.author with the following random ORCIDs from the ORCID registry:
    • +
    + +
    0000-0002-6115-0956||0000-0002-3812-8793||0000-0001-7462-405X
    +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    September, 2016

    + +
    +

    2016-09-01

    + +
      +
    • Discuss helping CCAFS with some batch tagging of ORCID IDs for their authors
    • +
    • Discuss how the migration of CGIAR’s Active Directory to a flat structure will break our LDAP groups in DSpace
    • +
    • We had been using DC=ILRI to determine whether a user was ILRI or not
    • +
    • It looks like we might be able to use OUs now, instead of DCs:
    • +
    + +
    $ ldapsearch -x -H ldaps://svcgroot2.cgiarad.org:3269/ -b "dc=cgiarad,dc=org" -D "admigration1@cgiarad.org" -W "(sAMAccountName=admigration1)"
    +
    + +

    + Read more → +
    + + + + + + + + + + +
    + + + + +
    +
    + + + + + + + + + diff --git a/docs/categories/page/3/index.html b/docs/categories/page/3/index.html new file mode 100644 index 000000000..66191a251 --- /dev/null +++ b/docs/categories/page/3/index.html @@ -0,0 +1,481 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + CGSpace Notes + + + + + + + + + + + + + + + + + + + + +
    +
    + +
    +
    + + + +
    +
    +

    CGSpace Notes

    +

    Documenting day-to-day work on the CGSpace repository.

    +
    +
    + + + +
    +
    +
    + + + + + + + + + +
    +
    +

    August, 2016

    + +
    +

    2016-08-01

    + +
      +
    • Add updated distribution license from Sisay (#259)
    • +
    • Play with upgrading Mirage 2 dependencies in bower.json because most are several versions of out date
    • +
    • Bootstrap is at 3.3.0 but upstream is at 3.3.7, and upgrading to anything beyond 3.3.1 breaks glyphicons and probably more
    • +
    • bower stuff is a dead end, waste of time, too many issues
    • +
    • Anything after Bootstrap 3.3.1 makes glyphicons disappear (HTTP 404 trying to access from incorrect path of fonts)
    • +
    • Start working on DSpace 5.1 → 5.5 port:
    • +
    + +
    $ git checkout -b 55new 5_x-prod
    +$ git reset --hard ilri/5_x-prod
    +$ git rebase -i dspace-5.5
    +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    July, 2016

    + +
    +

    2016-07-01

    + +
      +
    • Add dc.description.sponsorship to Discovery sidebar facets and make investors clickable in item view (#232)
    • +
    • I think this query should find and replace all authors that have “,” at the end of their names:
    • +
    + +
    dspacetest=# update metadatavalue set text_value = regexp_replace(text_value, '(^.+?),$', '\1') where metadata_field_id=3 and resource_type_id=2 and text_value ~ '^.+?,$';
    +UPDATE 95
    +dspacetest=# select text_value from  metadatavalue where metadata_field_id=3 and resource_type_id=2 and text_value ~ '^.+?,$';
    + text_value
    +------------
    +(0 rows)
    +
    + +
      +
    • In this case the select query was showing 95 results before the update
    • +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    June, 2016

    + +
    +

    2016-06-01

    + + + +

    + Read more → +
    + + + + + + +
    +
    +

    May, 2016

    + +
    +

    2016-05-01

    + +
      +
    • Since yesterday there have been 10,000 REST errors and the site has been unstable again
    • +
    • I have blocked access to the API now
    • +
    • There are 3,000 IPs accessing the REST API in a 24-hour period!
    • +
    + +
    # awk '{print $1}' /var/log/nginx/rest.log  | uniq | wc -l
    +3168
    +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    April, 2016

    + +
    +

    2016-04-04

    + +
      +
    • Looking at log file use on CGSpace and notice that we need to work on our cron setup a bit
    • +
    • We are backing up all logs in the log folder, including useless stuff like solr, cocoon, handle-plugin, etc
    • +
    • After running DSpace for over five years I’ve never needed to look in any other log file than dspace.log, leave alone one from last year!
    • +
    • This will save us a few gigs of backup space we’re paying for on S3
    • +
    • Also, I noticed the checker log has some errors we should pay attention to:
    • +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    March, 2016

    + +
    +

    2016-03-02

    + +
      +
    • Looking at issues with author authorities on CGSpace
    • +
    • For some reason we still have the index-lucene-update cron job active on CGSpace, but I’m pretty sure we don’t need it as of the latest few versions of Atmire’s Listings and Reports module
    • +
    • Reinstall my local (Mac OS X) DSpace stack with Tomcat 7, PostgreSQL 9.3, and Java JDK 1.7 to match environment on CGSpace server
    • +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    February, 2016

    + +
    +

    2016-02-05

    + +
      +
    • Looking at some DAGRIS data for Abenet Yabowork
    • +
    • Lots of issues with spaces, newlines, etc causing the import to fail
    • +
    • I noticed we have a very interesting list of countries on CGSpace:
    • +
    + +

    CGSpace country list

    + +
      +
    • Not only are there 49,000 countries, we have some blanks (25)…
    • +
    • Also, lots of things like “COTE D`LVOIRE” and “COTE D IVOIRE”
    • +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    January, 2016

    + +
    +

    2016-01-13

    + +
      +
    • Move ILRI collection 10568/12503 from 10568/27869 to 10568/27629 using the move_collections.sh script I wrote last year.
    • +
    • I realized it is only necessary to clear the Cocoon cache after moving collections—rather than reindexing—as no metadata has changed, and therefore no search or browse indexes need to be updated.
    • +
    • Update GitHub wiki for documentation of maintenance tasks.
    • +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    December, 2015

    + +
    +

    2015-12-02

    + +
      +
    • Replace lzop with xz in log compression cron jobs on DSpace Test—it uses less space:
    • +
    + +
    # cd /home/dspacetest.cgiar.org/log
    +# ls -lh dspace.log.2015-11-18*
    +-rw-rw-r-- 1 tomcat7 tomcat7 2.0M Nov 18 23:59 dspace.log.2015-11-18
    +-rw-rw-r-- 1 tomcat7 tomcat7 387K Nov 18 23:59 dspace.log.2015-11-18.lzo
    +-rw-rw-r-- 1 tomcat7 tomcat7 169K Nov 18 23:59 dspace.log.2015-11-18.xz
    +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    November, 2015

    + +
    +

    2015-11-22

    + +
      +
    • CGSpace went down
    • +
    • Looks like DSpace exhausted its PostgreSQL connection pool
    • +
    • Last week I had increased the limit from 30 to 60, which seemed to help, but now there are many more idle connections:
    • +
    + +
    $ psql -c 'SELECT * from pg_stat_activity;' | grep idle | grep -c cgspace
    +78
    +
    + +

    + Read more → +
    + + + + + + + + + + +
    + + + + +
    +
    + + + + + + + + + diff --git a/docs/categories/page/4/index.html b/docs/categories/page/4/index.html new file mode 100644 index 000000000..6edb57e97 --- /dev/null +++ b/docs/categories/page/4/index.html @@ -0,0 +1,191 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + CGSpace Notes + + + + + + + + + + + + + + + + + + + + +
    +
    + +
    +
    + + + +
    +
    +

    CGSpace Notes

    +

    Documenting day-to-day work on the CGSpace repository.

    +
    +
    + + + +
    +
    +
    + + + + + + + + + + + + + + + + + + + + +
    + + + + +
    +
    + + + + + + + + + diff --git a/docs/cgiar-library-migration/index.html b/docs/cgiar-library-migration/index.html index 6de53c8c1..a989f1009 100644 --- a/docs/cgiar-library-migration/index.html +++ b/docs/cgiar-library-migration/index.html @@ -14,7 +14,7 @@ - + @@ -41,7 +41,7 @@ "url": "https://alanorth.github.io/cgspace-notes/cgiar-library-migration/", "wordCount": "1278", "datePublished": "2017-09-18T16:38:35+03:00", - "dateModified": "2017-09-28T12:00:49+03:00", + "dateModified": "2018-03-09T22:10:33+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -77,8 +77,6 @@
    diff --git a/docs/index.html b/docs/index.html index 7f639c1ce..73d57e48f 100644 --- a/docs/index.html +++ b/docs/index.html @@ -74,8 +74,6 @@
    diff --git a/docs/page/2/index.html b/docs/page/2/index.html index dbe19ecbc..37ce9032c 100644 --- a/docs/page/2/index.html +++ b/docs/page/2/index.html @@ -74,8 +74,6 @@
    diff --git a/docs/page/3/index.html b/docs/page/3/index.html index 61342c854..691d8d892 100644 --- a/docs/page/3/index.html +++ b/docs/page/3/index.html @@ -74,8 +74,6 @@
    @@ -408,9 +406,9 @@ dspacetest=# select text_value from metadatavalue where metadata_field_id=3 and diff --git a/docs/page/4/index.html b/docs/page/4/index.html new file mode 100644 index 000000000..b2eee5186 --- /dev/null +++ b/docs/page/4/index.html @@ -0,0 +1,191 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + CGSpace Notes + + + + + + + + + + + + + + + + + + + + +
    +
    + +
    +
    + + + +
    +
    +

    CGSpace Notes

    +

    Documenting day-to-day work on the CGSpace repository.

    +
    +
    + + + +
    +
    +
    + + + + + + + + + + + + + + + + + + + + +
    + + + + +
    +
    + + + + + + + + + diff --git a/docs/post/page/1/index.html b/docs/post/page/1/index.html deleted file mode 100644 index c4d1d9390..000000000 --- a/docs/post/page/1/index.html +++ /dev/null @@ -1 +0,0 @@ -https://alanorth.github.io/cgspace-notes/post/ \ No newline at end of file diff --git a/docs/post/index.html b/docs/posts/index.html similarity index 97% rename from docs/post/index.html rename to docs/posts/index.html index f2451dfd7..e7272f99e 100644 --- a/docs/post/index.html +++ b/docs/posts/index.html @@ -8,7 +8,7 @@ - + @@ -35,7 +35,7 @@ "@context": "http://schema.org", "@type": "Blog", "headline": "CGSpace Notes", - "url" : "https://alanorth.github.io/cgspace-notes/post/", + "url" : "https://alanorth.github.io/cgspace-notes/posts/", "author": { "@type": "Person", "name": "Alan Orth" @@ -47,7 +47,7 @@ - + CGSpace Notes @@ -56,7 +56,7 @@ - + @@ -74,8 +74,6 @@
    @@ -479,7 +477,7 @@ COPY 54701 Previous page - + diff --git a/docs/post/index.xml b/docs/posts/index.xml similarity index 99% rename from docs/post/index.xml rename to docs/posts/index.xml index 52555a50a..382e0341a 100644 --- a/docs/post/index.xml +++ b/docs/posts/index.xml @@ -2,13 +2,13 @@ Posts on CGSpace Notes - https://alanorth.github.io/cgspace-notes/post/ + https://alanorth.github.io/cgspace-notes/posts/ Recent content in Posts on CGSpace Notes Hugo -- gohugo.io en-us Fri, 02 Mar 2018 16:07:54 +0200 - + diff --git a/docs/posts/page/1/index.html b/docs/posts/page/1/index.html new file mode 100644 index 000000000..2f6a3af7c --- /dev/null +++ b/docs/posts/page/1/index.html @@ -0,0 +1 @@ +https://alanorth.github.io/cgspace-notes/posts/ \ No newline at end of file diff --git a/docs/posts/page/2/index.html b/docs/posts/page/2/index.html new file mode 100644 index 000000000..f83db2f3f --- /dev/null +++ b/docs/posts/page/2/index.html @@ -0,0 +1,484 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + CGSpace Notes + + + + + + + + + + + + + + + + + + + + +
    +
    + +
    +
    + + + +
    +
    +

    CGSpace Notes

    +

    Documenting day-to-day work on the CGSpace repository.

    +
    +
    + + + +
    +
    +
    + + + + + + + + + +
    +
    +

    June, 2017

    + +
    + 2017-06-01 After discussion with WLE and CGSpace content people, we decided to just add one metadata field for the WLE Research Themes The cg.identifier.wletheme field will be used for both Phase I and Phase II Research Themes Then we’ll create a new sub-community for Phase II and create collections for the research themes there The current “Research Themes” community will be renamed to “WLE Phase I Research Themes” Tagged all items in the current Phase I collections with their appropriate themes Create pull request to add Phase II research themes to the submission form: #328 Add cg. + Read more → +
    + + + + + + +
    +
    +

    May, 2017

    + +
    + 2017-05-01 ICARDA apparently started working on CG Core on their MEL repository They have done a few cg.* fields, but not very consistent and even copy some of CGSpace items: https://mel.cgiar.org/xmlui/handle/20.500.11766/6911?show=full https://cgspace.cgiar.org/handle/10568/73683 2017-05-02 Atmire got back about the Workflow Statistics issue, and apparently it’s a bug in the CUA module so they will send us a pull request 2017-05-04 Sync DSpace Test with database and assetstore from CGSpace Re-deploy DSpace Test with Atmire’s CUA patch for workflow statistics, run system updates, and restart the server Now I can see the workflow statistics and am able to select users, but everything returns 0 items Megan says there are still some mapped items are not appearing since last week, so I forced a full index-discovery -b Need to remember to check if the collection has more items (currently 39 on CGSpace, but 118 on the freshly reindexed DSPace Test) tomorrow: https://cgspace. + Read more → +
    + + + + + + +
    +
    +

    April, 2017

    + +
    +

    2017-04-02

    + +
      +
    • Merge one change to CCAFS flagships that I had forgotten to remove last month (“MANAGING CLIMATE RISK”): https://github.com/ilri/DSpace/pull/317
    • +
    • Quick proof-of-concept hack to add dc.rights to the input form, including some inline instructions/hints:
    • +
    + +

    dc.rights in the submission form

    + +
      +
    • Remove redundant/duplicate text in the DSpace submission license
    • +
    • Testing the CMYK patch on a collection with 650 items:
    • +
    + +
    $ [dspace]/bin/dspace filter-media -f -i 10568/16498 -p "ImageMagick PDF Thumbnail" -v >& /tmp/filter-media-cmyk.txt
    +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    March, 2017

    + +
    +

    2017-03-01

    + +
      +
    • Run the 279 CIAT author corrections on CGSpace
    • +
    + +

    2017-03-02

    + +
      +
    • Skype with Michael and Peter, discussing moving the CGIAR Library to CGSpace
    • +
    • CGIAR people possibly open to moving content, redirecting library.cgiar.org to CGSpace and letting CGSpace resolve their handles
    • +
    • They might come in at the top level in one “CGIAR System” community, or with several communities
    • +
    • I need to spend a bit of time looking at the multiple handle support in DSpace and see if new content can be minted in both handles, or just one?
    • +
    • Need to send Peter and Michael some notes about this in a few days
    • +
    • Also, need to consider talking to Atmire about hiring them to bring ORCiD metadata to REST / OAI
    • +
    • Filed an issue on DSpace issue tracker for the filter-media bug that causes it to process JPGs even when limiting to the PDF thumbnail plugin: DS-3516
    • +
    • Discovered that the ImageMagic filter-media plugin creates JPG thumbnails with the CMYK colorspace when the source PDF is using CMYK
    • +
    • Interestingly, it seems DSpace 4.x’s thumbnails were sRGB, but forcing regeneration using DSpace 5.x’s ImageMagick plugin creates CMYK JPGs if the source PDF was CMYK (see 1056851999):
    • +
    + +
    $ identify ~/Desktop/alc_contrastes_desafios.jpg
    +/Users/aorth/Desktop/alc_contrastes_desafios.jpg JPEG 464x600 464x600+0+0 8-bit CMYK 168KB 0.000u 0:00.000
    +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    February, 2017

    + +
    +

    2017-02-07

    + +
      +
    • An item was mapped twice erroneously again, so I had to remove one of the mappings manually:
    • +
    + +
    dspace=# select * from collection2item where item_id = '80278';
    +  id   | collection_id | item_id
    +-------+---------------+---------
    + 92551 |           313 |   80278
    + 92550 |           313 |   80278
    + 90774 |          1051 |   80278
    +(3 rows)
    +dspace=# delete from collection2item where id = 92551 and item_id = 80278;
    +DELETE 1
    +
    + +
      +
    • Create issue on GitHub to track the addition of CCAFS Phase II project tags (#301)
    • +
    • Looks like we’ll be using cg.identifier.ccafsprojectpii as the field name
    • +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    January, 2017

    + +
    +

    2017-01-02

    + +
      +
    • I checked to see if the Solr sharding task that is supposed to run on January 1st had run and saw there was an error
    • +
    • I tested on DSpace Test as well and it doesn’t work there either
    • +
    • I asked on the dspace-tech mailing list because it seems to be broken, and actually now I’m not sure if we’ve ever had the sharding task run successfully over all these years
    • +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    December, 2016

    + +
    +

    2016-12-02

    + +
      +
    • CGSpace was down for five hours in the morning while I was sleeping
    • +
    • While looking in the logs for errors, I see tons of warnings about Atmire MQM:
    • +
    + +
    2016-12-02 03:00:32,352 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=CREATE, SubjectType=BUNDLE, SubjectID=70316, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632305, dispatcher=1544803905, detail=[null], transactionID="TX157907838689377964651674089851855413607")
    +2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=MODIFY_METADATA, SubjectType=BUNDLE, SubjectID =70316, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632309, dispatcher=1544803905, detail="dc.title", transactionID="TX157907838689377964651674089851855413607")
    +2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=ADD, SubjectType=ITEM, SubjectID=80044, Object Type=BUNDLE, ObjectID=70316, TimeStamp=1480647632311, dispatcher=1544803905, detail="THUMBNAIL", transactionID="TX157907838689377964651674089851855413607")
    +2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=ADD, SubjectType=BUNDLE, SubjectID=70316, Obje ctType=BITSTREAM, ObjectID=86715, TimeStamp=1480647632318, dispatcher=1544803905, detail="-1", transactionID="TX157907838689377964651674089851855413607")
    +2016-12-02 03:00:32,353 WARN  com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=MODIFY, SubjectType=ITEM, SubjectID=80044, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632351, dispatcher=1544803905, detail=[null], transactionID="TX157907838689377964651674089851855413607")
    +
    + +
      +
    • I see thousands of them in the logs for the last few months, so it’s not related to the DSpace 5.5 upgrade
    • +
    • I’ve raised a ticket with Atmire to ask
    • +
    • Another worrying error from dspace.log is:
    • +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    November, 2016

    + +
    +

    2016-11-01

    + +
      +
    • Add dc.type to the output options for Atmire’s Listings and Reports module (#286)
    • +
    + +

    Listings and Reports with output type

    + +

    + Read more → +
    + + + + + + +
    +
    +

    October, 2016

    + +
    +

    2016-10-03

    + +
      +
    • Testing adding ORCIDs to a CSV file for a single item to see if the author orders get messed up
    • +
    • Need to test the following scenarios to see how author order is affected: + +
        +
      • ORCIDs only
      • +
      • ORCIDs plus normal authors
      • +
    • +
    • I exported a random item’s metadata as CSV, deleted all columns except id and collection, and made a new coloum called ORCID:dc.contributor.author with the following random ORCIDs from the ORCID registry:
    • +
    + +
    0000-0002-6115-0956||0000-0002-3812-8793||0000-0001-7462-405X
    +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    September, 2016

    + +
    +

    2016-09-01

    + +
      +
    • Discuss helping CCAFS with some batch tagging of ORCID IDs for their authors
    • +
    • Discuss how the migration of CGIAR’s Active Directory to a flat structure will break our LDAP groups in DSpace
    • +
    • We had been using DC=ILRI to determine whether a user was ILRI or not
    • +
    • It looks like we might be able to use OUs now, instead of DCs:
    • +
    + +
    $ ldapsearch -x -H ldaps://svcgroot2.cgiarad.org:3269/ -b "dc=cgiarad,dc=org" -D "admigration1@cgiarad.org" -W "(sAMAccountName=admigration1)"
    +
    + +

    + Read more → +
    + + + + + + + + + + +
    + + + + +
    +
    + + + + + + + + + diff --git a/docs/posts/page/3/index.html b/docs/posts/page/3/index.html new file mode 100644 index 000000000..6cf4917ae --- /dev/null +++ b/docs/posts/page/3/index.html @@ -0,0 +1,481 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + CGSpace Notes + + + + + + + + + + + + + + + + + + + + +
    +
    + +
    +
    + + + +
    +
    +

    CGSpace Notes

    +

    Documenting day-to-day work on the CGSpace repository.

    +
    +
    + + + +
    +
    +
    + + + + + + + + + +
    +
    +

    August, 2016

    + +
    +

    2016-08-01

    + +
      +
    • Add updated distribution license from Sisay (#259)
    • +
    • Play with upgrading Mirage 2 dependencies in bower.json because most are several versions of out date
    • +
    • Bootstrap is at 3.3.0 but upstream is at 3.3.7, and upgrading to anything beyond 3.3.1 breaks glyphicons and probably more
    • +
    • bower stuff is a dead end, waste of time, too many issues
    • +
    • Anything after Bootstrap 3.3.1 makes glyphicons disappear (HTTP 404 trying to access from incorrect path of fonts)
    • +
    • Start working on DSpace 5.1 → 5.5 port:
    • +
    + +
    $ git checkout -b 55new 5_x-prod
    +$ git reset --hard ilri/5_x-prod
    +$ git rebase -i dspace-5.5
    +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    July, 2016

    + +
    +

    2016-07-01

    + +
      +
    • Add dc.description.sponsorship to Discovery sidebar facets and make investors clickable in item view (#232)
    • +
    • I think this query should find and replace all authors that have “,” at the end of their names:
    • +
    + +
    dspacetest=# update metadatavalue set text_value = regexp_replace(text_value, '(^.+?),$', '\1') where metadata_field_id=3 and resource_type_id=2 and text_value ~ '^.+?,$';
    +UPDATE 95
    +dspacetest=# select text_value from  metadatavalue where metadata_field_id=3 and resource_type_id=2 and text_value ~ '^.+?,$';
    + text_value
    +------------
    +(0 rows)
    +
    + +
      +
    • In this case the select query was showing 95 results before the update
    • +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    June, 2016

    + +
    +

    2016-06-01

    + + + +

    + Read more → +
    + + + + + + +
    +
    +

    May, 2016

    + +
    +

    2016-05-01

    + +
      +
    • Since yesterday there have been 10,000 REST errors and the site has been unstable again
    • +
    • I have blocked access to the API now
    • +
    • There are 3,000 IPs accessing the REST API in a 24-hour period!
    • +
    + +
    # awk '{print $1}' /var/log/nginx/rest.log  | uniq | wc -l
    +3168
    +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    April, 2016

    + +
    +

    2016-04-04

    + +
      +
    • Looking at log file use on CGSpace and notice that we need to work on our cron setup a bit
    • +
    • We are backing up all logs in the log folder, including useless stuff like solr, cocoon, handle-plugin, etc
    • +
    • After running DSpace for over five years I’ve never needed to look in any other log file than dspace.log, leave alone one from last year!
    • +
    • This will save us a few gigs of backup space we’re paying for on S3
    • +
    • Also, I noticed the checker log has some errors we should pay attention to:
    • +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    March, 2016

    + +
    +

    2016-03-02

    + +
      +
    • Looking at issues with author authorities on CGSpace
    • +
    • For some reason we still have the index-lucene-update cron job active on CGSpace, but I’m pretty sure we don’t need it as of the latest few versions of Atmire’s Listings and Reports module
    • +
    • Reinstall my local (Mac OS X) DSpace stack with Tomcat 7, PostgreSQL 9.3, and Java JDK 1.7 to match environment on CGSpace server
    • +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    February, 2016

    + +
    +

    2016-02-05

    + +
      +
    • Looking at some DAGRIS data for Abenet Yabowork
    • +
    • Lots of issues with spaces, newlines, etc causing the import to fail
    • +
    • I noticed we have a very interesting list of countries on CGSpace:
    • +
    + +

    CGSpace country list

    + +
      +
    • Not only are there 49,000 countries, we have some blanks (25)…
    • +
    • Also, lots of things like “COTE D`LVOIRE” and “COTE D IVOIRE”
    • +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    January, 2016

    + +
    +

    2016-01-13

    + +
      +
    • Move ILRI collection 10568/12503 from 10568/27869 to 10568/27629 using the move_collections.sh script I wrote last year.
    • +
    • I realized it is only necessary to clear the Cocoon cache after moving collections—rather than reindexing—as no metadata has changed, and therefore no search or browse indexes need to be updated.
    • +
    • Update GitHub wiki for documentation of maintenance tasks.
    • +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    December, 2015

    + +
    +

    2015-12-02

    + +
      +
    • Replace lzop with xz in log compression cron jobs on DSpace Test—it uses less space:
    • +
    + +
    # cd /home/dspacetest.cgiar.org/log
    +# ls -lh dspace.log.2015-11-18*
    +-rw-rw-r-- 1 tomcat7 tomcat7 2.0M Nov 18 23:59 dspace.log.2015-11-18
    +-rw-rw-r-- 1 tomcat7 tomcat7 387K Nov 18 23:59 dspace.log.2015-11-18.lzo
    +-rw-rw-r-- 1 tomcat7 tomcat7 169K Nov 18 23:59 dspace.log.2015-11-18.xz
    +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    November, 2015

    + +
    +

    2015-11-22

    + +
      +
    • CGSpace went down
    • +
    • Looks like DSpace exhausted its PostgreSQL connection pool
    • +
    • Last week I had increased the limit from 30 to 60, which seemed to help, but now there are many more idle connections:
    • +
    + +
    $ psql -c 'SELECT * from pg_stat_activity;' | grep idle | grep -c cgspace
    +78
    +
    + +

    + Read more → +
    + + + + + + + + + + +
    + + + + +
    +
    + + + + + + + + + diff --git a/docs/posts/page/4/index.html b/docs/posts/page/4/index.html new file mode 100644 index 000000000..164dbd742 --- /dev/null +++ b/docs/posts/page/4/index.html @@ -0,0 +1,191 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + CGSpace Notes + + + + + + + + + + + + + + + + + + + + +
    +
    + +
    +
    + + + +
    +
    +

    CGSpace Notes

    +

    Documenting day-to-day work on the CGSpace repository.

    +
    +
    + + + +
    +
    +
    + + + + + + + + + + + + + + + + + + + + +
    + + + + +
    +
    + + + + + + + + + diff --git a/docs/robots.txt b/docs/robots.txt index 9d63922ff..118e7de64 100644 --- a/docs/robots.txt +++ b/docs/robots.txt @@ -35,5 +35,5 @@ Disallow: /cgspace-notes/ Disallow: /cgspace-notes/categories/ Disallow: /cgspace-notes/tags/notes/ Disallow: /cgspace-notes/categories/notes/ -Disallow: /cgspace-notes/post/ +Disallow: /cgspace-notes/posts/ Disallow: /cgspace-notes/tags/ diff --git a/docs/sitemap.xml b/docs/sitemap.xml index 7ab32ef56..3e290afd7 100644 --- a/docs/sitemap.xml +++ b/docs/sitemap.xml @@ -4,157 +4,157 @@ https://alanorth.github.io/cgspace-notes/2018-03/ - 2018-03-08T21:29:37+02:00 + 2018-03-09T22:10:33+02:00 https://alanorth.github.io/cgspace-notes/2018-02/ - 2018-02-28T17:30:16+02:00 + 2018-03-09T22:10:33+02:00 https://alanorth.github.io/cgspace-notes/2018-01/ - 2018-01-31T16:17:39+02:00 + 2018-03-09T22:10:33+02:00 https://alanorth.github.io/cgspace-notes/2017-12/ - 2017-12-31T10:42:16-08:00 + 2018-03-09T22:10:33+02:00 https://alanorth.github.io/cgspace-notes/2017-11/ - 2018-01-12T06:07:03+02:00 + 2018-03-09T22:10:33+02:00 https://alanorth.github.io/cgspace-notes/2017-10/ - 2017-11-02T16:13:10+02:00 + 2018-03-09T22:10:33+02:00 https://alanorth.github.io/cgspace-notes/cgiar-library-migration/ - 2017-09-28T12:00:49+03:00 + 2018-03-09T22:10:33+02:00 https://alanorth.github.io/cgspace-notes/2017-09/ - 2017-09-28T07:56:11+03:00 + 2018-03-09T22:10:33+02:00 https://alanorth.github.io/cgspace-notes/2017-08/ - 2017-09-10T19:18:52+03:00 + 2018-03-09T22:10:33+02:00 https://alanorth.github.io/cgspace-notes/2017-07/ - 2017-08-01T08:55:37+03:00 + 2018-03-09T22:10:33+02:00 https://alanorth.github.io/cgspace-notes/2017-06/ - 2017-06-30T18:34:51+03:00 + 2018-03-09T22:10:33+02:00 https://alanorth.github.io/cgspace-notes/2017-05/ - 2017-09-10T17:46:54+03:00 + 2018-03-09T22:10:33+02:00 https://alanorth.github.io/cgspace-notes/2017-04/ - 2017-04-26T13:35:10+03:00 + 2018-03-09T22:10:33+02:00 https://alanorth.github.io/cgspace-notes/2017-03/ - 2017-03-31T05:36:10+03:00 + 2018-03-09T22:10:33+02:00 https://alanorth.github.io/cgspace-notes/2017-02/ - 2017-02-28T22:58:29+02:00 + 2018-03-09T22:10:33+02:00 https://alanorth.github.io/cgspace-notes/2017-01/ - 2017-01-29T13:18:32+02:00 + 2018-03-09T22:10:33+02:00 https://alanorth.github.io/cgspace-notes/2016-12/ - 2017-09-19T16:07:20+03:00 + 2018-03-09T22:10:33+02:00 https://alanorth.github.io/cgspace-notes/2016-11/ - 2017-01-10T16:21:47+02:00 + 2018-03-09T22:10:33+02:00 https://alanorth.github.io/cgspace-notes/2016-10/ - 2017-01-10T16:21:47+02:00 + 2018-03-09T22:10:33+02:00 https://alanorth.github.io/cgspace-notes/2016-09/ - 2017-01-09T16:18:07+02:00 + 2018-03-09T22:10:33+02:00 https://alanorth.github.io/cgspace-notes/2016-08/ - 2017-01-09T16:18:07+02:00 + 2018-03-09T22:10:33+02:00 https://alanorth.github.io/cgspace-notes/2016-07/ - 2017-01-09T16:18:07+02:00 + 2018-03-09T22:10:33+02:00 https://alanorth.github.io/cgspace-notes/2016-06/ - 2017-01-09T16:18:07+02:00 + 2018-03-09T22:10:33+02:00 https://alanorth.github.io/cgspace-notes/2016-05/ - 2017-01-09T16:18:07+02:00 + 2018-03-09T22:10:33+02:00 https://alanorth.github.io/cgspace-notes/2016-04/ - 2016-09-28T17:02:30+03:00 + 2018-03-09T22:10:33+02:00 https://alanorth.github.io/cgspace-notes/2016-03/ - 2017-01-09T16:18:07+02:00 + 2018-03-09T22:10:33+02:00 https://alanorth.github.io/cgspace-notes/2016-02/ - 2017-01-09T16:18:07+02:00 + 2018-03-09T22:10:33+02:00 https://alanorth.github.io/cgspace-notes/2016-01/ - 2017-01-09T16:18:07+02:00 + 2018-03-09T22:10:33+02:00 https://alanorth.github.io/cgspace-notes/2015-12/ - 2017-01-09T16:18:07+02:00 + 2018-03-09T22:10:33+02:00 https://alanorth.github.io/cgspace-notes/2015-11/ - 2016-09-28T17:02:30+03:00 + 2018-03-09T22:10:33+02:00 https://alanorth.github.io/cgspace-notes/ - 2018-03-08T21:29:37+02:00 + 2018-03-09T22:10:33+02:00 0 @@ -165,25 +165,25 @@ https://alanorth.github.io/cgspace-notes/tags/notes/ - 2018-03-08T21:29:37+02:00 + 2018-03-09T22:10:33+02:00 0 https://alanorth.github.io/cgspace-notes/categories/notes/ - 2017-09-28T12:00:49+03:00 + 2018-03-09T22:10:33+02:00 0 - https://alanorth.github.io/cgspace-notes/post/ - 2018-03-08T21:29:37+02:00 + https://alanorth.github.io/cgspace-notes/posts/ + 2018-03-09T22:10:33+02:00 0 https://alanorth.github.io/cgspace-notes/tags/ - 2018-03-08T21:29:37+02:00 + 2018-03-09T22:10:33+02:00 0 diff --git a/docs/tags/index.html b/docs/tags/index.html index 9261691c8..978fa172d 100644 --- a/docs/tags/index.html +++ b/docs/tags/index.html @@ -74,8 +74,6 @@
    @@ -103,6 +101,387 @@ +
    +
    +

    March, 2018

    + +
    +

    2018-03-02

    + +
      +
    • Export a CSV of the IITA community metadata for Martin Mueller
    • +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    February, 2018

    + +
    +

    2018-02-01

    + +
      +
    • Peter gave feedback on the dc.rights proof of concept that I had sent him last week
    • +
    • We don’t need to distinguish between internal and external works, so that makes it just a simple list
    • +
    • Yesterday I figured out how to monitor DSpace sessions using JMX
    • +
    • I copied the logic in the jmx_tomcat_dbpools provided by Ubuntu’s munin-plugins-java package and used the stuff I discovered about JMX in 2018-01
    • +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    January, 2018

    + +
    +

    2018-01-02

    + +
      +
    • Uptime Robot noticed that CGSpace went down and up a few times last night, for a few minutes each time
    • +
    • I didn’t get any load alerts from Linode and the REST and XMLUI logs don’t show anything out of the ordinary
    • +
    • The nginx logs show HTTP 200s until 02/Jan/2018:11:27:17 +0000 when Uptime Robot got an HTTP 500
    • +
    • In dspace.log around that time I see many errors like “Client closed the connection before file download was complete”
    • +
    • And just before that I see this:
    • +
    + +
    Caused by: org.apache.tomcat.jdbc.pool.PoolExhaustedException: [http-bio-127.0.0.1-8443-exec-980] Timeout: Pool empty. Unable to fetch a connection in 5 seconds, none available[size:50; busy:50; idle:0; lastwait:5000].
    +
    + +
      +
    • Ah hah! So the pool was actually empty!
    • +
    • I need to increase that, let’s try to bump it up from 50 to 75
    • +
    • After that one client got an HTTP 499 but then the rest were HTTP 200, so I don’t know what the hell Uptime Robot saw
    • +
    • I notice this error quite a few times in dspace.log:
    • +
    + +
    2018-01-02 01:21:19,137 ERROR org.dspace.app.xmlui.aspect.discovery.SidebarFacetsTransformer @ Error while searching for sidebar facets
    +org.dspace.discovery.SearchServiceException: org.apache.solr.search.SyntaxError: Cannot parse 'dateIssued_keyword:[1976+TO+1979]': Encountered " "]" "] "" at line 1, column 32.
    +
    + +
      +
    • And there are many of these errors every day for the past month:
    • +
    + +
    $ grep -c "Error while searching for sidebar facets" dspace.log.*
    +dspace.log.2017-11-21:4
    +dspace.log.2017-11-22:1
    +dspace.log.2017-11-23:4
    +dspace.log.2017-11-24:11
    +dspace.log.2017-11-25:0
    +dspace.log.2017-11-26:1
    +dspace.log.2017-11-27:7
    +dspace.log.2017-11-28:21
    +dspace.log.2017-11-29:31
    +dspace.log.2017-11-30:15
    +dspace.log.2017-12-01:15
    +dspace.log.2017-12-02:20
    +dspace.log.2017-12-03:38
    +dspace.log.2017-12-04:65
    +dspace.log.2017-12-05:43
    +dspace.log.2017-12-06:72
    +dspace.log.2017-12-07:27
    +dspace.log.2017-12-08:15
    +dspace.log.2017-12-09:29
    +dspace.log.2017-12-10:35
    +dspace.log.2017-12-11:20
    +dspace.log.2017-12-12:44
    +dspace.log.2017-12-13:36
    +dspace.log.2017-12-14:59
    +dspace.log.2017-12-15:104
    +dspace.log.2017-12-16:53
    +dspace.log.2017-12-17:66
    +dspace.log.2017-12-18:83
    +dspace.log.2017-12-19:101
    +dspace.log.2017-12-20:74
    +dspace.log.2017-12-21:55
    +dspace.log.2017-12-22:66
    +dspace.log.2017-12-23:50
    +dspace.log.2017-12-24:85
    +dspace.log.2017-12-25:62
    +dspace.log.2017-12-26:49
    +dspace.log.2017-12-27:30
    +dspace.log.2017-12-28:54
    +dspace.log.2017-12-29:68
    +dspace.log.2017-12-30:89
    +dspace.log.2017-12-31:53
    +dspace.log.2018-01-01:45
    +dspace.log.2018-01-02:34
    +
    + +
      +
    • Danny wrote to ask for help renewing the wildcard ilri.org certificate and I advised that we should probably use Let’s Encrypt if it’s just a handful of domains
    • +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    December, 2017

    + +
    +

    2017-12-01

    + +
      +
    • Uptime Robot noticed that CGSpace went down
    • +
    • The logs say “Timeout waiting for idle object”
    • +
    • PostgreSQL activity says there are 115 connections currently
    • +
    • The list of connections to XMLUI and REST API for today:
    • +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    November, 2017

    + +
    +

    2017-11-01

    + +
      +
    • The CORE developers responded to say they are looking into their bot not respecting our robots.txt
    • +
    + +

    2017-11-02

    + +
      +
    • Today there have been no hits by CORE and no alerts from Linode (coincidence?)
    • +
    + +
    # grep -c "CORE" /var/log/nginx/access.log
    +0
    +
    + +
      +
    • Generate list of authors on CGSpace for Peter to go through and correct:
    • +
    + +
    dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'contributor' and qualifier = 'author') AND resource_type_id = 2 group by text_value order by count desc) to /tmp/authors.csv with csv;
    +COPY 54701
    +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    October, 2017

    + +
    +

    2017-10-01

    + + + +
    http://hdl.handle.net/10568/78495||http://hdl.handle.net/10568/79336
    +
    + +
      +
    • There appears to be a pattern but I’ll have to look a bit closer and try to clean them up automatically, either in SQL or in OpenRefine
    • +
    • Add Katherine Lutz to the groups for content submission and edit steps of the CGIAR System collections
    • +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    CGIAR Library Migration

    + +
    +

    Rough notes for importing the CGIAR Library content. It was decided that this content would go to a new top-level community called CGIAR System Organization.

    + +

    + Read more → +
    + + + + + + +
    +
    +

    September, 2017

    + +
    +

    2017-09-06

    + +
      +
    • Linode sent an alert that CGSpace (linode18) was using 261% CPU for the past two hours
    • +
    + +

    2017-09-07

    + +
      +
    • Ask Sisay to clean up the WLE approvers a bit, as Marianne’s user account is both in the approvers step as well as the group
    • +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    August, 2017

    + +
    +

    2017-08-01

    + +
      +
    • Linode sent an alert that CGSpace (linode18) was using 350% CPU for the past two hours
    • +
    • I looked in the Activity pane of the Admin Control Panel and it seems that Google, Baidu, Yahoo, and Bing are all crawling with massive numbers of bots concurrently (~100 total, mostly Baidu and Google)
    • +
    • The good thing is that, according to dspace.log.2017-08-01, they are all using the same Tomcat session
    • +
    • This means our Tomcat Crawler Session Valve is working
    • +
    • But many of the bots are browsing dynamic URLs like: + +
        +
      • /handle/10568/3353/discover
      • +
      • /handle/10568/16510/browse
      • +
    • +
    • The robots.txt only blocks the top-level /discover and /browse URLs… we will need to find a way to forbid them from accessing these!
    • +
    • Relevant issue from DSpace Jira (semi resolved in DSpace 6.0): https://jira.duraspace.org/browse/DS-2962
    • +
    • It turns out that we’re already adding the X-Robots-Tag "none" HTTP header, but this only forbids the search engine from indexing the page, not crawling it!
    • +
    • Also, the bot has to successfully browse the page first so it can receive the HTTP header…
    • +
    • We might actually have to block these requests with HTTP 403 depending on the user agent
    • +
    • Abenet pointed out that the CGIAR Library Historical Archive collection I sent July 20th only had ~100 entries, instead of 2415
    • +
    • This was due to newline characters in the dc.description.abstract column, which caused OpenRefine to choke when exporting the CSV
    • +
    • I exported a new CSV from the collection on DSpace Test and then manually removed the characters in vim using g/^$/d
    • +
    • Then I cleaned up the author authorities and HTML characters in OpenRefine and sent the file back to Abenet
    • +
    + +

    + Read more → +
    + + + + + + +
    +
    +

    July, 2017

    + +
    +

    2017-07-01

    + +
      +
    • Run system updates and reboot DSpace Test
    • +
    + +

    2017-07-04

    + +
      +
    • Merge changes for WLE Phase II theme rename (#329)
    • +
    • Looking at extracting the metadata registries from ICARDA’s MEL DSpace database so we can compare fields with CGSpace
    • +
    • We can use PostgreSQL’s extended output format (-x) plus sed to format the output into quasi XML:
    • +
    + +

    + Read more → +
    + + + + + + + + diff --git a/docs/tags/notes/index.html b/docs/tags/notes/index.html index 982d49795..44995573f 100644 --- a/docs/tags/notes/index.html +++ b/docs/tags/notes/index.html @@ -74,8 +74,6 @@
    @@ -351,6 +349,26 @@ COPY 54701 +
    +
    +

    CGIAR Library Migration

    + +
    +

    Rough notes for importing the CGIAR Library content. It was decided that this content would go to a new top-level community called CGIAR System Organization.

    + +

    + Read more → +
    + + + + + +

    September, 2017

    @@ -454,24 +472,6 @@ COPY 54701 - -
    -
    -

    June, 2017

    - -
    - 2017-06-01 After discussion with WLE and CGSpace content people, we decided to just add one metadata field for the WLE Research Themes The cg.identifier.wletheme field will be used for both Phase I and Phase II Research Themes Then we’ll create a new sub-community for Phase II and create collections for the research themes there The current “Research Themes” community will be renamed to “WLE Phase I Research Themes” Tagged all items in the current Phase I collections with their appropriate themes Create pull request to add Phase II research themes to the submission form: #328 Add cg. - Read more → -
    - - - - -