CGSpace CG Core v2 Migration
+ +Possible changes to CGSpace metadata fields to align more with DC, QDC, and DCTERMS as well as CG Core v2.
+ +With reference to CG Core v2 draft standard by Marie-Angélique as well as DCMI DCTERMS.
+ Read more → +From 20539a394a5efd2de537fc5b9902e2cc3b34faaa Mon Sep 17 00:00:00 2001
From: Alan Orth Possible changes to CGSpace metadata fields to align more with DC, QDC, and DCTERMS as well as CG Core v2. With reference to CG Core v2 draft standard by Marie-Angélique as well as DCMI DCTERMS. I don’t see anything interesting in the web server logs around that time though:
+
diff --git a/docs/2015-12/index.html b/docs/2015-12/index.html
index 58517f714..3687720e4 100644
--- a/docs/2015-12/index.html
+++ b/docs/2015-12/index.html
@@ -296,6 +296,8 @@ $ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle
+
diff --git a/docs/2016-01/index.html b/docs/2016-01/index.html
index ea6650395..752ec90f6 100644
--- a/docs/2016-01/index.html
+++ b/docs/2016-01/index.html
@@ -212,6 +212,8 @@ $ find SimpleArchiveForBio/ -iname “*.pdf” -exec basename {} \; | so
+
diff --git a/docs/2016-02/index.html b/docs/2016-02/index.html
index b3120155a..cfe8291fb 100644
--- a/docs/2016-02/index.html
+++ b/docs/2016-02/index.html
@@ -457,6 +457,8 @@ Bitstream: tést señora alimentación.pdf
+
diff --git a/docs/2016-03/index.html b/docs/2016-03/index.html
index f92390013..6eb870352 100644
--- a/docs/2016-03/index.html
+++ b/docs/2016-03/index.html
@@ -363,6 +363,8 @@ metadata_value_id | resource_id | metadata_field_id | text_value | text_lang | p
+
diff --git a/docs/2016-04/index.html b/docs/2016-04/index.html
index 3ffa1a0a9..64e0b9ea2 100644
--- a/docs/2016-04/index.html
+++ b/docs/2016-04/index.html
@@ -579,6 +579,8 @@ dspace.log.2016-04-27:7271
+
diff --git a/docs/2016-05/index.html b/docs/2016-05/index.html
index d414ca635..9d0a9dbb7 100644
--- a/docs/2016-05/index.html
+++ b/docs/2016-05/index.html
@@ -437,6 +437,8 @@ sys 0m20.540s
+
diff --git a/docs/2016-06/index.html b/docs/2016-06/index.html
index 70dff3cba..3efb28160 100644
--- a/docs/2016-06/index.html
+++ b/docs/2016-06/index.html
@@ -472,6 +472,8 @@ $ ./delete-metadata-values.py -f dc.contributor.corporate -i Corporate-Authors-D
+
diff --git a/docs/2016-07/index.html b/docs/2016-07/index.html
index 13868e547..5980bf046 100644
--- a/docs/2016-07/index.html
+++ b/docs/2016-07/index.html
@@ -383,6 +383,8 @@ discovery.index.authority.ignore-variants=true
+
diff --git a/docs/2016-08/index.html b/docs/2016-08/index.html
index 981e5e26f..f2c80b249 100644
--- a/docs/2016-08/index.html
+++ b/docs/2016-08/index.html
@@ -462,6 +462,8 @@ $ JAVA_OPTS="-Dfile.encoding=UTF-8 -Xmx512m" /home/cgspace.cgiar.org/b
+
diff --git a/docs/2016-09/index.html b/docs/2016-09/index.html
index 630de8664..e1e610ae3 100644
--- a/docs/2016-09/index.html
+++ b/docs/2016-09/index.html
@@ -736,6 +736,8 @@ $ ./delete-metadata-values.py -i ilrisubjects-delete-13.csv -f cg.subject.ilri -
+
diff --git a/docs/2016-10/index.html b/docs/2016-10/index.html
index a297f0b10..cceeeee55 100644
--- a/docs/2016-10/index.html
+++ b/docs/2016-10/index.html
@@ -441,6 +441,8 @@ dspace=# update metadatavalue set text_value = regexp_replace(text_value, 'http:
+
diff --git a/docs/2016-11/index.html b/docs/2016-11/index.html
index 1ced59b34..45c4e037b 100644
--- a/docs/2016-11/index.html
+++ b/docs/2016-11/index.html
@@ -642,6 +642,8 @@ org.dspace.discovery.SearchServiceException: Error executing query
+
diff --git a/docs/2016-12/index.html b/docs/2016-12/index.html
index c5b4019cd..b2c259f4e 100644
--- a/docs/2016-12/index.html
+++ b/docs/2016-12/index.html
@@ -909,6 +909,8 @@ $ exit
+
diff --git a/docs/2017-01/index.html b/docs/2017-01/index.html
index 1ef9c6e22..42234f7d4 100644
--- a/docs/2017-01/index.html
+++ b/docs/2017-01/index.html
@@ -434,6 +434,8 @@ $ gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/ebook -dNOPAUSE -
+
diff --git a/docs/2017-02/index.html b/docs/2017-02/index.html
index 62ab0915b..1c12face8 100644
--- a/docs/2017-02/index.html
+++ b/docs/2017-02/index.html
@@ -499,6 +499,8 @@ COPY 1968
+
diff --git a/docs/2017-03/index.html b/docs/2017-03/index.html
index db752e630..bc0220b54 100644
--- a/docs/2017-03/index.html
+++ b/docs/2017-03/index.html
@@ -436,6 +436,8 @@ $ ./delete-metadata-values.py -i Investors-Delete-121.csv -f dc.description.spon
+
diff --git a/docs/2017-04/index.html b/docs/2017-04/index.html
index 9b3c09b2c..302ff58df 100644
--- a/docs/2017-04/index.html
+++ b/docs/2017-04/index.html
@@ -704,6 +704,8 @@ $ gem install compass -v 1.0.3
+
diff --git a/docs/2017-05/index.html b/docs/2017-05/index.html
index 49bd57e61..f293d94ba 100644
--- a/docs/2017-05/index.html
+++ b/docs/2017-05/index.html
@@ -464,6 +464,8 @@ UPDATE 187
+
diff --git a/docs/2017-06/index.html b/docs/2017-06/index.html
index 44efd74d2..9d807f35c 100644
--- a/docs/2017-06/index.html
+++ b/docs/2017-06/index.html
@@ -297,6 +297,8 @@ text_value
+
diff --git a/docs/2017-07/index.html b/docs/2017-07/index.html
index 918d8edb6..a1ac77e64 100644
--- a/docs/2017-07/index.html
+++ b/docs/2017-07/index.html
@@ -317,6 +317,8 @@ delete from metadatavalue where resource_type_id=2 and metadata_field_id=235 and
+
diff --git a/docs/2017-08/index.html b/docs/2017-08/index.html
index d987326b8..88eeea989 100644
--- a/docs/2017-08/index.html
+++ b/docs/2017-08/index.html
@@ -613,6 +613,8 @@ org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error
+
diff --git a/docs/2017-09/index.html b/docs/2017-09/index.html
index f5214db1e..65bdf09ca 100644
--- a/docs/2017-09/index.html
+++ b/docs/2017-09/index.html
@@ -783,6 +783,8 @@ Cert Status: good
+
diff --git a/docs/2017-10/index.html b/docs/2017-10/index.html
index 038dfc929..b68d9c572 100644
--- a/docs/2017-10/index.html
+++ b/docs/2017-10/index.html
@@ -21,7 +21,7 @@ Add Katherine Lutz to the groups for content submission and edit steps of the CG
-
+
@@ -49,7 +49,7 @@ Add Katherine Lutz to the groups for content submission and edit steps of the CG
"url": "https:\/\/alanorth.github.io\/cgspace-notes\/2017-10\/",
"wordCount": "2613",
"datePublished": "2017-10-01T08:07:54+03:00",
- "dateModified": "2018-03-09T22:10:33+02:00",
+ "dateModified": "2019-10-28T13:39:25+02:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@@ -113,8 +113,8 @@ Add Katherine Lutz to the groups for content submission and edit steps of the CG
October, 2017
+
diff --git a/docs/2017-11/index.html b/docs/2017-11/index.html
index 59b0aba72..44497c6ea 100644
--- a/docs/2017-11/index.html
+++ b/docs/2017-11/index.html
@@ -30,7 +30,7 @@ COPY 54701
-
+
@@ -67,7 +67,7 @@ COPY 54701
"url": "https:\/\/alanorth.github.io\/cgspace-notes\/2017-11\/",
"wordCount": "5428",
"datePublished": "2017-11-02T09:37:54+02:00",
- "dateModified": "2018-04-10T08:27:55+03:00",
+ "dateModified": "2019-10-28T13:39:25+02:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@@ -131,8 +131,8 @@ COPY 54701
November, 2017
+
diff --git a/docs/2017-12/index.html b/docs/2017-12/index.html
index ef930c789..c0f9a678b 100644
--- a/docs/2017-12/index.html
+++ b/docs/2017-12/index.html
@@ -17,7 +17,7 @@ The list of connections to XMLUI and REST API for today:
-
+
@@ -41,7 +41,7 @@ The list of connections to XMLUI and REST API for today:
"url": "https:\/\/alanorth.github.io\/cgspace-notes\/2017-12\/",
"wordCount": "4088",
"datePublished": "2017-12-01T13:53:54+03:00",
- "dateModified": "2018-03-09T22:10:33+02:00",
+ "dateModified": "2019-10-28T13:39:25+02:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@@ -105,8 +105,8 @@ The list of connections to XMLUI and REST API for today:
December, 2017
+
diff --git a/docs/2018-01/index.html b/docs/2018-01/index.html
index 167d7ade7..93ea87129 100644
--- a/docs/2018-01/index.html
+++ b/docs/2018-01/index.html
@@ -84,7 +84,7 @@ Danny wrote to ask for help renewing the wildcard ilri.org certificate and I adv
-
+
@@ -175,7 +175,7 @@ Danny wrote to ask for help renewing the wildcard ilri.org certificate and I adv
"url": "https:\/\/alanorth.github.io\/cgspace-notes\/2018-01\/",
"wordCount": "7940",
"datePublished": "2018-01-02T08:35:54-08:00",
- "dateModified": "2018-03-28T09:48:08+03:00",
+ "dateModified": "2019-10-28T13:39:25+02:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@@ -239,8 +239,8 @@ Danny wrote to ask for help renewing the wildcard ilri.org certificate and I adv
January, 2018
+
diff --git a/docs/2018-02/index.html b/docs/2018-02/index.html
index 74bb3f00f..8c3d7418f 100644
--- a/docs/2018-02/index.html
+++ b/docs/2018-02/index.html
@@ -17,7 +17,7 @@ I copied the logic in the jmx_tomcat_dbpools provided by Ubuntu’s munin-pl
-
+
@@ -41,7 +41,7 @@ I copied the logic in the jmx_tomcat_dbpools provided by Ubuntu’s munin-pl
"url": "https:\/\/alanorth.github.io\/cgspace-notes\/2018-02\/",
"wordCount": "6410",
"datePublished": "2018-02-01T16:28:54+02:00",
- "dateModified": "2018-08-19T18:42:55+03:00",
+ "dateModified": "2019-10-28T13:39:25+02:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@@ -105,8 +105,8 @@ I copied the logic in the jmx_tomcat_dbpools provided by Ubuntu’s munin-pl
February, 2018
+
diff --git a/docs/2018-03/index.html b/docs/2018-03/index.html
index 8640d4d12..989fc592d 100644
--- a/docs/2018-03/index.html
+++ b/docs/2018-03/index.html
@@ -14,7 +14,7 @@ Export a CSV of the IITA community metadata for Martin Mueller
-
+
@@ -35,7 +35,7 @@ Export a CSV of the IITA community metadata for Martin Mueller
"url": "https:\/\/alanorth.github.io\/cgspace-notes\/2018-03\/",
"wordCount": "2960",
"datePublished": "2018-03-02T16:07:54+02:00",
- "dateModified": "2019-04-26T12:13:02+03:00",
+ "dateModified": "2019-10-28T13:39:25+02:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@@ -99,8 +99,8 @@ Export a CSV of the IITA community metadata for Martin Mueller
March, 2018
+
diff --git a/docs/2018-04/index.html b/docs/2018-04/index.html
index ce0e0feff..9733823db 100644
--- a/docs/2018-04/index.html
+++ b/docs/2018-04/index.html
@@ -15,7 +15,7 @@ Catalina logs at least show some memory errors yesterday:
-
+
@@ -37,7 +37,7 @@ Catalina logs at least show some memory errors yesterday:
"url": "https:\/\/alanorth.github.io\/cgspace-notes\/2018-04\/",
"wordCount": "3016",
"datePublished": "2018-04-01T16:13:54+02:00",
- "dateModified": "2018-04-30T18:45:30+03:00",
+ "dateModified": "2019-10-28T13:39:25+02:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@@ -101,8 +101,8 @@ Catalina logs at least show some memory errors yesterday:
April, 2018
+
diff --git a/docs/2018-05/index.html b/docs/2018-05/index.html
index ada9f258f..b854f18b3 100644
--- a/docs/2018-05/index.html
+++ b/docs/2018-05/index.html
@@ -21,7 +21,7 @@ Also, I switched it to use OpenJDK instead of Oracle Java, as well as re-worked
-
+
@@ -49,7 +49,7 @@ Also, I switched it to use OpenJDK instead of Oracle Java, as well as re-worked
"url": "https:\/\/alanorth.github.io\/cgspace-notes\/2018-05\/",
"wordCount": "3503",
"datePublished": "2018-05-01T16:43:54+03:00",
- "dateModified": "2018-09-04T16:15:26+03:00",
+ "dateModified": "2019-10-28T13:39:25+02:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@@ -113,8 +113,8 @@ Also, I switched it to use OpenJDK instead of Oracle Java, as well as re-worked
May, 2018
+
diff --git a/docs/2018-06/index.html b/docs/2018-06/index.html
index 03adf9ba4..f141c0775 100644
--- a/docs/2018-06/index.html
+++ b/docs/2018-06/index.html
@@ -35,7 +35,7 @@ sys 2m7.289s
-
+
@@ -77,7 +77,7 @@ sys 2m7.289s
"url": "https:\/\/alanorth.github.io\/cgspace-notes\/2018-06\/",
"wordCount": "2894",
"datePublished": "2018-06-04T19:49:54-07:00",
- "dateModified": "2018-06-28T13:37:35+03:00",
+ "dateModified": "2019-10-28T13:39:25+02:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@@ -141,8 +141,8 @@ sys 2m7.289s
June, 2018
+
diff --git a/docs/2018-07/index.html b/docs/2018-07/index.html
index 7a564cfc3..8e8c7ca56 100644
--- a/docs/2018-07/index.html
+++ b/docs/2018-07/index.html
@@ -22,7 +22,7 @@ There is insufficient memory for the Java Runtime Environment to continue.
-
+
@@ -51,7 +51,7 @@ There is insufficient memory for the Java Runtime Environment to continue.
"url": "https:\/\/alanorth.github.io\/cgspace-notes\/2018-07\/",
"wordCount": "3376",
"datePublished": "2018-07-01T12:56:54+03:00",
- "dateModified": "2018-07-28T12:06:56+03:00",
+ "dateModified": "2019-10-28T13:39:25+02:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@@ -115,8 +115,8 @@ There is insufficient memory for the Java Runtime Environment to continue.
July, 2018
+
diff --git a/docs/2018-08/index.html b/docs/2018-08/index.html
index 021e4bd81..32701a894 100644
--- a/docs/2018-08/index.html
+++ b/docs/2018-08/index.html
@@ -31,7 +31,7 @@ I ran all system updates on DSpace Test and rebooted it
-
+
@@ -69,7 +69,7 @@ I ran all system updates on DSpace Test and rebooted it
"url": "https:\/\/alanorth.github.io\/cgspace-notes\/2018-08\/",
"wordCount": "2748",
"datePublished": "2018-08-01T11:52:54+03:00",
- "dateModified": "2018-09-10T23:35:46+03:00",
+ "dateModified": "2019-10-28T13:39:25+02:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@@ -133,8 +133,8 @@ I ran all system updates on DSpace Test and rebooted it
August, 2018
+
diff --git a/docs/2018-09/index.html b/docs/2018-09/index.html
index 7f0e52459..2839914d8 100644
--- a/docs/2018-09/index.html
+++ b/docs/2018-09/index.html
@@ -17,7 +17,7 @@ I’m testing the new DSpace 5.8 branch in my Ubuntu 18.04 environment and I
-
+
@@ -41,7 +41,7 @@ I’m testing the new DSpace 5.8 branch in my Ubuntu 18.04 environment and I
"url": "https:\/\/alanorth.github.io\/cgspace-notes\/2018-09\/",
"wordCount": "5245",
"datePublished": "2018-09-02T09:55:54+03:00",
- "dateModified": "2018-09-30T08:23:48+03:00",
+ "dateModified": "2019-10-28T13:39:25+02:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@@ -105,8 +105,8 @@ I’m testing the new DSpace 5.8 branch in my Ubuntu 18.04 environment and I
September, 2018
+
diff --git a/docs/2018-10/index.html b/docs/2018-10/index.html
index b1c4c5447..57d2ba6b9 100644
--- a/docs/2018-10/index.html
+++ b/docs/2018-10/index.html
@@ -15,7 +15,7 @@ I created a GitHub issue to track this #389, because I’m super busy in Nai
-
+
@@ -37,7 +37,7 @@ I created a GitHub issue to track this #389, because I’m super busy in Nai
"url": "https:\/\/alanorth.github.io\/cgspace-notes\/2018-10\/",
"wordCount": "4519",
"datePublished": "2018-10-01T22:31:54+03:00",
- "dateModified": "2018-11-01T16:42:20+02:00",
+ "dateModified": "2019-10-28T13:39:25+02:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@@ -101,8 +101,8 @@ I created a GitHub issue to track this #389, because I’m super busy in Nai
October, 2018
+
diff --git a/docs/2018-11/index.html b/docs/2018-11/index.html
index 711becc73..fb77ea2d1 100644
--- a/docs/2018-11/index.html
+++ b/docs/2018-11/index.html
@@ -22,7 +22,7 @@ Today these are the top 10 IPs:
-
+
@@ -51,7 +51,7 @@ Today these are the top 10 IPs:
"url": "https:\/\/alanorth.github.io\/cgspace-notes\/2018-11\/",
"wordCount": "2823",
"datePublished": "2018-11-01T16:41:30+02:00",
- "dateModified": "2019-02-07T10:52:23+02:00",
+ "dateModified": "2019-10-28T13:39:25+02:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@@ -115,8 +115,8 @@ Today these are the top 10 IPs:
November, 2018
+
diff --git a/docs/2018-12/index.html b/docs/2018-12/index.html
index d468eae0a..64f8d20a7 100644
--- a/docs/2018-12/index.html
+++ b/docs/2018-12/index.html
@@ -22,7 +22,7 @@ I noticed that there is another issue with PDF thumbnails on CGSpace, and I see
-
+
@@ -51,7 +51,7 @@ I noticed that there is another issue with PDF thumbnails on CGSpace, and I see
"url": "https:\/\/alanorth.github.io\/cgspace-notes\/2018-12\/",
"wordCount": "3096",
"datePublished": "2018-12-02T02:09:30+02:00",
- "dateModified": "2018-12-30T18:12:18+02:00",
+ "dateModified": "2019-10-28T13:39:25+02:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@@ -115,8 +115,8 @@ I noticed that there is another issue with PDF thumbnails on CGSpace, and I see
December, 2018
+
diff --git a/docs/2019-01/index.html b/docs/2019-01/index.html
index 117fc387b..bbdae2740 100644
--- a/docs/2019-01/index.html
+++ b/docs/2019-01/index.html
@@ -29,7 +29,7 @@ I don’t see anything interesting in the web server logs around that time t
-
+
@@ -65,7 +65,7 @@ I don’t see anything interesting in the web server logs around that time t
"url": "https:\/\/alanorth.github.io\/cgspace-notes\/2019-01\/",
"wordCount": "5532",
"datePublished": "2019-01-02T09:48:30+02:00",
- "dateModified": "2019-02-01T21:45:50+02:00",
+ "dateModified": "2019-10-28T13:39:25+02:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@@ -129,8 +129,8 @@ I don’t see anything interesting in the web server logs around that time t
January, 2019
+
diff --git a/docs/2019-02/index.html b/docs/2019-02/index.html
index 5f98cea82..f706a7679 100644
--- a/docs/2019-02/index.html
+++ b/docs/2019-02/index.html
@@ -43,7 +43,7 @@ sys 0m1.979s
-
+
@@ -93,7 +93,7 @@ sys 0m1.979s
"url": "https:\/\/alanorth.github.io\/cgspace-notes\/2019-02\/",
"wordCount": "7700",
"datePublished": "2019-02-01T21:37:30+02:00",
- "dateModified": "2019-03-18T15:25:49+02:00",
+ "dateModified": "2019-10-28T13:39:25+02:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@@ -157,8 +157,8 @@ sys 0m1.979s
February, 2019
+
diff --git a/docs/2019-03/index.html b/docs/2019-03/index.html
index 11bcbc747..91e522d89 100644
--- a/docs/2019-03/index.html
+++ b/docs/2019-03/index.html
@@ -25,7 +25,7 @@ I think I will need to ask Udana to re-copy and paste the abstracts with more ca
-
+
@@ -57,7 +57,7 @@ I think I will need to ask Udana to re-copy and paste the abstracts with more ca
"url": "https:\/\/alanorth.github.io\/cgspace-notes\/2019-03\/",
"wordCount": "7105",
"datePublished": "2019-03-01T12:16:30+01:00",
- "dateModified": "2019-07-12T14:05:21+03:00",
+ "dateModified": "2019-10-28T13:39:25+02:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@@ -121,8 +121,8 @@ I think I will need to ask Udana to re-copy and paste the abstracts with more ca
March, 2019
+
diff --git a/docs/2019-04/index.html b/docs/2019-04/index.html
index 20698b121..b652ad62c 100644
--- a/docs/2019-04/index.html
+++ b/docs/2019-04/index.html
@@ -38,7 +38,7 @@ $ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-1-region.csv -db dspace
-
+
@@ -83,7 +83,7 @@ $ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-1-region.csv -db dspace
"url": "https:\/\/alanorth.github.io\/cgspace-notes\/2019-04\/",
"wordCount": "6799",
"datePublished": "2019-04-01T09:00:43+03:00",
- "dateModified": "2019-07-04T19:37:10+03:00",
+ "dateModified": "2019-10-28T13:39:25+02:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@@ -147,8 +147,8 @@ $ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-1-region.csv -db dspace
April, 2019
+
diff --git a/docs/2019-05/index.html b/docs/2019-05/index.html
index a552b7195..3bbf83aa1 100644
--- a/docs/2019-05/index.html
+++ b/docs/2019-05/index.html
@@ -28,7 +28,7 @@ But after this I tried to delete the item from the XMLUI and it is still present
-
+
@@ -63,7 +63,7 @@ But after this I tried to delete the item from the XMLUI and it is still present
"url": "https:\/\/alanorth.github.io\/cgspace-notes\/2019-05\/",
"wordCount": "3215",
"datePublished": "2019-05-01T07:37:43+03:00",
- "dateModified": "2019-07-08T11:18:51+03:00",
+ "dateModified": "2019-10-28T13:39:25+02:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@@ -127,8 +127,8 @@ But after this I tried to delete the item from the XMLUI and it is still present
May, 2019
+
diff --git a/docs/2019-06/index.html b/docs/2019-06/index.html
index df592b38c..e31f9dafd 100644
--- a/docs/2019-06/index.html
+++ b/docs/2019-06/index.html
@@ -21,7 +21,7 @@ Skype with Marie-Angélique and Abenet about CG Core v2
-
+
@@ -49,7 +49,7 @@ Skype with Marie-Angélique and Abenet about CG Core v2
"url": "https:\/\/alanorth.github.io\/cgspace-notes\/2019-06\/",
"wordCount": "1057",
"datePublished": "2019-06-02T10:57:51+03:00",
- "dateModified": "2019-07-01T12:14:35+03:00",
+ "dateModified": "2019-10-28T13:39:25+02:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@@ -113,8 +113,8 @@ Skype with Marie-Angélique and Abenet about CG Core v2
June, 2019
+
diff --git a/docs/2019-07/index.html b/docs/2019-07/index.html
index 44d2e9eb8..45ff142c9 100644
--- a/docs/2019-07/index.html
+++ b/docs/2019-07/index.html
@@ -21,7 +21,7 @@ Abenet had another similar issue a few days ago when trying to find the stats fo
-
+
@@ -49,7 +49,7 @@ Abenet had another similar issue a few days ago when trying to find the stats fo
"url": "https:\/\/alanorth.github.io\/cgspace-notes\/2019-07\/",
"wordCount": "2330",
"datePublished": "2019-07-01T12:13:51+03:00",
- "dateModified": "2019-07-30T20:15:21+03:00",
+ "dateModified": "2019-10-28T13:39:25+02:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@@ -113,8 +113,8 @@ Abenet had another similar issue a few days ago when trying to find the stats fo
July, 2019
+
diff --git a/docs/2019-08/index.html b/docs/2019-08/index.html
index 4fa3c1e6f..dc5cd68db 100644
--- a/docs/2019-08/index.html
+++ b/docs/2019-08/index.html
@@ -27,7 +27,7 @@ Run system updates on DSpace Test (linode19) and reboot it
-
+
@@ -61,7 +61,7 @@ Run system updates on DSpace Test (linode19) and reboot it
"url": "https:\/\/alanorth.github.io\/cgspace-notes\/2019-08\/",
"wordCount": "2703",
"datePublished": "2019-08-03T12:39:51+03:00",
- "dateModified": "2019-09-27T17:53:18+03:00",
+ "dateModified": "2019-10-28T13:39:25+02:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@@ -125,8 +125,8 @@ Run system updates on DSpace Test (linode19) and reboot it
August, 2019
+
diff --git a/docs/2019-09/index.html b/docs/2019-09/index.html
index c913fda36..41b90b6b4 100644
--- a/docs/2019-09/index.html
+++ b/docs/2019-09/index.html
@@ -40,7 +40,7 @@ Here are the top ten IPs in the nginx XMLUI and REST/OAI logs this morning:
-
+
@@ -87,7 +87,7 @@ Here are the top ten IPs in the nginx XMLUI and REST/OAI logs this morning:
"url": "https:\/\/alanorth.github.io\/cgspace-notes\/2019-09\/",
"wordCount": "2870",
"datePublished": "2019-09-01T10:17:51+03:00",
- "dateModified": "2019-09-27T17:53:18+03:00",
+ "dateModified": "2019-10-28T13:39:25+02:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@@ -151,8 +151,8 @@ Here are the top ten IPs in the nginx XMLUI and REST/OAI logs this morning:
September, 2019
+
diff --git a/docs/2019-10/index.html b/docs/2019-10/index.html
index 812b5a036..1309ea574 100644
--- a/docs/2019-10/index.html
+++ b/docs/2019-10/index.html
@@ -11,7 +11,7 @@
-
+
@@ -29,7 +29,7 @@
"url": "https:\/\/alanorth.github.io\/cgspace-notes\/2019-10\/",
"wordCount": "1535",
"datePublished": "2019-10-01T13:20:51+03:00",
- "dateModified": "2019-10-21T22:09:15+03:00",
+ "dateModified": "2019-10-28T13:41:08+02:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@@ -93,8 +93,8 @@
October, 2019
+
diff --git a/docs/404.html b/docs/404.html
index 25559957d..804c86255 100644
--- a/docs/404.html
+++ b/docs/404.html
@@ -89,6 +89,8 @@
+
diff --git a/docs/categories/index.html b/docs/categories/index.html
index 763ad3f42..a4506800e 100644
--- a/docs/categories/index.html
+++ b/docs/categories/index.html
@@ -10,7 +10,7 @@
-
+
@@ -29,8 +29,8 @@
"@type": "Person",
"name": "Alan Orth"
},
- "dateModified": "2017-09-18T16:38:35+03:00",
- "keywords": "notes,notes,",
+ "dateModified": "2019-10-28T13:27:35+02:00",
+ "keywords": "notes,migration,notes,",
"description": "Documenting day-to-day work on the [CGSpace](https:\/\/cgspace.cgiar.org) repository."
}
@@ -91,12 +91,34 @@
+CGSpace CG Core v2 Migration
+
+ October, 2019
September, 2019
August, 2019
July, 2019
June, 2019
May, 2019
April, 2019
March, 2019
February, 2019
January, 2019
-
- 2019-01-02
-
-
-
- Read more →
-# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "02/Jan/2019:0(1|2|3)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
- 92 40.77.167.4
- 99 210.7.29.100
-120 38.126.157.45
-177 35.237.175.180
-177 40.77.167.32
-216 66.249.75.219
-225 18.203.76.93
-261 46.101.86.248
-357 207.46.13.1
-903 54.70.40.11
-
Rough notes for importing the CGIAR Library content. It was decided that this content would go to a new top-level community called CGIAR System Organization.
@@ -415,24 +439,6 @@ COPY 54701 - -bower.json
because most are several versions of out datefonts
)Start working on DSpace 5.1 → 5.5 port:
- -$ git checkout -b 55new 5_x-prod
-$ git reset --hard ilri/5_x-prod
-$ git rebase -i dspace-5.5
-
bower.json
because most are several versions of out datefonts
)Start working on DSpace 5.1 → 5.5 port:
+ +$ git checkout -b 55new 5_x-prod
+$ git reset --hard ilri/5_x-prod
+$ git rebase -i dspace-5.5
+
Rough notes for importing the CGIAR Library content. It was decided that this content would go to a new top-level community called CGIAR System Organization.
@@ -327,6 +329,8 @@ dspace=# select setval('handle_seq',86873);Possible changes to CGSpace metadata fields to align more with DC, QDC, and DCTERMS as well as CG Core v2.
+ +With reference to CG Core v2 draft standard by Marie-Angélique as well as DCMI DCTERMS.
+ Read more → +I don’t see anything interesting in the web server logs around that time though:
- -# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "02/Jan/2019:0(1|2|3)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
- 92 40.77.167.4
- 99 210.7.29.100
-120 38.126.157.45
-177 35.237.175.180
-177 40.77.167.32
-216 66.249.75.219
-225 18.203.76.93
-261 46.101.86.248
-357 207.46.13.1
-903 54.70.40.11
-
I don’t see anything interesting in the web server logs around that time though:
+ +# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "02/Jan/2019:0(1|2|3)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
+ 92 40.77.167.4
+ 99 210.7.29.100
+120 38.126.157.45
+177 35.237.175.180
+177 40.77.167.32
+216 66.249.75.219
+225 18.203.76.93
+261 46.101.86.248
+357 207.46.13.1
+903 54.70.40.11
+
Rough notes for importing the CGIAR Library content. It was decided that this content would go to a new top-level community called CGIAR System Organization.
@@ -415,24 +439,6 @@ COPY 54701 - -bower.json
because most are several versions of out datefonts
)Start working on DSpace 5.1 → 5.5 port:
- -$ git checkout -b 55new 5_x-prod
-$ git reset --hard ilri/5_x-prod
-$ git rebase -i dspace-5.5
-
bower.json
because most are several versions of out datefonts
)Start working on DSpace 5.1 → 5.5 port:
+ +$ git checkout -b 55new 5_x-prod
+$ git reset --hard ilri/5_x-prod
+$ git rebase -i dspace-5.5
+
Possible changes to CGSpace metadata fields to align more with DC, QDC, and DCTERMS as well as CG Core v2.
+ +With reference to CG Core v2 draft standard by Marie-Angélique as well as DCMI DCTERMS.
+ Read more → +I don’t see anything interesting in the web server logs around that time though:
- -# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "02/Jan/2019:0(1|2|3)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
- 92 40.77.167.4
- 99 210.7.29.100
-120 38.126.157.45
-177 35.237.175.180
-177 40.77.167.32
-216 66.249.75.219
-225 18.203.76.93
-261 46.101.86.248
-357 207.46.13.1
-903 54.70.40.11
-
I don’t see anything interesting in the web server logs around that time though:
+ +# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "02/Jan/2019:0(1|2|3)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
+ 92 40.77.167.4
+ 99 210.7.29.100
+120 38.126.157.45
+177 35.237.175.180
+177 40.77.167.32
+216 66.249.75.219
+225 18.203.76.93
+261 46.101.86.248
+357 207.46.13.1
+903 54.70.40.11
+
Rough notes for importing the CGIAR Library content. It was decided that this content would go to a new top-level community called CGIAR System Organization.
@@ -415,24 +439,6 @@ COPY 54701 - -bower.json
because most are several versions of out datefonts
)Start working on DSpace 5.1 → 5.5 port:
- -$ git checkout -b 55new 5_x-prod
-$ git reset --hard ilri/5_x-prod
-$ git rebase -i dspace-5.5
-
bower.json
because most are several versions of out datefonts
)Start working on DSpace 5.1 → 5.5 port:
+ +$ git checkout -b 55new 5_x-prod
+$ git reset --hard ilri/5_x-prod
+$ git rebase -i dspace-5.5
+
Possible changes to CGSpace metadata fields to align more with DC, QDC, and DCTERMS as well as CG Core v2.
+ +With reference to CG Core v2 draft standard by Marie-Angélique as well as DCMI DCTERMS.
+ Read more → +I don’t see anything interesting in the web server logs around that time though:
- -# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "02/Jan/2019:0(1|2|3)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
- 92 40.77.167.4
- 99 210.7.29.100
-120 38.126.157.45
-177 35.237.175.180
-177 40.77.167.32
-216 66.249.75.219
-225 18.203.76.93
-261 46.101.86.248
-357 207.46.13.1
-903 54.70.40.11
-
dspace.log.2017-08-01
, they are all using the same Tomcat sessionHere are the top ten IPs in the nginx XMLUI and REST/OAI logs this morning:
+robots.txt
only blocks the top-level /discover
and /browse
URLs… we will need to find a way to forbid them from accessing these!X-Robots-Tag "none"
HTTP header, but this only forbids the search engine from indexing the page, not crawling it!dc.description.abstract
column, which caused OpenRefine to choke when exporting the CSVg/^$/d
# zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E "01/Sep/2019:0" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
-440 17.58.101.255
-441 157.55.39.101
-485 207.46.13.43
-728 169.60.128.125
-730 207.46.13.108
-758 157.55.39.9
-808 66.160.140.179
-814 207.46.13.212
-2472 163.172.71.23
-6092 3.94.211.189
-# zcat --force /var/log/nginx/rest.log /var/log/nginx/rest.log.1 /var/log/nginx/oai.log /var/log/nginx/oai.log.1 | grep -E "01/Sep/2019:0" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
- 33 2a01:7e00::f03c:91ff:fe16:fcb
- 57 3.83.192.124
- 57 3.87.77.25
- 57 54.82.1.8
-822 2a01:9cc0:47:1:1a:4:0:2
-1223 45.5.184.72
-1633 172.104.229.92
-5112 205.186.128.185
-7249 2a01:7e00::f03c:91ff:fe18:7396
-9124 45.5.186.2
+
+
+
+
+
+
+
+ July, 2017
+
+
+ 2017-07-01
+
+
+- Run system updates and reboot DSpace Test
+
+
+2017-07-04
+
+
+- Merge changes for WLE Phase II theme rename (#329)
+- Looking at extracting the metadata registries from ICARDA’s MEL DSpace database so we can compare fields with CGSpace
+- We can use PostgreSQL’s extended output format (
-x
) plus sed
to format the output into quasi XML:
+
+ Read more →
+
+
+
+
+
+
+
+
+
+ June, 2017
+
+
+ 2017-06-01 After discussion with WLE and CGSpace content people, we decided to just add one metadata field for the WLE Research Themes The cg.identifier.wletheme field will be used for both Phase I and Phase II Research Themes Then we’ll create a new sub-community for Phase II and create collections for the research themes there The current “Research Themes” community will be renamed to “WLE Phase I Research Themes” Tagged all items in the current Phase I collections with their appropriate themes Create pull request to add Phase II research themes to the submission form: #328 Add cg.
+ Read more →
+
+
+
+
+
+
+
+
+
+ May, 2017
+
+
+ 2017-05-01 ICARDA apparently started working on CG Core on their MEL repository They have done a few cg.* fields, but not very consistent and even copy some of CGSpace items: https://mel.cgiar.org/xmlui/handle/20.500.11766/6911?show=full https://cgspace.cgiar.org/handle/10568/73683 2017-05-02 Atmire got back about the Workflow Statistics issue, and apparently it’s a bug in the CUA module so they will send us a pull request 2017-05-04 Sync DSpace Test with database and assetstore from CGSpace Re-deploy DSpace Test with Atmire’s CUA patch for workflow statistics, run system updates, and restart the server Now I can see the workflow statistics and am able to select users, but everything returns 0 items Megan says there are still some mapped items are not appearing since last week, so I forced a full index-discovery -b Need to remember to check if the collection has more items (currently 39 on CGSpace, but 118 on the freshly reindexed DSPace Test) tomorrow: https://cgspace.
+ Read more →
+
+
+
+
+
+
+
+
+
+ April, 2017
+
+
+ 2017-04-02
+
+
+- Merge one change to CCAFS flagships that I had forgotten to remove last month (“MANAGING CLIMATE RISK”): https://github.com/ilri/DSpace/pull/317
+- Quick proof-of-concept hack to add
dc.rights
to the input form, including some inline instructions/hints:
+
+
+
+
+
+- Remove redundant/duplicate text in the DSpace submission license
+
+Testing the CMYK patch on a collection with 650 items:
+
+$ [dspace]/bin/dspace filter-media -f -i 10568/16498 -p "ImageMagick PDF Thumbnail" -v >& /tmp/filter-media-cmyk.txt
- Read more →
+ Read more →
@@ -145,32 +246,38 @@
- August, 2019
- March, 2017
+
- 2019-08-03
+ 2017-03-01
-- Look at Bioversity’s latest migration CSV and now I see that Francesco has cleaned up the extra columns and the newline at the end of the file, but many of the column headers have an extra space in the name…
+- Run the 279 CIAT author corrections on CGSpace
-2019-08-04
+2017-03-02
-- Deploy ORCID identifier updates requested by Bioversity to CGSpace
-- Run system updates on CGSpace (linode18) and reboot it
+
- Skype with Michael and Peter, discussing moving the CGIAR Library to CGSpace
+- CGIAR people possibly open to moving content, redirecting library.cgiar.org to CGSpace and letting CGSpace resolve their handles
+- They might come in at the top level in one “CGIAR System” community, or with several communities
+- I need to spend a bit of time looking at the multiple handle support in DSpace and see if new content can be minted in both handles, or just one?
+- Need to send Peter and Michael some notes about this in a few days
+- Also, need to consider talking to Atmire about hiring them to bring ORCiD metadata to REST / OAI
+- Filed an issue on DSpace issue tracker for the
filter-media
bug that causes it to process JPGs even when limiting to the PDF thumbnail plugin: DS-3516
+- Discovered that the ImageMagic
filter-media
plugin creates JPG thumbnails with the CMYK colorspace when the source PDF is using CMYK
-
-- Before updating it I checked Solr and verified that all statistics cores were loaded properly…
-- After rebooting, all statistics cores were loaded… wow, that’s lucky.
-
-- Run system updates on DSpace Test (linode19) and reboot it
+Interestingly, it seems DSpace 4.x’s thumbnails were sRGB, but forcing regeneration using DSpace 5.x’s ImageMagick plugin creates CMYK JPGs if the source PDF was CMYK (see 10568⁄51999):
+
+$ identify ~/Desktop/alc_contrastes_desafios.jpg
+/Users/aorth/Desktop/alc_contrastes_desafios.jpg JPEG 464x600 464x600+0+0 8-bit CMYK 168KB 0.000u 0:00.000
+
- Read more →
+ Read more →
@@ -180,91 +287,34 @@
- July, 2019
- February, 2017
+
- 2019-07-01
+ 2017-02-07
-- Create an “AfricaRice books and book chapters” collection on CGSpace for AfricaRice
-- Last month Sisay asked why the following “most popular” statistics link for a range of months in 2018 works for the CIAT community on DSpace Test, but not on CGSpace:
+
An item was mapped twice erroneously again, so I had to remove one of the mappings manually:
-
-- DSpace Test
-- CGSpace
-
-- Abenet had another similar issue a few days ago when trying to find the stats for 2018 in the RTB community
-
- Read more →
-
-
-
-
-
-
-
-
-
- June, 2019
-
-
- 2019-06-02
-
-
-- Merge the Solr filterCache and XMLUI ISI journal changes to the
5_x-prod
branch and deploy on CGSpace
-- Run system updates on CGSpace (linode18) and reboot it
-
-
-2019-06-03
-
-
-- Skype with Marie-Angélique and Abenet about CG Core v2
-
- Read more →
-
-
-
-
-
-
-
-
-
- May, 2019
-
-
- 2019-05-01
-
-
-- Help CCAFS with regenerating some item thumbnails after they uploaded new PDFs to some items on CGSpace
-- A user on the dspace-tech mailing list offered some suggestions for troubleshooting the problem with the inability to delete certain items
-
-
-- Apparently if the item is in the
workflowitem
table it is submitted to a workflow
-- And if it is in the
workspaceitem
table it is in the pre-submitted state
-
-
-The item seems to be in a pre-submitted state, so I tried to delete it from there:
-
-dspace=# DELETE FROM workspaceitem WHERE item_id=74648;
+dspace=# select * from collection2item where item_id = '80278';
+id | collection_id | item_id
+-------+---------------+---------
+92551 | 313 | 80278
+92550 | 313 | 80278
+90774 | 1051 | 80278
+(3 rows)
+dspace=# delete from collection2item where id = 92551 and item_id = 80278;
DELETE 1
-But after this I tried to delete the item from the XMLUI and it is still present…
+Create issue on GitHub to track the addition of CCAFS Phase II project tags (#301)
+
+Looks like we’ll be using cg.identifier.ccafsprojectpii
as the field name
- Read more →
+ Read more →
@@ -274,43 +324,21 @@ DELETE 1
- April, 2019
- January, 2017
+
- 2019-04-01
+ 2017-01-02
-- Meeting with AgroKnow to discuss CGSpace, ILRI data, AReS, GARDIAN, etc
-
-
-- They asked if we had plans to enable RDF support in CGSpace
-
-
-There have been 4,400 more downloads of the CTA Spore publication from those strange Amazon IP addresses today
-
-
-I suspected that some might not be successful, because the stats show less, but today they were all HTTP 200!
-
-# cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep 'Spore-192-EN-web.pdf' | grep -E '(18.196.196.108|18.195.78.144|18.195.218.6)' | awk '{print $9}' | sort | uniq -c | sort -n | tail -n 5
-4432 200
-
-
-
-In the last two weeks there have been 47,000 downloads of this same exact PDF by these three IP addresses
-
-Apply country and region corrections and deletions on DSpace Test and CGSpace:
-
-$ ./fix-metadata-values.py -i /tmp/2019-02-21-fix-9-countries.csv -db dspace -u dspace -p 'fuuu' -f cg.coverage.country -m 228 -t ACTION -d
-$ ./fix-metadata-values.py -i /tmp/2019-02-21-fix-4-regions.csv -db dspace -u dspace -p 'fuuu' -f cg.coverage.region -m 231 -t action -d
-$ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-2-countries.csv -db dspace -u dspace -p 'fuuu' -m 228 -f cg.coverage.country -d
-$ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-1-region.csv -db dspace -u dspace -p 'fuuu' -m 231 -f cg.coverage.region -d
-
+- I checked to see if the Solr sharding task that is supposed to run on January 1st had run and saw there was an error
+- I tested on DSpace Test as well and it doesn’t work there either
+- I asked on the dspace-tech mailing list because it seems to be broken, and actually now I’m not sure if we’ve ever had the sharding task run successfully over all these years
- Read more →
+ Read more →
@@ -320,118 +348,34 @@ $ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-1-region.csv -db dspace
- March, 2019
- December, 2016
+
- 2019-03-01
+ 2016-12-02
-- I checked IITA’s 259 Feb 14 records from last month for duplicates using Atmire’s Duplicate Checker on a fresh snapshot of CGSpace on my local machine and everything looks good
-- I am now only waiting to hear from her about where the items should go, though I assume Journal Articles go to IITA Journal Articles collection, etc…
-- Looking at the other half of Udana’s WLE records from 2018-11
+
- CGSpace was down for five hours in the morning while I was sleeping
-
-- I finished the ones for Restoring Degraded Landscapes (RDL), but these are for Variability, Risks and Competing Uses (VRC)
-- I did the usual cleanups for whitespace, added regions where they made sense for certain countries, cleaned up the DOI link formats, added rights information based on the publications page for a few items
-- Most worryingly, there are encoding errors in the abstracts for eleven items, for example:
-- 68.15% � 9.45 instead of 68.15% ± 9.45
-- 2003�2013 instead of 2003–2013
-
-- I think I will need to ask Udana to re-copy and paste the abstracts with more care using Google Docs
-
- Read more →
-
+While looking in the logs for errors, I see tons of warnings about Atmire MQM:
-
-
-
-
-
-
-
- February, 2019
-
-
- 2019-02-01
-
-
-- Linode has alerted a few times since last night that the CPU usage on CGSpace (linode18) was high despite me increasing the alert threshold last week from 250% to 275%—I might need to increase it again!
-
-The top IPs before, during, and after this latest alert tonight were:
-
-# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "01/Feb/2019:(17|18|19|20|21)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
-245 207.46.13.5
-332 54.70.40.11
-385 5.143.231.38
-405 207.46.13.173
-405 207.46.13.75
-1117 66.249.66.219
-1121 35.237.175.180
-1546 5.9.6.51
-2474 45.5.186.2
-5490 85.25.237.71
+2016-12-02 03:00:32,352 WARN com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=CREATE, SubjectType=BUNDLE, SubjectID=70316, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632305, dispatcher=1544803905, detail=[null], transactionID="TX157907838689377964651674089851855413607")
+2016-12-02 03:00:32,353 WARN com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=MODIFY_METADATA, SubjectType=BUNDLE, SubjectID =70316, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632309, dispatcher=1544803905, detail="dc.title", transactionID="TX157907838689377964651674089851855413607")
+2016-12-02 03:00:32,353 WARN com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=ADD, SubjectType=ITEM, SubjectID=80044, Object Type=BUNDLE, ObjectID=70316, TimeStamp=1480647632311, dispatcher=1544803905, detail="THUMBNAIL", transactionID="TX157907838689377964651674089851855413607")
+2016-12-02 03:00:32,353 WARN com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=ADD, SubjectType=BUNDLE, SubjectID=70316, Obje ctType=BITSTREAM, ObjectID=86715, TimeStamp=1480647632318, dispatcher=1544803905, detail="-1", transactionID="TX157907838689377964651674089851855413607")
+2016-12-02 03:00:32,353 WARN com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=MODIFY, SubjectType=ITEM, SubjectID=80044, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632351, dispatcher=1544803905, detail=[null], transactionID="TX157907838689377964651674089851855413607")
-85.25.237.71
is the “Linguee Bot” that I first saw last month
+I see thousands of them in the logs for the last few months, so it’s not related to the DSpace 5.5 upgrade
-The Solr statistics the past few months have been very high and I was wondering if the web server logs also showed an increase
+I’ve raised a ticket with Atmire to ask
-There were just over 3 million accesses in the nginx logs last month:
-
-# time zcat --force /var/log/nginx/* | grep -cE "[0-9]{1,2}/Jan/2019"
-3018243
-
-real 0m19.873s
-user 0m22.203s
-sys 0m1.979s
-
+Another worrying error from dspace.log is:
- Read more →
-
-
-
-
-
-
-
-
-
- January, 2019
-
-
- 2019-01-02
-
-
-- Linode alerted that CGSpace (linode18) had a higher outbound traffic rate than normal early this morning
-
-I don’t see anything interesting in the web server logs around that time though:
-
-# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "02/Jan/2019:0(1|2|3)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
- 92 40.77.167.4
- 99 210.7.29.100
-120 38.126.157.45
-177 35.237.175.180
-177 40.77.167.32
-216 66.249.75.219
-225 18.203.76.93
-261 46.101.86.248
-357 207.46.13.1
-903 54.70.40.11
-
-
- Read more →
+ Read more →
@@ -462,6 +406,8 @@ sys 0m1.979s
+- CGSpace CG Core v2 Migration
+
- October, 2019
- September, 2019
@@ -470,8 +416,6 @@ sys 0m1.979s
- July, 2019
-- June, 2019
-
diff --git a/docs/tags/notes/index.xml b/docs/tags/notes/index.xml
index 64c6e162e..d13eb8756 100644
--- a/docs/tags/notes/index.xml
+++ b/docs/tags/notes/index.xml
@@ -6,652 +6,11 @@
Recent content in Notes on CGSpace Notes
Hugo -- gohugo.io
en-us
- Tue, 01 Oct 2019 13:20:51 +0300
+ Thu, 07 Sep 2017 16:54:52 +0700
- -
-
October, 2019
- https://alanorth.github.io/cgspace-notes/2019-10/
- Tue, 01 Oct 2019 13:20:51 +0300
-
- https://alanorth.github.io/cgspace-notes/2019-10/
- 2019-10-01 Udana from IWMI asked me for a CSV export of their community on CGSpace
- I exported it, but a quick run through the csv-metadata-quality tool shows that there are some low-hanging fruits we can fix before I send him the data I will limit the scope to the titles, regions, subregions, and river basins for now to manually fix some non-breaking spaces (U+00A0) there that would otherwise be removed by the csv-metadata-quality script’s “unneccesary Unicode” fix:
-
-
- -
-
September, 2019
- https://alanorth.github.io/cgspace-notes/2019-09/
- Sun, 01 Sep 2019 10:17:51 +0300
-
- https://alanorth.github.io/cgspace-notes/2019-09/
- <h2 id="2019-09-01">2019-09-01</h2>
-
-<ul>
-<li>Linode emailed to say that CGSpace (linode18) had a high rate of outbound traffic for several hours this morning</li>
-
-<li><p>Here are the top ten IPs in the nginx XMLUI and REST/OAI logs this morning:</p>
-
-<pre><code># zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E "01/Sep/2019:0" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
-440 17.58.101.255
-441 157.55.39.101
-485 207.46.13.43
-728 169.60.128.125
-730 207.46.13.108
-758 157.55.39.9
-808 66.160.140.179
-814 207.46.13.212
-2472 163.172.71.23
-6092 3.94.211.189
-# zcat --force /var/log/nginx/rest.log /var/log/nginx/rest.log.1 /var/log/nginx/oai.log /var/log/nginx/oai.log.1 | grep -E "01/Sep/2019:0" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
- 33 2a01:7e00::f03c:91ff:fe16:fcb
- 57 3.83.192.124
- 57 3.87.77.25
- 57 54.82.1.8
-822 2a01:9cc0:47:1:1a:4:0:2
-1223 45.5.184.72
-1633 172.104.229.92
-5112 205.186.128.185
-7249 2a01:7e00::f03c:91ff:fe18:7396
-9124 45.5.186.2
-</code></pre></li>
-</ul>
-
-
- -
-
August, 2019
- https://alanorth.github.io/cgspace-notes/2019-08/
- Sat, 03 Aug 2019 12:39:51 +0300
-
- https://alanorth.github.io/cgspace-notes/2019-08/
- <h2 id="2019-08-03">2019-08-03</h2>
-
-<ul>
-<li>Look at Bioversity’s latest migration CSV and now I see that Francesco has cleaned up the extra columns and the newline at the end of the file, but many of the column headers have an extra space in the name…</li>
-</ul>
-
-<h2 id="2019-08-04">2019-08-04</h2>
-
-<ul>
-<li>Deploy ORCID identifier updates requested by Bioversity to CGSpace</li>
-<li>Run system updates on CGSpace (linode18) and reboot it
-
-<ul>
-<li>Before updating it I checked Solr and verified that all statistics cores were loaded properly…</li>
-<li>After rebooting, all statistics cores were loaded… wow, that’s lucky.</li>
-</ul></li>
-<li>Run system updates on DSpace Test (linode19) and reboot it</li>
-</ul>
-
-
- -
-
July, 2019
- https://alanorth.github.io/cgspace-notes/2019-07/
- Mon, 01 Jul 2019 12:13:51 +0300
-
- https://alanorth.github.io/cgspace-notes/2019-07/
- <h2 id="2019-07-01">2019-07-01</h2>
-
-<ul>
-<li>Create an “AfricaRice books and book chapters” collection on CGSpace for AfricaRice</li>
-<li>Last month Sisay asked why the following “most popular” statistics link for a range of months in 2018 works for the CIAT community on DSpace Test, but not on CGSpace:
-
-<ul>
-<li><a href="https://dspacetest.cgiar.org/handle/10568/35697/most-popular/item#simplefilter=custom&time_filter_end_date=01%2F12%2F2018">DSpace Test</a></li>
-<li><a href="https://cgspace.cgiar.org/handle/10568/35697/most-popular/item#simplefilter=custom&time_filter_end_date=01%2F12%2F2018">CGSpace</a></li>
-</ul></li>
-<li>Abenet had another similar issue a few days ago when trying to find the stats for 2018 in the RTB community</li>
-</ul>
-
-
- -
-
June, 2019
- https://alanorth.github.io/cgspace-notes/2019-06/
- Sun, 02 Jun 2019 10:57:51 +0300
-
- https://alanorth.github.io/cgspace-notes/2019-06/
- <h2 id="2019-06-02">2019-06-02</h2>
-
-<ul>
-<li>Merge the <a href="https://github.com/ilri/DSpace/pull/425">Solr filterCache</a> and <a href="https://github.com/ilri/DSpace/pull/426">XMLUI ISI journal</a> changes to the <code>5_x-prod</code> branch and deploy on CGSpace</li>
-<li>Run system updates on CGSpace (linode18) and reboot it</li>
-</ul>
-
-<h2 id="2019-06-03">2019-06-03</h2>
-
-<ul>
-<li>Skype with Marie-Angélique and Abenet about <a href="https://agriculturalsemantics.github.io/cg-core/cgcore.html">CG Core v2</a></li>
-</ul>
-
-
- -
-
May, 2019
- https://alanorth.github.io/cgspace-notes/2019-05/
- Wed, 01 May 2019 07:37:43 +0300
-
- https://alanorth.github.io/cgspace-notes/2019-05/
- <h2 id="2019-05-01">2019-05-01</h2>
-
-<ul>
-<li>Help CCAFS with regenerating some item thumbnails after they uploaded new PDFs to some items on CGSpace</li>
-<li>A user on the dspace-tech mailing list offered some suggestions for troubleshooting the problem with the inability to delete certain items
-
-<ul>
-<li>Apparently if the item is in the <code>workflowitem</code> table it is submitted to a workflow</li>
-<li>And if it is in the <code>workspaceitem</code> table it is in the pre-submitted state</li>
-</ul></li>
-
-<li><p>The item seems to be in a pre-submitted state, so I tried to delete it from there:</p>
-
-<pre><code>dspace=# DELETE FROM workspaceitem WHERE item_id=74648;
-DELETE 1
-</code></pre></li>
-
-<li><p>But after this I tried to delete the item from the XMLUI and it is <em>still</em> present…</p></li>
-</ul>
-
-
- -
-
April, 2019
- https://alanorth.github.io/cgspace-notes/2019-04/
- Mon, 01 Apr 2019 09:00:43 +0300
-
- https://alanorth.github.io/cgspace-notes/2019-04/
- <h2 id="2019-04-01">2019-04-01</h2>
-
-<ul>
-<li>Meeting with AgroKnow to discuss CGSpace, ILRI data, AReS, GARDIAN, etc
-
-<ul>
-<li>They asked if we had plans to enable RDF support in CGSpace</li>
-</ul></li>
-
-<li><p>There have been 4,400 more downloads of the CTA Spore publication from those strange Amazon IP addresses today</p>
-
-<ul>
-<li><p>I suspected that some might not be successful, because the stats show less, but today they were all HTTP 200!</p>
-
-<pre><code># cat /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep 'Spore-192-EN-web.pdf' | grep -E '(18.196.196.108|18.195.78.144|18.195.218.6)' | awk '{print $9}' | sort | uniq -c | sort -n | tail -n 5
-4432 200
-</code></pre></li>
-</ul></li>
-
-<li><p>In the last two weeks there have been 47,000 downloads of this <em>same exact PDF</em> by these three IP addresses</p></li>
-
-<li><p>Apply country and region corrections and deletions on DSpace Test and CGSpace:</p>
-
-<pre><code>$ ./fix-metadata-values.py -i /tmp/2019-02-21-fix-9-countries.csv -db dspace -u dspace -p 'fuuu' -f cg.coverage.country -m 228 -t ACTION -d
-$ ./fix-metadata-values.py -i /tmp/2019-02-21-fix-4-regions.csv -db dspace -u dspace -p 'fuuu' -f cg.coverage.region -m 231 -t action -d
-$ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-2-countries.csv -db dspace -u dspace -p 'fuuu' -m 228 -f cg.coverage.country -d
-$ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-1-region.csv -db dspace -u dspace -p 'fuuu' -m 231 -f cg.coverage.region -d
-</code></pre></li>
-</ul>
-
-
- -
-
March, 2019
- https://alanorth.github.io/cgspace-notes/2019-03/
- Fri, 01 Mar 2019 12:16:30 +0100
-
- https://alanorth.github.io/cgspace-notes/2019-03/
- <h2 id="2019-03-01">2019-03-01</h2>
-
-<ul>
-<li>I checked IITA’s 259 Feb 14 records from last month for duplicates using Atmire’s Duplicate Checker on a fresh snapshot of CGSpace on my local machine and everything looks good</li>
-<li>I am now only waiting to hear from her about where the items should go, though I assume Journal Articles go to IITA Journal Articles collection, etc…</li>
-<li>Looking at the other half of Udana’s WLE records from 2018-11
-
-<ul>
-<li>I finished the ones for Restoring Degraded Landscapes (RDL), but these are for Variability, Risks and Competing Uses (VRC)</li>
-<li>I did the usual cleanups for whitespace, added regions where they made sense for certain countries, cleaned up the DOI link formats, added rights information based on the publications page for a few items</li>
-<li>Most worryingly, there are encoding errors in the abstracts for eleven items, for example:</li>
-<li>68.15% � 9.45 instead of 68.15% ± 9.45</li>
-<li>2003�2013 instead of 2003–2013</li>
-</ul></li>
-<li>I think I will need to ask Udana to re-copy and paste the abstracts with more care using Google Docs</li>
-</ul>
-
-
- -
-
February, 2019
- https://alanorth.github.io/cgspace-notes/2019-02/
- Fri, 01 Feb 2019 21:37:30 +0200
-
- https://alanorth.github.io/cgspace-notes/2019-02/
- <h2 id="2019-02-01">2019-02-01</h2>
-
-<ul>
-<li>Linode has alerted a few times since last night that the CPU usage on CGSpace (linode18) was high despite me increasing the alert threshold last week from 250% to 275%—I might need to increase it again!</li>
-
-<li><p>The top IPs before, during, and after this latest alert tonight were:</p>
-
-<pre><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "01/Feb/2019:(17|18|19|20|21)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
-245 207.46.13.5
-332 54.70.40.11
-385 5.143.231.38
-405 207.46.13.173
-405 207.46.13.75
-1117 66.249.66.219
-1121 35.237.175.180
-1546 5.9.6.51
-2474 45.5.186.2
-5490 85.25.237.71
-</code></pre></li>
-
-<li><p><code>85.25.237.71</code> is the “Linguee Bot” that I first saw last month</p></li>
-
-<li><p>The Solr statistics the past few months have been very high and I was wondering if the web server logs also showed an increase</p></li>
-
-<li><p>There were just over 3 million accesses in the nginx logs last month:</p>
-
-<pre><code># time zcat --force /var/log/nginx/* | grep -cE "[0-9]{1,2}/Jan/2019"
-3018243
-
-real 0m19.873s
-user 0m22.203s
-sys 0m1.979s
-</code></pre></li>
-</ul>
-
-
- -
-
January, 2019
- https://alanorth.github.io/cgspace-notes/2019-01/
- Wed, 02 Jan 2019 09:48:30 +0200
-
- https://alanorth.github.io/cgspace-notes/2019-01/
- <h2 id="2019-01-02">2019-01-02</h2>
-
-<ul>
-<li>Linode alerted that CGSpace (linode18) had a higher outbound traffic rate than normal early this morning</li>
-
-<li><p>I don’t see anything interesting in the web server logs around that time though:</p>
-
-<pre><code># zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "02/Jan/2019:0(1|2|3)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
- 92 40.77.167.4
- 99 210.7.29.100
-120 38.126.157.45
-177 35.237.175.180
-177 40.77.167.32
-216 66.249.75.219
-225 18.203.76.93
-261 46.101.86.248
-357 207.46.13.1
-903 54.70.40.11
-</code></pre></li>
-</ul>
-
-
- -
-
December, 2018
- https://alanorth.github.io/cgspace-notes/2018-12/
- Sun, 02 Dec 2018 02:09:30 +0200
-
- https://alanorth.github.io/cgspace-notes/2018-12/
- <h2 id="2018-12-01">2018-12-01</h2>
-
-<ul>
-<li>Switch CGSpace (linode18) to use OpenJDK instead of Oracle JDK</li>
-<li>I manually installed OpenJDK, then removed Oracle JDK, then re-ran the <a href="http://github.com/ilri/rmg-ansible-public">Ansible playbook</a> to update all configuration files, etc</li>
-<li>Then I ran all system updates and restarted the server</li>
-</ul>
-
-<h2 id="2018-12-02">2018-12-02</h2>
-
-<ul>
-<li>I noticed that there is another issue with PDF thumbnails on CGSpace, and I see there was another <a href="https://usn.ubuntu.com/3831-1/">Ghostscript vulnerability last week</a></li>
-</ul>
-
-
- -
-
November, 2018
- https://alanorth.github.io/cgspace-notes/2018-11/
- Thu, 01 Nov 2018 16:41:30 +0200
-
- https://alanorth.github.io/cgspace-notes/2018-11/
- <h2 id="2018-11-01">2018-11-01</h2>
-
-<ul>
-<li>Finalize AReS Phase I and Phase II ToRs</li>
-<li>Send a note about my <a href="https://github.com/ilri/dspace-statistics-api">dspace-statistics-api</a> to the dspace-tech mailing list</li>
-</ul>
-
-<h2 id="2018-11-03">2018-11-03</h2>
-
-<ul>
-<li>Linode has been sending mails a few times a day recently that CGSpace (linode18) has had high CPU usage</li>
-<li>Today these are the top 10 IPs:</li>
-</ul>
-
-
- -
-
October, 2018
- https://alanorth.github.io/cgspace-notes/2018-10/
- Mon, 01 Oct 2018 22:31:54 +0300
-
- https://alanorth.github.io/cgspace-notes/2018-10/
- <h2 id="2018-10-01">2018-10-01</h2>
-
-<ul>
-<li>Phil Thornton got an ORCID identifier so we need to add it to the list on CGSpace and tag his existing items</li>
-<li>I created a GitHub issue to track this <a href="https://github.com/ilri/DSpace/issues/389">#389</a>, because I’m super busy in Nairobi right now</li>
-</ul>
-
-
- -
-
September, 2018
- https://alanorth.github.io/cgspace-notes/2018-09/
- Sun, 02 Sep 2018 09:55:54 +0300
-
- https://alanorth.github.io/cgspace-notes/2018-09/
- <h2 id="2018-09-02">2018-09-02</h2>
-
-<ul>
-<li>New <a href="https://jdbc.postgresql.org/documentation/changelog.html#version_42.2.5">PostgreSQL JDBC driver version 42.2.5</a></li>
-<li>I’ll update the DSpace role in our <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure playbooks</a> and run the updated playbooks on CGSpace and DSpace Test</li>
-<li>Also, I’ll re-run the <code>postgresql</code> tasks because the custom PostgreSQL variables are dynamic according to the system’s RAM, and we never re-ran them after migrating to larger Linodes last month</li>
-<li>I’m testing the new DSpace 5.8 branch in my Ubuntu 18.04 environment and I’m getting those autowire errors in Tomcat 8.5.30 again:</li>
-</ul>
-
-
- -
-
August, 2018
- https://alanorth.github.io/cgspace-notes/2018-08/
- Wed, 01 Aug 2018 11:52:54 +0300
-
- https://alanorth.github.io/cgspace-notes/2018-08/
- <h2 id="2018-08-01">2018-08-01</h2>
-
-<ul>
-<li><p>DSpace Test had crashed at some point yesterday morning and I see the following in <code>dmesg</code>:</p>
-
-<pre><code>[Tue Jul 31 00:00:41 2018] Out of memory: Kill process 1394 (java) score 668 or sacrifice child
-[Tue Jul 31 00:00:41 2018] Killed process 1394 (java) total-vm:15601860kB, anon-rss:5355528kB, file-rss:0kB, shmem-rss:0kB
-[Tue Jul 31 00:00:41 2018] oom_reaper: reaped process 1394 (java), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
-</code></pre></li>
-
-<li><p>Judging from the time of the crash it was probably related to the Discovery indexing that starts at midnight</p></li>
-
-<li><p>From the DSpace log I see that eventually Solr stopped responding, so I guess the <code>java</code> process that was OOM killed above was Tomcat’s</p></li>
-
-<li><p>I’m not sure why Tomcat didn’t crash with an OutOfMemoryError…</p></li>
-
-<li><p>Anyways, perhaps I should increase the JVM heap from 5120m to 6144m like we did a few months ago when we tried to run the whole CGSpace Solr core</p></li>
-
-<li><p>The server only has 8GB of RAM so we’ll eventually need to upgrade to a larger one because we’ll start starving the OS, PostgreSQL, and command line batch processes</p></li>
-
-<li><p>I ran all system updates on DSpace Test and rebooted it</p></li>
-</ul>
-
-
- -
-
July, 2018
- https://alanorth.github.io/cgspace-notes/2018-07/
- Sun, 01 Jul 2018 12:56:54 +0300
-
- https://alanorth.github.io/cgspace-notes/2018-07/
- <h2 id="2018-07-01">2018-07-01</h2>
-
-<ul>
-<li><p>I want to upgrade DSpace Test to DSpace 5.8 so I took a backup of its current database just in case:</p>
-
-<pre><code>$ pg_dump -b -v -o --format=custom -U dspace -f dspace-2018-07-01.backup dspace
-</code></pre></li>
-
-<li><p>During the <code>mvn package</code> stage on the 5.8 branch I kept getting issues with java running out of memory:</p>
-
-<pre><code>There is insufficient memory for the Java Runtime Environment to continue.
-</code></pre></li>
-</ul>
-
-
- -
-
June, 2018
- https://alanorth.github.io/cgspace-notes/2018-06/
- Mon, 04 Jun 2018 19:49:54 -0700
-
- https://alanorth.github.io/cgspace-notes/2018-06/
- <h2 id="2018-06-04">2018-06-04</h2>
-
-<ul>
-<li>Test the <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=560">DSpace 5.8 module upgrades from Atmire</a> (<a href="https://github.com/ilri/DSpace/pull/378">#378</a>)
-
-<ul>
-<li>There seems to be a problem with the CUA and L&R versions in <code>pom.xml</code> because they are using SNAPSHOT and it doesn’t build</li>
-</ul></li>
-<li>I added the new CCAFS Phase II Project Tag <code>PII-FP1_PACCA2</code> and merged it into the <code>5_x-prod</code> branch (<a href="https://github.com/ilri/DSpace/pull/379">#379</a>)</li>
-
-<li><p>I proofed and tested the ILRI author corrections that Peter sent back to me this week:</p>
-
-<pre><code>$ ./fix-metadata-values.py -i /tmp/2018-05-30-Correct-660-authors.csv -db dspace -u dspace -p 'fuuu' -f dc.contributor.author -t correct -m 3 -n
-</code></pre></li>
-
-<li><p>I think a sane proofing workflow in OpenRefine is to apply the custom text facets for check/delete/remove and illegal characters that I developed in <a href="https://alanorth.github.io/cgspace-notes/cgspace-notes/2018-03/">March, 2018</a></p></li>
-
-<li><p>Time to index ~70,000 items on CGSpace:</p>
-
-<pre><code>$ time schedtool -D -e ionice -c2 -n7 nice -n19 [dspace]/bin/dspace index-discovery -b
-
-real 74m42.646s
-user 8m5.056s
-sys 2m7.289s
-</code></pre></li>
-</ul>
-
-
- -
-
May, 2018
- https://alanorth.github.io/cgspace-notes/2018-05/
- Tue, 01 May 2018 16:43:54 +0300
-
- https://alanorth.github.io/cgspace-notes/2018-05/
- <h2 id="2018-05-01">2018-05-01</h2>
-
-<ul>
-<li>I cleared the Solr statistics core on DSpace Test by issuing two commands directly to the Solr admin interface:
-
-<ul>
-<li><a href="http://localhost:3000/solr/statistics/update?stream.body=%3Cdelete%3E%3Cquery%3E*:*%3C/query%3E%3C/delete%3E">http://localhost:3000/solr/statistics/update?stream.body=%3Cdelete%3E%3Cquery%3E*:*%3C/query%3E%3C/delete%3E</a></li>
-<li><a href="http://localhost:3000/solr/statistics/update?stream.body=%3Ccommit/%3E">http://localhost:3000/solr/statistics/update?stream.body=%3Ccommit/%3E</a></li>
-</ul></li>
-<li>Then I reduced the JVM heap size from 6144 back to 5120m</li>
-<li>Also, I switched it to use OpenJDK instead of Oracle Java, as well as re-worked the <a href="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure scripts</a> to support hosts choosing which distribution they want to use</li>
-</ul>
-
-
- -
-
April, 2018
- https://alanorth.github.io/cgspace-notes/2018-04/
- Sun, 01 Apr 2018 16:13:54 +0200
-
- https://alanorth.github.io/cgspace-notes/2018-04/
- <h2 id="2018-04-01">2018-04-01</h2>
-
-<ul>
-<li>I tried to test something on DSpace Test but noticed that it’s down since god knows when</li>
-<li>Catalina logs at least show some memory errors yesterday:</li>
-</ul>
-
-
- -
-
March, 2018
- https://alanorth.github.io/cgspace-notes/2018-03/
- Fri, 02 Mar 2018 16:07:54 +0200
-
- https://alanorth.github.io/cgspace-notes/2018-03/
- <h2 id="2018-03-02">2018-03-02</h2>
-
-<ul>
-<li>Export a CSV of the IITA community metadata for Martin Mueller</li>
-</ul>
-
-
- -
-
February, 2018
- https://alanorth.github.io/cgspace-notes/2018-02/
- Thu, 01 Feb 2018 16:28:54 +0200
-
- https://alanorth.github.io/cgspace-notes/2018-02/
- <h2 id="2018-02-01">2018-02-01</h2>
-
-<ul>
-<li>Peter gave feedback on the <code>dc.rights</code> proof of concept that I had sent him last week</li>
-<li>We don’t need to distinguish between internal and external works, so that makes it just a simple list</li>
-<li>Yesterday I figured out how to monitor DSpace sessions using JMX</li>
-<li>I copied the logic in the <code>jmx_tomcat_dbpools</code> provided by Ubuntu’s <code>munin-plugins-java</code> package and used the stuff I discovered about JMX <a href="https://alanorth.github.io/cgspace-notes/cgspace-notes/2018-01/">in 2018-01</a></li>
-</ul>
-
-
- -
-
January, 2018
- https://alanorth.github.io/cgspace-notes/2018-01/
- Tue, 02 Jan 2018 08:35:54 -0800
-
- https://alanorth.github.io/cgspace-notes/2018-01/
- <h2 id="2018-01-02">2018-01-02</h2>
-
-<ul>
-<li>Uptime Robot noticed that CGSpace went down and up a few times last night, for a few minutes each time</li>
-<li>I didn’t get any load alerts from Linode and the REST and XMLUI logs don’t show anything out of the ordinary</li>
-<li>The nginx logs show HTTP 200s until <code>02/Jan/2018:11:27:17 +0000</code> when Uptime Robot got an HTTP 500</li>
-<li>In dspace.log around that time I see many errors like “Client closed the connection before file download was complete”</li>
-
-<li><p>And just before that I see this:</p>
-
-<pre><code>Caused by: org.apache.tomcat.jdbc.pool.PoolExhaustedException: [http-bio-127.0.0.1-8443-exec-980] Timeout: Pool empty. Unable to fetch a connection in 5 seconds, none available[size:50; busy:50; idle:0; lastwait:5000].
-</code></pre></li>
-
-<li><p>Ah hah! So the pool was actually empty!</p></li>
-
-<li><p>I need to increase that, let’s try to bump it up from 50 to 75</p></li>
-
-<li><p>After that one client got an HTTP 499 but then the rest were HTTP 200, so I don’t know what the hell Uptime Robot saw</p></li>
-
-<li><p>I notice this error quite a few times in dspace.log:</p>
-
-<pre><code>2018-01-02 01:21:19,137 ERROR org.dspace.app.xmlui.aspect.discovery.SidebarFacetsTransformer @ Error while searching for sidebar facets
-org.dspace.discovery.SearchServiceException: org.apache.solr.search.SyntaxError: Cannot parse 'dateIssued_keyword:[1976+TO+1979]': Encountered " "]" "] "" at line 1, column 32.
-</code></pre></li>
-
-<li><p>And there are many of these errors every day for the past month:</p>
-
-<pre><code>$ grep -c "Error while searching for sidebar facets" dspace.log.*
-dspace.log.2017-11-21:4
-dspace.log.2017-11-22:1
-dspace.log.2017-11-23:4
-dspace.log.2017-11-24:11
-dspace.log.2017-11-25:0
-dspace.log.2017-11-26:1
-dspace.log.2017-11-27:7
-dspace.log.2017-11-28:21
-dspace.log.2017-11-29:31
-dspace.log.2017-11-30:15
-dspace.log.2017-12-01:15
-dspace.log.2017-12-02:20
-dspace.log.2017-12-03:38
-dspace.log.2017-12-04:65
-dspace.log.2017-12-05:43
-dspace.log.2017-12-06:72
-dspace.log.2017-12-07:27
-dspace.log.2017-12-08:15
-dspace.log.2017-12-09:29
-dspace.log.2017-12-10:35
-dspace.log.2017-12-11:20
-dspace.log.2017-12-12:44
-dspace.log.2017-12-13:36
-dspace.log.2017-12-14:59
-dspace.log.2017-12-15:104
-dspace.log.2017-12-16:53
-dspace.log.2017-12-17:66
-dspace.log.2017-12-18:83
-dspace.log.2017-12-19:101
-dspace.log.2017-12-20:74
-dspace.log.2017-12-21:55
-dspace.log.2017-12-22:66
-dspace.log.2017-12-23:50
-dspace.log.2017-12-24:85
-dspace.log.2017-12-25:62
-dspace.log.2017-12-26:49
-dspace.log.2017-12-27:30
-dspace.log.2017-12-28:54
-dspace.log.2017-12-29:68
-dspace.log.2017-12-30:89
-dspace.log.2017-12-31:53
-dspace.log.2018-01-01:45
-dspace.log.2018-01-02:34
-</code></pre></li>
-
-<li><p>Danny wrote to ask for help renewing the wildcard ilri.org certificate and I advised that we should probably use Let’s Encrypt if it’s just a handful of domains</p></li>
-</ul>
-
-
- -
-
December, 2017
- https://alanorth.github.io/cgspace-notes/2017-12/
- Fri, 01 Dec 2017 13:53:54 +0300
-
- https://alanorth.github.io/cgspace-notes/2017-12/
- <h2 id="2017-12-01">2017-12-01</h2>
-
-<ul>
-<li>Uptime Robot noticed that CGSpace went down</li>
-<li>The logs say “Timeout waiting for idle object”</li>
-<li>PostgreSQL activity says there are 115 connections currently</li>
-<li>The list of connections to XMLUI and REST API for today:</li>
-</ul>
-
-
- -
-
November, 2017
- https://alanorth.github.io/cgspace-notes/2017-11/
- Thu, 02 Nov 2017 09:37:54 +0200
-
- https://alanorth.github.io/cgspace-notes/2017-11/
- <h2 id="2017-11-01">2017-11-01</h2>
-
-<ul>
-<li>The CORE developers responded to say they are looking into their bot not respecting our robots.txt</li>
-</ul>
-
-<h2 id="2017-11-02">2017-11-02</h2>
-
-<ul>
-<li><p>Today there have been no hits by CORE and no alerts from Linode (coincidence?)</p>
-
-<pre><code># grep -c "CORE" /var/log/nginx/access.log
-0
-</code></pre></li>
-
-<li><p>Generate list of authors on CGSpace for Peter to go through and correct:</p>
-
-<pre><code>dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'contributor' and qualifier = 'author') AND resource_type_id = 2 group by text_value order by count desc) to /tmp/authors.csv with csv;
-COPY 54701
-</code></pre></li>
-</ul>
-
-
- -
-
October, 2017
- https://alanorth.github.io/cgspace-notes/2017-10/
- Sun, 01 Oct 2017 08:07:54 +0300
-
- https://alanorth.github.io/cgspace-notes/2017-10/
- <h2 id="2017-10-01">2017-10-01</h2>
-
-<ul>
-<li><p>Peter emailed to point out that many items in the <a href="https://cgspace.cgiar.org/handle/10568/2703">ILRI archive collection</a> have multiple handles:</p>
-
-<pre><code>http://hdl.handle.net/10568/78495||http://hdl.handle.net/10568/79336
-</code></pre></li>
-
-<li><p>There appears to be a pattern but I’ll have to look a bit closer and try to clean them up automatically, either in SQL or in OpenRefine</p></li>
-
-<li><p>Add Katherine Lutz to the groups for content submission and edit steps of the CGIAR System collections</p></li>
-</ul>
-
-
-
September, 2017
https://alanorth.github.io/cgspace-notes/2017-09/
diff --git a/docs/tags/notes/page/2/index.html b/docs/tags/notes/page/2/index.html
index 41871eabb..ab4901323 100644
--- a/docs/tags/notes/page/2/index.html
+++ b/docs/tags/notes/page/2/index.html
@@ -10,7 +10,7 @@
-
+
@@ -78,27 +78,21 @@
- December, 2018
- November, 2016
+
- 2018-12-01
+ 2016-11-01
-- Switch CGSpace (linode18) to use OpenJDK instead of Oracle JDK
-- I manually installed OpenJDK, then removed Oracle JDK, then re-ran the Ansible playbook to update all configuration files, etc
-- Then I ran all system updates and restarted the server
+- Add
dc.type
to the output options for Atmire’s Listings and Reports module (#286)
-2018-12-02
-
-
-- I noticed that there is another issue with PDF thumbnails on CGSpace, and I see there was another Ghostscript vulnerability last week
-
- Read more →
+
+ Read more →
@@ -108,187 +102,30 @@
- November, 2018
- October, 2016
+
- 2018-11-01
+ 2016-10-03
-- Finalize AReS Phase I and Phase II ToRs
-- Send a note about my dspace-statistics-api to the dspace-tech mailing list
-
-
-2018-11-03
+Testing adding ORCIDs to a CSV file for a single item to see if the author orders get messed up
+Need to test the following scenarios to see how author order is affected:
-- Linode has been sending mails a few times a day recently that CGSpace (linode18) has had high CPU usage
-- Today these are the top 10 IPs:
-
- Read more →
-
-
-
-
-
-
-
-
-
- October, 2018
-
-
- 2018-10-01
-
-
-- Phil Thornton got an ORCID identifier so we need to add it to the list on CGSpace and tag his existing items
-- I created a GitHub issue to track this #389, because I’m super busy in Nairobi right now
-
- Read more →
-
-
-
-
-
-
-
-
-
- September, 2018
-
-
- 2018-09-02
-
-
-- New PostgreSQL JDBC driver version 42.2.5
-- I’ll update the DSpace role in our Ansible infrastructure playbooks and run the updated playbooks on CGSpace and DSpace Test
-- Also, I’ll re-run the
postgresql
tasks because the custom PostgreSQL variables are dynamic according to the system’s RAM, and we never re-ran them after migrating to larger Linodes last month
-- I’m testing the new DSpace 5.8 branch in my Ubuntu 18.04 environment and I’m getting those autowire errors in Tomcat 8.5.30 again:
-
- Read more →
-
-
-
-
-
-
-
-
-
- August, 2018
-
-
- 2018-08-01
-
-
-DSpace Test had crashed at some point yesterday morning and I see the following in dmesg
:
-
-[Tue Jul 31 00:00:41 2018] Out of memory: Kill process 1394 (java) score 668 or sacrifice child
-[Tue Jul 31 00:00:41 2018] Killed process 1394 (java) total-vm:15601860kB, anon-rss:5355528kB, file-rss:0kB, shmem-rss:0kB
-[Tue Jul 31 00:00:41 2018] oom_reaper: reaped process 1394 (java), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
-
-
-Judging from the time of the crash it was probably related to the Discovery indexing that starts at midnight
-
-From the DSpace log I see that eventually Solr stopped responding, so I guess the java
process that was OOM killed above was Tomcat’s
-
-I’m not sure why Tomcat didn’t crash with an OutOfMemoryError…
-
-Anyways, perhaps I should increase the JVM heap from 5120m to 6144m like we did a few months ago when we tried to run the whole CGSpace Solr core
-
-The server only has 8GB of RAM so we’ll eventually need to upgrade to a larger one because we’ll start starving the OS, PostgreSQL, and command line batch processes
-
-I ran all system updates on DSpace Test and rebooted it
-
- Read more →
-
-
-
-
-
-
-
-
-
- July, 2018
-
-
- 2018-07-01
-
-
-I want to upgrade DSpace Test to DSpace 5.8 so I took a backup of its current database just in case:
-
-$ pg_dump -b -v -o --format=custom -U dspace -f dspace-2018-07-01.backup dspace
-
-
-During the mvn package
stage on the 5.8 branch I kept getting issues with java running out of memory:
-
-There is insufficient memory for the Java Runtime Environment to continue.
-
-
- Read more →
-
-
-
-
-
-
-
-
-
- June, 2018
-
-
- 2018-06-04
-
-
-- Test the DSpace 5.8 module upgrades from Atmire (#378)
-
-
-- There seems to be a problem with the CUA and L&R versions in
pom.xml
because they are using SNAPSHOT and it doesn’t build
+- ORCIDs only
+- ORCIDs plus normal authors
-- I added the new CCAFS Phase II Project Tag
PII-FP1_PACCA2
and merged it into the 5_x-prod
branch (#379)
-I proofed and tested the ILRI author corrections that Peter sent back to me this week:
+I exported a random item’s metadata as CSV, deleted all columns except id and collection, and made a new coloum called ORCID:dc.contributor.author
with the following random ORCIDs from the ORCID registry:
-$ ./fix-metadata-values.py -i /tmp/2018-05-30-Correct-660-authors.csv -db dspace -u dspace -p 'fuuu' -f dc.contributor.author -t correct -m 3 -n
-
-
-I think a sane proofing workflow in OpenRefine is to apply the custom text facets for check/delete/remove and illegal characters that I developed in March, 2018
-
-Time to index ~70,000 items on CGSpace:
-
-$ time schedtool -D -e ionice -c2 -n7 nice -n19 [dspace]/bin/dspace index-discovery -b
-
-real 74m42.646s
-user 8m5.056s
-sys 2m7.289s
+0000-0002-6115-0956||0000-0002-3812-8793||0000-0001-7462-405X
- Read more →
+ Read more →
@@ -298,26 +135,26 @@ sys 2m7.289s
- May, 2018
- September, 2016
+
- 2018-05-01
+ 2016-09-01
-- I cleared the Solr statistics core on DSpace Test by issuing two commands directly to the Solr admin interface:
+
- Discuss helping CCAFS with some batch tagging of ORCID IDs for their authors
+- Discuss how the migration of CGIAR’s Active Directory to a flat structure will break our LDAP groups in DSpace
+- We had been using
DC=ILRI
to determine whether a user was ILRI or not
-
-Then I reduced the JVM heap size from 6144 back to 5120m
-Also, I switched it to use OpenJDK instead of Oracle Java, as well as re-worked the Ansible infrastructure scripts to support hosts choosing which distribution they want to use
+It looks like we might be able to use OUs now, instead of DCs:
+
+$ ldapsearch -x -H ldaps://svcgroot2.cgiarad.org:3269/ -b "dc=cgiarad,dc=org" -D "admigration1@cgiarad.org" -W "(sAMAccountName=admigration1)"
+
- Read more →
+ Read more →
bower.json
because most are several versions of out datefonts
)Start working on DSpace 5.1 → 5.5 port:
+ +$ git checkout -b 55new 5_x-prod
+$ git reset --hard ilri/5_x-prod
+$ git rebase -i dspace-5.5
+
dc.description.sponsorship
to Discovery sidebar facets and make investors clickable in item view (#232)I think this query should find and replace all authors that have “,” at the end of their names:
+ +dspacetest=# update metadatavalue set text_value = regexp_replace(text_value, '(^.+?),$', '\1') where metadata_field_id=3 and resource_type_id=2 and text_value ~ '^.+?,$';
+UPDATE 95
+dspacetest=# select text_value from metadatavalue where metadata_field_id=3 and resource_type_id=2 and text_value ~ '^.+?,$';
+text_value
+------------
+(0 rows)
+
In this case the select query was showing 95 results before the update
ListSets
verb: http://ebrary.ifpri.org/oai/oai.php?verb=ListSetsdc.identifier.fund
to cg.identifier.cpwfproject
and then the rest to dc.description.sponsorship
There are 3,000 IPs accessing the REST API in a 24-hour period!
+ +# awk '{print $1}' /var/log/nginx/rest.log | uniq | wc -l
+3168
+
checker
log has some errors we should pay attention to:index-lucene-update
cron job active on CGSpace, but I’m pretty sure we don’t need it as of the latest few versions of Atmire’s Listings and Reports moduledc.rights
proof of concept that I had sent him last weekjmx_tomcat_dbpools
provided by Ubuntu’s munin-plugins-java
package and used the stuff I discovered about JMX in 2018-0110568/12503
from 10568/27869
to 10568/27629
using the move_collections.sh script I wrote last year.02/Jan/2018:11:27:17 +0000
when Uptime Robot got an HTTP 500Replace lzop
with xz
in log compression cron jobs on DSpace Test—it uses less space:
And just before that I see this:
- -Caused by: org.apache.tomcat.jdbc.pool.PoolExhaustedException: [http-bio-127.0.0.1-8443-exec-980] Timeout: Pool empty. Unable to fetch a connection in 5 seconds, none available[size:50; busy:50; idle:0; lastwait:5000].
-
Ah hah! So the pool was actually empty!
I need to increase that, let’s try to bump it up from 50 to 75
After that one client got an HTTP 499 but then the rest were HTTP 200, so I don’t know what the hell Uptime Robot saw
I notice this error quite a few times in dspace.log:
- -2018-01-02 01:21:19,137 ERROR org.dspace.app.xmlui.aspect.discovery.SidebarFacetsTransformer @ Error while searching for sidebar facets
-org.dspace.discovery.SearchServiceException: org.apache.solr.search.SyntaxError: Cannot parse 'dateIssued_keyword:[1976+TO+1979]': Encountered " "]" "] "" at line 1, column 32.
-
And there are many of these errors every day for the past month:
- -$ grep -c "Error while searching for sidebar facets" dspace.log.*
-dspace.log.2017-11-21:4
-dspace.log.2017-11-22:1
-dspace.log.2017-11-23:4
-dspace.log.2017-11-24:11
-dspace.log.2017-11-25:0
-dspace.log.2017-11-26:1
-dspace.log.2017-11-27:7
-dspace.log.2017-11-28:21
-dspace.log.2017-11-29:31
-dspace.log.2017-11-30:15
-dspace.log.2017-12-01:15
-dspace.log.2017-12-02:20
-dspace.log.2017-12-03:38
-dspace.log.2017-12-04:65
-dspace.log.2017-12-05:43
-dspace.log.2017-12-06:72
-dspace.log.2017-12-07:27
-dspace.log.2017-12-08:15
-dspace.log.2017-12-09:29
-dspace.log.2017-12-10:35
-dspace.log.2017-12-11:20
-dspace.log.2017-12-12:44
-dspace.log.2017-12-13:36
-dspace.log.2017-12-14:59
-dspace.log.2017-12-15:104
-dspace.log.2017-12-16:53
-dspace.log.2017-12-17:66
-dspace.log.2017-12-18:83
-dspace.log.2017-12-19:101
-dspace.log.2017-12-20:74
-dspace.log.2017-12-21:55
-dspace.log.2017-12-22:66
-dspace.log.2017-12-23:50
-dspace.log.2017-12-24:85
-dspace.log.2017-12-25:62
-dspace.log.2017-12-26:49
-dspace.log.2017-12-27:30
-dspace.log.2017-12-28:54
-dspace.log.2017-12-29:68
-dspace.log.2017-12-30:89
-dspace.log.2017-12-31:53
-dspace.log.2018-01-01:45
-dspace.log.2018-01-02:34
-
Danny wrote to ask for help renewing the wildcard ilri.org certificate and I advised that we should probably use Let’s Encrypt if it’s just a handful of domains
Today there have been no hits by CORE and no alerts from Linode (coincidence?)
- -# grep -c "CORE" /var/log/nginx/access.log
-0
-
Generate list of authors on CGSpace for Peter to go through and correct:
- -dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'contributor' and qualifier = 'author') AND resource_type_id = 2 group by text_value order by count desc) to /tmp/authors.csv with csv;
-COPY 54701
+# cd /home/dspacetest.cgiar.org/log
+# ls -lh dspace.log.2015-11-18*
+-rw-rw-r-- 1 tomcat7 tomcat7 2.0M Nov 18 23:59 dspace.log.2015-11-18
+-rw-rw-r-- 1 tomcat7 tomcat7 387K Nov 18 23:59 dspace.log.2015-11-18.lzo
+-rw-rw-r-- 1 tomcat7 tomcat7 169K Nov 18 23:59 dspace.log.2015-11-18.xz
Peter emailed to point out that many items in the ILRI archive collection have multiple handles:
+http://hdl.handle.net/10568/78495||http://hdl.handle.net/10568/79336
+Last week I had increased the limit from 30 to 60, which seemed to help, but now there are many more idle connections:
+
+$ psql -c 'SELECT * from pg_stat_activity;' | grep idle | grep -c cgspace
+78
-
-There appears to be a pattern but I’ll have to look a bit closer and try to clean them up automatically, either in SQL or in OpenRefine
-
-Add Katherine Lutz to the groups for content submission and edit steps of the CGIAR System collections
- Read more →
-
dspace.log.2017-08-01
, they are all using the same Tomcat sessionrobots.txt
only blocks the top-level /discover
and /browse
URLs… we will need to find a way to forbid them from accessing these!X-Robots-Tag "none"
HTTP header, but this only forbids the search engine from indexing the page, not crawling it!dc.description.abstract
column, which caused OpenRefine to choke when exporting the CSVg/^$/d
-x
) plus sed
to format the output into quasi XML:Documenting day-to-day work on the CGSpace repository.
-dc.rights
to the input form, including some inline instructions/hints:Testing the CMYK patch on a collection with 650 items:
- -$ [dspace]/bin/dspace filter-media -f -i 10568/16498 -p "ImageMagick PDF Thumbnail" -v >& /tmp/filter-media-cmyk.txt
-
filter-media
bug that causes it to process JPGs even when limiting to the PDF thumbnail plugin: DS-3516filter-media
plugin creates JPG thumbnails with the CMYK colorspace when the source PDF is using CMYKInterestingly, it seems DSpace 4.x’s thumbnails were sRGB, but forcing regeneration using DSpace 5.x’s ImageMagick plugin creates CMYK JPGs if the source PDF was CMYK (see 10568⁄51999):
- -$ identify ~/Desktop/alc_contrastes_desafios.jpg
-/Users/aorth/Desktop/alc_contrastes_desafios.jpg JPEG 464x600 464x600+0+0 8-bit CMYK 168KB 0.000u 0:00.000
-
An item was mapped twice erroneously again, so I had to remove one of the mappings manually:
- -dspace=# select * from collection2item where item_id = '80278';
-id | collection_id | item_id
--------+---------------+---------
-92551 | 313 | 80278
-92550 | 313 | 80278
-90774 | 1051 | 80278
-(3 rows)
-dspace=# delete from collection2item where id = 92551 and item_id = 80278;
-DELETE 1
-
Create issue on GitHub to track the addition of CCAFS Phase II project tags (#301)
Looks like we’ll be using cg.identifier.ccafsprojectpii
as the field name
While looking in the logs for errors, I see tons of warnings about Atmire MQM:
- -2016-12-02 03:00:32,352 WARN com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=CREATE, SubjectType=BUNDLE, SubjectID=70316, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632305, dispatcher=1544803905, detail=[null], transactionID="TX157907838689377964651674089851855413607")
-2016-12-02 03:00:32,353 WARN com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=MODIFY_METADATA, SubjectType=BUNDLE, SubjectID =70316, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632309, dispatcher=1544803905, detail="dc.title", transactionID="TX157907838689377964651674089851855413607")
-2016-12-02 03:00:32,353 WARN com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=ADD, SubjectType=ITEM, SubjectID=80044, Object Type=BUNDLE, ObjectID=70316, TimeStamp=1480647632311, dispatcher=1544803905, detail="THUMBNAIL", transactionID="TX157907838689377964651674089851855413607")
-2016-12-02 03:00:32,353 WARN com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=ADD, SubjectType=BUNDLE, SubjectID=70316, Obje ctType=BITSTREAM, ObjectID=86715, TimeStamp=1480647632318, dispatcher=1544803905, detail="-1", transactionID="TX157907838689377964651674089851855413607")
-2016-12-02 03:00:32,353 WARN com.atmire.metadataquality.batchedit.BatchEditConsumer @ BatchEditConsumer should not have been given this kind of Subject in an event, skipping: org.dspace.event.Event(eventType=MODIFY, SubjectType=ITEM, SubjectID=80044, ObjectType=(Unknown), ObjectID=-1, TimeStamp=1480647632351, dispatcher=1544803905, detail=[null], transactionID="TX157907838689377964651674089851855413607")
-
I see thousands of them in the logs for the last few months, so it’s not related to the DSpace 5.5 upgrade
I’ve raised a ticket with Atmire to ask
Another worrying error from dspace.log is:
dc.type
to the output options for Atmire’s Listings and Reports module (#286)I exported a random item’s metadata as CSV, deleted all columns except id and collection, and made a new coloum called ORCID:dc.contributor.author
with the following random ORCIDs from the ORCID registry:
0000-0002-6115-0956||0000-0002-3812-8793||0000-0001-7462-405X
-
DC=ILRI
to determine whether a user was ILRI or notIt looks like we might be able to use OUs now, instead of DCs:
- -$ ldapsearch -x -H ldaps://svcgroot2.cgiarad.org:3269/ -b "dc=cgiarad,dc=org" -D "admigration1@cgiarad.org" -W "(sAMAccountName=admigration1)"
-
bower.json
because most are several versions of out datefonts
)Start working on DSpace 5.1 → 5.5 port:
- -$ git checkout -b 55new 5_x-prod
-$ git reset --hard ilri/5_x-prod
-$ git rebase -i dspace-5.5
-
dc.description.sponsorship
to Discovery sidebar facets and make investors clickable in item view (#232)I think this query should find and replace all authors that have “,” at the end of their names:
- -dspacetest=# update metadatavalue set text_value = regexp_replace(text_value, '(^.+?),$', '\1') where metadata_field_id=3 and resource_type_id=2 and text_value ~ '^.+?,$';
-UPDATE 95
-dspacetest=# select text_value from metadatavalue where metadata_field_id=3 and resource_type_id=2 and text_value ~ '^.+?,$';
-text_value
-------------
-(0 rows)
-
In this case the select query was showing 95 results before the update
Documenting day-to-day work on the CGSpace repository.
-ListSets
verb: http://ebrary.ifpri.org/oai/oai.php?verb=ListSetsdc.identifier.fund
to cg.identifier.cpwfproject
and then the rest to dc.description.sponsorship
There are 3,000 IPs accessing the REST API in a 24-hour period!
- -# awk '{print $1}' /var/log/nginx/rest.log | uniq | wc -l
-3168
-
checker
log has some errors we should pay attention to:index-lucene-update
cron job active on CGSpace, but I’m pretty sure we don’t need it as of the latest few versions of Atmire’s Listings and Reports module10568/12503
from 10568/27869
to 10568/27629
using the move_collections.sh script I wrote last year.Replace lzop
with xz
in log compression cron jobs on DSpace Test—it uses less space:
# cd /home/dspacetest.cgiar.org/log
-# ls -lh dspace.log.2015-11-18*
--rw-rw-r-- 1 tomcat7 tomcat7 2.0M Nov 18 23:59 dspace.log.2015-11-18
--rw-rw-r-- 1 tomcat7 tomcat7 387K Nov 18 23:59 dspace.log.2015-11-18.lzo
--rw-rw-r-- 1 tomcat7 tomcat7 169K Nov 18 23:59 dspace.log.2015-11-18.xz
-
Last week I had increased the limit from 30 to 60, which seemed to help, but now there are many more idle connections:
- -$ psql -c 'SELECT * from pg_stat_activity;' | grep idle | grep -c cgspace
-78
-
I don’t see anything interesting in the web server logs around that time though:
+ +# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "02/Jan/2019:0(1|2|3)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
+ 92 40.77.167.4
+ 99 210.7.29.100
+120 38.126.157.45
+177 35.237.175.180
+177 40.77.167.32
+216 66.249.75.219
+225 18.203.76.93
+261 46.101.86.248
+357 207.46.13.1
+903 54.70.40.11
+
Rough notes for importing the CGIAR Library content. It was decided that this content would go to a new top-level community called CGIAR System Organization.
@@ -415,24 +439,6 @@ COPY 54701 - -bower.json
because most are several versions of out datefonts
)Start working on DSpace 5.1 → 5.5 port:
- -$ git checkout -b 55new 5_x-prod
-$ git reset --hard ilri/5_x-prod
-$ git rebase -i dspace-5.5
-
bower.json
because most are several versions of out datefonts
)Start working on DSpace 5.1 → 5.5 port:
+ +$ git checkout -b 55new 5_x-prod
+$ git reset --hard ilri/5_x-prod
+$ git rebase -i dspace-5.5
+