diff --git a/content/posts/2020-09.md b/content/posts/2020-09.md index fbb41f5c0..34671c050 100644 --- a/content/posts/2020-09.md +++ b/content/posts/2020-09.md @@ -49,4 +49,64 @@ $ ldapsearch -x -H ldaps://AZCGNEROOT2.CGIARAD.ORG:636/ -b "dc=cgiarad,dc=org" - - Run all system updates on DSpace Test (linode26) and reboot it - After the restart LDAP login works... +## 2020-09-03 + +- Fix some erroneous "review status" fields that Abenet noticed on AReS + - I used my `fix-metadata-values.py` and `delete-metadata-values.py` scripts with the following input files: + +``` +$ cat 2020-09-03-fix-review-status.csv +dc.description.version,correct +Externally Peer Reviewed,Peer Review +Peer Reviewed,Peer Review +Peer review,Peer Review +Peer reviewed,Peer Review +Peer-Reviewed,Peer Review +Peer-reviewed,Peer Review +peer Review,Peer Review +$ cat 2020-09-03-delete-review-status.csv +dc.description.version +Report +Formally Published +Poster +Unrefereed reprint +$ ./delete-metadata-values.py -i 2020-09-03-delete-review-status.csv -db dspace -u dspace -p 'fuuu' -f dc.description.version -m 68 +$ ./fix-metadata-values.py -i 2020-09-03-fix-review-status.csv -db dspace -u dspace -p 'fuuu' -f dc.description.version -t 'correct' -m 68 +``` + +- Start reviewing 95 items for IITA (20201stbatch) + - I used my [csv-metadata-quality](https://github.com/ilri/csv-metadata-quality) tool to check and fix some low-hanging fruit first + - This fixed a few unnecessary Unicode, excessive whitespace, invalid multi-value separator, and duplicate metadata values + - Then I looked at the data in OpenRefine and noticed some things: + - All issue dates use year only, but some have months in the citation so they could be more specific + - I normalized all the DOIs to use "https://doi.org" format + - I fixed a few AGROVOC subjects with a simple GREL: `value.replace("GRAINS","GRAIN").replace("SOILS","SOIL").replace("CORN","MAIZE")` + - But there are a few more that are invalid that she will have to look at + - I uploaded the items to [DSpace Test](https://dspacetest.cgiar.org/handle/10568/108357) and it was apparently successful but I get these errors to the console: + +``` +Thu Sep 03 12:26:33 CEST 2020 | Query:containerItem:ea7a2648-180d-4fce-bdc5-c3aa2304fc58 +Error while updating +java.lang.NullPointerException + at com.atmire.dspace.cua.CUASolrLoggerServiceImpl$5.visit(SourceFile:1131) + at com.atmire.dspace.cua.CUASolrLoggerServiceImpl.visitEachStatisticShard(SourceFile:212) + at com.atmire.dspace.cua.CUASolrLoggerServiceImpl.update(SourceFile:1104) + at com.atmire.dspace.cua.CUASolrLoggerServiceImpl.update(SourceFile:1093) + at org.dspace.statistics.StatisticsLoggingConsumer.consume(SourceFile:104) + at org.dspace.event.BasicDispatcher.consume(BasicDispatcher.java:177) + at org.dspace.event.BasicDispatcher.dispatch(BasicDispatcher.java:123) + at org.dspace.core.Context.dispatchEvents(Context.java:455) + at org.dspace.core.Context.commit(Context.java:424) + at org.dspace.core.Context.complete(Context.java:380) + at org.dspace.app.bulkedit.MetadataImport.main(MetadataImport.java:1399) + at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) + at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) + at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) + at java.lang.reflect.Method.invoke(Method.java:498) + at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229) + at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81) +``` + +- There are more in the DSpace log so I will raise it with Atmire immediately + diff --git a/docs/2015-11/index.html b/docs/2015-11/index.html index 4d250f15f..869f9f11a 100644 --- a/docs/2015-11/index.html +++ b/docs/2015-11/index.html @@ -239,6 +239,8 @@ db.statementpool = true
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -247,8 +249,6 @@ db.statementpool = true
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2015-12/index.html b/docs/2015-12/index.html index ad1f5c627..7c046aee8 100644 --- a/docs/2015-12/index.html +++ b/docs/2015-12/index.html @@ -261,6 +261,8 @@ $ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -269,8 +271,6 @@ $ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2016-01/index.html b/docs/2016-01/index.html index 6a54ee109..f386d24a4 100644 --- a/docs/2016-01/index.html +++ b/docs/2016-01/index.html @@ -197,6 +197,8 @@ $ find SimpleArchiveForBio/ -iname “*.pdf” -exec basename {} ; | sor
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -205,8 +207,6 @@ $ find SimpleArchiveForBio/ -iname “*.pdf” -exec basename {} ; | sor
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2016-02/index.html b/docs/2016-02/index.html index d8e6d5238..9636d750c 100644 --- a/docs/2016-02/index.html +++ b/docs/2016-02/index.html @@ -375,6 +375,8 @@ Bitstream: tést señora alimentación.pdf
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -383,8 +385,6 @@ Bitstream: tést señora alimentación.pdf
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2016-03/index.html b/docs/2016-03/index.html index 24ea4d74a..9d32ee124 100644 --- a/docs/2016-03/index.html +++ b/docs/2016-03/index.html @@ -313,6 +313,8 @@ Reinstall my local (Mac OS X) DSpace stack with Tomcat 7, PostgreSQL 9.3, and Ja
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -321,8 +323,6 @@ Reinstall my local (Mac OS X) DSpace stack with Tomcat 7, PostgreSQL 9.3, and Ja
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2016-04/index.html b/docs/2016-04/index.html index 0cc9c5162..31394adbc 100644 --- a/docs/2016-04/index.html +++ b/docs/2016-04/index.html @@ -492,6 +492,8 @@ dspace.log.2016-04-27:7271
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -500,8 +502,6 @@ dspace.log.2016-04-27:7271
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2016-05/index.html b/docs/2016-05/index.html index a02cad3e1..5d8e3d26a 100644 --- a/docs/2016-05/index.html +++ b/docs/2016-05/index.html @@ -368,6 +368,8 @@ sys 0m20.540s
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -376,8 +378,6 @@ sys 0m20.540s
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2016-06/index.html b/docs/2016-06/index.html index f49ae0438..06e706e34 100644 --- a/docs/2016-06/index.html +++ b/docs/2016-06/index.html @@ -406,6 +406,8 @@ $ ./delete-metadata-values.py -f dc.contributor.corporate -i Corporate-Authors-D
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -414,8 +416,6 @@ $ ./delete-metadata-values.py -f dc.contributor.corporate -i Corporate-Authors-D
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2016-07/index.html b/docs/2016-07/index.html index be6978bdb..9de26e0fa 100644 --- a/docs/2016-07/index.html +++ b/docs/2016-07/index.html @@ -322,6 +322,8 @@ discovery.index.authority.ignore-variants=true
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -330,8 +332,6 @@ discovery.index.authority.ignore-variants=true
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2016-08/index.html b/docs/2016-08/index.html index 0f88665bd..7d2185119 100644 --- a/docs/2016-08/index.html +++ b/docs/2016-08/index.html @@ -386,6 +386,8 @@ $ JAVA_OPTS="-Dfile.encoding=UTF-8 -Xmx512m" /home/cgspace.cgiar.org/b
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -394,8 +396,6 @@ $ JAVA_OPTS="-Dfile.encoding=UTF-8 -Xmx512m" /home/cgspace.cgiar.org/b
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2016-09/index.html b/docs/2016-09/index.html index 5ee04ccd4..04934cf53 100644 --- a/docs/2016-09/index.html +++ b/docs/2016-09/index.html @@ -603,6 +603,8 @@ $ ./delete-metadata-values.py -i ilrisubjects-delete-13.csv -f cg.subject.ilri -
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -611,8 +613,6 @@ $ ./delete-metadata-values.py -i ilrisubjects-delete-13.csv -f cg.subject.ilri -
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2016-10/index.html b/docs/2016-10/index.html index 380b18b23..03c04c044 100644 --- a/docs/2016-10/index.html +++ b/docs/2016-10/index.html @@ -369,6 +369,8 @@ dspace=# update metadatavalue set text_value = regexp_replace(text_value, 'http:
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -377,8 +379,6 @@ dspace=# update metadatavalue set text_value = regexp_replace(text_value, 'http:
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2016-11/index.html b/docs/2016-11/index.html index b22249417..0142eabd6 100644 --- a/docs/2016-11/index.html +++ b/docs/2016-11/index.html @@ -545,6 +545,8 @@ org.dspace.discovery.SearchServiceException: Error executing query
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -553,8 +555,6 @@ org.dspace.discovery.SearchServiceException: Error executing query
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2016-12/index.html b/docs/2016-12/index.html index 47e9e3c62..12e60f42e 100644 --- a/docs/2016-12/index.html +++ b/docs/2016-12/index.html @@ -781,6 +781,8 @@ $ exit
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -789,8 +791,6 @@ $ exit
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2017-01/index.html b/docs/2017-01/index.html index 846e52e19..a42e75d99 100644 --- a/docs/2017-01/index.html +++ b/docs/2017-01/index.html @@ -366,6 +366,8 @@ $ gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/ebook -dNOPAUSE -
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -374,8 +376,6 @@ $ gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/ebook -dNOPAUSE -
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2017-02/index.html b/docs/2017-02/index.html index 0048c5683..53a568ffe 100644 --- a/docs/2017-02/index.html +++ b/docs/2017-02/index.html @@ -421,6 +421,8 @@ COPY 1968
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -429,8 +431,6 @@ COPY 1968
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2017-03/index.html b/docs/2017-03/index.html index e97ce13f2..decbc8721 100644 --- a/docs/2017-03/index.html +++ b/docs/2017-03/index.html @@ -352,6 +352,8 @@ $ ./delete-metadata-values.py -i Investors-Delete-121.csv -f dc.description.spon
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -360,8 +362,6 @@ $ ./delete-metadata-values.py -i Investors-Delete-121.csv -f dc.description.spon
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2017-04/index.html b/docs/2017-04/index.html index a2a8316fb..6cf4153ca 100644 --- a/docs/2017-04/index.html +++ b/docs/2017-04/index.html @@ -582,6 +582,8 @@ $ gem install compass -v 1.0.3
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -590,8 +592,6 @@ $ gem install compass -v 1.0.3
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2017-05/index.html b/docs/2017-05/index.html index 3200fa27e..74a7c7a36 100644 --- a/docs/2017-05/index.html +++ b/docs/2017-05/index.html @@ -388,6 +388,8 @@ UPDATE 187
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -396,8 +398,6 @@ UPDATE 187
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2017-06/index.html b/docs/2017-06/index.html index e3b601078..a89f94a6f 100644 --- a/docs/2017-06/index.html +++ b/docs/2017-06/index.html @@ -267,6 +267,8 @@ $ JAVA_OPTS="-Xmx1024m -Dfile.encoding=UTF-8" [dspace]/bin/dspace impo
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -275,8 +277,6 @@ $ JAVA_OPTS="-Xmx1024m -Dfile.encoding=UTF-8" [dspace]/bin/dspace impo
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2017-07/index.html b/docs/2017-07/index.html index 5fc6ef70b..3d2328fed 100644 --- a/docs/2017-07/index.html +++ b/docs/2017-07/index.html @@ -272,6 +272,8 @@ delete from metadatavalue where resource_type_id=2 and metadata_field_id=235 and
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -280,8 +282,6 @@ delete from metadatavalue where resource_type_id=2 and metadata_field_id=235 and
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2017-08/index.html b/docs/2017-08/index.html index 0d3d645a2..0c5613b2a 100644 --- a/docs/2017-08/index.html +++ b/docs/2017-08/index.html @@ -514,6 +514,8 @@ org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -522,8 +524,6 @@ org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2017-09/index.html b/docs/2017-09/index.html index 6f803b659..d1f689400 100644 --- a/docs/2017-09/index.html +++ b/docs/2017-09/index.html @@ -656,6 +656,8 @@ Cert Status: good
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -664,8 +666,6 @@ Cert Status: good
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2017-10/index.html b/docs/2017-10/index.html index d9bc73b91..673953119 100644 --- a/docs/2017-10/index.html +++ b/docs/2017-10/index.html @@ -440,6 +440,8 @@ session_id=6C30F10B4351A4ED83EC6ED50AFD6B6A
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -448,8 +450,6 @@ session_id=6C30F10B4351A4ED83EC6ED50AFD6B6A
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2017-11/index.html b/docs/2017-11/index.html index 379d30ab8..671cc42bc 100644 --- a/docs/2017-11/index.html +++ b/docs/2017-11/index.html @@ -941,6 +941,8 @@ $ cat dspace.log.2017-11-28 | grep -o -E 'session_id=[A-Z0-9]{32}' | sort -n | u
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -949,8 +951,6 @@ $ cat dspace.log.2017-11-28 | grep -o -E 'session_id=[A-Z0-9]{32}' | sort -n | u
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2017-12/index.html b/docs/2017-12/index.html index 3b6a24856..3348de63a 100644 --- a/docs/2017-12/index.html +++ b/docs/2017-12/index.html @@ -780,6 +780,8 @@ DELETE 20
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -788,8 +790,6 @@ DELETE 20
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2018-01/index.html b/docs/2018-01/index.html index 6fca4c871..f9682954a 100644 --- a/docs/2018-01/index.html +++ b/docs/2018-01/index.html @@ -1449,6 +1449,8 @@ Catalina:type=Manager,context=/,host=localhost activeSessions 8
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -1457,8 +1459,6 @@ Catalina:type=Manager,context=/,host=localhost activeSessions 8
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2018-02/index.html b/docs/2018-02/index.html index 733cfe719..d5a993335 100644 --- a/docs/2018-02/index.html +++ b/docs/2018-02/index.html @@ -1036,6 +1036,8 @@ UPDATE 3
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -1044,8 +1046,6 @@ UPDATE 3
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2018-03/index.html b/docs/2018-03/index.html index 2776b4a5e..3a00467bb 100644 --- a/docs/2018-03/index.html +++ b/docs/2018-03/index.html @@ -582,6 +582,8 @@ Fixed 5 occurences of: GENEBANKS
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -590,8 +592,6 @@ Fixed 5 occurences of: GENEBANKS
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2018-04/index.html b/docs/2018-04/index.html index 5003eb0a7..7d79e8f77 100644 --- a/docs/2018-04/index.html +++ b/docs/2018-04/index.html @@ -591,6 +591,8 @@ $ pg_restore -O -U dspacetest -d dspacetest -W -h localhost /tmp/dspace_2018-04-
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -599,8 +601,6 @@ $ pg_restore -O -U dspacetest -d dspacetest -W -h localhost /tmp/dspace_2018-04-
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2018-05/index.html b/docs/2018-05/index.html index b861d3ae1..8a8b9fad1 100644 --- a/docs/2018-05/index.html +++ b/docs/2018-05/index.html @@ -520,6 +520,8 @@ $ psql -h localhost -U postgres dspacetest
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -528,8 +530,6 @@ $ psql -h localhost -U postgres dspacetest
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2018-06/index.html b/docs/2018-06/index.html index 0ccbbffe6..b0f932e18 100644 --- a/docs/2018-06/index.html +++ b/docs/2018-06/index.html @@ -514,6 +514,8 @@ $ sed '/^id/d' 10568-*.csv | csvcut -c 1,2 > map-to-cifor-archive.csv
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -522,8 +524,6 @@ $ sed '/^id/d' 10568-*.csv | csvcut -c 1,2 > map-to-cifor-archive.csv
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2018-07/index.html b/docs/2018-07/index.html index b3d4d9665..faac01115 100644 --- a/docs/2018-07/index.html +++ b/docs/2018-07/index.html @@ -566,6 +566,8 @@ dspace=# select count(text_value) from metadatavalue where resource_type_id=2 an
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -574,8 +576,6 @@ dspace=# select count(text_value) from metadatavalue where resource_type_id=2 an
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2018-08/index.html b/docs/2018-08/index.html index 73cccc614..6874dcbb4 100644 --- a/docs/2018-08/index.html +++ b/docs/2018-08/index.html @@ -439,6 +439,8 @@ $ dspace database migrate ignored
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -447,8 +449,6 @@ $ dspace database migrate ignored
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2018-09/index.html b/docs/2018-09/index.html index 310aa6e45..37edc80c2 100644 --- a/docs/2018-09/index.html +++ b/docs/2018-09/index.html @@ -745,6 +745,8 @@ UPDATE metadatavalue SET text_value='ja' WHERE resource_type_id=2 AND metadata_f
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -753,8 +755,6 @@ UPDATE metadatavalue SET text_value='ja' WHERE resource_type_id=2 AND metadata_f
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2018-10/index.html b/docs/2018-10/index.html index 500f8fa97..aa9e5a48a 100644 --- a/docs/2018-10/index.html +++ b/docs/2018-10/index.html @@ -653,6 +653,8 @@ $ curl -X GET -H "Content-Type: application/json" -H "Accept: app
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -661,8 +663,6 @@ $ curl -X GET -H "Content-Type: application/json" -H "Accept: app
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2018-11/index.html b/docs/2018-11/index.html index 04087c9c2..5c8ef97a4 100644 --- a/docs/2018-11/index.html +++ b/docs/2018-11/index.html @@ -550,6 +550,8 @@ $ dspace dsrun org.dspace.eperson.Groomer -a -b 11/27/2016 -d
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -558,8 +560,6 @@ $ dspace dsrun org.dspace.eperson.Groomer -a -b 11/27/2016 -d
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2018-12/index.html b/docs/2018-12/index.html index 68b38f186..d20177ddb 100644 --- a/docs/2018-12/index.html +++ b/docs/2018-12/index.html @@ -591,6 +591,8 @@ UPDATE 1
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -599,8 +601,6 @@ UPDATE 1
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2019-01/index.html b/docs/2019-01/index.html index 3bfbb73e7..51c8a5cfa 100644 --- a/docs/2019-01/index.html +++ b/docs/2019-01/index.html @@ -1261,6 +1261,8 @@ identify: CorruptImageProfile `xmp' @ warning/profile.c/SetImageProfileInternal/
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -1269,8 +1271,6 @@ identify: CorruptImageProfile `xmp' @ warning/profile.c/SetImageProfileInternal/
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2019-02/index.html b/docs/2019-02/index.html index e45a6dbbb..7fb2df775 100644 --- a/docs/2019-02/index.html +++ b/docs/2019-02/index.html @@ -1341,6 +1341,8 @@ Please see the DSpace documentation for assistance.
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -1349,8 +1351,6 @@ Please see the DSpace documentation for assistance.
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2019-03/index.html b/docs/2019-03/index.html index b090b2e46..b1bba99b3 100644 --- a/docs/2019-03/index.html +++ b/docs/2019-03/index.html @@ -1205,6 +1205,8 @@ sys 0m2.551s
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -1213,8 +1215,6 @@ sys 0m2.551s
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2019-04/index.html b/docs/2019-04/index.html index f298b6cf2..5c0e3a830 100644 --- a/docs/2019-04/index.html +++ b/docs/2019-04/index.html @@ -1296,6 +1296,8 @@ UPDATE 14
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -1304,8 +1306,6 @@ UPDATE 14
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2019-05/index.html b/docs/2019-05/index.html index 6e8777f4f..825015acb 100644 --- a/docs/2019-05/index.html +++ b/docs/2019-05/index.html @@ -628,6 +628,8 @@ COPY 64871
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -636,8 +638,6 @@ COPY 64871
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2019-06/index.html b/docs/2019-06/index.html index 8c4c3512f..ddd96d506 100644 --- a/docs/2019-06/index.html +++ b/docs/2019-06/index.html @@ -314,6 +314,8 @@ UPDATE 2
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -322,8 +324,6 @@ UPDATE 2
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2019-07/index.html b/docs/2019-07/index.html index e258e1eea..82247ddb4 100644 --- a/docs/2019-07/index.html +++ b/docs/2019-07/index.html @@ -551,6 +551,8 @@ issn.validate('1020-3362')
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -559,8 +561,6 @@ issn.validate('1020-3362')
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2019-08/index.html b/docs/2019-08/index.html index 9b96b748c..963acfbec 100644 --- a/docs/2019-08/index.html +++ b/docs/2019-08/index.html @@ -570,6 +570,8 @@ sys 2m27.496s
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -578,8 +580,6 @@ sys 2m27.496s
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2019-09/index.html b/docs/2019-09/index.html index 9b70c8593..016b59e51 100644 --- a/docs/2019-09/index.html +++ b/docs/2019-09/index.html @@ -578,6 +578,8 @@ $ csv-metadata-quality -i /tmp/clarisa-institutions.csv -o /tmp/clarisa-institut
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -586,8 +588,6 @@ $ csv-metadata-quality -i /tmp/clarisa-institutions.csv -o /tmp/clarisa-institut
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2019-10/index.html b/docs/2019-10/index.html index f1670b728..c6b4d4ff7 100644 --- a/docs/2019-10/index.html +++ b/docs/2019-10/index.html @@ -382,6 +382,8 @@ $ dspace import -a -c 10568/104057 -e fuu@cgiar.org -m 2019-10-15-Bioversity.map
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -390,8 +392,6 @@ $ dspace import -a -c 10568/104057 -e fuu@cgiar.org -m 2019-10-15-Bioversity.map
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2019-11/index.html b/docs/2019-11/index.html index 3155f0f58..9e16bef93 100644 --- a/docs/2019-11/index.html +++ b/docs/2019-11/index.html @@ -689,6 +689,8 @@ $ tidy -xml -utf8 -iq -m -w 0 dspace/config/controlled-vocabularies/cg-creator-i
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -697,8 +699,6 @@ $ tidy -xml -utf8 -iq -m -w 0 dspace/config/controlled-vocabularies/cg-creator-i
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2019-12/index.html b/docs/2019-12/index.html index 13d88b625..1b1813fb9 100644 --- a/docs/2019-12/index.html +++ b/docs/2019-12/index.html @@ -401,6 +401,8 @@ UPDATE 1
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -409,8 +411,6 @@ UPDATE 1
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2020-01/index.html b/docs/2020-01/index.html index 9b6c153bb..de5d8f2dc 100644 --- a/docs/2020-01/index.html +++ b/docs/2020-01/index.html @@ -601,6 +601,8 @@ COPY 2900
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -609,8 +611,6 @@ COPY 2900
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2020-02/index.html b/docs/2020-02/index.html index 94bc29468..7d3feff57 100644 --- a/docs/2020-02/index.html +++ b/docs/2020-02/index.html @@ -1272,6 +1272,8 @@ Moving: 21993 into core statistics-2019
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -1280,8 +1282,6 @@ Moving: 21993 into core statistics-2019
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2020-03/index.html b/docs/2020-03/index.html index f6bc58a44..c66bc7459 100644 --- a/docs/2020-03/index.html +++ b/docs/2020-03/index.html @@ -481,6 +481,8 @@ $ tidy -xml -utf8 -iq -m -w 0 dspace/config/controlled-vocabularies/cg-creator-i
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -489,8 +491,6 @@ $ tidy -xml -utf8 -iq -m -w 0 dspace/config/controlled-vocabularies/cg-creator-i
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2020-04/index.html b/docs/2020-04/index.html index 3c8d8801b..7f2b4b8a5 100644 --- a/docs/2020-04/index.html +++ b/docs/2020-04/index.html @@ -655,6 +655,8 @@ $ psql -c 'select * from pg_stat_activity' | wc -l
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -663,8 +665,6 @@ $ psql -c 'select * from pg_stat_activity' | wc -l
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2020-05/index.html b/docs/2020-05/index.html index 0b66cb4a6..eec5ff946 100644 --- a/docs/2020-05/index.html +++ b/docs/2020-05/index.html @@ -474,6 +474,8 @@ Caused by: java.lang.NullPointerException
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -482,8 +484,6 @@ Caused by: java.lang.NullPointerException
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2020-06/index.html b/docs/2020-06/index.html index 75b1f0248..f50b3399d 100644 --- a/docs/2020-06/index.html +++ b/docs/2020-06/index.html @@ -808,6 +808,8 @@ $ csvcut -c 'id,cg.subject.ilri[],cg.subject.ilri[en_US],dc.subject[en_US]' /tmp
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -816,8 +818,6 @@ $ csvcut -c 'id,cg.subject.ilri[],cg.subject.ilri[en_US],dc.subject[en_US]' /tmp
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2020-07/index.html b/docs/2020-07/index.html index b5a726a61..9193db52c 100644 --- a/docs/2020-07/index.html +++ b/docs/2020-07/index.html @@ -1139,6 +1139,8 @@ Fixed 4 occurences of: Muloi, D.M.
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -1147,8 +1149,6 @@ Fixed 4 occurences of: Muloi, D.M.
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2020-08/index.html b/docs/2020-08/index.html index 50b1db208..f74ebcaf3 100644 --- a/docs/2020-08/index.html +++ b/docs/2020-08/index.html @@ -19,7 +19,7 @@ It is class based so I can easily add support for other vocabularies, and the te - + @@ -45,7 +45,7 @@ It is class based so I can easily add support for other vocabularies, and the te "url": "https://alanorth.github.io/cgspace-notes/2020-08/", "wordCount": "3672", "datePublished": "2020-08-02T15:35:54+03:00", - "dateModified": "2020-08-22T13:29:08+03:00", + "dateModified": "2020-09-02T13:39:11+03:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -795,6 +795,8 @@ $ grep -c added /tmp/2020-08-27-countrycodetagger.log
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -803,8 +805,6 @@ $ grep -c added /tmp/2020-08-27-countrycodetagger.log
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/2020-09/index.html b/docs/2020-09/index.html new file mode 100644 index 000000000..9c6537e4f --- /dev/null +++ b/docs/2020-09/index.html @@ -0,0 +1,320 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + September, 2020 | CGSpace Notes + + + + + + + + + + + + + + + + + + + + + + + +
+
+ +
+
+ + + + +
+
+

CGSpace Notes

+

Documenting day-to-day work on the CGSpace repository.

+
+
+ + + + +
+
+
+ + + + +
+
+

September, 2020

+ +
+

2020-09-02

+
    +
  • Replace Marissa van Epp for Rhys Bucknall in the CCAFS groups on CGSpace because Marissa no longer works at CCAFS
  • +
  • The AReS Explorer hasn’t updated its index since 2020-08-22 when I last forced it +
      +
    • I restarted it again now and told Moayad that the automatic indexing isn’t working
    • +
    +
  • +
  • Add Alliance of Bioversity International and CIAT to affiliations on CGSpace
  • +
  • Abenet told me that the general search text on AReS doesn’t get reset when you use the “Reset Filters” button + +
  • +
  • I filed an issue on OpenRXV to make some minor edits to the admin UI: https://github.com/ilri/OpenRXV/issues/40
  • +
+
    +
  • I ran the country code tagger on CGSpace:
  • +
+
$ time chrt -b 0 dspace curate -t countrycodetagger -i all -r - -l 500 -s object | tee /tmp/2020-09-02-countrycodetagger.log
+...
+real    2m10.516s
+user    1m43.953s
+sys     0m15.192s
+$ grep -c added /tmp/2020-09-02-countrycodetagger.log
+39
+
    +
  • I still need to create a cron job for this…
  • +
  • Sisay and Abenet said they can’t log in with LDAP on DSpace Test (DSpace 6) +
      +
    • I tried and I can’t either… but it is working on CGSpace
    • +
    • The error on DSpace 6 is:
    • +
    +
  • +
+
2020-09-02 12:03:10,666 INFO  org.dspace.authenticate.LDAPAuthentication @ anonymous:session_id=A629116488DCC467E1EA2062A2E2EFD7:ip_addr=92.220.02.201:failed_login:no DN found for user aorth
+
    +
  • I tried to query LDAP directly using the application credentials with ldapsearch and it works:
  • +
+
$ ldapsearch -x -H ldaps://AZCGNEROOT2.CGIARAD.ORG:636/ -b "dc=cgiarad,dc=org" -D "applicationaccount@cgiarad.org" -W "(sAMAccountName=me)"
+
    +
  • According to the DSpace 6 docs we need to escape commas in our LDAP parameters due to the new configuration system +
      +
    • I added the commas and restarted DSpace (though technically we shouldn’t need to restart due to the new config system hot reloading configs)
    • +
    • Run all system updates on DSpace Test (linode26) and reboot it
    • +
    • After the restart LDAP login works…
    • +
    +
  • +
+

2020-09-03

+
    +
  • Fix some erroneous “review status” fields that Abenet noticed on AReS +
      +
    • I used my fix-metadata-values.py and delete-metadata-values.py scripts with the following input files:
    • +
    +
  • +
+
$ cat 2020-09-03-fix-review-status.csv
+dc.description.version,correct
+Externally Peer Reviewed,Peer Review
+Peer Reviewed,Peer Review
+Peer review,Peer Review
+Peer reviewed,Peer Review
+Peer-Reviewed,Peer Review
+Peer-reviewed,Peer Review
+peer Review,Peer Review
+$ cat 2020-09-03-delete-review-status.csv
+dc.description.version
+Report
+Formally Published
+Poster
+Unrefereed reprint
+$ ./delete-metadata-values.py -i 2020-09-03-delete-review-status.csv -db dspace -u dspace -p 'fuuu' -f dc.description.version -m 68
+$ ./fix-metadata-values.py -i 2020-09-03-fix-review-status.csv -db dspace -u dspace -p 'fuuu' -f dc.description.version -t 'correct' -m 68
+
    +
  • Start reviewing 95 items for IITA (20201stbatch) +
      +
    • I used my csv-metadata-quality tool to check and fix some low-hanging fruit first
    • +
    • This fixed a few unnecessary Unicode, excessive whitespace, invalid multi-value separator, and duplicate metadata values
    • +
    • Then I looked at the data in OpenRefine and noticed some things: +
        +
      • All issue dates use year only, but some have months in the citation so they could be more specific
      • +
      • I normalized all the DOIs to use “https://doi.org” format
      • +
      • I fixed a few AGROVOC subjects with a simple GREL: value.replace("GRAINS","GRAIN").replace("SOILS","SOIL").replace("CORN","MAIZE")
      • +
      • But there are a few more that are invalid that she will have to look at
      • +
      • I uploaded the items to DSpace Test and it was apparently successful but I get these errors to the console:
      • +
      +
    • +
    +
  • +
+
Thu Sep 03 12:26:33 CEST 2020 | Query:containerItem:ea7a2648-180d-4fce-bdc5-c3aa2304fc58
+Error while updating
+java.lang.NullPointerException
+        at com.atmire.dspace.cua.CUASolrLoggerServiceImpl$5.visit(SourceFile:1131)
+        at com.atmire.dspace.cua.CUASolrLoggerServiceImpl.visitEachStatisticShard(SourceFile:212)
+        at com.atmire.dspace.cua.CUASolrLoggerServiceImpl.update(SourceFile:1104)
+        at com.atmire.dspace.cua.CUASolrLoggerServiceImpl.update(SourceFile:1093)
+        at org.dspace.statistics.StatisticsLoggingConsumer.consume(SourceFile:104)
+        at org.dspace.event.BasicDispatcher.consume(BasicDispatcher.java:177)
+        at org.dspace.event.BasicDispatcher.dispatch(BasicDispatcher.java:123)
+        at org.dspace.core.Context.dispatchEvents(Context.java:455)
+        at org.dspace.core.Context.commit(Context.java:424)
+        at org.dspace.core.Context.complete(Context.java:380)
+        at org.dspace.app.bulkedit.MetadataImport.main(MetadataImport.java:1399)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
+        at java.lang.reflect.Method.invoke(Method.java:498)
+        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229)
+        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81)
+
    +
  • There are more in the DSpace log so I will raise it with Atmire immediately
  • +
+ + + + + + +
+ + + +
+ + + + +
+
+ + + + + + + + + diff --git a/docs/404.html b/docs/404.html index a06f0acd8..46b5588ac 100644 --- a/docs/404.html +++ b/docs/404.html @@ -94,6 +94,8 @@
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -102,8 +104,6 @@
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/categories/index.html b/docs/categories/index.html index e92308224..be85a4831 100644 --- a/docs/categories/index.html +++ b/docs/categories/index.html @@ -9,7 +9,7 @@ - + @@ -83,7 +83,7 @@

Notes

- +
Read more → @@ -107,6 +107,8 @@
    +
  1. September, 2020
  2. +
  3. August, 2020
  4. July, 2020
  5. @@ -115,8 +117,6 @@
  6. May, 2020
  7. -
  8. April, 2020
  9. -
diff --git a/docs/categories/index.xml b/docs/categories/index.xml index cc3960f3e..8352a82c7 100644 --- a/docs/categories/index.xml +++ b/docs/categories/index.xml @@ -6,7 +6,7 @@ Recent content in Categories on CGSpace Notes Hugo -- gohugo.io en-us - Sun, 02 Aug 2020 15:35:54 +0300 + Wed, 02 Sep 2020 15:35:54 +0300 @@ -14,7 +14,7 @@ Notes https://alanorth.github.io/cgspace-notes/categories/notes/ - Sun, 02 Aug 2020 15:35:54 +0300 + Wed, 02 Sep 2020 15:35:54 +0300 https://alanorth.github.io/cgspace-notes/categories/notes/ diff --git a/docs/categories/notes/index.html b/docs/categories/notes/index.html index 23293b76e..503a35f4a 100644 --- a/docs/categories/notes/index.html +++ b/docs/categories/notes/index.html @@ -9,7 +9,7 @@ - + @@ -80,6 +80,39 @@ +
+
+

September, 2020

+ +
+

2020-09-02

+
    +
  • Replace Marissa van Epp for Rhys Bucknall in the CCAFS groups on CGSpace because Marissa no longer works at CCAFS
  • +
  • The AReS Explorer hasn’t updated its index since 2020-08-22 when I last forced it +
      +
    • I restarted it again now and told Moayad that the automatic indexing isn’t working
    • +
    +
  • +
  • Add Alliance of Bioversity International and CIAT to affiliations on CGSpace
  • +
  • Abenet told me that the general search text on AReS doesn’t get reset when you use the “Reset Filters” button + +
  • +
  • I filed an issue on OpenRXV to make some minor edits to the admin UI: https://github.com/ilri/OpenRXV/issues/40
  • +
+ Read more → +
+ + + + + +

August, 2020

@@ -347,44 +380,6 @@ - -
-
-

November, 2019

- -
-

2019-11-04

-
    -
  • Peter noticed that there were 5.2 million hits on CGSpace in 2019-10 according to the Atmire usage statistics -
      -
    • I looked in the nginx logs and see 4.6 million in the access logs, and 1.2 million in the API logs:
    • -
    -
  • -
-
# zcat --force /var/log/nginx/*access.log.*.gz | grep -cE "[0-9]{1,2}/Oct/2019"
-4671942
-# zcat --force /var/log/nginx/{rest,oai,statistics}.log.*.gz | grep -cE "[0-9]{1,2}/Oct/2019"
-1277694
-
    -
  • So 4.6 million from XMLUI and another 1.2 million from API requests
  • -
  • Let’s see how many of the REST API requests were for bitstreams (because they are counted in Solr stats):
  • -
-
# zcat --force /var/log/nginx/rest.log.*.gz | grep -c -E "[0-9]{1,2}/Oct/2019"
-1183456 
-# zcat --force /var/log/nginx/rest.log.*.gz | grep -E "[0-9]{1,2}/Oct/2019" | grep -c -E "/rest/bitstreams"
-106781
-
- Read more → -
- - - - -