Add notes for 2021-10

This commit is contained in:
Alan Orth 2021-10-04 19:40:13 +03:00
parent 45ae9e7820
commit 4c11bc1c1e
Signed by: alanorth
GPG Key ID: 0FB860CC9C45B1B9
111 changed files with 1402 additions and 805 deletions

110
content/posts/2021-10.md Normal file
View File

@ -0,0 +1,110 @@
---
title: "October, 2021"
date: 2021-10-01T11:14:07+03:00
author: "Alan Orth"
categories: ["Notes"]
---
## 2021-10-01
- Export all affiliations on CGSpace and run them against the latest RoR data dump:
```console
localhost/dspace63= > \COPY (SELECT DISTINCT text_value as "cg.contributor.affiliation", count(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id = 211 GROUP BY text_value ORDER BY count DESC) to /tmp/2021-10-01-affiliations.csv WITH CSV HEADER;
$ csvcut -c 1 /tmp/2021-10-01-affiliations.csv | sed 1d > /tmp/2021-10-01-affiliations.txt
$ ./ilri/ror-lookup.py -i /tmp/2021-10-01-affiliations.txt -r 2021-09-23-ror-data.json -o /tmp/2021-10-01-affili
ations-matching.csv
$ csvgrep -c matched -m true /tmp/2021-10-01-affiliations-matching.csv | sed 1d | wc -l
1879
$ wc -l /tmp/2021-10-01-affiliations.txt
7100 /tmp/2021-10-01-affiliations.txt
```
- So we have 1879/7100 (26.46%) matching already
<!--more-->
## 2021-10-03
- Dominique from IWMI asked me for information about how CGSpace partners are using CGSpace APIs to feed their websites
- Start a fresh indexing on AReS
- Udana sent me his file of 292 non-IWMI publications for the Virtual library on water management
- He added licenses
- I want to clean up the `dcterms.extent` field though because it has volume, issue, and pages there
- I cloned the column several times and extracted values based on their positions, for example:
- Volume: `value.partition(":")[0]`
- Issue: `value.partition("(")[2].partition(")")[0]`
- Page: `"p. " + value.replace(".", "")`
## 2021-10-04
- Start looking at the last month of Solr statistics on CGSpace
- I see a number of IPs with "normal" user agents who clearly behave like bots
- 198.15.130.18: 21,000 requests to /discover with a normal-looking user agent, from ASN 11282 (SERVERYOU, US)
- 93.158.90.107: 8,500 requests to handle and browse links with a Firefox 84.0 user agent, from ASN 12552 (IPO-EU, SE)
- 193.235.141.162: 4,800 requests to handle, browse, and discovery links with a Firefox 84.0 user agent, from ASN 51747 (INTERNETBOLAGET, SE)
- 3.225.28.105: 2,900 requests to REST API for the CIAT Story Maps collection with a normal user agent, from ASN 14618 (AMAZON-AES, US)
- 34.228.236.6: 2,800 requests to discovery for the CGIAR System community with user agent `Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)`, from ASN 14618 (AMAZON-AES, US)
- 18.212.137.2: 2,800 requests to discovery for the CGIAR System community with user agent `Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)`, from ASN 14618 (AMAZON-AES, US)
- 3.81.123.72: 2,800 requests to discovery and handles for the CGIAR System community with user agent `Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)`, from ASN 14618 (AMAZON-AES, US)
- 3.227.16.188: 2,800 requests to discovery and handles for the CGIAR System community with user agent `Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)`, from ASN 14618 (AMAZON-AES, US)
- Looking closer into the requests with this Mozilla/4.0 user agent, I see 500+ IPs using it:
```console
# zcat --force /var/log/nginx/*.log* | grep 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)' | awk '{print $1}' | sort | uniq > /tmp/mozilla-4.0-ips.txt
# wc -l /tmp/mozilla-4.0-ips.txt
543 /tmp/mozilla-4.0-ips.txt
```
- Then I resolved the IPs and extracted the ones belonging to Amazon:
```console
$ ./ilri/resolve-addresses-geoip2.py -i /tmp/mozilla-4.0-ips.txt -k "$ABUSEIPDB_API_KEY" -o /tmp/mozilla-4.0-ips.csv
$ csvgrep -c asn -m 14618 /tmp/mozilla-4.0-ips.csv | csvcut -c ip | sed 1d | tee /tmp/amazon-ips.txt | wc -l
```
- I am thinking I will purge them all, as I have several indicators that they are bots: mysterious user agent, IP owned by Amazon
- Even more interesting, these requests are weighted VERY heavily on the CGIAR System community:
```console
1592 GET /handle/10947/2526
1592 GET /handle/10947/2527
1592 GET /handle/10947/34
1593 GET /handle/10947/6
1594 GET /handle/10947/1
1598 GET /handle/10947/2515
1598 GET /handle/10947/2516
1599 GET /handle/10568/101335
1599 GET /handle/10568/91688
1599 GET /handle/10947/2517
1599 GET /handle/10947/2518
1599 GET /handle/10947/2519
1599 GET /handle/10947/2708
1599 GET /handle/10947/2871
1600 GET /handle/10568/89342
1600 GET /handle/10947/4467
1607 GET /handle/10568/103816
290382 GET /handle/10568/83389
```
- Before I purge all those I will ask someone Samuel Stacey from the System office to hopefully get an insight...
- Meeting with Michael Victor, Peter, Jane, and Abenet about the future of repositories in the One CGIAR
- Meeting with Michelle from Altmetric about their new CSV upload system
- I sent her some examples of Handles that have DOIs, but no linked score (yet) to see if an association will be created when she uploads them
```csv
doi,handle
10.1016/j.agsy.2021.103263,10568/115288
10.3389/fgene.2021.723360,10568/115287
10.3389/fpls.2021.720670,10568/115285
```
- Extract the AGROVOC subjects from IWMI's 292 publications to validate them against AGROVOC:
```console
$ csvcut -c 'dcterms.subject[en_US]' ~/Downloads/2021-10-03-non-IWMI-publications.csv | sed -e 1d -e 's/||/\n/g' -e 's/"//g' | sort -u > /tmp/agrovoc.txt
$ ./ilri/agrovoc-lookup.py -i /tmp/agrovoc-sorted.txt -o /tmp/agrovoc-matches.csv
$ csvgrep -c 'number of matches' -m '0' /tmp/agrovoc-matches.csv | csvcut -c 1 > /tmp/invalid-agrovoc.csv
```
<!-- vim: set sw=2 ts=2: -->

View File

@ -242,6 +242,8 @@ db.statementpool = true
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -250,8 +252,6 @@ db.statementpool = true
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -264,6 +264,8 @@ $ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -272,8 +274,6 @@ $ curl -o /dev/null -s -w %{time_total}\\n https://cgspace.cgiar.org/rest/handle
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -200,6 +200,8 @@ $ find SimpleArchiveForBio/ -iname &ldquo;*.pdf&rdquo; -exec basename {} ; | sor
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -208,8 +210,6 @@ $ find SimpleArchiveForBio/ -iname &ldquo;*.pdf&rdquo; -exec basename {} ; | sor
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -378,6 +378,8 @@ Bitstream: tést señora alimentación.pdf
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -386,8 +388,6 @@ Bitstream: tést señora alimentación.pdf
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -316,6 +316,8 @@ Reinstall my local (Mac OS X) DSpace stack with Tomcat 7, PostgreSQL 9.3, and Ja
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -324,8 +326,6 @@ Reinstall my local (Mac OS X) DSpace stack with Tomcat 7, PostgreSQL 9.3, and Ja
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -495,6 +495,8 @@ dspace.log.2016-04-27:7271
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -503,8 +505,6 @@ dspace.log.2016-04-27:7271
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -371,6 +371,8 @@ sys 0m20.540s
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -379,8 +381,6 @@ sys 0m20.540s
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -409,6 +409,8 @@ $ ./delete-metadata-values.py -f dc.contributor.corporate -i Corporate-Authors-D
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -417,8 +419,6 @@ $ ./delete-metadata-values.py -f dc.contributor.corporate -i Corporate-Authors-D
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -325,6 +325,8 @@ discovery.index.authority.ignore-variants=true
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -333,8 +335,6 @@ discovery.index.authority.ignore-variants=true
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -389,6 +389,8 @@ $ JAVA_OPTS=&quot;-Dfile.encoding=UTF-8 -Xmx512m&quot; /home/cgspace.cgiar.org/b
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -397,8 +399,6 @@ $ JAVA_OPTS=&quot;-Dfile.encoding=UTF-8 -Xmx512m&quot; /home/cgspace.cgiar.org/b
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -606,6 +606,8 @@ $ ./delete-metadata-values.py -i ilrisubjects-delete-13.csv -f cg.subject.ilri -
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -614,8 +616,6 @@ $ ./delete-metadata-values.py -i ilrisubjects-delete-13.csv -f cg.subject.ilri -
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -372,6 +372,8 @@ dspace=# update metadatavalue set text_value = regexp_replace(text_value, 'http:
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -380,8 +382,6 @@ dspace=# update metadatavalue set text_value = regexp_replace(text_value, 'http:
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -548,6 +548,8 @@ org.dspace.discovery.SearchServiceException: Error executing query
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -556,8 +558,6 @@ org.dspace.discovery.SearchServiceException: Error executing query
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -784,6 +784,8 @@ $ exit
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -792,8 +794,6 @@ $ exit
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -369,6 +369,8 @@ $ gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/ebook -dNOPAUSE -
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -377,8 +379,6 @@ $ gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/ebook -dNOPAUSE -
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -424,6 +424,8 @@ COPY 1968
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -432,8 +434,6 @@ COPY 1968
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -355,6 +355,8 @@ $ ./delete-metadata-values.py -i Investors-Delete-121.csv -f dc.description.spon
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -363,8 +365,6 @@ $ ./delete-metadata-values.py -i Investors-Delete-121.csv -f dc.description.spon
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -585,6 +585,8 @@ $ gem install compass -v 1.0.3
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -593,8 +595,6 @@ $ gem install compass -v 1.0.3
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -391,6 +391,8 @@ UPDATE 187
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -399,8 +401,6 @@ UPDATE 187
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -270,6 +270,8 @@ $ JAVA_OPTS=&quot;-Xmx1024m -Dfile.encoding=UTF-8&quot; [dspace]/bin/dspace impo
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -278,8 +280,6 @@ $ JAVA_OPTS=&quot;-Xmx1024m -Dfile.encoding=UTF-8&quot; [dspace]/bin/dspace impo
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -275,6 +275,8 @@ delete from metadatavalue where resource_type_id=2 and metadata_field_id=235 and
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -283,8 +285,6 @@ delete from metadatavalue where resource_type_id=2 and metadata_field_id=235 and
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -517,6 +517,8 @@ org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -525,8 +527,6 @@ org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -659,6 +659,8 @@ Cert Status: good
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -667,8 +669,6 @@ Cert Status: good
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -443,6 +443,8 @@ session_id=6C30F10B4351A4ED83EC6ED50AFD6B6A
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -451,8 +453,6 @@ session_id=6C30F10B4351A4ED83EC6ED50AFD6B6A
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -944,6 +944,8 @@ $ cat dspace.log.2017-11-28 | grep -o -E 'session_id=[A-Z0-9]{32}' | sort -n | u
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -952,8 +954,6 @@ $ cat dspace.log.2017-11-28 | grep -o -E 'session_id=[A-Z0-9]{32}' | sort -n | u
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -783,6 +783,8 @@ DELETE 20
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -791,8 +793,6 @@ DELETE 20
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -1452,6 +1452,8 @@ Catalina:type=Manager,context=/,host=localhost activeSessions 8
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -1460,8 +1462,6 @@ Catalina:type=Manager,context=/,host=localhost activeSessions 8
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -1039,6 +1039,8 @@ UPDATE 3
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -1047,8 +1049,6 @@ UPDATE 3
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -585,6 +585,8 @@ Fixed 5 occurences of: GENEBANKS
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -593,8 +595,6 @@ Fixed 5 occurences of: GENEBANKS
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -594,6 +594,8 @@ $ pg_restore -O -U dspacetest -d dspacetest -W -h localhost /tmp/dspace_2018-04-
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -602,8 +604,6 @@ $ pg_restore -O -U dspacetest -d dspacetest -W -h localhost /tmp/dspace_2018-04-
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -523,6 +523,8 @@ $ psql -h localhost -U postgres dspacetest
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -531,8 +533,6 @@ $ psql -h localhost -U postgres dspacetest
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -517,6 +517,8 @@ $ sed '/^id/d' 10568-*.csv | csvcut -c 1,2 &gt; map-to-cifor-archive.csv
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -525,8 +527,6 @@ $ sed '/^id/d' 10568-*.csv | csvcut -c 1,2 &gt; map-to-cifor-archive.csv
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -569,6 +569,8 @@ dspace=# select count(text_value) from metadatavalue where resource_type_id=2 an
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -577,8 +579,6 @@ dspace=# select count(text_value) from metadatavalue where resource_type_id=2 an
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -442,6 +442,8 @@ $ dspace database migrate ignored
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -450,8 +452,6 @@ $ dspace database migrate ignored
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -748,6 +748,8 @@ UPDATE metadatavalue SET text_value='ja' WHERE resource_type_id=2 AND metadata_f
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -756,8 +758,6 @@ UPDATE metadatavalue SET text_value='ja' WHERE resource_type_id=2 AND metadata_f
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -656,6 +656,8 @@ $ curl -X GET -H &quot;Content-Type: application/json&quot; -H &quot;Accept: app
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -664,8 +666,6 @@ $ curl -X GET -H &quot;Content-Type: application/json&quot; -H &quot;Accept: app
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -553,6 +553,8 @@ $ dspace dsrun org.dspace.eperson.Groomer -a -b 11/27/2016 -d
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -561,8 +563,6 @@ $ dspace dsrun org.dspace.eperson.Groomer -a -b 11/27/2016 -d
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -594,6 +594,8 @@ UPDATE 1
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -602,8 +604,6 @@ UPDATE 1
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -1264,6 +1264,8 @@ identify: CorruptImageProfile `xmp' @ warning/profile.c/SetImageProfileInternal/
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -1272,8 +1274,6 @@ identify: CorruptImageProfile `xmp' @ warning/profile.c/SetImageProfileInternal/
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -1344,6 +1344,8 @@ Please see the DSpace documentation for assistance.
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -1352,8 +1354,6 @@ Please see the DSpace documentation for assistance.
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -1208,6 +1208,8 @@ sys 0m2.551s
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -1216,8 +1218,6 @@ sys 0m2.551s
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -1299,6 +1299,8 @@ UPDATE 14
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -1307,8 +1309,6 @@ UPDATE 14
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -631,6 +631,8 @@ COPY 64871
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -639,8 +641,6 @@ COPY 64871
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -317,6 +317,8 @@ UPDATE 2
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -325,8 +327,6 @@ UPDATE 2
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -554,6 +554,8 @@ issn.validate('1020-3362')
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -562,8 +564,6 @@ issn.validate('1020-3362')
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -573,6 +573,8 @@ sys 2m27.496s
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -581,8 +583,6 @@ sys 2m27.496s
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -581,6 +581,8 @@ $ csv-metadata-quality -i /tmp/clarisa-institutions.csv -o /tmp/clarisa-institut
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -589,8 +591,6 @@ $ csv-metadata-quality -i /tmp/clarisa-institutions.csv -o /tmp/clarisa-institut
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -385,6 +385,8 @@ $ dspace import -a -c 10568/104057 -e fuu@cgiar.org -m 2019-10-15-Bioversity.map
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -393,8 +395,6 @@ $ dspace import -a -c 10568/104057 -e fuu@cgiar.org -m 2019-10-15-Bioversity.map
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -692,6 +692,8 @@ $ tidy -xml -utf8 -iq -m -w 0 dspace/config/controlled-vocabularies/cg-creator-i
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -700,8 +702,6 @@ $ tidy -xml -utf8 -iq -m -w 0 dspace/config/controlled-vocabularies/cg-creator-i
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -404,6 +404,8 @@ UPDATE 1
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -412,8 +414,6 @@ UPDATE 1
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -604,6 +604,8 @@ COPY 2900
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -612,8 +614,6 @@ COPY 2900
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -1275,6 +1275,8 @@ Moving: 21993 into core statistics-2019
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -1283,8 +1285,6 @@ Moving: 21993 into core statistics-2019
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -484,6 +484,8 @@ $ tidy -xml -utf8 -iq -m -w 0 dspace/config/controlled-vocabularies/cg-creator-i
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -492,8 +494,6 @@ $ tidy -xml -utf8 -iq -m -w 0 dspace/config/controlled-vocabularies/cg-creator-i
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -658,6 +658,8 @@ $ psql -c 'select * from pg_stat_activity' | wc -l
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -666,8 +668,6 @@ $ psql -c 'select * from pg_stat_activity' | wc -l
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -477,6 +477,8 @@ Caused by: java.lang.NullPointerException
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -485,8 +487,6 @@ Caused by: java.lang.NullPointerException
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -811,6 +811,8 @@ $ csvcut -c 'id,cg.subject.ilri[],cg.subject.ilri[en_US],dc.subject[en_US]' /tmp
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -819,8 +821,6 @@ $ csvcut -c 'id,cg.subject.ilri[],cg.subject.ilri[en_US],dc.subject[en_US]' /tmp
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -1142,6 +1142,8 @@ Fixed 4 occurences of: Muloi, D.M.
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -1150,8 +1152,6 @@ Fixed 4 occurences of: Muloi, D.M.
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -798,6 +798,8 @@ $ grep -c added /tmp/2020-08-27-countrycodetagger.log
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -806,8 +808,6 @@ $ grep -c added /tmp/2020-08-27-countrycodetagger.log
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -717,6 +717,8 @@ solr_query_params = {
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -725,8 +727,6 @@ solr_query_params = {
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -1241,6 +1241,8 @@ $ ./delete-metadata-values.py -i 2020-10-31-delete-74-sponsors.csv -db dspace -u
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -1249,8 +1251,6 @@ $ ./delete-metadata-values.py -i 2020-10-31-delete-74-sponsors.csv -db dspace -u
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -731,6 +731,8 @@ $ ./fix-metadata-values.py -i 2020-11-30-fix-hung-orcid.csv -db dspace63 -u dspa
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -739,8 +741,6 @@ $ ./fix-metadata-values.py -i 2020-11-30-fix-hung-orcid.csv -db dspace63 -u dspa
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -869,6 +869,8 @@ $ curl -XDELETE 'http://localhost:9200/openrxv-items-2020-12-29?pretty'
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -877,8 +879,6 @@ $ curl -XDELETE 'http://localhost:9200/openrxv-items-2020-12-29?pretty'
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -688,6 +688,8 @@ java.lang.IllegalArgumentException: Invalid character found in the request targe
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -696,8 +698,6 @@ java.lang.IllegalArgumentException: Invalid character found in the request targe
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -898,6 +898,8 @@ dspace.log.2021-02-28:0
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -906,8 +908,6 @@ dspace.log.2021-02-28:0
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -875,6 +875,8 @@ COPY 3081
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -883,8 +885,6 @@ COPY 3081
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -1042,6 +1042,8 @@ $ ./ilri/doi-to-handle.py -i /tmp/dois.txt -o /tmp/handles.csv -db dspace63 -u d
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -1050,8 +1052,6 @@ $ ./ilri/doi-to-handle.py -i /tmp/dois.txt -o /tmp/handles.csv -db dspace63 -u d
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -685,6 +685,8 @@ Please see the DSpace documentation for assistance.
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -693,8 +695,6 @@ Please see the DSpace documentation for assistance.
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -693,6 +693,8 @@ COPY 1710
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -701,8 +703,6 @@ COPY 1710
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -715,6 +715,8 @@ $ cat AS* /tmp/ddos-networks-to-block.txt | sed -e '/^$/d' -e '/^#/d' -e '/^{/d'
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -723,8 +725,6 @@ $ cat AS* /tmp/ddos-networks-to-block.txt | sed -e '/^$/d' -e '/^#/d' -e '/^{/d'
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -606,6 +606,8 @@ $ ./ilri/add-orcid-identifiers-csv.py -i 2021-08-25-add-orcids.csv -db dspace -u
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -614,8 +616,6 @@ $ ./ilri/add-orcid-identifiers-csv.py -i 2021-08-25-add-orcids.csv -db dspace -u
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -26,7 +26,7 @@ The syntax Moayad showed me last month doesn&rsquo;t seem to honor the search qu
<meta property="og:type" content="article" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2021-09/" />
<meta property="article:published_time" content="2021-09-01T09:14:07+03:00" />
<meta property="article:modified_time" content="2021-09-28T22:00:36+03:00" />
<meta property="article:modified_time" content="2021-10-04T11:10:54+03:00" />
@ -58,9 +58,9 @@ The syntax Moayad showed me last month doesn&rsquo;t seem to honor the search qu
"@type": "BlogPosting",
"headline": "September, 2021",
"url": "https://alanorth.github.io/cgspace-notes/2021-09/",
"wordCount": "2812",
"wordCount": "2864",
"datePublished": "2021-09-01T09:14:07+03:00",
"dateModified": "2021-09-28T22:00:36+03:00",
"dateModified": "2021-10-04T11:10:54+03:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@ -223,7 +223,7 @@ $ docker-compose build
<li>Some people from the Alliance contacted me last week about AICCRA metadata
<ul>
<li>They have internal things called Components and Clusters, so they were asking how to store these in CGSpace</li>
<li>I suggested adding new metadata values: <code>cg.subject.aiccraComponent</code> and cg.subject.aiccraCluster`</li>
<li>I suggested adding new metadata values: <code>cg.subject.aiccraComponent</code> and <code>cg.subject.aiccraCluster</code></li>
<li>On second thought, these are identifiers so perhaps this is better: <code>cg.identifier.aiccraComponent</code> and <code>cg.identifier.aiccraCluster</code></li>
</ul>
</li>
@ -558,6 +558,15 @@ $ csvcut -c subject,'match type' /tmp/2021-09-29-ilri-subjects.csv | sed -e 's/m
</ul>
</li>
</ul>
<h2 id="2021-09-30">2021-09-30</h2>
<ul>
<li>Look over 292 non-IWMI publications from Udana for inclusion into the Virtual library on water management collection on CGSpace
<ul>
<li>I did some minor cleanup to remove blank columns and run it through the csv-metadata-quality tool</li>
<li>I told him to add licenses and journal volume/issue and asked Abenet for input as well</li>
</ul>
</li>
</ul>
<!-- raw HTML omitted -->
@ -579,6 +588,8 @@ $ csvcut -c subject,'match type' /tmp/2021-09-29-ilri-subjects.csv | sed -e 's/m
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -587,8 +598,6 @@ $ csvcut -c subject,'match type' /tmp/2021-09-29-ilri-subjects.csv | sed -e 's/m
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

309
docs/2021-10/index.html Normal file
View File

@ -0,0 +1,309 @@
<!DOCTYPE html>
<html lang="en" >
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
<meta property="og:title" content="October, 2021" />
<meta property="og:description" content="2021-10-01
Export all affiliations on CGSpace and run them against the latest RoR data dump:
localhost/dspace63= &gt; \COPY (SELECT DISTINCT text_value as &quot;cg.contributor.affiliation&quot;, count(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id = 211 GROUP BY text_value ORDER BY count DESC) to /tmp/2021-10-01-affiliations.csv WITH CSV HEADER;
$ csvcut -c 1 /tmp/2021-10-01-affiliations.csv | sed 1d &gt; /tmp/2021-10-01-affiliations.txt
$ ./ilri/ror-lookup.py -i /tmp/2021-10-01-affiliations.txt -r 2021-09-23-ror-data.json -o /tmp/2021-10-01-affili
ations-matching.csv
$ csvgrep -c matched -m true /tmp/2021-10-01-affiliations-matching.csv | sed 1d | wc -l
1879
$ wc -l /tmp/2021-10-01-affiliations.txt
7100 /tmp/2021-10-01-affiliations.txt
So we have 1879/7100 (26.46%) matching already
" />
<meta property="og:type" content="article" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2021-10/" />
<meta property="article:published_time" content="2021-10-01T11:14:07+03:00" />
<meta property="article:modified_time" content="2021-10-01T11:14:07+03:00" />
<meta name="twitter:card" content="summary"/>
<meta name="twitter:title" content="October, 2021"/>
<meta name="twitter:description" content="2021-10-01
Export all affiliations on CGSpace and run them against the latest RoR data dump:
localhost/dspace63= &gt; \COPY (SELECT DISTINCT text_value as &quot;cg.contributor.affiliation&quot;, count(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id = 211 GROUP BY text_value ORDER BY count DESC) to /tmp/2021-10-01-affiliations.csv WITH CSV HEADER;
$ csvcut -c 1 /tmp/2021-10-01-affiliations.csv | sed 1d &gt; /tmp/2021-10-01-affiliations.txt
$ ./ilri/ror-lookup.py -i /tmp/2021-10-01-affiliations.txt -r 2021-09-23-ror-data.json -o /tmp/2021-10-01-affili
ations-matching.csv
$ csvgrep -c matched -m true /tmp/2021-10-01-affiliations-matching.csv | sed 1d | wc -l
1879
$ wc -l /tmp/2021-10-01-affiliations.txt
7100 /tmp/2021-10-01-affiliations.txt
So we have 1879/7100 (26.46%) matching already
"/>
<meta name="generator" content="Hugo 0.88.1" />
<script type="application/ld+json">
{
"@context": "http://schema.org",
"@type": "BlogPosting",
"headline": "October, 2021",
"url": "https://alanorth.github.io/cgspace-notes/2021-10/",
"wordCount": "697",
"datePublished": "2021-10-01T11:14:07+03:00",
"dateModified": "2021-10-01T11:14:07+03:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
},
"keywords": "Notes"
}
</script>
<link rel="canonical" href="https://alanorth.github.io/cgspace-notes/2021-10/">
<title>October, 2021 | CGSpace Notes</title>
<!-- combined, minified CSS -->
<link href="https://alanorth.github.io/cgspace-notes/css/style.beb8012edc08ba10be012f079d618dc243812267efe62e11f22fe49618f976a4.css" rel="stylesheet" integrity="sha256-vrgBLtwIuhC&#43;AS8HnWGNwkOBImfv5i4R8i/klhj5dqQ=" crossorigin="anonymous">
<!-- minified Font Awesome for SVG icons -->
<script defer src="https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity="sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz&#43;lcnA=" crossorigin="anonymous"></script>
<!-- RSS 2.0 feed -->
</head>
<body>
<div class="blog-masthead">
<div class="container">
<nav class="nav blog-nav">
<a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
</nav>
</div>
</div>
<header class="blog-header">
<div class="container">
<h1 class="blog-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/" rel="home">CGSpace Notes</a></h1>
<p class="lead blog-description" dir="auto">Documenting day-to-day work on the <a href="https://cgspace.cgiar.org">CGSpace</a> repository.</p>
</div>
</header>
<div class="container">
<div class="row">
<div class="col-sm-8 blog-main">
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-10/">October, 2021</a></h2>
<p class="blog-post-meta">
<time datetime="2021-10-01T11:14:07+03:00">Fri Oct 01, 2021</time>
in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2021-10-01">2021-10-01</h2>
<ul>
<li>Export all affiliations on CGSpace and run them against the latest RoR data dump:</li>
</ul>
<pre tabindex="0"><code class="language-console" data-lang="console">localhost/dspace63= &gt; \COPY (SELECT DISTINCT text_value as &quot;cg.contributor.affiliation&quot;, count(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id = 211 GROUP BY text_value ORDER BY count DESC) to /tmp/2021-10-01-affiliations.csv WITH CSV HEADER;
$ csvcut -c 1 /tmp/2021-10-01-affiliations.csv | sed 1d &gt; /tmp/2021-10-01-affiliations.txt
$ ./ilri/ror-lookup.py -i /tmp/2021-10-01-affiliations.txt -r 2021-09-23-ror-data.json -o /tmp/2021-10-01-affili
ations-matching.csv
$ csvgrep -c matched -m true /tmp/2021-10-01-affiliations-matching.csv | sed 1d | wc -l
1879
$ wc -l /tmp/2021-10-01-affiliations.txt
7100 /tmp/2021-10-01-affiliations.txt
</code></pre><ul>
<li>So we have 1879/7100 (26.46%) matching already</li>
</ul>
<h2 id="2021-10-03">2021-10-03</h2>
<ul>
<li>Dominique from IWMI asked me for information about how CGSpace partners are using CGSpace APIs to feed their websites</li>
<li>Start a fresh indexing on AReS</li>
<li>Udana sent me his file of 292 non-IWMI publications for the Virtual library on water management
<ul>
<li>He added licenses</li>
<li>I want to clean up the <code>dcterms.extent</code> field though because it has volume, issue, and pages there</li>
<li>I cloned the column several times and extracted values based on their positions, for example:
<ul>
<li>Volume: <code>value.partition(&quot;:&quot;)[0]</code></li>
<li>Issue: <code>value.partition(&quot;(&quot;)[2].partition(&quot;)&quot;)[0]</code></li>
<li>Page: <code>&quot;p. &quot; + value.replace(&quot;.&quot;, &quot;&quot;)</code></li>
</ul>
</li>
</ul>
</li>
</ul>
<h2 id="2021-10-04">2021-10-04</h2>
<ul>
<li>Start looking at the last month of Solr statistics on CGSpace
<ul>
<li>I see a number of IPs with &ldquo;normal&rdquo; user agents who clearly behave like bots
<ul>
<li>198.15.130.18: 21,000 requests to /discover with a normal-looking user agent, from ASN 11282 (SERVERYOU, US)</li>
<li>93.158.90.107: 8,500 requests to handle and browse links with a Firefox 84.0 user agent, from ASN 12552 (IPO-EU, SE)</li>
<li>193.235.141.162: 4,800 requests to handle, browse, and discovery links with a Firefox 84.0 user agent, from ASN 51747 (INTERNETBOLAGET, SE)</li>
<li>3.225.28.105: 2,900 requests to REST API for the CIAT Story Maps collection with a normal user agent, from ASN 14618 (AMAZON-AES, US)</li>
<li>34.228.236.6: 2,800 requests to discovery for the CGIAR System community with user agent <code>Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)</code>, from ASN 14618 (AMAZON-AES, US)</li>
<li>18.212.137.2: 2,800 requests to discovery for the CGIAR System community with user agent <code>Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)</code>, from ASN 14618 (AMAZON-AES, US)</li>
<li>3.81.123.72: 2,800 requests to discovery and handles for the CGIAR System community with user agent <code>Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)</code>, from ASN 14618 (AMAZON-AES, US)</li>
<li>3.227.16.188: 2,800 requests to discovery and handles for the CGIAR System community with user agent <code>Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)</code>, from ASN 14618 (AMAZON-AES, US)</li>
</ul>
</li>
<li>Looking closer into the requests with this Mozilla/4.0 user agent, I see 500+ IPs using it:</li>
</ul>
</li>
</ul>
<pre tabindex="0"><code class="language-console" data-lang="console"># zcat --force /var/log/nginx/*.log* | grep 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)' | awk '{print $1}' | sort | uniq &gt; /tmp/mozilla-4.0-ips.txt
# wc -l /tmp/mozilla-4.0-ips.txt
543 /tmp/mozilla-4.0-ips.txt
</code></pre><ul>
<li>Then I resolved the IPs and extracted the ones belonging to Amazon:</li>
</ul>
<pre tabindex="0"><code class="language-console" data-lang="console">$ ./ilri/resolve-addresses-geoip2.py -i /tmp/mozilla-4.0-ips.txt -k &quot;$ABUSEIPDB_API_KEY&quot; -o /tmp/mozilla-4.0-ips.csv
$ csvgrep -c asn -m 14618 /tmp/mozilla-4.0-ips.csv | csvcut -c ip | sed 1d | tee /tmp/amazon-ips.txt | wc -l
</code></pre><ul>
<li>I am thinking I will purge them all, as I have several indicators that they are bots: mysterious user agent, IP owned by Amazon</li>
<li>Even more interesting, these requests are weighted VERY heavily on the CGIAR System community:</li>
</ul>
<pre tabindex="0"><code class="language-console" data-lang="console"> 1592 GET /handle/10947/2526
1592 GET /handle/10947/2527
1592 GET /handle/10947/34
1593 GET /handle/10947/6
1594 GET /handle/10947/1
1598 GET /handle/10947/2515
1598 GET /handle/10947/2516
1599 GET /handle/10568/101335
1599 GET /handle/10568/91688
1599 GET /handle/10947/2517
1599 GET /handle/10947/2518
1599 GET /handle/10947/2519
1599 GET /handle/10947/2708
1599 GET /handle/10947/2871
1600 GET /handle/10568/89342
1600 GET /handle/10947/4467
1607 GET /handle/10568/103816
290382 GET /handle/10568/83389
</code></pre><ul>
<li>Before I purge all those I will ask someone Samuel Stacey from the System office to hopefully get an insight&hellip;</li>
<li>Meeting with Michael Victor, Peter, Jane, and Abenet about the future of repositories in the One CGIAR</li>
<li>Meeting with Michelle from Altmetric about their new CSV upload system
<ul>
<li>I sent her some examples of Handles that have DOIs, but no linked score (yet) to see if an association will be created when she uploads them</li>
</ul>
</li>
</ul>
<pre tabindex="0"><code class="language-csv" data-lang="csv">doi,handle
10.1016/j.agsy.2021.103263,10568/115288
10.3389/fgene.2021.723360,10568/115287
10.3389/fpls.2021.720670,10568/115285
</code></pre><ul>
<li>Extract the AGROVOC subjects from IWMI&rsquo;s 292 publications to validate them against AGROVOC:</li>
</ul>
<pre tabindex="0"><code class="language-console" data-lang="console">$ csvcut -c 'dcterms.subject[en_US]' ~/Downloads/2021-10-03-non-IWMI-publications.csv | sed -e 1d -e 's/||/\n/g' -e 's/&quot;//g' | sort -u &gt; /tmp/agrovoc.txt
$ ./ilri/agrovoc-lookup.py -i /tmp/agrovoc-sorted.txt -o /tmp/agrovoc-matches.csv
$ csvgrep -c 'number of matches' -m '0' /tmp/agrovoc-matches.csv | csvcut -c 1 &gt; /tmp/invalid-agrovoc.csv
</code></pre><!-- raw HTML omitted -->
</article>
</div> <!-- /.blog-main -->
<aside class="col-sm-3 ml-auto blog-sidebar">
<section class="sidebar-module">
<h4>Recent Posts</h4>
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
<li><a href="/cgspace-notes/2021-07/">July, 2021</a></li>
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
</ol>
</section>
<section class="sidebar-module">
<h4>Links</h4>
<ol class="list-unstyled">
<li><a href="https://cgspace.cgiar.org">CGSpace</a></li>
<li><a href="https://dspacetest.cgiar.org">DSpace Test</a></li>
<li><a href="https://github.com/ilri/DSpace">CGSpace @ GitHub</a></li>
</ol>
</section>
</aside>
</div> <!-- /.row -->
</div> <!-- /.container -->
<footer class="blog-footer">
<p dir="auto">
Blog template created by <a href="https://twitter.com/mdo">@mdo</a>, ported to Hugo by <a href='https://twitter.com/mralanorth'>@mralanorth</a>.
</p>
<p>
<a href="#">Back to top</a>
</p>
</footer>
</body>
</html>

View File

@ -95,6 +95,8 @@
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -103,8 +105,6 @@
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/" />
<meta property="og:updated_time" content="2021-09-28T22:00:36+03:00" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
@ -84,7 +84,7 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/categories/notes/">Notes</a></h2>
<p class="blog-post-meta"><time datetime="2021-09-01T09:14:07+03:00">Wed Sep 01, 2021</time> by Alan Orth</p>
<p class="blog-post-meta"><time datetime="2021-10-01T11:14:07+03:00">Fri Oct 01, 2021</time> by Alan Orth</p>
</header>
<a href='https://alanorth.github.io/cgspace-notes/categories/notes/'>Read more →</a>
@ -108,6 +108,8 @@
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -116,8 +118,6 @@
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -6,11 +6,11 @@
<description>Recent content in Categories on CGSpace Notes</description>
<generator>Hugo -- gohugo.io</generator>
<language>en-us</language>
<lastBuildDate>Wed, 01 Sep 2021 09:14:07 +0300</lastBuildDate><atom:link href="https://alanorth.github.io/cgspace-notes/categories/index.xml" rel="self" type="application/rss+xml" />
<lastBuildDate>Fri, 01 Oct 2021 11:14:07 +0300</lastBuildDate><atom:link href="https://alanorth.github.io/cgspace-notes/categories/index.xml" rel="self" type="application/rss+xml" />
<item>
<title>Notes</title>
<link>https://alanorth.github.io/cgspace-notes/categories/notes/</link>
<pubDate>Wed, 01 Sep 2021 09:14:07 +0300</pubDate>
<pubDate>Fri, 01 Oct 2021 11:14:07 +0300</pubDate>
<guid>https://alanorth.github.io/cgspace-notes/categories/notes/</guid>
<description></description>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2021-09-28T22:00:36+03:00" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
@ -81,6 +81,38 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-10/">October, 2021</a></h2>
<p class="blog-post-meta"><time datetime="2021-10-01T11:14:07+03:00">Fri Oct 01, 2021</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2021-10-01">2021-10-01</h2>
<ul>
<li>Export all affiliations on CGSpace and run them against the latest RoR data dump:</li>
</ul>
<pre tabindex="0"><code class="language-console" data-lang="console">localhost/dspace63= &gt; \COPY (SELECT DISTINCT text_value as &quot;cg.contributor.affiliation&quot;, count(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id = 211 GROUP BY text_value ORDER BY count DESC) to /tmp/2021-10-01-affiliations.csv WITH CSV HEADER;
$ csvcut -c 1 /tmp/2021-10-01-affiliations.csv | sed 1d &gt; /tmp/2021-10-01-affiliations.txt
$ ./ilri/ror-lookup.py -i /tmp/2021-10-01-affiliations.txt -r 2021-09-23-ror-data.json -o /tmp/2021-10-01-affili
ations-matching.csv
$ csvgrep -c matched -m true /tmp/2021-10-01-affiliations-matching.csv | sed 1d | wc -l
1879
$ wc -l /tmp/2021-10-01-affiliations.txt
7100 /tmp/2021-10-01-affiliations.txt
</code></pre><ul>
<li>So we have 1879/7100 (26.46%) matching already</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2021-10/'>Read more →</a>
</article>
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-09/">September, 2021</a></h2>
@ -333,40 +365,6 @@ COPY 20994
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-01/">January, 2021</a></h2>
<p class="blog-post-meta"><time datetime="2021-01-03T10:13:54+02:00">Sun Jan 03, 2021</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2021-01-03">2021-01-03</h2>
<ul>
<li>Peter notified me that some filters on AReS were broken again
<ul>
<li>It&rsquo;s the same issue with the field names getting <code>.keyword</code> appended to the end that I already <a href="https://github.com/ilri/OpenRXV/issues/66">filed an issue on OpenRXV about last month</a></li>
<li>I fixed the broken filters (careful to not edit any others, lest they break too!)</li>
</ul>
</li>
<li>Fix an issue with start page number for the DSpace REST API and statistics API in OpenRXV
<ul>
<li>The start page had been &ldquo;1&rdquo; in the UI, but in the backend they were doing some gymnastics to adjust to the zero-based offset/limit/page of the DSpace REST API and the statistics API</li>
<li>I adjusted it to default to 0 and added a note to the admin screen</li>
<li>I realized that this issue was actually causing the first page of 100 statistics to be missing&hellip;</li>
<li>For example, <a href="https://cgspace.cgiar.org/handle/10568/66839">this item</a> has 51 views on CGSpace, but 0 on AReS</li>
</ul>
</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2021-01/'>Read more →</a>
</article>
<nav class="blog-pagination">
@ -391,6 +389,8 @@ COPY 20994
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -399,8 +399,6 @@ COPY 20994
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -6,7 +6,30 @@
<description>Recent content in Notes on CGSpace Notes</description>
<generator>Hugo -- gohugo.io</generator>
<language>en-us</language>
<lastBuildDate>Wed, 01 Sep 2021 09:14:07 +0300</lastBuildDate><atom:link href="https://alanorth.github.io/cgspace-notes/categories/notes/index.xml" rel="self" type="application/rss+xml" />
<lastBuildDate>Fri, 01 Oct 2021 11:14:07 +0300</lastBuildDate><atom:link href="https://alanorth.github.io/cgspace-notes/categories/notes/index.xml" rel="self" type="application/rss+xml" />
<item>
<title>October, 2021</title>
<link>https://alanorth.github.io/cgspace-notes/2021-10/</link>
<pubDate>Fri, 01 Oct 2021 11:14:07 +0300</pubDate>
<guid>https://alanorth.github.io/cgspace-notes/2021-10/</guid>
<description>&lt;h2 id=&#34;2021-10-01&#34;&gt;2021-10-01&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Export all affiliations on CGSpace and run them against the latest RoR data dump:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code class=&#34;language-console&#34; data-lang=&#34;console&#34;&gt;localhost/dspace63= &amp;gt; \COPY (SELECT DISTINCT text_value as &amp;quot;cg.contributor.affiliation&amp;quot;, count(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id = 211 GROUP BY text_value ORDER BY count DESC) to /tmp/2021-10-01-affiliations.csv WITH CSV HEADER;
$ csvcut -c 1 /tmp/2021-10-01-affiliations.csv | sed 1d &amp;gt; /tmp/2021-10-01-affiliations.txt
$ ./ilri/ror-lookup.py -i /tmp/2021-10-01-affiliations.txt -r 2021-09-23-ror-data.json -o /tmp/2021-10-01-affili
ations-matching.csv
$ csvgrep -c matched -m true /tmp/2021-10-01-affiliations-matching.csv | sed 1d | wc -l
1879
$ wc -l /tmp/2021-10-01-affiliations.txt
7100 /tmp/2021-10-01-affiliations.txt
&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
&lt;li&gt;So we have 1879/7100 (26.46%) matching already&lt;/li&gt;
&lt;/ul&gt;</description>
</item>
<item>
<title>September, 2021</title>
<link>https://alanorth.github.io/cgspace-notes/2021-09/</link>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2021-09-28T22:00:36+03:00" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
@ -81,6 +81,40 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-01/">January, 2021</a></h2>
<p class="blog-post-meta"><time datetime="2021-01-03T10:13:54+02:00">Sun Jan 03, 2021</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2021-01-03">2021-01-03</h2>
<ul>
<li>Peter notified me that some filters on AReS were broken again
<ul>
<li>It&rsquo;s the same issue with the field names getting <code>.keyword</code> appended to the end that I already <a href="https://github.com/ilri/OpenRXV/issues/66">filed an issue on OpenRXV about last month</a></li>
<li>I fixed the broken filters (careful to not edit any others, lest they break too!)</li>
</ul>
</li>
<li>Fix an issue with start page number for the DSpace REST API and statistics API in OpenRXV
<ul>
<li>The start page had been &ldquo;1&rdquo; in the UI, but in the backend they were doing some gymnastics to adjust to the zero-based offset/limit/page of the DSpace REST API and the statistics API</li>
<li>I adjusted it to default to 0 and added a note to the admin screen</li>
<li>I realized that this issue was actually causing the first page of 100 statistics to be missing&hellip;</li>
<li>For example, <a href="https://cgspace.cgiar.org/handle/10568/66839">this item</a> has 51 views on CGSpace, but 0 on AReS</li>
</ul>
</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2021-01/'>Read more →</a>
</article>
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-12/">December, 2020</a></h2>
@ -324,39 +358,6 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-04/">April, 2020</a></h2>
<p class="blog-post-meta"><time datetime="2020-04-02T10:53:24+03:00">Thu Apr 02, 2020</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2020-04-02">2020-04-02</h2>
<ul>
<li>Maria asked me to update Charles Staver&rsquo;s ORCID iD in the submission template and on CGSpace, as his name was lower case before, and now he has corrected it
<ul>
<li>I updated the fifty-eight existing items on CGSpace</li>
</ul>
</li>
<li>Looking into the items Udana had asked about last week that were missing Altmetric donuts:
<ul>
<li><a href="https://hdl.handle.net/10568/103225">The first</a> is still missing its DOI, so I added it and <a href="https://twitter.com/mralanorth/status/1245632619661766657">tweeted its handle</a> (after a few hours there was a donut with score 222)</li>
<li><a href="https://hdl.handle.net/10568/106899">The second item</a> now has a donut with score 2 since I <a href="https://twitter.com/mralanorth/status/1243158045540134913">tweeted its handle</a> last week</li>
<li><a href="https://hdl.handle.net/10568/107258">The third item</a> now has a donut with score 1 since I <a href="https://twitter.com/mralanorth/status/1243158786392625153">tweeted it</a> last week</li>
</ul>
</li>
<li>On the same note, the <a href="https://hdl.handle.net/10568/106573">one item</a> Abenet pointed out last week now has a donut with score of 104 after I <a href="https://twitter.com/mralanorth/status/1243163710241345536">tweeted it</a> last week</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2020-04/'>Read more →</a>
</article>
<nav class="blog-pagination">
<a class="btn btn-outline-primary" href="/cgspace-notes/categories/notes/" rel="prev" role="button">Previous page</a>
@ -381,6 +382,8 @@
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -389,8 +392,6 @@
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2021-09-28T22:00:36+03:00" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
@ -81,6 +81,39 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-04/">April, 2020</a></h2>
<p class="blog-post-meta"><time datetime="2020-04-02T10:53:24+03:00">Thu Apr 02, 2020</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2020-04-02">2020-04-02</h2>
<ul>
<li>Maria asked me to update Charles Staver&rsquo;s ORCID iD in the submission template and on CGSpace, as his name was lower case before, and now he has corrected it
<ul>
<li>I updated the fifty-eight existing items on CGSpace</li>
</ul>
</li>
<li>Looking into the items Udana had asked about last week that were missing Altmetric donuts:
<ul>
<li><a href="https://hdl.handle.net/10568/103225">The first</a> is still missing its DOI, so I added it and <a href="https://twitter.com/mralanorth/status/1245632619661766657">tweeted its handle</a> (after a few hours there was a donut with score 222)</li>
<li><a href="https://hdl.handle.net/10568/106899">The second item</a> now has a donut with score 2 since I <a href="https://twitter.com/mralanorth/status/1243158045540134913">tweeted its handle</a> last week</li>
<li><a href="https://hdl.handle.net/10568/107258">The third item</a> now has a donut with score 1 since I <a href="https://twitter.com/mralanorth/status/1243158786392625153">tweeted it</a> last week</li>
</ul>
</li>
<li>On the same note, the <a href="https://hdl.handle.net/10568/106573">one item</a> Abenet pointed out last week now has a donut with score of 104 after I <a href="https://twitter.com/mralanorth/status/1243163710241345536">tweeted it</a> last week</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2020-04/'>Read more →</a>
</article>
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-03/">March, 2020</a></h2>
@ -368,32 +401,6 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-06/">June, 2019</a></h2>
<p class="blog-post-meta"><time datetime="2019-06-02T10:57:51+03:00">Sun Jun 02, 2019</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2019-06-02">2019-06-02</h2>
<ul>
<li>Merge the <a href="https://github.com/ilri/DSpace/pull/425">Solr filterCache</a> and <a href="https://github.com/ilri/DSpace/pull/426">XMLUI ISI journal</a> changes to the <code>5_x-prod</code> branch and deploy on CGSpace</li>
<li>Run system updates on CGSpace (linode18) and reboot it</li>
</ul>
<h2 id="2019-06-03">2019-06-03</h2>
<ul>
<li>Skype with Marie-Angélique and Abenet about <a href="https://agriculturalsemantics.github.io/cg-core/cgcore.html">CG Core v2</a></li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2019-06/'>Read more →</a>
</article>
<nav class="blog-pagination">
<a class="btn btn-outline-primary" href="/cgspace-notes/categories/notes/page/2/" rel="prev" role="button">Previous page</a>
@ -418,6 +425,8 @@
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -426,8 +435,6 @@
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2021-09-28T22:00:36+03:00" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
@ -81,6 +81,32 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-06/">June, 2019</a></h2>
<p class="blog-post-meta"><time datetime="2019-06-02T10:57:51+03:00">Sun Jun 02, 2019</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2019-06-02">2019-06-02</h2>
<ul>
<li>Merge the <a href="https://github.com/ilri/DSpace/pull/425">Solr filterCache</a> and <a href="https://github.com/ilri/DSpace/pull/426">XMLUI ISI journal</a> changes to the <code>5_x-prod</code> branch and deploy on CGSpace</li>
<li>Run system updates on CGSpace (linode18) and reboot it</li>
</ul>
<h2 id="2019-06-03">2019-06-03</h2>
<ul>
<li>Skype with Marie-Angélique and Abenet about <a href="https://agriculturalsemantics.github.io/cg-core/cgcore.html">CG Core v2</a></li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2019-06/'>Read more →</a>
</article>
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-05/">May, 2019</a></h2>
@ -365,38 +391,6 @@ sys 0m1.979s
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-08/">August, 2018</a></h2>
<p class="blog-post-meta"><time datetime="2018-08-01T11:52:54+03:00">Wed Aug 01, 2018</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2018-08-01">2018-08-01</h2>
<ul>
<li>DSpace Test had crashed at some point yesterday morning and I see the following in <code>dmesg</code>:</li>
</ul>
<pre tabindex="0"><code>[Tue Jul 31 00:00:41 2018] Out of memory: Kill process 1394 (java) score 668 or sacrifice child
[Tue Jul 31 00:00:41 2018] Killed process 1394 (java) total-vm:15601860kB, anon-rss:5355528kB, file-rss:0kB, shmem-rss:0kB
[Tue Jul 31 00:00:41 2018] oom_reaper: reaped process 1394 (java), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
</code></pre><ul>
<li>Judging from the time of the crash it was probably related to the Discovery indexing that starts at midnight</li>
<li>From the DSpace log I see that eventually Solr stopped responding, so I guess the <code>java</code> process that was OOM killed above was Tomcat&rsquo;s</li>
<li>I&rsquo;m not sure why Tomcat didn&rsquo;t crash with an OutOfMemoryError&hellip;</li>
<li>Anyways, perhaps I should increase the JVM heap from 5120m to 6144m like we did a few months ago when we tried to run the whole CGSpace Solr core</li>
<li>The server only has 8GB of RAM so we&rsquo;ll eventually need to upgrade to a larger one because we&rsquo;ll start starving the OS, PostgreSQL, and command line batch processes</li>
<li>I ran all system updates on DSpace Test and rebooted it</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2018-08/'>Read more →</a>
</article>
<nav class="blog-pagination">
<a class="btn btn-outline-primary" href="/cgspace-notes/categories/notes/page/3/" rel="prev" role="button">Previous page</a>
@ -421,6 +415,8 @@ sys 0m1.979s
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -429,8 +425,6 @@ sys 0m1.979s
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2021-09-28T22:00:36+03:00" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
@ -81,6 +81,38 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-08/">August, 2018</a></h2>
<p class="blog-post-meta"><time datetime="2018-08-01T11:52:54+03:00">Wed Aug 01, 2018</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2018-08-01">2018-08-01</h2>
<ul>
<li>DSpace Test had crashed at some point yesterday morning and I see the following in <code>dmesg</code>:</li>
</ul>
<pre tabindex="0"><code>[Tue Jul 31 00:00:41 2018] Out of memory: Kill process 1394 (java) score 668 or sacrifice child
[Tue Jul 31 00:00:41 2018] Killed process 1394 (java) total-vm:15601860kB, anon-rss:5355528kB, file-rss:0kB, shmem-rss:0kB
[Tue Jul 31 00:00:41 2018] oom_reaper: reaped process 1394 (java), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
</code></pre><ul>
<li>Judging from the time of the crash it was probably related to the Discovery indexing that starts at midnight</li>
<li>From the DSpace log I see that eventually Solr stopped responding, so I guess the <code>java</code> process that was OOM killed above was Tomcat&rsquo;s</li>
<li>I&rsquo;m not sure why Tomcat didn&rsquo;t crash with an OutOfMemoryError&hellip;</li>
<li>Anyways, perhaps I should increase the JVM heap from 5120m to 6144m like we did a few months ago when we tried to run the whole CGSpace Solr core</li>
<li>The server only has 8GB of RAM so we&rsquo;ll eventually need to upgrade to a larger one because we&rsquo;ll start starving the OS, PostgreSQL, and command line batch processes</li>
<li>I ran all system updates on DSpace Test and rebooted it</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2018-08/'>Read more →</a>
</article>
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-07/">July, 2018</a></h2>
@ -381,32 +413,6 @@ COPY 54701
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-10/">October, 2017</a></h2>
<p class="blog-post-meta"><time datetime="2017-10-01T08:07:54+03:00">Sun Oct 01, 2017</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2017-10-01">2017-10-01</h2>
<ul>
<li>Peter emailed to point out that many items in the <a href="https://cgspace.cgiar.org/handle/10568/2703">ILRI archive collection</a> have multiple handles:</li>
</ul>
<pre tabindex="0"><code>http://hdl.handle.net/10568/78495||http://hdl.handle.net/10568/79336
</code></pre><ul>
<li>There appears to be a pattern but I&rsquo;ll have to look a bit closer and try to clean them up automatically, either in SQL or in OpenRefine</li>
<li>Add Katherine Lutz to the groups for content submission and edit steps of the CGIAR System collections</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2017-10/'>Read more →</a>
</article>
<nav class="blog-pagination">
<a class="btn btn-outline-primary" href="/cgspace-notes/categories/notes/page/4/" rel="prev" role="button">Previous page</a>
@ -431,6 +437,8 @@ COPY 54701
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -439,8 +447,6 @@ COPY 54701
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
<meta property="og:updated_time" content="2021-09-28T22:00:36+03:00" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
@ -81,6 +81,32 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-10/">October, 2017</a></h2>
<p class="blog-post-meta"><time datetime="2017-10-01T08:07:54+03:00">Sun Oct 01, 2017</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2017-10-01">2017-10-01</h2>
<ul>
<li>Peter emailed to point out that many items in the <a href="https://cgspace.cgiar.org/handle/10568/2703">ILRI archive collection</a> have multiple handles:</li>
</ul>
<pre tabindex="0"><code>http://hdl.handle.net/10568/78495||http://hdl.handle.net/10568/79336
</code></pre><ul>
<li>There appears to be a pattern but I&rsquo;ll have to look a bit closer and try to clean them up automatically, either in SQL or in OpenRefine</li>
<li>Add Katherine Lutz to the groups for content submission and edit steps of the CGIAR System collections</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2017-10/'>Read more →</a>
</article>
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/cgiar-library-migration/">CGIAR Library Migration</a></h2>
@ -124,6 +150,8 @@
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -132,8 +160,6 @@
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -282,6 +282,8 @@ dspace=# select setval('handle_seq',86873);
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -290,8 +292,6 @@ dspace=# select setval('handle_seq',86873);
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -467,6 +467,8 @@
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -475,8 +477,6 @@
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -471,6 +471,8 @@ $ chrt -b 0 dspace solr-upgrade-statistics-6x -n 2500000 -i statistics-2018
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -479,8 +481,6 @@ $ chrt -b 0 dspace solr-upgrade-statistics-6x -n 2500000 -i statistics-2018
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2021-09-28T22:00:36+03:00" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
@ -31,7 +31,7 @@
"@type": "Person",
"name": "Alan Orth"
},
"dateModified": "2021-09-01T09:14:07+03:00",
"dateModified": "2021-10-01T11:14:07+03:00",
"keywords": "notes, migration, notes",
"description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
}
@ -96,6 +96,38 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-10/">October, 2021</a></h2>
<p class="blog-post-meta"><time datetime="2021-10-01T11:14:07+03:00">Fri Oct 01, 2021</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2021-10-01">2021-10-01</h2>
<ul>
<li>Export all affiliations on CGSpace and run them against the latest RoR data dump:</li>
</ul>
<pre tabindex="0"><code class="language-console" data-lang="console">localhost/dspace63= &gt; \COPY (SELECT DISTINCT text_value as &quot;cg.contributor.affiliation&quot;, count(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id = 211 GROUP BY text_value ORDER BY count DESC) to /tmp/2021-10-01-affiliations.csv WITH CSV HEADER;
$ csvcut -c 1 /tmp/2021-10-01-affiliations.csv | sed 1d &gt; /tmp/2021-10-01-affiliations.txt
$ ./ilri/ror-lookup.py -i /tmp/2021-10-01-affiliations.txt -r 2021-09-23-ror-data.json -o /tmp/2021-10-01-affili
ations-matching.csv
$ csvgrep -c matched -m true /tmp/2021-10-01-affiliations-matching.csv | sed 1d | wc -l
1879
$ wc -l /tmp/2021-10-01-affiliations.txt
7100 /tmp/2021-10-01-affiliations.txt
</code></pre><ul>
<li>So we have 1879/7100 (26.46%) matching already</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2021-10/'>Read more →</a>
</article>
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-09/">September, 2021</a></h2>
@ -348,40 +380,6 @@ COPY 20994
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-01/">January, 2021</a></h2>
<p class="blog-post-meta"><time datetime="2021-01-03T10:13:54+02:00">Sun Jan 03, 2021</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2021-01-03">2021-01-03</h2>
<ul>
<li>Peter notified me that some filters on AReS were broken again
<ul>
<li>It&rsquo;s the same issue with the field names getting <code>.keyword</code> appended to the end that I already <a href="https://github.com/ilri/OpenRXV/issues/66">filed an issue on OpenRXV about last month</a></li>
<li>I fixed the broken filters (careful to not edit any others, lest they break too!)</li>
</ul>
</li>
<li>Fix an issue with start page number for the DSpace REST API and statistics API in OpenRXV
<ul>
<li>The start page had been &ldquo;1&rdquo; in the UI, but in the backend they were doing some gymnastics to adjust to the zero-based offset/limit/page of the DSpace REST API and the statistics API</li>
<li>I adjusted it to default to 0 and added a note to the admin screen</li>
<li>I realized that this issue was actually causing the first page of 100 statistics to be missing&hellip;</li>
<li>For example, <a href="https://cgspace.cgiar.org/handle/10568/66839">this item</a> has 51 views on CGSpace, but 0 on AReS</li>
</ul>
</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2021-01/'>Read more →</a>
</article>
<nav class="blog-pagination">
@ -406,6 +404,8 @@ COPY 20994
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -414,8 +414,6 @@ COPY 20994
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -6,7 +6,30 @@
<description>Recent content on CGSpace Notes</description>
<generator>Hugo -- gohugo.io</generator>
<language>en-us</language>
<lastBuildDate>Wed, 01 Sep 2021 09:14:07 +0300</lastBuildDate><atom:link href="https://alanorth.github.io/cgspace-notes/index.xml" rel="self" type="application/rss+xml" />
<lastBuildDate>Fri, 01 Oct 2021 11:14:07 +0300</lastBuildDate><atom:link href="https://alanorth.github.io/cgspace-notes/index.xml" rel="self" type="application/rss+xml" />
<item>
<title>October, 2021</title>
<link>https://alanorth.github.io/cgspace-notes/2021-10/</link>
<pubDate>Fri, 01 Oct 2021 11:14:07 +0300</pubDate>
<guid>https://alanorth.github.io/cgspace-notes/2021-10/</guid>
<description>&lt;h2 id=&#34;2021-10-01&#34;&gt;2021-10-01&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Export all affiliations on CGSpace and run them against the latest RoR data dump:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code class=&#34;language-console&#34; data-lang=&#34;console&#34;&gt;localhost/dspace63= &amp;gt; \COPY (SELECT DISTINCT text_value as &amp;quot;cg.contributor.affiliation&amp;quot;, count(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id = 211 GROUP BY text_value ORDER BY count DESC) to /tmp/2021-10-01-affiliations.csv WITH CSV HEADER;
$ csvcut -c 1 /tmp/2021-10-01-affiliations.csv | sed 1d &amp;gt; /tmp/2021-10-01-affiliations.txt
$ ./ilri/ror-lookup.py -i /tmp/2021-10-01-affiliations.txt -r 2021-09-23-ror-data.json -o /tmp/2021-10-01-affili
ations-matching.csv
$ csvgrep -c matched -m true /tmp/2021-10-01-affiliations-matching.csv | sed 1d | wc -l
1879
$ wc -l /tmp/2021-10-01-affiliations.txt
7100 /tmp/2021-10-01-affiliations.txt
&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
&lt;li&gt;So we have 1879/7100 (26.46%) matching already&lt;/li&gt;
&lt;/ul&gt;</description>
</item>
<item>
<title>September, 2021</title>
<link>https://alanorth.github.io/cgspace-notes/2021-09/</link>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2021-09-28T22:00:36+03:00" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
@ -31,7 +31,7 @@
"@type": "Person",
"name": "Alan Orth"
},
"dateModified": "2021-09-01T09:14:07+03:00",
"dateModified": "2021-10-01T11:14:07+03:00",
"keywords": "notes, migration, notes",
"description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
}
@ -96,6 +96,40 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-01/">January, 2021</a></h2>
<p class="blog-post-meta"><time datetime="2021-01-03T10:13:54+02:00">Sun Jan 03, 2021</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2021-01-03">2021-01-03</h2>
<ul>
<li>Peter notified me that some filters on AReS were broken again
<ul>
<li>It&rsquo;s the same issue with the field names getting <code>.keyword</code> appended to the end that I already <a href="https://github.com/ilri/OpenRXV/issues/66">filed an issue on OpenRXV about last month</a></li>
<li>I fixed the broken filters (careful to not edit any others, lest they break too!)</li>
</ul>
</li>
<li>Fix an issue with start page number for the DSpace REST API and statistics API in OpenRXV
<ul>
<li>The start page had been &ldquo;1&rdquo; in the UI, but in the backend they were doing some gymnastics to adjust to the zero-based offset/limit/page of the DSpace REST API and the statistics API</li>
<li>I adjusted it to default to 0 and added a note to the admin screen</li>
<li>I realized that this issue was actually causing the first page of 100 statistics to be missing&hellip;</li>
<li>For example, <a href="https://cgspace.cgiar.org/handle/10568/66839">this item</a> has 51 views on CGSpace, but 0 on AReS</li>
</ul>
</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2021-01/'>Read more →</a>
</article>
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-12/">December, 2020</a></h2>
@ -339,39 +373,6 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-04/">April, 2020</a></h2>
<p class="blog-post-meta"><time datetime="2020-04-02T10:53:24+03:00">Thu Apr 02, 2020</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2020-04-02">2020-04-02</h2>
<ul>
<li>Maria asked me to update Charles Staver&rsquo;s ORCID iD in the submission template and on CGSpace, as his name was lower case before, and now he has corrected it
<ul>
<li>I updated the fifty-eight existing items on CGSpace</li>
</ul>
</li>
<li>Looking into the items Udana had asked about last week that were missing Altmetric donuts:
<ul>
<li><a href="https://hdl.handle.net/10568/103225">The first</a> is still missing its DOI, so I added it and <a href="https://twitter.com/mralanorth/status/1245632619661766657">tweeted its handle</a> (after a few hours there was a donut with score 222)</li>
<li><a href="https://hdl.handle.net/10568/106899">The second item</a> now has a donut with score 2 since I <a href="https://twitter.com/mralanorth/status/1243158045540134913">tweeted its handle</a> last week</li>
<li><a href="https://hdl.handle.net/10568/107258">The third item</a> now has a donut with score 1 since I <a href="https://twitter.com/mralanorth/status/1243158786392625153">tweeted it</a> last week</li>
</ul>
</li>
<li>On the same note, the <a href="https://hdl.handle.net/10568/106573">one item</a> Abenet pointed out last week now has a donut with score of 104 after I <a href="https://twitter.com/mralanorth/status/1243163710241345536">tweeted it</a> last week</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2020-04/'>Read more →</a>
</article>
<nav class="blog-pagination">
<a class="btn btn-outline-primary" href="/cgspace-notes/" rel="prev" role="button">Previous page</a>
@ -396,6 +397,8 @@
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -404,8 +407,6 @@
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2021-09-28T22:00:36+03:00" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
@ -31,7 +31,7 @@
"@type": "Person",
"name": "Alan Orth"
},
"dateModified": "2021-09-01T09:14:07+03:00",
"dateModified": "2021-10-01T11:14:07+03:00",
"keywords": "notes, migration, notes",
"description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
}
@ -96,6 +96,39 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-04/">April, 2020</a></h2>
<p class="blog-post-meta"><time datetime="2020-04-02T10:53:24+03:00">Thu Apr 02, 2020</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2020-04-02">2020-04-02</h2>
<ul>
<li>Maria asked me to update Charles Staver&rsquo;s ORCID iD in the submission template and on CGSpace, as his name was lower case before, and now he has corrected it
<ul>
<li>I updated the fifty-eight existing items on CGSpace</li>
</ul>
</li>
<li>Looking into the items Udana had asked about last week that were missing Altmetric donuts:
<ul>
<li><a href="https://hdl.handle.net/10568/103225">The first</a> is still missing its DOI, so I added it and <a href="https://twitter.com/mralanorth/status/1245632619661766657">tweeted its handle</a> (after a few hours there was a donut with score 222)</li>
<li><a href="https://hdl.handle.net/10568/106899">The second item</a> now has a donut with score 2 since I <a href="https://twitter.com/mralanorth/status/1243158045540134913">tweeted its handle</a> last week</li>
<li><a href="https://hdl.handle.net/10568/107258">The third item</a> now has a donut with score 1 since I <a href="https://twitter.com/mralanorth/status/1243158786392625153">tweeted it</a> last week</li>
</ul>
</li>
<li>On the same note, the <a href="https://hdl.handle.net/10568/106573">one item</a> Abenet pointed out last week now has a donut with score of 104 after I <a href="https://twitter.com/mralanorth/status/1243163710241345536">tweeted it</a> last week</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2020-04/'>Read more →</a>
</article>
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-03/">March, 2020</a></h2>
@ -383,32 +416,6 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-06/">June, 2019</a></h2>
<p class="blog-post-meta"><time datetime="2019-06-02T10:57:51+03:00">Sun Jun 02, 2019</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2019-06-02">2019-06-02</h2>
<ul>
<li>Merge the <a href="https://github.com/ilri/DSpace/pull/425">Solr filterCache</a> and <a href="https://github.com/ilri/DSpace/pull/426">XMLUI ISI journal</a> changes to the <code>5_x-prod</code> branch and deploy on CGSpace</li>
<li>Run system updates on CGSpace (linode18) and reboot it</li>
</ul>
<h2 id="2019-06-03">2019-06-03</h2>
<ul>
<li>Skype with Marie-Angélique and Abenet about <a href="https://agriculturalsemantics.github.io/cg-core/cgcore.html">CG Core v2</a></li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2019-06/'>Read more →</a>
</article>
<nav class="blog-pagination">
<a class="btn btn-outline-primary" href="/cgspace-notes/page/2/" rel="prev" role="button">Previous page</a>
@ -433,6 +440,8 @@
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -441,8 +450,6 @@
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2021-09-28T22:00:36+03:00" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
@ -31,7 +31,7 @@
"@type": "Person",
"name": "Alan Orth"
},
"dateModified": "2021-09-01T09:14:07+03:00",
"dateModified": "2021-10-01T11:14:07+03:00",
"keywords": "notes, migration, notes",
"description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
}
@ -96,6 +96,32 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-06/">June, 2019</a></h2>
<p class="blog-post-meta"><time datetime="2019-06-02T10:57:51+03:00">Sun Jun 02, 2019</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2019-06-02">2019-06-02</h2>
<ul>
<li>Merge the <a href="https://github.com/ilri/DSpace/pull/425">Solr filterCache</a> and <a href="https://github.com/ilri/DSpace/pull/426">XMLUI ISI journal</a> changes to the <code>5_x-prod</code> branch and deploy on CGSpace</li>
<li>Run system updates on CGSpace (linode18) and reboot it</li>
</ul>
<h2 id="2019-06-03">2019-06-03</h2>
<ul>
<li>Skype with Marie-Angélique and Abenet about <a href="https://agriculturalsemantics.github.io/cg-core/cgcore.html">CG Core v2</a></li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2019-06/'>Read more →</a>
</article>
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-05/">May, 2019</a></h2>
@ -380,38 +406,6 @@ sys 0m1.979s
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-08/">August, 2018</a></h2>
<p class="blog-post-meta"><time datetime="2018-08-01T11:52:54+03:00">Wed Aug 01, 2018</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2018-08-01">2018-08-01</h2>
<ul>
<li>DSpace Test had crashed at some point yesterday morning and I see the following in <code>dmesg</code>:</li>
</ul>
<pre tabindex="0"><code>[Tue Jul 31 00:00:41 2018] Out of memory: Kill process 1394 (java) score 668 or sacrifice child
[Tue Jul 31 00:00:41 2018] Killed process 1394 (java) total-vm:15601860kB, anon-rss:5355528kB, file-rss:0kB, shmem-rss:0kB
[Tue Jul 31 00:00:41 2018] oom_reaper: reaped process 1394 (java), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
</code></pre><ul>
<li>Judging from the time of the crash it was probably related to the Discovery indexing that starts at midnight</li>
<li>From the DSpace log I see that eventually Solr stopped responding, so I guess the <code>java</code> process that was OOM killed above was Tomcat&rsquo;s</li>
<li>I&rsquo;m not sure why Tomcat didn&rsquo;t crash with an OutOfMemoryError&hellip;</li>
<li>Anyways, perhaps I should increase the JVM heap from 5120m to 6144m like we did a few months ago when we tried to run the whole CGSpace Solr core</li>
<li>The server only has 8GB of RAM so we&rsquo;ll eventually need to upgrade to a larger one because we&rsquo;ll start starving the OS, PostgreSQL, and command line batch processes</li>
<li>I ran all system updates on DSpace Test and rebooted it</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2018-08/'>Read more →</a>
</article>
<nav class="blog-pagination">
<a class="btn btn-outline-primary" href="/cgspace-notes/page/3/" rel="prev" role="button">Previous page</a>
@ -436,6 +430,8 @@ sys 0m1.979s
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -444,8 +440,6 @@ sys 0m1.979s
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2021-09-28T22:00:36+03:00" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
@ -31,7 +31,7 @@
"@type": "Person",
"name": "Alan Orth"
},
"dateModified": "2021-09-01T09:14:07+03:00",
"dateModified": "2021-10-01T11:14:07+03:00",
"keywords": "notes, migration, notes",
"description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
}
@ -96,6 +96,38 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-08/">August, 2018</a></h2>
<p class="blog-post-meta"><time datetime="2018-08-01T11:52:54+03:00">Wed Aug 01, 2018</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2018-08-01">2018-08-01</h2>
<ul>
<li>DSpace Test had crashed at some point yesterday morning and I see the following in <code>dmesg</code>:</li>
</ul>
<pre tabindex="0"><code>[Tue Jul 31 00:00:41 2018] Out of memory: Kill process 1394 (java) score 668 or sacrifice child
[Tue Jul 31 00:00:41 2018] Killed process 1394 (java) total-vm:15601860kB, anon-rss:5355528kB, file-rss:0kB, shmem-rss:0kB
[Tue Jul 31 00:00:41 2018] oom_reaper: reaped process 1394 (java), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
</code></pre><ul>
<li>Judging from the time of the crash it was probably related to the Discovery indexing that starts at midnight</li>
<li>From the DSpace log I see that eventually Solr stopped responding, so I guess the <code>java</code> process that was OOM killed above was Tomcat&rsquo;s</li>
<li>I&rsquo;m not sure why Tomcat didn&rsquo;t crash with an OutOfMemoryError&hellip;</li>
<li>Anyways, perhaps I should increase the JVM heap from 5120m to 6144m like we did a few months ago when we tried to run the whole CGSpace Solr core</li>
<li>The server only has 8GB of RAM so we&rsquo;ll eventually need to upgrade to a larger one because we&rsquo;ll start starving the OS, PostgreSQL, and command line batch processes</li>
<li>I ran all system updates on DSpace Test and rebooted it</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2018-08/'>Read more →</a>
</article>
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-07/">July, 2018</a></h2>
@ -396,32 +428,6 @@ COPY 54701
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-10/">October, 2017</a></h2>
<p class="blog-post-meta"><time datetime="2017-10-01T08:07:54+03:00">Sun Oct 01, 2017</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2017-10-01">2017-10-01</h2>
<ul>
<li>Peter emailed to point out that many items in the <a href="https://cgspace.cgiar.org/handle/10568/2703">ILRI archive collection</a> have multiple handles:</li>
</ul>
<pre tabindex="0"><code>http://hdl.handle.net/10568/78495||http://hdl.handle.net/10568/79336
</code></pre><ul>
<li>There appears to be a pattern but I&rsquo;ll have to look a bit closer and try to clean them up automatically, either in SQL or in OpenRefine</li>
<li>Add Katherine Lutz to the groups for content submission and edit steps of the CGIAR System collections</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2017-10/'>Read more →</a>
</article>
<nav class="blog-pagination">
<a class="btn btn-outline-primary" href="/cgspace-notes/page/4/" rel="prev" role="button">Previous page</a>
@ -446,6 +452,8 @@ COPY 54701
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -454,8 +462,6 @@ COPY 54701
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2021-09-28T22:00:36+03:00" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
@ -31,7 +31,7 @@
"@type": "Person",
"name": "Alan Orth"
},
"dateModified": "2021-09-01T09:14:07+03:00",
"dateModified": "2021-10-01T11:14:07+03:00",
"keywords": "notes, migration, notes",
"description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
}
@ -96,6 +96,32 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-10/">October, 2017</a></h2>
<p class="blog-post-meta"><time datetime="2017-10-01T08:07:54+03:00">Sun Oct 01, 2017</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2017-10-01">2017-10-01</h2>
<ul>
<li>Peter emailed to point out that many items in the <a href="https://cgspace.cgiar.org/handle/10568/2703">ILRI archive collection</a> have multiple handles:</li>
</ul>
<pre tabindex="0"><code>http://hdl.handle.net/10568/78495||http://hdl.handle.net/10568/79336
</code></pre><ul>
<li>There appears to be a pattern but I&rsquo;ll have to look a bit closer and try to clean them up automatically, either in SQL or in OpenRefine</li>
<li>Add Katherine Lutz to the groups for content submission and edit steps of the CGIAR System collections</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2017-10/'>Read more →</a>
</article>
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/cgiar-library-migration/">CGIAR Library Migration</a></h2>
@ -341,29 +367,6 @@ DELETE 1
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-01/">January, 2017</a></h2>
<p class="blog-post-meta"><time datetime="2017-01-02T10:43:00+03:00">Mon Jan 02, 2017</time> by Alan Orth in
<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/tags/notes/" rel="tag">Notes</a>
</p>
</header>
<h2 id="2017-01-02">2017-01-02</h2>
<ul>
<li>I checked to see if the Solr sharding task that is supposed to run on January 1st had run and saw there was an error</li>
<li>I tested on DSpace Test as well and it doesn&rsquo;t work there either</li>
<li>I asked on the dspace-tech mailing list because it seems to be broken, and actually now I&rsquo;m not sure if we&rsquo;ve ever had the sharding task run successfully over all these years</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2017-01/'>Read more →</a>
</article>
<nav class="blog-pagination">
<a class="btn btn-outline-primary" href="/cgspace-notes/page/5/" rel="prev" role="button">Previous page</a>
@ -388,6 +391,8 @@ DELETE 1
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -396,8 +401,6 @@ DELETE 1
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2021-09-28T22:00:36+03:00" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
@ -31,7 +31,7 @@
"@type": "Person",
"name": "Alan Orth"
},
"dateModified": "2021-09-01T09:14:07+03:00",
"dateModified": "2021-10-01T11:14:07+03:00",
"keywords": "notes, migration, notes",
"description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
}
@ -96,6 +96,29 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2017-01/">January, 2017</a></h2>
<p class="blog-post-meta"><time datetime="2017-01-02T10:43:00+03:00">Mon Jan 02, 2017</time> by Alan Orth in
<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/tags/notes/" rel="tag">Notes</a>
</p>
</header>
<h2 id="2017-01-02">2017-01-02</h2>
<ul>
<li>I checked to see if the Solr sharding task that is supposed to run on January 1st had run and saw there was an error</li>
<li>I tested on DSpace Test as well and it doesn&rsquo;t work there either</li>
<li>I asked on the dspace-tech mailing list because it seems to be broken, and actually now I&rsquo;m not sure if we&rsquo;ve ever had the sharding task run successfully over all these years</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2017-01/'>Read more →</a>
</article>
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-12/">December, 2016</a></h2>
@ -343,29 +366,6 @@ dspacetest=# select text_value from metadatavalue where metadata_field_id=3 and
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-03/">March, 2016</a></h2>
<p class="blog-post-meta"><time datetime="2016-03-02T16:50:00+03:00">Wed Mar 02, 2016</time> by Alan Orth in
<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/tags/notes/" rel="tag">Notes</a>
</p>
</header>
<h2 id="2016-03-02">2016-03-02</h2>
<ul>
<li>Looking at issues with author authorities on CGSpace</li>
<li>For some reason we still have the <code>index-lucene-update</code> cron job active on CGSpace, but I&rsquo;m pretty sure we don&rsquo;t need it as of the latest few versions of Atmire&rsquo;s Listings and Reports module</li>
<li>Reinstall my local (Mac OS X) DSpace stack with Tomcat 7, PostgreSQL 9.3, and Java JDK 1.7 to match environment on CGSpace server</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2016-03/'>Read more →</a>
</article>
<nav class="blog-pagination">
<a class="btn btn-outline-primary" href="/cgspace-notes/page/6/" rel="prev" role="button">Previous page</a>
@ -390,6 +390,8 @@ dspacetest=# select text_value from metadatavalue where metadata_field_id=3 and
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -398,8 +400,6 @@ dspacetest=# select text_value from metadatavalue where metadata_field_id=3 and
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
<meta property="og:updated_time" content="2021-09-28T22:00:36+03:00" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
@ -31,7 +31,7 @@
"@type": "Person",
"name": "Alan Orth"
},
"dateModified": "2021-09-01T09:14:07+03:00",
"dateModified": "2021-10-01T11:14:07+03:00",
"keywords": "notes, migration, notes",
"description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
}
@ -96,6 +96,29 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-03/">March, 2016</a></h2>
<p class="blog-post-meta"><time datetime="2016-03-02T16:50:00+03:00">Wed Mar 02, 2016</time> by Alan Orth in
<span class="fas fa-tag" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/tags/notes/" rel="tag">Notes</a>
</p>
</header>
<h2 id="2016-03-02">2016-03-02</h2>
<ul>
<li>Looking at issues with author authorities on CGSpace</li>
<li>For some reason we still have the <code>index-lucene-update</code> cron job active on CGSpace, but I&rsquo;m pretty sure we don&rsquo;t need it as of the latest few versions of Atmire&rsquo;s Listings and Reports module</li>
<li>Reinstall my local (Mac OS X) DSpace stack with Tomcat 7, PostgreSQL 9.3, and Java JDK 1.7 to match environment on CGSpace server</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2016-03/'>Read more →</a>
</article>
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2016-02/">February, 2016</a></h2>
@ -223,6 +246,8 @@
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -231,8 +256,6 @@
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2021-09-28T22:00:36+03:00" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
@ -31,7 +31,7 @@
"@type": "Person",
"name": "Alan Orth"
},
"dateModified": "2021-09-01T09:14:07+03:00",
"dateModified": "2021-10-01T11:14:07+03:00",
"keywords": "notes, migration, notes",
"description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
}
@ -96,6 +96,38 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-10/">October, 2021</a></h2>
<p class="blog-post-meta"><time datetime="2021-10-01T11:14:07+03:00">Fri Oct 01, 2021</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2021-10-01">2021-10-01</h2>
<ul>
<li>Export all affiliations on CGSpace and run them against the latest RoR data dump:</li>
</ul>
<pre tabindex="0"><code class="language-console" data-lang="console">localhost/dspace63= &gt; \COPY (SELECT DISTINCT text_value as &quot;cg.contributor.affiliation&quot;, count(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id = 211 GROUP BY text_value ORDER BY count DESC) to /tmp/2021-10-01-affiliations.csv WITH CSV HEADER;
$ csvcut -c 1 /tmp/2021-10-01-affiliations.csv | sed 1d &gt; /tmp/2021-10-01-affiliations.txt
$ ./ilri/ror-lookup.py -i /tmp/2021-10-01-affiliations.txt -r 2021-09-23-ror-data.json -o /tmp/2021-10-01-affili
ations-matching.csv
$ csvgrep -c matched -m true /tmp/2021-10-01-affiliations-matching.csv | sed 1d | wc -l
1879
$ wc -l /tmp/2021-10-01-affiliations.txt
7100 /tmp/2021-10-01-affiliations.txt
</code></pre><ul>
<li>So we have 1879/7100 (26.46%) matching already</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2021-10/'>Read more →</a>
</article>
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-09/">September, 2021</a></h2>
@ -348,40 +380,6 @@ COPY 20994
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-01/">January, 2021</a></h2>
<p class="blog-post-meta"><time datetime="2021-01-03T10:13:54+02:00">Sun Jan 03, 2021</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2021-01-03">2021-01-03</h2>
<ul>
<li>Peter notified me that some filters on AReS were broken again
<ul>
<li>It&rsquo;s the same issue with the field names getting <code>.keyword</code> appended to the end that I already <a href="https://github.com/ilri/OpenRXV/issues/66">filed an issue on OpenRXV about last month</a></li>
<li>I fixed the broken filters (careful to not edit any others, lest they break too!)</li>
</ul>
</li>
<li>Fix an issue with start page number for the DSpace REST API and statistics API in OpenRXV
<ul>
<li>The start page had been &ldquo;1&rdquo; in the UI, but in the backend they were doing some gymnastics to adjust to the zero-based offset/limit/page of the DSpace REST API and the statistics API</li>
<li>I adjusted it to default to 0 and added a note to the admin screen</li>
<li>I realized that this issue was actually causing the first page of 100 statistics to be missing&hellip;</li>
<li>For example, <a href="https://cgspace.cgiar.org/handle/10568/66839">this item</a> has 51 views on CGSpace, but 0 on AReS</li>
</ul>
</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2021-01/'>Read more →</a>
</article>
<nav class="blog-pagination">
@ -406,6 +404,8 @@ COPY 20994
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -414,8 +414,6 @@ COPY 20994
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -6,7 +6,30 @@
<description>Recent content in Posts on CGSpace Notes</description>
<generator>Hugo -- gohugo.io</generator>
<language>en-us</language>
<lastBuildDate>Wed, 01 Sep 2021 09:14:07 +0300</lastBuildDate><atom:link href="https://alanorth.github.io/cgspace-notes/posts/index.xml" rel="self" type="application/rss+xml" />
<lastBuildDate>Fri, 01 Oct 2021 11:14:07 +0300</lastBuildDate><atom:link href="https://alanorth.github.io/cgspace-notes/posts/index.xml" rel="self" type="application/rss+xml" />
<item>
<title>October, 2021</title>
<link>https://alanorth.github.io/cgspace-notes/2021-10/</link>
<pubDate>Fri, 01 Oct 2021 11:14:07 +0300</pubDate>
<guid>https://alanorth.github.io/cgspace-notes/2021-10/</guid>
<description>&lt;h2 id=&#34;2021-10-01&#34;&gt;2021-10-01&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Export all affiliations on CGSpace and run them against the latest RoR data dump:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code class=&#34;language-console&#34; data-lang=&#34;console&#34;&gt;localhost/dspace63= &amp;gt; \COPY (SELECT DISTINCT text_value as &amp;quot;cg.contributor.affiliation&amp;quot;, count(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id = 211 GROUP BY text_value ORDER BY count DESC) to /tmp/2021-10-01-affiliations.csv WITH CSV HEADER;
$ csvcut -c 1 /tmp/2021-10-01-affiliations.csv | sed 1d &amp;gt; /tmp/2021-10-01-affiliations.txt
$ ./ilri/ror-lookup.py -i /tmp/2021-10-01-affiliations.txt -r 2021-09-23-ror-data.json -o /tmp/2021-10-01-affili
ations-matching.csv
$ csvgrep -c matched -m true /tmp/2021-10-01-affiliations-matching.csv | sed 1d | wc -l
1879
$ wc -l /tmp/2021-10-01-affiliations.txt
7100 /tmp/2021-10-01-affiliations.txt
&lt;/code&gt;&lt;/pre&gt;&lt;ul&gt;
&lt;li&gt;So we have 1879/7100 (26.46%) matching already&lt;/li&gt;
&lt;/ul&gt;</description>
</item>
<item>
<title>September, 2021</title>
<link>https://alanorth.github.io/cgspace-notes/2021-09/</link>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2021-09-28T22:00:36+03:00" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
@ -31,7 +31,7 @@
"@type": "Person",
"name": "Alan Orth"
},
"dateModified": "2021-09-01T09:14:07+03:00",
"dateModified": "2021-10-01T11:14:07+03:00",
"keywords": "notes, migration, notes",
"description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
}
@ -96,6 +96,40 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2021-01/">January, 2021</a></h2>
<p class="blog-post-meta"><time datetime="2021-01-03T10:13:54+02:00">Sun Jan 03, 2021</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2021-01-03">2021-01-03</h2>
<ul>
<li>Peter notified me that some filters on AReS were broken again
<ul>
<li>It&rsquo;s the same issue with the field names getting <code>.keyword</code> appended to the end that I already <a href="https://github.com/ilri/OpenRXV/issues/66">filed an issue on OpenRXV about last month</a></li>
<li>I fixed the broken filters (careful to not edit any others, lest they break too!)</li>
</ul>
</li>
<li>Fix an issue with start page number for the DSpace REST API and statistics API in OpenRXV
<ul>
<li>The start page had been &ldquo;1&rdquo; in the UI, but in the backend they were doing some gymnastics to adjust to the zero-based offset/limit/page of the DSpace REST API and the statistics API</li>
<li>I adjusted it to default to 0 and added a note to the admin screen</li>
<li>I realized that this issue was actually causing the first page of 100 statistics to be missing&hellip;</li>
<li>For example, <a href="https://cgspace.cgiar.org/handle/10568/66839">this item</a> has 51 views on CGSpace, but 0 on AReS</li>
</ul>
</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2021-01/'>Read more →</a>
</article>
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-12/">December, 2020</a></h2>
@ -339,39 +373,6 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-04/">April, 2020</a></h2>
<p class="blog-post-meta"><time datetime="2020-04-02T10:53:24+03:00">Thu Apr 02, 2020</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2020-04-02">2020-04-02</h2>
<ul>
<li>Maria asked me to update Charles Staver&rsquo;s ORCID iD in the submission template and on CGSpace, as his name was lower case before, and now he has corrected it
<ul>
<li>I updated the fifty-eight existing items on CGSpace</li>
</ul>
</li>
<li>Looking into the items Udana had asked about last week that were missing Altmetric donuts:
<ul>
<li><a href="https://hdl.handle.net/10568/103225">The first</a> is still missing its DOI, so I added it and <a href="https://twitter.com/mralanorth/status/1245632619661766657">tweeted its handle</a> (after a few hours there was a donut with score 222)</li>
<li><a href="https://hdl.handle.net/10568/106899">The second item</a> now has a donut with score 2 since I <a href="https://twitter.com/mralanorth/status/1243158045540134913">tweeted its handle</a> last week</li>
<li><a href="https://hdl.handle.net/10568/107258">The third item</a> now has a donut with score 1 since I <a href="https://twitter.com/mralanorth/status/1243158786392625153">tweeted it</a> last week</li>
</ul>
</li>
<li>On the same note, the <a href="https://hdl.handle.net/10568/106573">one item</a> Abenet pointed out last week now has a donut with score of 104 after I <a href="https://twitter.com/mralanorth/status/1243163710241345536">tweeted it</a> last week</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2020-04/'>Read more →</a>
</article>
<nav class="blog-pagination">
<a class="btn btn-outline-primary" href="/cgspace-notes/posts/" rel="prev" role="button">Previous page</a>
@ -396,6 +397,8 @@
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -404,8 +407,6 @@
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2021-09-28T22:00:36+03:00" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
@ -31,7 +31,7 @@
"@type": "Person",
"name": "Alan Orth"
},
"dateModified": "2021-09-01T09:14:07+03:00",
"dateModified": "2021-10-01T11:14:07+03:00",
"keywords": "notes, migration, notes",
"description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
}
@ -96,6 +96,39 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-04/">April, 2020</a></h2>
<p class="blog-post-meta"><time datetime="2020-04-02T10:53:24+03:00">Thu Apr 02, 2020</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2020-04-02">2020-04-02</h2>
<ul>
<li>Maria asked me to update Charles Staver&rsquo;s ORCID iD in the submission template and on CGSpace, as his name was lower case before, and now he has corrected it
<ul>
<li>I updated the fifty-eight existing items on CGSpace</li>
</ul>
</li>
<li>Looking into the items Udana had asked about last week that were missing Altmetric donuts:
<ul>
<li><a href="https://hdl.handle.net/10568/103225">The first</a> is still missing its DOI, so I added it and <a href="https://twitter.com/mralanorth/status/1245632619661766657">tweeted its handle</a> (after a few hours there was a donut with score 222)</li>
<li><a href="https://hdl.handle.net/10568/106899">The second item</a> now has a donut with score 2 since I <a href="https://twitter.com/mralanorth/status/1243158045540134913">tweeted its handle</a> last week</li>
<li><a href="https://hdl.handle.net/10568/107258">The third item</a> now has a donut with score 1 since I <a href="https://twitter.com/mralanorth/status/1243158786392625153">tweeted it</a> last week</li>
</ul>
</li>
<li>On the same note, the <a href="https://hdl.handle.net/10568/106573">one item</a> Abenet pointed out last week now has a donut with score of 104 after I <a href="https://twitter.com/mralanorth/status/1243163710241345536">tweeted it</a> last week</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2020-04/'>Read more →</a>
</article>
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-03/">March, 2020</a></h2>
@ -383,32 +416,6 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-06/">June, 2019</a></h2>
<p class="blog-post-meta"><time datetime="2019-06-02T10:57:51+03:00">Sun Jun 02, 2019</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2019-06-02">2019-06-02</h2>
<ul>
<li>Merge the <a href="https://github.com/ilri/DSpace/pull/425">Solr filterCache</a> and <a href="https://github.com/ilri/DSpace/pull/426">XMLUI ISI journal</a> changes to the <code>5_x-prod</code> branch and deploy on CGSpace</li>
<li>Run system updates on CGSpace (linode18) and reboot it</li>
</ul>
<h2 id="2019-06-03">2019-06-03</h2>
<ul>
<li>Skype with Marie-Angélique and Abenet about <a href="https://agriculturalsemantics.github.io/cg-core/cgcore.html">CG Core v2</a></li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2019-06/'>Read more →</a>
</article>
<nav class="blog-pagination">
<a class="btn btn-outline-primary" href="/cgspace-notes/posts/page/2/" rel="prev" role="button">Previous page</a>
@ -433,6 +440,8 @@
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -441,8 +450,6 @@
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

View File

@ -10,7 +10,7 @@
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
<meta property="og:type" content="website" />
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
<meta property="og:updated_time" content="2021-09-28T22:00:36+03:00" />
<meta property="og:updated_time" content="2021-10-04T11:10:54+03:00" />
@ -31,7 +31,7 @@
"@type": "Person",
"name": "Alan Orth"
},
"dateModified": "2021-09-01T09:14:07+03:00",
"dateModified": "2021-10-01T11:14:07+03:00",
"keywords": "notes, migration, notes",
"description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
}
@ -96,6 +96,32 @@
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-06/">June, 2019</a></h2>
<p class="blog-post-meta"><time datetime="2019-06-02T10:57:51+03:00">Sun Jun 02, 2019</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2019-06-02">2019-06-02</h2>
<ul>
<li>Merge the <a href="https://github.com/ilri/DSpace/pull/425">Solr filterCache</a> and <a href="https://github.com/ilri/DSpace/pull/426">XMLUI ISI journal</a> changes to the <code>5_x-prod</code> branch and deploy on CGSpace</li>
<li>Run system updates on CGSpace (linode18) and reboot it</li>
</ul>
<h2 id="2019-06-03">2019-06-03</h2>
<ul>
<li>Skype with Marie-Angélique and Abenet about <a href="https://agriculturalsemantics.github.io/cg-core/cgcore.html">CG Core v2</a></li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2019-06/'>Read more →</a>
</article>
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-05/">May, 2019</a></h2>
@ -380,38 +406,6 @@ sys 0m1.979s
<article class="blog-post">
<header>
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2018-08/">August, 2018</a></h2>
<p class="blog-post-meta"><time datetime="2018-08-01T11:52:54+03:00">Wed Aug 01, 2018</time> by Alan Orth in
<span class="fas fa-folder" aria-hidden="true"></span>&nbsp;<a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
</p>
</header>
<h2 id="2018-08-01">2018-08-01</h2>
<ul>
<li>DSpace Test had crashed at some point yesterday morning and I see the following in <code>dmesg</code>:</li>
</ul>
<pre tabindex="0"><code>[Tue Jul 31 00:00:41 2018] Out of memory: Kill process 1394 (java) score 668 or sacrifice child
[Tue Jul 31 00:00:41 2018] Killed process 1394 (java) total-vm:15601860kB, anon-rss:5355528kB, file-rss:0kB, shmem-rss:0kB
[Tue Jul 31 00:00:41 2018] oom_reaper: reaped process 1394 (java), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
</code></pre><ul>
<li>Judging from the time of the crash it was probably related to the Discovery indexing that starts at midnight</li>
<li>From the DSpace log I see that eventually Solr stopped responding, so I guess the <code>java</code> process that was OOM killed above was Tomcat&rsquo;s</li>
<li>I&rsquo;m not sure why Tomcat didn&rsquo;t crash with an OutOfMemoryError&hellip;</li>
<li>Anyways, perhaps I should increase the JVM heap from 5120m to 6144m like we did a few months ago when we tried to run the whole CGSpace Solr core</li>
<li>The server only has 8GB of RAM so we&rsquo;ll eventually need to upgrade to a larger one because we&rsquo;ll start starving the OS, PostgreSQL, and command line batch processes</li>
<li>I ran all system updates on DSpace Test and rebooted it</li>
</ul>
<a href='https://alanorth.github.io/cgspace-notes/2018-08/'>Read more →</a>
</article>
<nav class="blog-pagination">
<a class="btn btn-outline-primary" href="/cgspace-notes/posts/page/3/" rel="prev" role="button">Previous page</a>
@ -436,6 +430,8 @@ sys 0m1.979s
<ol class="list-unstyled">
<li><a href="/cgspace-notes/2021-10/">October, 2021</a></li>
<li><a href="/cgspace-notes/2021-09/">September, 2021</a></li>
<li><a href="/cgspace-notes/2021-08/">August, 2021</a></li>
@ -444,8 +440,6 @@ sys 0m1.979s
<li><a href="/cgspace-notes/2021-06/">June, 2021</a></li>
<li><a href="/cgspace-notes/2021-05/">May, 2021</a></li>
</ol>
</section>

Some files were not shown because too many files have changed in this diff Show More