mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Add notes for 2022-03-04
This commit is contained in:
@ -34,7 +34,7 @@ I see that CGSpace (linode18) is still using PostgreSQL JDBC driver version 42.2
|
||||
|
||||
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.92.2" />
|
||||
<meta name="generator" content="Hugo 0.93.1" />
|
||||
|
||||
|
||||
|
||||
@ -166,7 +166,7 @@ I see that CGSpace (linode18) is still using PostgreSQL JDBC driver version 42.2
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<pre tabindex="0"><code># cat /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "07/May/2020:(01|03|04)" | goaccess --log-format=COMBINED -
|
||||
<pre tabindex="0"><code># cat /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "07/May/2020:(01|03|04)" | goaccess --log-format=COMBINED -
|
||||
</code></pre><ul>
|
||||
<li>The two main IPs making requests around then are 188.134.31.88 and 212.34.8.188
|
||||
<ul>
|
||||
@ -211,9 +211,9 @@ Total number of bot hits purged: 192332
|
||||
</ul>
|
||||
<pre tabindex="0"><code>$ cat 2020-05-11-add-orcids.csv
|
||||
dc.contributor.author,cg.creator.id
|
||||
"Lutakome, P.","Pius Lutakome: 0000-0002-0804-2649"
|
||||
"Lutakome, Pius","Pius Lutakome: 0000-0002-0804-2649"
|
||||
$ ./add-orcid-identifiers-csv.py -i 2020-05-11-add-orcids.csv -db dspace -u dspace -p 'fuuu' -d
|
||||
"Lutakome, P.","Pius Lutakome: 0000-0002-0804-2649"
|
||||
"Lutakome, Pius","Pius Lutakome: 0000-0002-0804-2649"
|
||||
$ ./add-orcid-identifiers-csv.py -i 2020-05-11-add-orcids.csv -db dspace -u dspace -p 'fuuu' -d
|
||||
</code></pre><ul>
|
||||
<li>Run system updates on CGSpace (linode18) and reboot it
|
||||
<ul>
|
||||
@ -265,8 +265,8 @@ $ ./add-orcid-identifiers-csv.py -i 2020-05-11-add-orcids.csv -db dspace -u dspa
|
||||
</ul>
|
||||
<pre tabindex="0"><code>$ cat 2020-05-19-add-orcids.csv
|
||||
dc.contributor.author,cg.creator.id
|
||||
"Bahta, Sirak T.","Sirak Bahta: 0000-0002-5728-2489"
|
||||
$ ./add-orcid-identifiers-csv.py -i 2020-05-19-add-orcids.csv -db dspace -u dspace -p 'fuuu' -d
|
||||
"Bahta, Sirak T.","Sirak Bahta: 0000-0002-5728-2489"
|
||||
$ ./add-orcid-identifiers-csv.py -i 2020-05-19-add-orcids.csv -db dspace -u dspace -p 'fuuu' -d
|
||||
</code></pre><ul>
|
||||
<li>An IITA user is having issues submitting to CGSpace and I see there are a rising number of PostgreSQL connections waiting in transaction and in lock:</li>
|
||||
</ul>
|
||||
@ -300,9 +300,9 @@ $ ./add-orcid-identifiers-csv.py -i 2020-05-19-add-orcids.csv -db dspace -u dspa
|
||||
</ul>
|
||||
<pre tabindex="0"><code>$ cat 2020-05-25-add-orcids.csv
|
||||
dc.contributor.author,cg.creator.id
|
||||
"Díaz, Manuel F.","Manuel Francisco Diaz Baca: 0000-0001-8996-5092"
|
||||
"Díaz, Manuel Francisco","Manuel Francisco Diaz Baca: 0000-0001-8996-5092"
|
||||
$ ./add-orcid-identifiers-csv.py -i 2020-05-25-add-orcids.csv -db dspace -u dspace -p 'fuuu' -d
|
||||
"Díaz, Manuel F.","Manuel Francisco Diaz Baca: 0000-0001-8996-5092"
|
||||
"Díaz, Manuel Francisco","Manuel Francisco Diaz Baca: 0000-0001-8996-5092"
|
||||
$ ./add-orcid-identifiers-csv.py -i 2020-05-25-add-orcids.csv -db dspace -u dspace -p 'fuuu' -d
|
||||
</code></pre><ul>
|
||||
<li>Last week Maria asked again about searching for items by accession or issue date
|
||||
<ul>
|
||||
@ -327,7 +327,7 @@ $ ./add-orcid-identifiers-csv.py -i 2020-05-25-add-orcids.csv -db dspace -u dspa
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<pre tabindex="0"><code># cat /var/log/nginx/*.log.1 | grep -E "29/May/2020:(02|03|04|05)" | goaccess --log-format=COMBINED -
|
||||
<pre tabindex="0"><code># cat /var/log/nginx/*.log.1 | grep -E "29/May/2020:(02|03|04|05)" | goaccess --log-format=COMBINED -
|
||||
</code></pre><ul>
|
||||
<li>The top is 172.104.229.92, which is the AReS harvester (still not using a user agent, but it’s tagged as a bot in the nginx mapping)</li>
|
||||
<li>Second is 188.134.31.88, which is a Russian host that we also saw in the last few weeks, using a browser user agent and hitting the XMLUI (but it is tagged as a bot in nginx as well)</li>
|
||||
@ -361,13 +361,13 @@ $ ./add-orcid-identifiers-csv.py -i 2020-05-25-add-orcids.csv -db dspace -u dspa
|
||||
<pre tabindex="0"><code>$ sudo su - postgres
|
||||
$ dropdb dspacetest
|
||||
$ createdb -O dspacetest --encoding=UNICODE dspacetest
|
||||
$ psql dspacetest -c 'alter user dspacetest superuser;'
|
||||
$ psql dspacetest -c 'alter user dspacetest superuser;'
|
||||
$ pg_restore -d dspacetest -O --role=dspacetest /tmp/cgspace_2020-05-31.backup
|
||||
$ psql dspacetest -c 'alter user dspacetest nosuperuser;'
|
||||
$ psql dspacetest -c 'alter user dspacetest nosuperuser;'
|
||||
# run DSpace 5 version of update-sequences.sql!!!
|
||||
$ psql -f /home/dspace/src/git/DSpace/dspace/etc/postgres/update-sequences.sql dspacetest
|
||||
$ psql dspacetest -c "DELETE FROM schema_version WHERE version IN ('5.8.2015.12.03.3');"
|
||||
$ psql dspacetest -c 'CREATE EXTENSION pgcrypto;'
|
||||
$ psql dspacetest -c "DELETE FROM schema_version WHERE version IN ('5.8.2015.12.03.3');"
|
||||
$ psql dspacetest -c 'CREATE EXTENSION pgcrypto;'
|
||||
$ exit
|
||||
</code></pre><ul>
|
||||
<li>Now switch to the DSpace 6.x branch and start a build:</li>
|
||||
@ -391,7 +391,7 @@ $ ant update
|
||||
<li>I had a mistake in my Solr internal URL parameter so DSpace couldn’t find it, but once I fixed that DSpace starts up OK!</li>
|
||||
<li>Once the initial Discovery reindexing was completed (after three hours or so!) I started the Solr statistics UUID migration:</li>
|
||||
</ul>
|
||||
<pre tabindex="0"><code>$ export JAVA_OPTS="-Xmx1024m -Dfile.encoding=UTF-8"
|
||||
<pre tabindex="0"><code>$ export JAVA_OPTS="-Xmx1024m -Dfile.encoding=UTF-8"
|
||||
$ dspace solr-upgrade-statistics-6x -i statistics -n 250000
|
||||
$ dspace solr-upgrade-statistics-6x -i statistics -n 1000000
|
||||
$ dspace solr-upgrade-statistics-6x -i statistics -n 1000000
|
||||
@ -400,8 +400,8 @@ $ dspace solr-upgrade-statistics-6x -i statistics -n 1000000
|
||||
<li>It’s taking about 35 minutes for 1,000,000 records…</li>
|
||||
<li>Some issues towards the end of this core:</li>
|
||||
</ul>
|
||||
<pre tabindex="0"><code>Exception: Error while creating field 'p_group_id{type=uuid,properties=indexed,stored,multiValued}' from value '10'
|
||||
org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Error while creating field 'p_group_id{type=uuid,properties=indexed,stored,multiValued}' from value '10'
|
||||
<pre tabindex="0"><code>Exception: Error while creating field 'p_group_id{type=uuid,properties=indexed,stored,multiValued}' from value '10'
|
||||
org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Error while creating field 'p_group_id{type=uuid,properties=indexed,stored,multiValued}' from value '10'
|
||||
at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:552)
|
||||
at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
|
||||
at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
|
||||
@ -425,17 +425,17 @@ org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Error whil
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<pre tabindex="0"><code>$ ./run.sh -s http://localhost:8081/solr/statistics -a export -o statistics-unmigrated.json -k uid -f '(*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)'
|
||||
$ curl -s "http://localhost:8081/solr/statistics/update?softCommit=true" -H "Content-Type: text/xml" --data-binary "<delete><query>(*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)</query></delete>"
|
||||
<pre tabindex="0"><code>$ ./run.sh -s http://localhost:8081/solr/statistics -a export -o statistics-unmigrated.json -k uid -f '(*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)'
|
||||
$ curl -s "http://localhost:8081/solr/statistics/update?softCommit=true" -H "Content-Type: text/xml" --data-binary "<delete><query>(*:* NOT id:/.{36}/) AND (*:* NOT id:/.+-unmigrated/)</query></delete>"
|
||||
</code></pre><ul>
|
||||
<li>Now the UUID conversion script says there is nothing left to convert, so I can try to run the Atmire CUA conversion utility:</li>
|
||||
</ul>
|
||||
<pre tabindex="0"><code>$ export JAVA_OPTS="-Xmx1024m -Dfile.encoding=UTF-8"
|
||||
<pre tabindex="0"><code>$ export JAVA_OPTS="-Xmx1024m -Dfile.encoding=UTF-8"
|
||||
$ dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 1
|
||||
</code></pre><ul>
|
||||
<li>The processing is very slow and there are lots of errors like this:</li>
|
||||
</ul>
|
||||
<pre tabindex="0"><code>Record uid: 7b5b3900-28e8-417f-9c1c-e7d88a753221 couldn't be processed
|
||||
<pre tabindex="0"><code>Record uid: 7b5b3900-28e8-417f-9c1c-e7d88a753221 couldn't be processed
|
||||
com.atmire.statistics.util.update.atomic.ProcessingException: something went wrong while processing record uid: 7b5b3900-28e8-417f-9c1c-e7d88a753221, an error occured in the com.atmire.statistics.util.update.atomic.processor.ContainerOwnerDBProcessor
|
||||
at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.applyProcessors(AtomicStatisticsUpdater.java:304)
|
||||
at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.processRecords(AtomicStatisticsUpdater.java:176)
|
||||
|
Reference in New Issue
Block a user