mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-11-14 10:57:03 +01:00
27 KiB
27 KiB
title | date | author | categories | |
---|---|---|---|---|
April, 2020 | 2020-04-02T10:53:24+03:00 | Alan Orth |
|
2020-04-02
- Maria asked me to update Charles Staver's ORCID iD in the submission template and on CGSpace, as his name was lower case before, and now he has corrected it
- I updated the fifty-eight existing items on CGSpace
- Looking into the items Udana had asked about last week that were missing Altmetric donuts:
- The first is still missing its DOI, so I added it and tweeted its handle (after a few hours there was a donut with score 222)
- The second item now has a donut with score 2 since I tweeted its handle last week
- The third item now has a donut with score 1 since I tweeted it last week
- On the same note, the one item Abenet pointed out last week now has a donut with score of 104 after I tweeted it last week
- Altmetric responded about one item that had no donut since at least 2019-12 and said they fixed some problems with their bot's user agent
- I decided to tweet the item, as I can't remember if I ever did it before
2020-04-05
- Update PostgreSQL JDBC driver to version 42.2.12
2020-04-07
- Yesterday Atmire sent me their pull request for DSpace 6 modules
- Peter pointed out that some items have his ORCID identifier (
cg.creator.id
) twice- I think this is because my early
add-orcid-identifiers.py
script was adding identifiers to existing records without properly checking if there was already one present (at first it only checked if there was one with the exactplace
value) - As a test I dropped all his ORCID identifiers and added them back with the
add-orcid-identifiers.py
script:
- I think this is because my early
$ psql -h localhost -U postgres dspace -c "DELETE FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id=240 AND text_value LIKE '%Ballantyne%';"
DELETE 97
$ ./add-orcid-identifiers-csv.py -i 2020-04-07-peter-orcids.csv -db dspace -u dspace -p 'fuuu' -d
- I used this CSV with the script (all records with his name have the name standardized like this):
dc.contributor.author,cg.creator.id
"Ballantyne, Peter G.","Peter G. Ballantyne: 0000-0001-9346-2893"
- Then I tried another way, to identify all duplicate ORCID identifiers for a given resource ID and group them so I can see if count is greater than 1:
dspace=# \COPY (SELECT DISTINCT(resource_id, text_value) as distinct_orcid, COUNT(*) FROM metadatavalue WHERE resource_type_id = 2 AND metadata_field_id = 240 GROUP BY distinct_orcid ORDER BY count DESC) TO /tmp/2020-04-07-duplicate-orcids.csv WITH CSV HEADER;
COPY 15209
- Of those, about nine authors had duplicate ORCID identifiers over about thirty records, so I created a CSV with all their name variations and ORCID identifiers:
dc.contributor.author,cg.creator.id
"Ballantyne, Peter G.","Peter G. Ballantyne: 0000-0001-9346-2893"
"Ramirez-Villegas, Julian","Julian Ramirez-Villegas: 0000-0002-8044-583X"
"Villegas-Ramirez, J","Julian Ramirez-Villegas: 0000-0002-8044-583X"
"Ishitani, Manabu","Manabu Ishitani: 0000-0002-6950-4018"
"Manabu, Ishitani","Manabu Ishitani: 0000-0002-6950-4018"
"Ishitani, M.","Manabu Ishitani: 0000-0002-6950-4018"
"Ishitani, M.","Manabu Ishitani: 0000-0002-6950-4018"
"Buruchara, Robin A.","Robin Buruchara: 0000-0003-0934-1218"
"Buruchara, Robin","Robin Buruchara: 0000-0003-0934-1218"
"Jarvis, Andy","Andy Jarvis: 0000-0001-6543-0798"
"Jarvis, Andrew","Andy Jarvis: 0000-0001-6543-0798"
"Jarvis, A.","Andy Jarvis: 0000-0001-6543-0798"
"Tohme, Joseph M.","Joe Tohme: 0000-0003-2765-7101"
"Hansen, James","James Hansen: 0000-0002-8599-7895"
"Hansen, James W.","James Hansen: 0000-0002-8599-7895"
"Asseng, Senthold","Senthold Asseng: 0000-0002-7583-3811"
- Then I deleted all their existing ORCID identifier records:
dspace=# DELETE FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id=240 AND text_value SIMILAR TO '%(0000-0001-6543-0798|0000-0001-9346-2893|0000-0002-6950-4018|0000-0002-7583-3811|0000-0002-8044-583X|0000-0002-8599-7895|0000-0003-0934-1218|0000-0003-2765-7101)%';
DELETE 994
- And then I added them again using the
add-orcid-identifiers
records:
$ ./add-orcid-identifiers-csv.py -i 2020-04-07-fix-duplicate-orcids.csv -db dspace -u dspace -p 'fuuu' -d
- I ran the fixes on DSpace Test and CGSpace as well
- I started testing the pull request sent by Atmire yesterday
- I notice that we now need
yarn
to build, and I need to bump the Node.jsengine
version in our Mirage 2 theme in order to get it to build on Node.js 10.x - Font Awesome icons for GitHub etc weren't loading, and after a bit of troubleshooting I replaced version 4.5.0 with 5.13.0 and to my surprise they now include Mendeley and ORCID so we can get rid of the Academicons dependency
- I notice that we now need
2020-04-12
- Testing the Atmire DSpace 6.3 code with a clean CGSpace DSpace 5.8 database snapshot
- One Flyway migration failed so I had to manually remove it (and of course create the pgcrypto extension):
dspace63=# DELETE FROM schema_version WHERE version IN ('5.8.2015.12.03.3');
dspace63=# CREATE EXTENSION pgcrypto;
- Then DSpace 6.3 started up OK and I was able to see some statistics in the Content and Usage Analysis (CUA) module, but not on community, collection, or item pages
- I also noticed at least one of these errors in the DSpace log:
2020-04-12 16:34:33,363 ERROR com.atmire.dspace.app.xmlui.aspect.statistics.editorparts.DataTableTransformer @ java.lang.IllegalArgumentException: Invalid UUID string: 1
- And I remembered I actually need to run the DSpace 6.4 Solr UUID migrations:
$ export JAVA_OPTS="-Xmx1024m -Dfile.encoding=UTF-8"
$ ~/dspace63/bin/dspace solr-upgrade-statistics-6x
- Run system updates on DSpace Test (linode26) and reboot it
- More work on the DSpace 6.3 stuff, improving the GDPR consent logic to use haven instead of cookieconsent
- It works better by injecting the Google Analytics script after the user clicks agree, and it also has a preferences section that gets automatically injected on the privacy page!
2020-04-13
- I realized that
solr-upgrade-statistics-6x
only processes 100,000 records by default so I think we actually need to finish running it for all legacy Solr records before asking Atmire why CUA statlets and detailed statistics aren't working - For now I am just doing 250,000 records at a time on my local environment:
$ export JAVA_OPTS="-Xmx2000m -Dfile.encoding=UTF-8"
$ ~/dspace63/bin/dspace solr-upgrade-statistics-6x -n 250000
- Despite running the migration for all of my local 1.5 million Solr records, I still see a few hundred thousand like
-1
and0-unmigrated
- I will purge them all and try to import only a subset...
- After importing again I see there are indeed tens of thousands of these documents with IDs "-1" and "0"
- They are all
type: 5
, which is "SITE" according toConstants.java
:
/** DSpace site type */
public static final int SITE = 5;
- Even after deleting those documents and re-running
solr-upgrade-statistics-6x
I still get the UUID errors when using CUA and the statlets - I have sent some feedback and questions to Atmire (including about the  issue with glypicons in the header trail)
- In other news, my local Artifactory container stopped working for some reason so I re-created it and it seems some things have changed upstream (port 8082 for web UI?):
$ podman rm artifactory
$ podman pull docker.bintray.io/jfrog/artifactory-oss:latest
$ podman create --ulimit nofile=32000:32000 --name artifactory -v artifactory_data:/var/opt/jfrog/artifactory -p 8081-8082:8081-8082 docker.bintray.io/jfrog/artifactory-oss
$ podman start artifactory
2020-04-14
- A few days ago Peter asked me to update an author's name on CGSpace and in the controlled vocabularies:
dspace=# UPDATE metadatavalue SET text_value='Knight-Jones, Theodore J.D.' WHERE resource_type_id=2 AND metadata_field_id=3 AND text_value='Knight-Jones, T.J.D.';
- I updated his existing records on CGSpace, changed the controlled lists, added his ORCID identifier to the controlled list, and tagged his thirty-nine items with the ORCID iD
- The new DSpace 6 stuff that Atmire sent modifies the Mirage 2's
pom.xml
to copy the each theme's resultingnode_modules
to each theme after building and installing withant update
because they moved some packages from bower to npm and now reference them inpage-structure.xsl
- This is a good idea, because bower is no longer supported, and npm has gotten a lot better, but it causes an extra 200,000 files to get copied!
- Most scripts are concatenated into
theme.js
during build, so we don't need thenode_modules
after that, but there are three scripts inpage-structure.xsl
that are not included there - The scripts are a very old version of modernizr which is not even available on npm, html5shiv, and respond.js
- For modernizr I can simply download a static copy and put it in
0_CGIAR/scripts
and concatenate it intotheme.js
- For the others, I can revert to using them from bower's
vendor
directory, which is installed by the parent XMLUI Mirage 2 theme - During this process I also realized that
mvn clean
doesn't actually clean everything, anddspace/modules/xmlui-mirage2/target
is remaining from previous builds and contains a bunch of shit from previous builds (including all the themes which I was trying to build without!)- This must be a DSpace bug, but I should theoretically check on vanilla DSpace and then file a bug...
2020-04-17
- Atmire responded to some of the issues I raised earlier this week about the DSpace 6 pull request
- They said they don't think the glyphicon encoding issue is due to their changes, but I built a new clean version of the vanilla
6_x-dev
branch from before their pull request and it does not have the encoding issue in the Mirage 2 header trails - Also, they said we need to use something called
AtomicStatisticsUpdateCLI
to do the Solr legacy integer ID to UUID conversion so I asked for more information about that workflow
- They said they don't think the glyphicon encoding issue is due to their changes, but I built a new clean version of the vanilla
2020-04-20
- Looking into a high rate of outgoing bandwidth from yesterday on CGSpace (linode18):
# cat /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "19/Apr/2020:0[6789]" | goaccess --log-format=COMBINED -
- One host in Russia (91.241.19.70) download 23GiB over those few hours in the morning
- It looks like all the requests were for one single item's bitstreams:
# grep -c 91.241.19.70 /var/log/nginx/access.log.1
8900
# grep 91.241.19.70 /var/log/nginx/access.log.1 | grep -c '10568/35187'
8900
- I thought the host might have been Yandex misbehaving, but its user agent is:
Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_3; nl-nl) AppleWebKit/527 (KHTML, like Gecko) Version/3.1.1 Safari/525.20
- I will purge that IP from the Solr statistics using my
check-spider-ip-hits.sh
script:
$ ./check-spider-ip-hits.sh -d -f /tmp/ip -p
(DEBUG) Using spider IPs file: /tmp/ip
(DEBUG) Checking for hits from spider IP: 91.241.19.70
Purging 8909 hits from 91.241.19.70 in statistics
Total number of bot hits purged: 8909
- While investigating that I noticed ORCID identifiers missing from a few authors names, so I added them with my
add-orcid-identifiers.py
script:
$ ./add-orcid-identifiers-csv.py -i 2020-04-20-add-orcids.csv -db dspace -u dspace -p 'fuuu' -d
- The contents of
2020-04-20-add-orcids.csv
was:
dc.contributor.author,cg.creator.id
"Schut, Marc","Marc Schut: 0000-0002-3361-4581"
"Schut, M.","Marc Schut: 0000-0002-3361-4581"
"Kamau, G.","Geoffrey Kamau: 0000-0002-6995-4801"
"Kamau, G","Geoffrey Kamau: 0000-0002-6995-4801"
"Triomphe, Bernard","Bernard Triomphe: 0000-0001-6657-3002"
"Waters-Bayer, Ann","Ann Waters-Bayer: 0000-0003-1887-7903"
"Klerkx, Laurens","Laurens Klerkx: 0000-0002-1664-886X"
- I confirmed some of the authors' names from the report itself, then by looking at their profiles on ORCID.org
- Add new ILRI subject "COVID19" to the
5_x-prod
branch - Add new CCAFS Phase II project tags to the
5_x-prod
branch - I will deploy these to CGSpace in the next few days
2020-04-24
- Atmire responded to my ticket about the  issue with glypicons and said their test server does not show this same issue
- They asked if I am using the
JAVA_OPTS=-Dfile.encoding=UTF-8
when building DSpace and running Tomcat - I set it explicitly for Maven and Ant just now (and cleared all XMLUI caches) but the issue is still there
- I asked them if they are building on macOS or Linux, and which Node.js version (I'm using 10.20.1, which is the current LTS branch).
- They asked if I am using the
2020-04-25
- I researched a bit more about the  issue with glypicons and realized it's due to a bug in libsass
- I have Ruby sass version 3.4.25 installed in my local environment, but DSpace Mirage 2 is supposed to build with 3.3.14
- Downgrading the version fixes it, though I wonder: why did I not have this issue on the
6.x-dev
branch before Atmire's pull request, and also on5_x-prod
where I've been building for a few months here...
- I deployed the latest
5_x-prod
branch on CGSpace (linode18), ran all updates, and rebooted the server- This includes the "COVID19" ILRI subject, the new CCAFS Phase II project tags, and the changes to an ILRI author name
- After restarting the server I had to restart Tomcat three times before all Solr statistics cores loaded properly.
- After that I started a full Discovery reindexing to pick up some author changes I made last week:
$ export JAVA_OPTS="-Dfile.encoding=UTF-8 -Xmx1024m"
$ time chrt -i 0 ionice -c2 -n7 nice -n19 dspace index-discovery -b
- I ran the
dspace cleanup -v
process on CGSpace and got an error:
Error: ERROR: update or delete on table "bitstream" violates foreign key constraint "bundle_primary_bitstream_id_fkey" on table "bundle"
Detail: Key (bitstream_id)=(184980) is still referenced from table "bundle".
- The solution is, as always:
$ psql -d dspace -U dspace -c 'update bundle set primary_bitstream_id=NULL where primary_bitstream_id in (183996);'
UPDATE 1
- I spent some time working on the XMLUI themes in DSpace 6
- Atmire's pull request modifies
pom.xml
to not excludenode_modules
, which means an extra ~260,000 files get copied to our installation folder because of all the themes - I worked on the
Gruntfile.js
to copy Font Awesome and Bootstrap glyphicon fonts out ofnode_modules
and intofonts
at build time, but stilljquery-ui.min.css
was being referenced as aurl()
in CSS - SASS can include imported CSS in your compiled CSS—instead of including an
@import url(..)
if you import it without the ".css", but our version of Ruby SASS doesn't support that - I hacked
Gruntfile.js
to usedart-sass
instead of Rubycompass
(including installing compass's mixins via npm!) but then dart-sass converts all the glyphicon ASCII escape codes to Unicode literals and they show up garbled in Firefox - I tried to use
node-sass
instead of dart-sass and it doesn't replace the ASCII escapes with literals, but then I get the the  issue with the glyphicon in the header trail again! Back to square one! - So that was a waste of five hours...
- I might just leave this tiny hack in
0_CGIAR/styles/_style.scss
to override this and be done with it:
- Atmire's pull request modifies
.breadcrumb > li + li:before {
content: "/\00a0";
}
2020-04-27
- File an issue on DSpace Jira about the
mvn clean
task not removing the Mirage 2 target directory - My changes to DSpace XMLUI Mirage 2 build process mean that we don't need Ruby gems at all anymore! We can completely build without them!
- Trying to test the
com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI
script but there is an error:
Exception: org.apache.solr.search.SyntaxError: Cannot parse 'cua_version:${cua.version.number}': Encountered " "}" "} "" at line 1, column 32.
Was expecting one of:
"TO" ...
<RANGE_QUOTED> ...
<RANGE_GOOP> ...
- Seems something is wrong with the variable interpolation, and I see two configurations in the
atmire-cua.cfg
file:
atmire-cua.cua.version.number=${cua.version.number}
atmire-cua.version.number=${cua.version.number}
- I sent a message to Atmire to check
2020-04-28
- I did some work on DSpace 6 to modify our XMLUI theme to use Font Awesome icons via SVG + JavaScript instead of using web fonts
- The difference is about 105K less, plus two fewer network requests since we don't need the web fonts anymore
- Before:
scripts/theme.js
: 654Kstyles/main.css
: 220Kfa-brands-400.woff2
: 75Kfa-solid-900.woff2
: 78K- Total: 1027K
- After:
scripts/theme.js
: 704Kstyles/main.css
: 218K- Total: 922K
- I manually edited the CUA version variable and was then able to run the
com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI
script- On the first run it took one hour to process 100,000 records on my local test instance...
- On the second run it took one hour to process 140,000 records
- On the third run it took one hour to process 150,000 records
2020-04-29
- I found out that running the Atmire CUA script with more memory and a larger number of records (-r 20000) makes it run faster
- Now the process finishes, but there are errors on some records:
Record uid: ee085cc0-0110-42c5-80b9-0fad4015ed9f couldn't be processed
com.atmire.statistics.util.update.atomic.ProcessingException: something went wrong while processing record uid: ee085cc0-0110-42c5-80b9-0fad4015ed9f, an error occured in the com.atmire.statistics.util.update.atomic.processor.ContainerOwnerDBProcessor
at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.applyProcessors(AtomicStatisticsUpdater.java:304)
at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.processRecords(AtomicStatisticsUpdater.java:176)
at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.performRun(AtomicStatisticsUpdater.java:161)
at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.update(AtomicStatisticsUpdater.java:128)
at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI.main(AtomicStatisticsUpdateCLI.java:78)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:229)
at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:81)
Caused by: java.lang.NullPointerException
at com.atmire.dspace.cua.CUADSOServiceImpl.findByLegacyID(CUADSOServiceImpl.java:40)
at com.atmire.statistics.util.update.atomic.processor.AtomicUpdateProcessor.getDSpaceObject(AtomicUpdateProcessor.java:49)
at com.atmire.statistics.util.update.atomic.processor.ContainerOwnerDBProcessor.process(ContainerOwnerDBProcessor.java:45)
at com.atmire.statistics.util.update.atomic.processor.ContainerOwnerDBProcessor.visit(ContainerOwnerDBProcessor.java:38)
at com.atmire.statistics.util.update.atomic.record.UsageRecord.accept(UsageRecord.java:23)
at com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdater.applyProcessors(AtomicStatisticsUpdater.java:301)
... 10 more
- I've sent a message to Atmire to ask for advice
- Also, now I can actually see the CUA statlets and usage statistics
- Unfortunately it seems they are using Font Awesome 4 in their CUA module and this means that some icons are broken because the names have changed, but also some of their code is using Unicode characters instead of classes in spans!
- I've reverted my sweet SVG work from yesterday and adjusted the Font Awesome 5 SCSS to add a few more icons that they are using
- Tezira said she was having issues submitting items on CGSpace today
- I looked at all the errors in the DSpace log and see a few SQL pooling errors around mid day:
$ grep ERROR dspace.log.2020-04-29 | cut -f 3- -d' ' | sort | uniq -c | sort -n
1 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL findByUnique Error -
1 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL find Error -
1 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL query singleTable Error -
1 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL QueryTable Error -
- I looked in Munin and I see that there is a strange spike in the database pool usage this afternoon:
- Looking at the past month it does seem to be something strange:
- Database connections do seem high:
$ psql -c 'select * from pg_stat_activity' | grep -o -E '(dspaceWeb|dspaceApi|dspaceCli)' | sort | uniq -c
5 dspaceApi
6 dspaceCli
88 dspaceWeb
- Most of those are idle in transaction:
$ psql -c 'select * from pg_stat_activity' | grep 'dspaceWeb' | grep -c "idle in transaction"
67
- I don't see anything in the PostgreSQL or Tomcat logs suggesting anything is wrong... I think the solution to clear these idle connections is probably to just restart Tomcat
- I looked at the Solr stats for this month and see lots of suspicious IPs:
$ curl -s 'http://localhost:8081/solr/statistics/select?q=*:*&fq=dateYearMonth:2020-04&rows=0&wt=json&indent=true&facet=true&facet.field=ip
"88.99.115.53",23621, # Hetzner, using XMLUI and REST API with no user agent
"104.154.216.0",11865,# Google cloud, scraping XMLUI with no user agent
"104.198.96.245",4925,# Google cloud, using REST API with no user agent
"52.34.238.26",2907, # EcoSearch on XMLUI, user agent: EcoSearch (+https://search.ecointernet.org/)
- And a bunch more... ugh...
- 70.32.90.172: scraping REST API for IWMI/WLE pages with no user agent
- 2a01:7e00::f03c:91ff:fe16:fcb: Linode, REST API, no user agent
- 2607:f298:5:101d:f816:3eff:fed9:a484: DreamHost, XMLUI and REST API, python-requests/2.18.4
- 2a00:1768:2001:7a::20: Netherlands, XMLUI, trying SQL injections
- I need to start blocking requests without a user agent...
- I purged these user agents using my
check-spider-ip-hits.sh
script:
$ for year in {2010..2019}; do ./check-spider-ip-hits.sh -f /tmp/ips -s statistics-$year -p; done
$ ./check-spider-ip-hits.sh -f /tmp/ips -s statistics -p
- Then I added a few of them to the bot mapping in the nginx config because it appears they are regular harvesters since 2018
- Looking through the Solr stats faceted by the
userAgent
field I see some interesting ones:
$ curl 'http://localhost:8081/solr/statistics/select?q=*%3A*&rows=0&wt=json&indent=true&facet=true&facet.field=userAgent'
...
"Delphi 2009",50725,
"OgScrper/1.0.0",12421,
- Delphi is only used by IP addresses in Greece, so that's obviously the GARDIAN people harvesting us...
- I have no idea what OgScrper is, but it's not a user!
- Then there are 276,000 hits from
MEL-API
from Jordanian IPs in 2018, so that's obviously CodeObia guys... - Other user agents:
- GigablastOpenSource/1
- Owlin Domain Resolver V1
- API scraper
- MetaURI
- I don't know why, but my
check-spider-hits.sh
script doesn't seem to be handling the user agents with spaces properly so I will delete those manually after - First delete the ones without spaces, creating a temp file in
/tmp/agents
containing the patterns:
$ for year in {2010..2019}; do ./check-spider-hits.sh -f /tmp/agents -s statistics-$year -p; done
$ ./check-spider-hits.sh -f /tmp/agents -s statistics -p
- That's about 300,000 hits purged...
- Then remove the ones with spaces manually, checking the query syntax first, then deleting in yearly cores and the statistics core:
$ curl -s "http://localhost:8081/solr/statistics/select" -d "q=userAgent:/Delphi 2009/&rows=0"
...
<lst name="responseHeader"><int name="status">0</int><int name="QTime">52</int><lst name="params"><str name="q">userAgent:/Delphi 2009/</str><str name="rows">0</str></lst></lst><result name="response" numFound="38760" start="0"></result>
$ for year in {2010..2019}; do curl -s "http://localhost:8081/solr/statistics-$year/update?softCommit=true" -H "Content-Type: text/xml" --data-binary '<delete><query>userAgent:"Delphi 2009"</query></delete>'; done
$ curl -s "http://localhost:8081/solr/statistics/update?softCommit=true" -H "Content-Type: text/xml" --data-binary '<delete><query>userAgent:"Delphi 2009"</query></delete>'
- Quoting them works for now until I can look into it and handle it properly in the script
- This was about 400,000 hits in total purged from the Solr statistics
2020-04-30
- The TLS certificates on DSpace Test (linode26) have not been renewing correctly
- The log shows the message "No renewals were attempted"
- The
certbot-auto certificates
command doesn't list the certificate I have installed - I guess this is because I copied it from the previous server...
- I moved the Let's Encrypt directory, got a new cert, then revoked the old one:
# mv /etc/letsencrypt /etc/letsencrypt.bak
# /opt/certbot-auto certonly --standalone --email fu@m.com -d dspacetest.cgiar.org --standalone --pre-hook "/bin/systemctl stop nginx" --post-hook "/bin/systemctl start nginx"
# /opt/certbot-auto revoke --cert-path /etc/letsencrypt.bak/live/dspacetest.cgiar.org/cert.pem
# rm -rf /etc/letsencrypt.bak
- Run all system updates on DSpace Test and reboot it
- Tezira is still having issue submitting to CGSpace and the database is definitely busy according to Munin:
- But I don't see a lot of connections in PostgreSQL itself:
$ psql -c 'select * from pg_stat_activity' | grep -o -E '(dspaceWeb|dspaceApi|dspaceCli)' | sort | uniq -c
5 dspaceApi
6 dspaceCli
14 dspaceWeb
$ psql -c 'select * from pg_stat_activity' | wc -l
30
- Tezira said she cleared her browser cache and then was able to submit again
- She said once she logged back in she had very many "Untitled" submissions pending
- I see that the database connections are indeed much lower now:
- The PostgreSQL log shows a lot of errors about deadlocks and queries waiting on other processes...
ERROR: deadlock detected