mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-11-18 04:37:04 +01:00
212 lines
7.4 KiB
Markdown
212 lines
7.4 KiB
Markdown
+++
|
|
date = "2016-07-01T10:53:00+03:00"
|
|
author = "Alan Orth"
|
|
title = "July, 2016"
|
|
tags = ["notes"]
|
|
image = "../images/bg.jpg"
|
|
|
|
+++
|
|
## 2016-07-01
|
|
|
|
- Add `dc.description.sponsorship` to Discovery sidebar facets and make investors clickable in item view ([#232](https://github.com/ilri/DSpace/issues/232))
|
|
- I think this query should find and replace all authors that have "," at the end of their names:
|
|
|
|
```
|
|
dspacetest=# update metadatavalue set text_value = regexp_replace(text_value, '(^.+?),$', '\1') where metadata_field_id=3 and resource_type_id=2 and text_value ~ '^.+?,$';
|
|
UPDATE 95
|
|
dspacetest=# select text_value from metadatavalue where metadata_field_id=3 and resource_type_id=2 and text_value ~ '^.+?,$';
|
|
text_value
|
|
------------
|
|
(0 rows)
|
|
```
|
|
|
|
- In this case the select query was showing 95 results before the update
|
|
|
|
## 2016-07-02
|
|
|
|
- Comment on DSpace Jira ticket about author lookup search text ([DS-2329](https://jira.duraspace.org/browse/DS-2329))
|
|
|
|
## 2016-07-04
|
|
|
|
- Seems the database's author authority values mean nothing without the `authority` Solr core from the host where they were created!
|
|
|
|
## 2016-07-05
|
|
|
|
- Amend `backup-solr.sh` script so it backs up the entire Solr folder
|
|
- We *really* only need `statistics` and `authority` but meh
|
|
- Fix metadata for species on DSpace Test:
|
|
|
|
```
|
|
$ ./fix-metadata-values.py -i /tmp/Species-Peter-Fix.csv -f dc.Species -t CORRECT -m 94 -d dspacetest -u dspacetest -p 'fuuu'
|
|
```
|
|
|
|
- Will run later on CGSpace
|
|
- A user is still having problems with Sherpa/Romeo causing crashes during the submission process when the journal is "ungraded"
|
|
- I tested the [patch for DS-2740](https://jira.duraspace.org/browse/DS-2740) that I had found last month and it seems to work
|
|
- I will merge it to `5_x-prod`
|
|
|
|
## 2016-07-06
|
|
|
|
- Delete 23 blank metadata values from CGSpace:
|
|
|
|
```
|
|
cgspace=# delete from metadatavalue where resource_type_id=2 and text_value='';
|
|
DELETE 23
|
|
```
|
|
|
|
- Complete phase three of metadata migration, for the following fields:
|
|
- dc.title.jtitle → dc.source
|
|
- dc.crsubject.crpsubject → cg.contributor.crp
|
|
- dc.contributor.affiliation → cg.contributor.affiliation
|
|
- dc.Species → cg.species
|
|
- dc.srplace.subregion → cg.coverage.subregion
|
|
- dc.contributor.corporate → dc.contributor.author
|
|
- dc.identifier.url → cg.identifier.url
|
|
- dc.identifier.doi → cg.identifier.doi
|
|
- dc.identifier.googleurl → cg.identifier.googleurl
|
|
- dc.identifier.dataurl → cg.identifier.dataurl
|
|
- Also, run fixes and deletes for species and author affiliations (over 1000 corrections!)
|
|
|
|
```
|
|
$ ./fix-metadata-values.py -i Species-Peter-Fix.csv -f dc.Species -t CORRECT -m 212 -d dspace -u dspace -p 'fuuu'
|
|
$ ./fix-metadata-values.py -i Affiliations-Fix-1045-Peter-Abenet.csv -f dc.contributor.affiliation -t Correct -m 211 -d dspace -u dspace -p 'fuuu'
|
|
$ ./delete-metadata-values.py -f dc.contributor.affiliation -i Affiliations-Delete-Peter-Abenet.csv -m 211 -u dspace -d dspace -p 'fuuu'
|
|
```
|
|
|
|
- I then ran all server updates and rebooted the server
|
|
|
|
## 2016-07-11
|
|
|
|
- Doing some author cleanups from Peter and Abenet:
|
|
|
|
```
|
|
$ ./fix-metadata-values.py -i /tmp/Authors-Fix-205-UTF8.csv -f dc.contributor.author -t correct -m 3 -d dspacetest -u dspacetest -p fuuu
|
|
$ ./delete-metadata-values.py -f dc.contributor.author -i /tmp/Authors-Delete-UTF8.csv -m 3 -u dspacetest -d dspacetest -p fuuu
|
|
```
|
|
|
|
## 2016-07-13
|
|
|
|
- Run the author cleanups on CGSpace and start a full Discovery re-index
|
|
|
|
## 2016-07-14
|
|
|
|
- Test LDAP settings for new root LDAP
|
|
- Seems to work when binding as a top-level user
|
|
|
|
## 2016-07-18
|
|
|
|
- Adjust identifiers in XMLUI item display to be more prominent
|
|
- Add species and breed to the XMLUI item display
|
|
- CGSpace crashed late at night and the DSpace logs were showing:
|
|
|
|
```
|
|
2016-07-18 20:26:30,941 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL connection Error -
|
|
org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object
|
|
...
|
|
```
|
|
|
|
- I suspect it's someone hitting REST too much:
|
|
|
|
```
|
|
# awk '{print $1}' /var/log/nginx/rest.log | sort -n | uniq -c | sort -h | tail -n 3
|
|
710 66.249.78.38
|
|
1781 181.118.144.29
|
|
24904 70.32.99.142
|
|
```
|
|
|
|
- I just blocked access to `/rest` for that last IP for now:
|
|
|
|
```
|
|
# log rest requests
|
|
location /rest {
|
|
access_log /var/log/nginx/rest.log;
|
|
proxy_pass http://127.0.0.1:8443;
|
|
deny 70.32.99.142;
|
|
}
|
|
```
|
|
|
|
## 2016-07-21
|
|
|
|
- Mitigate the [HTTPoxy](https://httpoxy.org) vulnerability for Tomcat etc in nginx: https://github.com/ilri/rmg-ansible-public/pull/38
|
|
- Unblock 70.32.99.142 from `/rest` as it has been blocked for a few days
|
|
|
|
## 2016-07-22
|
|
|
|
- Help Paola from CCAFS with thumbnails for batch uploads
|
|
- She has been struggling to get the dimensions right, and manually enlarging smaller thumbnails, renaming PNGs to JPG, etc
|
|
- Altmetric reports having an issue with some of our authors being doubled...
|
|
- This is related to authority and confidence!
|
|
- We might need to use `index.authority.ignore-prefered=true` to tell the Discovery index to prefer the variation that exists in the metadatavalue rather than what it finds in the authority cache.
|
|
- Trying these on DSpace Test after a discussion by Daniel Scharon on the dspace-tech mailing list:
|
|
|
|
```
|
|
index.authority.ignore-prefered.dc.contributor.author=true
|
|
index.authority.ignore-variants.dc.contributor.author=false
|
|
```
|
|
|
|
- After reindexing I don't see any change in Discovery's display of authors, and still have entries like:
|
|
|
|
```
|
|
Grace, D. (464)
|
|
Grace, D. (62)
|
|
```
|
|
|
|
- I asked for clarification of the following options on the DSpace mailing list:
|
|
|
|
```
|
|
index.authority.ignore
|
|
index.authority.ignore-prefered
|
|
index.authority.ignore-variants
|
|
```
|
|
|
|
- In the mean time, I will try these on DSpace Test (plus a reindex):
|
|
|
|
```
|
|
index.authority.ignore=true
|
|
index.authority.ignore-prefered=true
|
|
index.authority.ignore-variants=true
|
|
```
|
|
|
|
- Enabled usage of `X-Forwarded-For` in DSpace admin control panel ([#255](https://github.com/ilri/DSpace/pull/255)
|
|
- It was misconfigured and disabled, but already working for some reason *sigh*
|
|
- ... no luck. Trying with just:
|
|
|
|
```
|
|
index.authority.ignore=true
|
|
```
|
|
|
|
- After re-indexing and clearing the XMLUI cache nothing has changed
|
|
|
|
## 2016-07-25
|
|
|
|
- Trying a few more settings (plus reindex) for Discovery on DSpace Test:
|
|
|
|
```
|
|
index.authority.ignore-prefered.dc.contributor.author=true
|
|
index.authority.ignore-variants=true
|
|
```
|
|
|
|
- Run all OS updates and reboot DSpace Test server
|
|
- No changes to Discovery after reindexing... hmm.
|
|
- Integrate and massively clean up About page ([#256](https://github.com/ilri/DSpace/pull/256))
|
|
|
|
![About page](../images/2016/07/cgspace-about-page.png)
|
|
|
|
- The DSpace source code mentions the configuration key `discovery.index.authority.ignore-prefered.*` (with prefix of discovery, despite the docs saying otherwise), so I'm trying the following on DSpace Test:
|
|
|
|
```
|
|
discovery.index.authority.ignore-prefered.dc.contributor.author=true
|
|
discovery.index.authority.ignore-variants=true
|
|
```
|
|
|
|
- Still no change!
|
|
- Deploy species, breed, and identifier changes to CGSpace, as well as About page
|
|
- Run Linode RAM upgrade (8→12GB)
|
|
- Re-sync DSpace Test with CGSpace
|
|
- I noticed that our backup scripts don't send Solr cores to S3 so I amended the script
|
|
|
|
## 2016-07-31
|
|
|
|
- Work on removing Dryland Systems and Humidtropics subjects from Discovery sidebar and Browse by
|
|
- Also change "Subjects" to "AGROVOC keywords" in Discovery sidebar/search and Browse by ([#257](https://github.com/ilri/DSpace/issues/257))
|