cgspace-notes/content/2016-07.md

127 lines
4.4 KiB
Markdown
Raw Normal View History

2016-07-01 15:00:04 +02:00
+++
date = "2016-07-01T10:53:00+03:00"
author = "Alan Orth"
title = "July, 2016"
tags = ["notes"]
image = "../images/bg.jpg"
+++
## 2016-07-01
- Add `dc.description.sponsorship` to Discovery sidebar facets and make investors clickable in item view ([#232](https://github.com/ilri/DSpace/issues/232))
2016-07-02 16:04:52 +02:00
- I think this query should find and replace all authors that have "," at the end of their names:
2016-07-01 15:56:54 +02:00
```
dspacetest=# update metadatavalue set text_value = regexp_replace(text_value, '(^.+?),$', '\1') where metadata_field_id=3 and text_value ~ '^.+?,$';
UPDATE 95
dspacetest=# select text_value from metadatavalue where metadata_field_id=3 and text_value ~ '^.+?,$';
text_value
------------
(0 rows)
```
2016-07-02 16:04:52 +02:00
- In this case the select query was showing 95 results before the update
## 2016-07-02
- Comment on DSpace Jira ticket about author lookup search text ([DS-2329](https://jira.duraspace.org/browse/DS-2329))
2016-07-05 18:07:25 +02:00
## 2016-07-04
- Seems the database's author authority values mean nothing without the `authority` Solr core from the host where they were created!
## 2016-07-05
- Amend `backup-solr.sh` script so it backs up the entire Solr folder
- We *really* only need `statistics` and `authority` but meh
- Fix metadata for species on DSpace Test:
```
$ ./fix-metadata-values.py -i /tmp/Species-Peter-Fix.csv -f dc.Species -t CORRECT -m 94 -d dspacetest -u dspacetest -p 'fuuu'
```
- Will run later on CGSpace
- A user is still having problems with Sherpa/Romeo causing crashes during the submission process when the journal is "ungraded"
- I tested the [patch for DS-2740](https://jira.duraspace.org/browse/DS-2740) that I had found last month and it seems to work
- I will merge it to `5_x-prod`
2016-07-06 17:49:52 +02:00
## 2016-07-06
- Delete 23 blank metadata values from CGSpace:
```
cgspace=# delete from metadatavalue where resource_type_id=2 and text_value='';
DELETE 23
```
- Complete phase three of metadata migration, for the following fields:
- dc.title.jtitle → dc.source
- dc.crsubject.crpsubject → cg.contributor.crp
- dc.contributor.affiliation → cg.contributor.affiliation
- dc.Species → cg.species
- dc.srplace.subregion → cg.coverage.subregion
- dc.contributor.corporate → dc.contributor.author
- dc.identifier.url → cg.identifier.url
- dc.identifier.doi → cg.identifier.doi
- dc.identifier.googleurl → cg.identifier.googleurl
- dc.identifier.dataurl → cg.identifier.dataurl
- Also, run fixes and deletes for species and author affiliations (over 1000 corrections!)
```
$ ./fix-metadata-values.py -i Species-Peter-Fix.csv -f dc.Species -t CORRECT -m 212 -d dspace -u dspace -p 'fuuu'
2016-07-07 08:21:12 +02:00
$ ./fix-metadata-values.py -i Affiliations-Fix-1045-Peter-Abenet.csv -f dc.contributor.affiliation -t Correct -m 211 -d dspace -u dspace -p 'fuuu'
$ ./delete-metadata-values.py -f dc.contributor.affiliation -i Affiliations-Delete-Peter-Abenet.csv -m 211 -u dspace -d dspace -p 'fuuu'
2016-07-06 17:49:52 +02:00
```
- I then ran all server updates and rebooted the server
2016-07-11 22:10:00 +02:00
## 2016-07-11
- Doing some author cleanups from Peter and Abenet:
```
$ ./fix-metadata-values.py -i /tmp/Authors-Fix-205-UTF8.csv -f dc.contributor.author -t correct -m 3 -d dspacetest -u dspacetest -p fuuu
$ ./delete-metadata-values.py -f dc.contributor.author -i /tmp/Authors-Delete-UTF8.csv -m 3 -u dspacetest -d dspacetest -p fuuu
```
2016-07-13 15:05:33 +02:00
## 2016-07-13
- Run the author cleanups on CGSpace and start a full Discovery re-index
2016-07-18 16:25:30 +02:00
## 2016-07-18
- Adjust identifiers in XMLUI item display to be more prominent
- Add species and breed to the XMLUI item display
2016-07-19 13:38:30 +02:00
- CGSpace crashed late at night and the DSpace logs were showing:
```
2016-07-18 20:26:30,941 ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL connection Error -
org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object
...
```
- I suspect it's someone hitting REST too much:
```
# awk '{print $1}' /var/log/nginx/rest.log | sort -n | uniq -c | sort -h | tail -n 3
710 66.249.78.38
1781 181.118.144.29
24904 70.32.99.142
```
- I just blocked access to `/rest` for that last IP for now:
```
# log rest requests
location /rest {
access_log /var/log/nginx/rest.log;
proxy_pass http://127.0.0.1:8443;
deny 70.32.99.142;
}
```
2016-07-21 17:06:56 +02:00
## 2016-07-21
- Mitigate the [HTTPoxy](https://httpoxy.org) vulnerability for Tomcat etc in nginx: https://github.com/ilri/rmg-ansible-public/pull/38
- Unblock 70.32.99.142 from `/rest` as it has been blocked for a few days