mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-11-19 05:07:03 +01:00
91 lines
4.7 KiB
Markdown
91 lines
4.7 KiB
Markdown
+++
|
|
date = "2016-10-03T15:53:00+03:00"
|
|
author = "Alan Orth"
|
|
title = "October, 2016"
|
|
tags = ["Notes"]
|
|
|
|
+++
|
|
## 2016-10-03
|
|
|
|
- Testing adding [ORCIDs to a CSV](https://wiki.duraspace.org/display/DSDOC5x/ORCID+Integration#ORCIDIntegration-EditingexistingitemsusingBatchCSVEditing) file for a single item to see if the author orders get messed up
|
|
- Need to test the following scenarios to see how author order is affected:
|
|
- ORCIDs only
|
|
- ORCIDs plus normal authors
|
|
- I exported a random item's metadata as CSV, deleted *all columns* except id and collection, and made a new coloum called `ORCID:dc.contributor.author` with the following random ORCIDs from the ORCID registry:
|
|
|
|
```
|
|
0000-0002-6115-0956||0000-0002-3812-8793||0000-0001-7462-405X
|
|
```
|
|
|
|
- Hmm, with the `dc.contributor.author` column removed, DSpace doesn't detect any changes
|
|
- With a blank `dc.contributor.author` column, DSpace wants to remove all non-ORCID authors and add the new ORCID authors
|
|
- I added the [disclaimer text](https://github.com/ilri/DSpace/issues/234) to the About page, then added a footer link to the disclaimer's ID, but there is a Bootstrap issue that causes the page content to disappear when using in-page anchors: https://github.com/twbs/bootstrap/issues/1768
|
|
|
|
![Bootstrap issue with in-page anchors](2016/10/bootstrap-issue.png)
|
|
|
|
- Looks like we'll just have to add the text to the About page (without a link) or add a separate page
|
|
|
|
## 2016-10-04
|
|
|
|
- Start testing cleanups of authors that Peter sent last week
|
|
- Out of 40,000+ rows, Peter had indicated corrections for ~3,200 of them—too many to look through carefully, so I did some basic quality checking:
|
|
- Trim leading/trailing whitespace
|
|
- Find invalid characters
|
|
- Cluster values to merge obvious authors
|
|
- That left us with 3,180 valid corrections and 3 deletions:
|
|
|
|
```
|
|
$ ./fix-metadata-values.py -i authors-fix-3180.csv -f dc.contributor.author -t correct -m 3 -d dspacetest -u dspacetest -p fuuu
|
|
$ ./delete-metadata-values.py -i authors-delete-3.csv -f dc.contributor.author -m 3 -d dspacetest -u dspacetest -p fuuu
|
|
```
|
|
|
|
- Remove old about page ([#284](https://github.com/ilri/DSpace/pull/284))
|
|
- CGSpace crashed a few times today
|
|
- Generate list of unique authors in CCAFS collections:
|
|
|
|
```
|
|
dspacetest=# \copy (select distinct text_value, count(*) as count from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'contributor' and qualifier = 'author') AND resource_type_id = 2 AND resource_id IN (select item_id from collection2item where collection_id IN (select resource_id from handle where handle in ('10568/32729', '10568/5472', '10568/5473', '10568/10288', '10568/70974', '10568/3547', '10568/3549', '10568/3531','10568/16890','10568/5470','10568/3546', '10568/36024', '10568/66581', '10568/21789', '10568/5469', '10568/5468', '10568/3548', '10568/71053', '10568/25167'))) group by text_value order by count desc) to /tmp/ccafs-authors.csv with csv;
|
|
```
|
|
|
|
## 2016-10-05
|
|
|
|
- Work on more infrastructure cleanups for Ansible DSpace role
|
|
- Clean up Let's Encrypt plumbing and submit pull request for rmg-ansible-public ([#60](https://github.com/ilri/rmg-ansible-public/pull/60))
|
|
|
|
## 2016-10-06
|
|
|
|
- Nice! DSpace Test (linode02) is now having `java.lang.OutOfMemoryError: Java heap space` errors...
|
|
- Heap space is 2048m, and we have 5GB of RAM being used for OS cache (Solr!) so let's just bump the memory to 3072m
|
|
- Magdalena from CCAFS asked why the colors in the thumbnails for these [two](https://cgspace.cgiar.org/handle/10568/71249) [items](https://cgspace.cgiar.org/handle/10568/71259) look different, even though they are the same in the PDF itself
|
|
|
|
![CMYK vs sRGB colors](2016/10/cmyk-vs-srgb.jpg)
|
|
|
|
- Turns out the first PDF was exported from InDesign using CMYK and the second one was using sRGB
|
|
- Run all system updates on DSpace Test and reboot it
|
|
|
|
## 2016-10-08
|
|
|
|
- Re-deploy CGSpace with latest changes from late September and early October
|
|
- Run fixes for ILRI subjects and delete blank metadata values:
|
|
|
|
```
|
|
dspace=# delete from metadatavalue where resource_type_id=2 and text_value='';
|
|
DELETE 11
|
|
```
|
|
|
|
- Run all system updates and reboot CGSpace
|
|
- Delete ten gigs of old 2015 Tomcat logs that never got rotated (WTF?):
|
|
|
|
```
|
|
root@linode01:~# ls -lh /var/log/tomcat7/localhost_access_log.2015* | wc -l
|
|
47
|
|
```
|
|
|
|
- Delete 2GB `cron-filter-media.log` file, as it is just a log from a cron job and it doesn't get rotated like normal log files (almost a year now maybe)
|
|
|
|
## 2016-10-14
|
|
|
|
- Run all system updates on DSpace Test and reboot server
|
|
- Looking into some issues with Discovery filters in Atmire's content and usage analysis module after adjusting the filter class
|
|
- Looks like changing the filters from `configuration.DiscoverySearchFilterFacet` to `configuration.DiscoverySearchFilter` breaks them in Atmire CUA module
|