cgspace-notes/content/post/2016-10.md

115 lines
6.0 KiB
Markdown
Raw Normal View History

2016-10-03 17:28:33 +02:00
+++
date = "2016-10-03T15:53:00+03:00"
author = "Alan Orth"
title = "October, 2016"
tags = ["Notes"]
+++
## 2016-10-03
- Testing adding [ORCIDs to a CSV](https://wiki.duraspace.org/display/DSDOC5x/ORCID+Integration#ORCIDIntegration-EditingexistingitemsusingBatchCSVEditing) file for a single item to see if the author orders get messed up
- Need to test the following scenarios to see how author order is affected:
- ORCIDs only
- ORCIDs plus normal authors
- I exported a random item's metadata as CSV, deleted *all columns* except id and collection, and made a new coloum called `ORCID:dc.contributor.author` with the following random ORCIDs from the ORCID registry:
```
0000-0002-6115-0956||0000-0002-3812-8793||0000-0001-7462-405X
```
- Hmm, with the `dc.contributor.author` column removed, DSpace doesn't detect any changes
- With a blank `dc.contributor.author` column, DSpace wants to remove all non-ORCID authors and add the new ORCID authors
- I added the [disclaimer text](https://github.com/ilri/DSpace/issues/234) to the About page, then added a footer link to the disclaimer's ID, but there is a Bootstrap issue that causes the page content to disappear when using in-page anchors: https://github.com/twbs/bootstrap/issues/1768
![Bootstrap issue with in-page anchors](2016/10/bootstrap-issue.png)
- Looks like we'll just have to add the text to the About page (without a link) or add a separate page
2016-10-04 10:34:57 +02:00
## 2016-10-04
- Start testing cleanups of authors that Peter sent last week
- Out of 40,000+ rows, Peter had indicated corrections for ~3,200 of them—too many to look through carefully, so I did some basic quality checking:
- Trim leading/trailing whitespace
- Find invalid characters
- Cluster values to merge obvious authors
- That left us with 3,180 valid corrections and 3 deletions:
```
$ ./fix-metadata-values.py -i authors-fix-3180.csv -f dc.contributor.author -t correct -m 3 -d dspacetest -u dspacetest -p fuuu
$ ./delete-metadata-values.py -i authors-delete-3.csv -f dc.contributor.author -m 3 -d dspacetest -u dspacetest -p fuuu
```
- Remove old about page ([#284](https://github.com/ilri/DSpace/pull/284))
2016-10-06 07:06:32 +02:00
- CGSpace crashed a few times today
- Generate list of unique authors in CCAFS collections:
```
dspacetest=# \copy (select distinct text_value, count(*) as count from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'contributor' and qualifier = 'author') AND resource_type_id = 2 AND resource_id IN (select item_id from collection2item where collection_id IN (select resource_id from handle where handle in ('10568/32729', '10568/5472', '10568/5473', '10568/10288', '10568/70974', '10568/3547', '10568/3549', '10568/3531','10568/16890','10568/5470','10568/3546', '10568/36024', '10568/66581', '10568/21789', '10568/5469', '10568/5468', '10568/3548', '10568/71053', '10568/25167'))) group by text_value order by count desc) to /tmp/ccafs-authors.csv with csv;
```
## 2016-10-05
- Work on more infrastructure cleanups for Ansible DSpace role
2016-10-06 08:41:36 +02:00
- Clean up Let's Encrypt plumbing and submit pull request for rmg-ansible-public ([#60](https://github.com/ilri/rmg-ansible-public/pull/60))
## 2016-10-06
- Nice! DSpace Test (linode02) is now having `java.lang.OutOfMemoryError: Java heap space` errors...
- Heap space is 2048m, and we have 5GB of RAM being used for OS cache (Solr!) so let's just bump the memory to 3072m
- Magdalena from CCAFS asked why the colors in the thumbnails for these [two](https://cgspace.cgiar.org/handle/10568/71249) [items](https://cgspace.cgiar.org/handle/10568/71259) look different, even though they are the same in the PDF itself
![CMYK vs sRGB colors](2016/10/cmyk-vs-srgb.jpg)
- Turns out the first PDF was exported from InDesign using CMYK and the second one was using sRGB
- Run all system updates on DSpace Test and reboot it
2016-10-08 08:35:05 +02:00
## 2016-10-08
- Re-deploy CGSpace with latest changes from late September and early October
- Run fixes for ILRI subjects and delete blank metadata values:
```
dspace=# delete from metadatavalue where resource_type_id=2 and text_value='';
DELETE 11
```
- Run all system updates and reboot CGSpace
- Delete ten gigs of old 2015 Tomcat logs that never got rotated (WTF?):
```
root@linode01:~# ls -lh /var/log/tomcat7/localhost_access_log.2015* | wc -l
47
```
- Delete 2GB `cron-filter-media.log` file, as it is just a log from a cron job and it doesn't get rotated like normal log files (almost a year now maybe)
2016-10-15 01:18:16 +02:00
## 2016-10-14
- Run all system updates on DSpace Test and reboot server
- Looking into some issues with Discovery filters in Atmire's content and usage analysis module after adjusting the filter class
- Looks like changing the filters from `configuration.DiscoverySearchFilterFacet` to `configuration.DiscoverySearchFilter` breaks them in Atmire CUA module
2016-10-17 16:41:40 +02:00
## 2016-10-17
- A bit more cleanup on the CCAFS authors, and run the corrections on DSpace Test:
```
$ ./fix-metadata-values.py -i ccafs-authors-oct-16.csv -f dc.contributor.author -t 'correct name' -m 3 -d dspace -u dspace -p fuuu
```
- One observation is that there are still some old versions of names in the author lookup because authors appear in other communities (as we only corrected authors from CCAFS for this round)
2016-10-19 02:19:28 +02:00
## 2016-10-18
- Start working on DSpace 5.5 porting work again:
$ git checkout -b 5_x-55 5_x-prod
$ git rebase -i dspace-5.5
- Have to fix about ten merge conflicts, mostly in the SCSS for the CGIAR theme
- Skip 1e34751b8cf17021f45d4cf2b9a5800c93fb4cb2 in lieu of upstream's 55e623d1c2b8b7b1fa45db6728e172e06bfa8598 (fixes X-Forwarded-For header) because I had made the same fix myself and it's better to use the upstream one
- I notice this rebase gets rid of GitHub merge commits... which actually might be fine because merges are fucking annoying to deal with when remote people merge without pulling and rebasing their branch first
- Finished up applying the 5.5 sitemap changes to all themes
2016-10-19 09:05:11 +02:00
- Merge the `discovery.xml` cleanups ([#278](https://github.com/ilri/DSpace/pull/278))
- Merge some minor edits to the distribution license ([#285](https://github.com/ilri/DSpace/pull/285))