- Testing adding [ORCIDs to a CSV](https://wiki.lyrasis.org/display/DSDOC5x/ORCID+Integration#ORCIDIntegration-EditingexistingitemsusingBatchCSVEditing) file for a single item to see if the author orders get messed up
- Need to test the following scenarios to see how author order is affected:
- ORCIDs only
- ORCIDs plus normal authors
- I exported a random item's metadata as CSV, deleted *all columns* except id and collection, and made a new coloum called `ORCID:dc.contributor.author` with the following random ORCIDs from the ORCID registry:
- Hmm, with the `dc.contributor.author` column removed, DSpace doesn't detect any changes
- With a blank `dc.contributor.author` column, DSpace wants to remove all non-ORCID authors and add the new ORCID authors
- I added the [disclaimer text](https://github.com/ilri/DSpace/issues/234) to the About page, then added a footer link to the disclaimer's ID, but there is a Bootstrap issue that causes the page content to disappear when using in-page anchors: https://github.com/twbs/bootstrap/issues/1768
- Generate list of unique authors in CCAFS collections:
```
dspacetest=# \copy (select distinct text_value, count(*) as count from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'contributor' and qualifier = 'author') AND resource_type_id = 2 AND resource_id IN (select item_id from collection2item where collection_id IN (select resource_id from handle where handle in ('10568/32729', '10568/5472', '10568/5473', '10568/10288', '10568/70974', '10568/3547', '10568/3549', '10568/3531','10568/16890','10568/5470','10568/3546', '10568/36024', '10568/66581', '10568/21789', '10568/5469', '10568/5468', '10568/3548', '10568/71053', '10568/25167'))) group by text_value order by count desc) to /tmp/ccafs-authors.csv with csv;
```
## 2016-10-05
- Work on more infrastructure cleanups for Ansible DSpace role
- Clean up Let's Encrypt plumbing and submit pull request for rmg-ansible-public ([#60](https://github.com/ilri/rmg-ansible-public/pull/60))
## 2016-10-06
- Nice! DSpace Test (linode02) is now having `java.lang.OutOfMemoryError: Java heap space` errors...
- Heap space is 2048m, and we have 5GB of RAM being used for OS cache (Solr!) so let's just bump the memory to 3072m
- Magdalena from CCAFS asked why the colors in the thumbnails for these [two](https://cgspace.cgiar.org/handle/10568/71249) [items](https://cgspace.cgiar.org/handle/10568/71259) look different, even though they are the same in the PDF itself
- Re-deploy CGSpace with latest changes from late September and early October
- Run fixes for ILRI subjects and delete blank metadata values:
```
dspace=# delete from metadatavalue where resource_type_id=2 and text_value='';
DELETE 11
```
- Run all system updates and reboot CGSpace
- Delete ten gigs of old 2015 Tomcat logs that never got rotated (WTF?):
```
root@linode01:~# ls -lh /var/log/tomcat7/localhost_access_log.2015* | wc -l
47
```
- Delete 2GB `cron-filter-media.log` file, as it is just a log from a cron job and it doesn't get rotated like normal log files (almost a year now maybe)
- Run all system updates on DSpace Test and reboot server
- Looking into some issues with Discovery filters in Atmire's content and usage analysis module after adjusting the filter class
- Looks like changing the filters from `configuration.DiscoverySearchFilterFacet` to `configuration.DiscoverySearchFilter` breaks them in Atmire CUA module
- One observation is that there are still some old versions of names in the author lookup because authors appear in other communities (as we only corrected authors from CCAFS for this round)
- Have to fix about ten merge conflicts, mostly in the SCSS for the CGIAR theme
- Skip 1e34751b8cf17021f45d4cf2b9a5800c93fb4cb2 in lieu of upstream's 55e623d1c2b8b7b1fa45db6728e172e06bfa8598 (fixes X-Forwarded-For header) because I had made the same fix myself and it's better to use the upstream one
- I notice this rebase gets rid of GitHub merge commits... which actually might be fine because merges are fucking annoying to deal with when remote people merge without pulling and rebasing their branch first
- Finished up applying the 5.5 sitemap changes to all themes
- Start testing some things for DSpace 5.5, like command line metadata import, PDF media filter, and Atmire CUA
- Start looking at batch fixing of "old" ILRI website links without www or https, for example:
```
dspace=# select * from metadatavalue where resource_type_id=2 and text_value like 'http://ilri.org%';
```
- Also CCAFS has HTTPS and their links should use it where possible:
```
dspace=# select * from metadatavalue where resource_type_id=2 and text_value like 'http://ccafs.cgiar.org%';
```
- And this will find community and collection HTML text that is using the old style PNG/JPG icons for RSS and email (we should be using Font Awesome icons instead):
```
dspace=# select text_value from metadatavalue where resource_type_id in (3,4) and text_value like '%Iconrss2.png%';
```
- Turns out there are shit tons of varieties of this, like with http, https, www, separate `</img>` tags, alignments, etc
- Had to find all variations and replace them individually:
```
dspace=# update metadatavalue set text_value = regexp_replace(text_value, '<imgalign="left"src="https://www.ilri.org/images/Iconrss2.png"/>','<spanclass="fa fa-rss fa-2x"aria-hidden="true"></span>') where resource_type_id in (3,4) and text_value like '%<imgalign="left"src="https://www.ilri.org/images/Iconrss2.png"/>%';
dspace=# update metadatavalue set text_value = regexp_replace(text_value, '<imgalign="left"src="https://www.ilri.org/images/email.jpg"/>', '<spanclass="fa fa-at fa-2x"aria-hidden="true"></span>') where resource_type_id in (3,4) and text_value like '%<imgalign="left"src="https://www.ilri.org/images/email.jpg"/>%';
dspace=# update metadatavalue set text_value = regexp_replace(text_value, '<imgalign="left"src="http://www.ilri.org/images/Iconrss2.png"/>', '<spanclass="fa fa-rss fa-2x"aria-hidden="true"></span>') where resource_type_id in (3,4) and text_value like '%<imgalign="left"src="http://www.ilri.org/images/Iconrss2.png"/>%';
dspace=# update metadatavalue set text_value = regexp_replace(text_value, '<imgalign="left"src="http://www.ilri.org/images/email.jpg"/>', '<spanclass="fa fa-at fa-2x"aria-hidden="true"></span>') where resource_type_id in (3,4) and text_value like '%<imgalign="left"src="http://www.ilri.org/images/email.jpg"/>%';
dspace=# update metadatavalue set text_value = regexp_replace(text_value, '<imgalign="left"src="http://www.ilri.org/images/Iconrss2.png"></img>', '<spanclass="fa fa-rss fa-2x"aria-hidden="true"></span>') where resource_type_id in (3,4) and text_value like '%<imgalign="left"src="http://www.ilri.org/images/Iconrss2.png"></img>%';
dspace=# update metadatavalue set text_value = regexp_replace(text_value, '<imgalign="left"src="http://www.ilri.org/images/email.jpg"></img>', '<spanclass="fa fa-at fa-2x"aria-hidden="true"></span>') where resource_type_id in (3,4) and text_value like '%<imgalign="left"src="http://www.ilri.org/images/email.jpg"></img>%';
dspace=# update metadatavalue set text_value = regexp_replace(text_value, '<imgalign="left"src="https://ilri.org/images/Iconrss2.png"></img>', '<spanclass="fa fa-rss fa-2x"aria-hidden="true"></span>') where resource_type_id in (3,4) and text_value like '%<imgalign="left"src="https://ilri.org/images/Iconrss2.png"></img>%';
dspace=# update metadatavalue set text_value = regexp_replace(text_value, '<imgalign="left"src="https://ilri.org/images/email.jpg"></img>', '<spanclass="fa fa-at fa-2x"aria-hidden="true"></span>') where resource_type_id in (3,4) and text_value like '%<imgalign="left"src="https://ilri.org/images/email.jpg"></img>%';
dspace=# update metadatavalue set text_value = regexp_replace(text_value, '<imgalign="left"src="https://www.ilri.org/images/Iconrss2.png"></img>', '<spanclass="fa fa-rss fa-2x"aria-hidden="true"></span>') where resource_type_id in (3,4) and text_value like '%<imgalign="left"src="https://www.ilri.org/images/Iconrss2.png"></img>%';
dspace=# update metadatavalue set text_value = regexp_replace(text_value, '<imgalign="left"src="https://www.ilri.org/images/email.jpg"></img>', '<spanclass="fa fa-at fa-2x"aria-hidden="true"></span>') where resource_type_id in (3,4) and text_value like '%<imgalign="left"src="https://www.ilri.org/images/email.jpg"></img>%';
dspace=# update metadatavalue set text_value = regexp_replace(text_value, '<imgalign="left"src="https://ilri.org/images/Iconrss2.png"/>', '<spanclass="fa fa-rss fa-2x"aria-hidden="true"></span>') where resource_type_id in (3,4) and text_value like '%<imgalign="left"src="https://ilri.org/images/Iconrss2.png"/>%';
dspace=# update metadatavalue set text_value = regexp_replace(text_value, '<imgalign="left"src="https://ilri.org/images/email.jpg"/>', '<spanclass="fa fa-at fa-2x"aria-hidden="true"></span>') where resource_type_id in (3,4) and text_value like '%<imgalign="left"src="https://ilri.org/images/email.jpg"/>%';
dspace=# update metadatavalue set text_value = regexp_replace(text_value, '<imgvalign="center"align="left"src="https://www.ilri.org/images/Iconrss2.png"/>', '<spanclass="fa fa-rss fa-2x"aria-hidden="true"></span>') where resource_type_id in (3,4) and text_value like '%<imgvalign="center"align="left"src="https://www.ilri.org/images/Iconrss2.png"/>%';
dspace=# update metadatavalue set text_value = regexp_replace(text_value, '<imgvalign="center"align="left"src="https://www.ilri.org/images/email.jpg"/>', '<spanclass="fa fa-at fa-2x"aria-hidden="true"></span>') where resource_type_id in (3,4) and text_value like '%<imgvalign="center"align="left"src="https://www.ilri.org/images/email.jpg"/>%';
dspace=# update metadatavalue set text_value = regexp_replace(text_value, '<imgvalign="center"align="left"src="http://www.ilri.org/images/Iconrss2.png"/>', '<spanclass="fa fa-rss fa-2x"aria-hidden="true"></span>') where resource_type_id in (3,4) and text_value like '%<imgvalign="center"align="left"src="http://www.ilri.org/images/Iconrss2.png"/>%';
dspace=# update metadatavalue set text_value = regexp_replace(text_value, '<imgvalign="center"align="left"src="http://www.ilri.org/images/email.jpg"/>', '<spanclass="fa fa-at fa-2x"aria-hidden="true"></span>') where resource_type_id in (3,4) and text_value like '%<imgvalign="center"align="left"src="http://www.ilri.org/images/email.jpg"/>%';
```
- Getting rid of these reduces the number of network requests each client makes on community/collection pages, and makes use of Font Awesome icons (which they are already loading anyways!)
- And now that I start looking, I want to fix a bunch of links to popular sites that should be using HTTPS, like Twitter, Facebook, Google, Feed Burner, DOI, etc
- I should look to see if any of those domains is sending an HTTP 301 or setting HSTS headers to their HTTPS domains, then just replace them
dspace=# update metadatavalue set authority='799da1d8-22f3-43f5-8233-3d2ef5ebf8a8', confidence=600 where metadata_field_id=3 and resource_type_id=2 and text_value like 'Charleston, B.%';
UPDATE 10
dspace=# update metadatavalue set authority='e936f5c5-343d-4c46-aa91-7a1fff6277ed', confidence=600 where metadata_field_id=3 and resource_type_id=2 and text_value like 'Knight-Jones%';
UPDATE 36
```
- I updated the authority index but nothing seemed to change, so I'll wait and do it again after I update Discovery below
- Skype chat with Tsega about the [IFPRI contentdm bridge](https://github.com/ilri/ckm-cgspace-contentdm-bridge)
- We tested harvesting OAI in an example collection to see how it works
- Talk to Carlos Quiros about CG Core metadata in CGSpace
- Get a list of countries from CGSpace so I can do some batch corrections:
```
dspace=# \copy (select distinct text_value, count(*) from metadatavalue where metadata_field_id=228 group by text_value order by count desc) to /tmp/countries.csv with csv;
```
- Fix a bunch of countries in Open Refine and run the corrections on CGSpace:
- Run a few URL corrections for ilri.org and doi.org, etc:
```
dspace=# update metadatavalue set text_value = regexp_replace(text_value, 'http://www.ilri.org','https://www.ilri.org') where resource_type_id=2 and text_value like '%http://www.ilri.org%';
dspace=# update metadatavalue set text_value = regexp_replace(text_value, 'http://mahider.ilri.org', 'https://cgspace.cgiar.org') where resource_type_id=2 and text_value like '%http://mahider.%.org%' and metadata_field_id not in (28);
dspace=# update metadatavalue set text_value = regexp_replace(text_value, 'http://dx.doi.org', 'https://dx.doi.org') where resource_type_id=2 and text_value like '%http://dx.doi.org%' and metadata_field_id not in (18,26,28,111);
dspace=# update metadatavalue set text_value = regexp_replace(text_value, 'http://doi.org', 'https://dx.doi.org') where resource_type_id=2 and text_value like '%http://doi.org%' and metadata_field_id not in (18,26,28,111);
```
- I skipped metadata fields like citation and description