cgspace-notes/content/post/2017-10.md

121 lines
6.5 KiB
Markdown
Raw Normal View History

2017-10-01 07:13:31 +02:00
---
title: "October, 2017"
date: 2017-10-01T08:07:54+03:00
author: "Alan Orth"
tags: ["Notes"]
---
## 2017-10-01
- Peter emailed to point out that many items in the [ILRI archive collection](https://cgspace.cgiar.org/handle/10568/2703) have multiple handles:
```
http://hdl.handle.net/10568/78495||http://hdl.handle.net/10568/79336
```
- There appears to be a pattern but I'll have to look a bit closer and try to clean them up automatically, either in SQL or in OpenRefine
2017-10-01 12:44:37 +02:00
- Add Katherine Lutz to the groups for content sumission and edit steps of the CGIAR System collections
2017-10-01 07:13:31 +02:00
<!--more-->
2017-10-01 12:44:37 +02:00
2017-10-02 07:14:44 +02:00
## 2017-10-02
- Peter Ballantyne said he was having problems logging into CGSpace with "both" of his accounts (CGIAR LDAP and personal, apparently)
- I looked in the logs and saw some LDAP lookup failures due to timeout but also strangely a "no DN found" error:
```
2017-10-01 20:24:57,928 WARN org.dspace.authenticate.LDAPAuthentication @ anonymous:session_id=CA0AA5FEAEA8805645489404CDCE9594:ip_addr=41.204.190.40:ldap_attribute_lookup:type=failed_search javax.naming.CommunicationException\colon; svcgroot2.cgiarad.org\colon;3269 [Root exception is java.net.ConnectException\colon; Connection timed out (Connection timed out)]
2017-10-01 20:22:37,982 INFO org.dspace.authenticate.LDAPAuthentication @ anonymous:session_id=CA0AA5FEAEA8805645489404CDCE9594:ip_addr=41.204.190.40:failed_login:no DN found for user pballantyne
```
2017-10-02 07:31:19 +02:00
- I thought maybe his account had expired (seeing as it's was the first of the month) but he says he was finally able to log in today
- The logs for yesterday show fourteen errors related to LDAP auth failures:
```
$ grep -c "ldap_authentication:type=failed_auth" dspace.log.2017-10-01
14
```
- For what it's worth, there are no errors on any other recent days, so it must have been some network issue on Linode or CGNET's LDAP server
2017-10-02 16:29:23 +02:00
- Linode emailed to say that linode578611 (DSpace Test) needs to migrate to a new host for a security update so I initiated the migration immediately rather than waiting for the scheduled time in two weeks
2017-10-04 10:29:41 +02:00
## 2017-10-04
- Twice in the last twenty-four hours Linode has alerted about high CPU usage on CGSpace (linode2533629)
- Communicate with Sam from the CGIAR System Organization about some broken links coming from their CGIAR Library domain to CGSpace
- The first is a link to a browse page that should be handled better in nginx:
```
http://library.cgiar.org/browse?value=Intellectual%20Assets%20Reports&type=subject → https://cgspace.cgiar.org/browse?value=Intellectual%20Assets%20Reports&type=subject
```
- We'll need to check for browse links and handle them properly, including swapping the `subject` parameter for `systemsubject` (which doesn't exist in Discovery yet, but we'll need to add it) as we have moved their poorly curated subjects from `dc.subject` to `cg.subject.system`
- The second link was a direct link to a bitstream which has broken due to the sequence being updated, so I told him he should link to the handle of the item instead
2017-10-04 14:56:39 +02:00
- Help Sisay proof sixty-two IITA records on DSpace Test
- Lots of inconsistencies and errors in subjects, dc.format.extent, regions, countries
2017-10-04 16:06:10 +02:00
- Merge the Discovery search changes for ISI Journal ([#341](https://github.com/ilri/DSpace/pull/341))
2017-10-05 17:36:49 +02:00
## 2017-10-05
- Twice in the past twenty-four hours Linode has warned that CGSpace's outbound traffic rate was exceeding the notification threshold
- I had a look at yesterday's OAI and REST logs in `/var/log/nginx` but didn't see anything unusual:
```
# awk '{print $1}' /var/log/nginx/rest.log.1 | sort -n | uniq -c | sort -h | tail -n 10
141 157.55.39.240
145 40.77.167.85
162 66.249.66.92
181 66.249.66.95
211 66.249.66.91
312 66.249.66.94
384 66.249.66.90
1495 50.116.102.77
3904 70.32.83.92
9904 45.5.184.196
# awk '{print $1}' /var/log/nginx/oai.log.1 | sort -n | uniq -c | sort -h | tail -n 10
5 66.249.66.71
6 66.249.66.67
6 68.180.229.31
8 41.84.227.85
8 66.249.66.92
17 66.249.66.65
24 66.249.66.91
38 66.249.66.95
69 66.249.66.90
148 66.249.66.94
```
- Working on the nginx redirects for CGIAR Library
- We should start using 301 redirects and also allow for `/sitemap` to work on the library.cgiar.org domain so the CGIAR System Organization people can update their Google Search Console and allow Google to find their content in a structured way
- Remove eleven occurrences of `ACP` in IITA's `cg.coverage.region` using the Atmire batch edit module from Discovery
- Need to investigate how we can verify the library.cgiar.org using the HTML or DNS methods
2017-10-06 02:10:53 +02:00
- Run corrections on 143 ILRI Archive items that had two `dc.identifier.uri` values (Handle) that Peter had pointed out earlier this week
- I used OpenRefine to isolate them and then fixed and re-imported them into CGSpace
2017-10-06 02:11:56 +02:00
- I manually checked a dozen of them and it appeared that the correct handle was always the second one, so I just deleted the first one
2017-10-06 11:22:46 +02:00
## 2017-10-06
- I saw a nice tweak to thumbnail presentation on the Cardiff Metropolitan University DSpace: https://repository.cardiffmet.ac.uk/handle/10369/8780
- It adds a subtle border and box shadow, before and after:
![Original flat thumbnails](/cgspace-notes/2017/10/dspace-thumbnail-original.png)
![Tweaked with border and box shadow](/cgspace-notes/2017/10/dspace-thumbnail-box-shadow.png)
- I'll post it to the Yammer group to see what people think
2017-10-06 18:27:58 +02:00
- I figured out at way to do the HTML verification for Google Search console for library.cgiar.org
- We can drop the HTML file in their XMLUI theme folder and it will get copied to the webapps directory during build/install
- Then we add an nginx alias for that URL in the library.cgiar.org vhost
- This method is kinda a hack but at least we can put all the pieces into git to be reproducible
- I will tell Tunji to send me the verification file
2017-10-10 13:22:54 +02:00
## 2017-10-10
- Deploy logic to allow verification of the library.cgiar.org domain in the Google Search Console ([#343](https://github.com/ilri/DSpace/pull/343))
- After verifying both the HTTP and HTTPS domains and submitting a sitemap it will be interesting to see how the stats in the console as well as the search results change (currently 28,500 results):
![Google Search Console](/cgspace-notes/2017/10/google-search-console.png)
![Google Search Console 2](/cgspace-notes/2017/10/google-search-console-2.png)
![Google Search results](/cgspace-notes/2017/10/google-search-results.png)
- I tried to submit a "Change of Address" request in the Google Search Console but I need to be an owner on CGSpace's console (currently I'm just a user) in order to do that