--- title: "October, 2017" date: 2017-10-01T08:07:54+03:00 author: "Alan Orth" tags: ["Notes"] --- ## 2017-10-01 - Peter emailed to point out that many items in the [ILRI archive collection](https://cgspace.cgiar.org/handle/10568/2703) have multiple handles: ``` http://hdl.handle.net/10568/78495||http://hdl.handle.net/10568/79336 ``` - There appears to be a pattern but I'll have to look a bit closer and try to clean them up automatically, either in SQL or in OpenRefine - Add Katherine Lutz to the groups for content sumission and edit steps of the CGIAR System collections ## 2017-10-02 - Peter Ballantyne said he was having problems logging into CGSpace with "both" of his accounts (CGIAR LDAP and personal, apparently) - I looked in the logs and saw some LDAP lookup failures due to timeout but also strangely a "no DN found" error: ``` 2017-10-01 20:24:57,928 WARN org.dspace.authenticate.LDAPAuthentication @ anonymous:session_id=CA0AA5FEAEA8805645489404CDCE9594:ip_addr=41.204.190.40:ldap_attribute_lookup:type=failed_search javax.naming.CommunicationException\colon; svcgroot2.cgiarad.org\colon;3269 [Root exception is java.net.ConnectException\colon; Connection timed out (Connection timed out)] 2017-10-01 20:22:37,982 INFO org.dspace.authenticate.LDAPAuthentication @ anonymous:session_id=CA0AA5FEAEA8805645489404CDCE9594:ip_addr=41.204.190.40:failed_login:no DN found for user pballantyne ``` - I thought maybe his account had expired (seeing as it's was the first of the month) but he says he was finally able to log in today - The logs for yesterday show fourteen errors related to LDAP auth failures: ``` $ grep -c "ldap_authentication:type=failed_auth" dspace.log.2017-10-01 14 ``` - For what it's worth, there are no errors on any other recent days, so it must have been some network issue on Linode or CGNET's LDAP server - Linode emailed to say that linode578611 (DSpace Test) needs to migrate to a new host for a security update so I initiated the migration immediately rather than waiting for the scheduled time in two weeks ## 2017-10-04 - Twice in the last twenty-four hours Linode has alerted about high CPU usage on CGSpace (linode2533629) - Communicate with Sam from the CGIAR System Organization about some broken links coming from their CGIAR Library domain to CGSpace - The first is a link to a browse page that should be handled better in nginx: ``` http://library.cgiar.org/browse?value=Intellectual%20Assets%20Reports&type=subject → https://cgspace.cgiar.org/browse?value=Intellectual%20Assets%20Reports&type=subject ``` - We'll need to check for browse links and handle them properly, including swapping the `subject` parameter for `systemsubject` (which doesn't exist in Discovery yet, but we'll need to add it) as we have moved their poorly curated subjects from `dc.subject` to `cg.subject.system` - The second link was a direct link to a bitstream which has broken due to the sequence being updated, so I told him he should link to the handle of the item instead - Help Sisay proof sixty-two IITA records on DSpace Test - Lots of inconsistencies and errors in subjects, dc.format.extent, regions, countries - Merge the Discovery search changes for ISI Journal ([#341](https://github.com/ilri/DSpace/pull/341)) ## 2017-10-05 - Twice in the past twenty-four hours Linode has warned that CGSpace's outbound traffic rate was exceeding the notification threshold - I had a look at yesterday's OAI and REST logs in `/var/log/nginx` but didn't see anything unusual: ``` # awk '{print $1}' /var/log/nginx/rest.log.1 | sort -n | uniq -c | sort -h | tail -n 10 141 157.55.39.240 145 40.77.167.85 162 66.249.66.92 181 66.249.66.95 211 66.249.66.91 312 66.249.66.94 384 66.249.66.90 1495 50.116.102.77 3904 70.32.83.92 9904 45.5.184.196 # awk '{print $1}' /var/log/nginx/oai.log.1 | sort -n | uniq -c | sort -h | tail -n 10 5 66.249.66.71 6 66.249.66.67 6 68.180.229.31 8 41.84.227.85 8 66.249.66.92 17 66.249.66.65 24 66.249.66.91 38 66.249.66.95 69 66.249.66.90 148 66.249.66.94 ``` - Working on the nginx redirects for CGIAR Library - We should start using 301 redirects and also allow for `/sitemap` to work on the library.cgiar.org domain so the CGIAR System Organization people can update their Google Search Console and allow Google to find their content in a structured way - Remove eleven occurrences of `ACP` in IITA's `cg.coverage.region` using the Atmire batch edit module from Discovery - Need to investigate how we can verify the library.cgiar.org using the HTML or DNS methods - Run corrections on 143 ILRI Archive items that had two `dc.identifier.uri` values (Handle) that Peter had pointed out earlier this week - I used OpenRefine to isolate them and then fixed and re-imported them into CGSpace - I manually checked a dozen of them and it appeared that the correct handle was always the second one, so I just deleted the first one ## 2017-10-06 - I saw a nice tweak to thumbnail presentation on the Cardiff Metropolitan University DSpace: https://repository.cardiffmet.ac.uk/handle/10369/8780 - It adds a subtle border and box shadow, before and after: ![Original flat thumbnails](/cgspace-notes/2017/10/dspace-thumbnail-original.png) ![Tweaked with border and box shadow](/cgspace-notes/2017/10/dspace-thumbnail-box-shadow.png) - I'll post it to the Yammer group to see what people think - I figured out at way to do the HTML verification for Google Search console for library.cgiar.org - We can drop the HTML file in their XMLUI theme folder and it will get copied to the webapps directory during build/install - Then we add an nginx alias for that URL in the library.cgiar.org vhost - This method is kinda a hack but at least we can put all the pieces into git to be reproducible - I will tell Tunji to send me the verification file ## 2017-10-10 - Deploy logic to allow verification of the library.cgiar.org domain in the Google Search Console ([#343](https://github.com/ilri/DSpace/pull/343)) - After verifying both the HTTP and HTTPS domains and submitting a sitemap it will be interesting to see how the stats in the console as well as the search results change (currently 28,500 results): ![Google Search Console](/cgspace-notes/2017/10/google-search-console.png) ![Google Search Console 2](/cgspace-notes/2017/10/google-search-console-2.png) ![Google Search results](/cgspace-notes/2017/10/google-search-results.png) - I tried to submit a "Change of Address" request in the Google Search Console but I need to be an owner on CGSpace's console (currently I'm just a user) in order to do that - Manually clean up some communities and collections that Peter had requested a few weeks ago - Delete Community 10568/102 (ILRI Research and Development Issues) - Move five collections to 10568/27629 (ILRI Projects) using `move-collections.sh` with the following configuration: ``` 10568/1637 10568/174 10568/27629 10568/1642 10568/174 10568/27629 10568/1614 10568/174 10568/27629 10568/75561 10568/150 10568/27629 10568/183 10568/230 10568/27629 ``` - Delete community 10568/174 (Sustainable livestock futures) - Delete collections in 10568/27629 that have zero items (33 of them!)