+++ date = "2016-05-01T23:06:00+03:00" author = "Alan Orth" title = "May, 2016" tags = ["notes"] image = "../images/bg.jpg" +++ ## 2016-05-01 - Since yesterday there have been 10,000 REST errors and the site has been unstable again - I have blocked access to the API now - There are 3,000 IPs accessing the REST API in a 24-hour period! ``` # awk '{print $1}' /var/log/nginx/rest.log | uniq | wc -l 3168 ``` - The two most often requesters are in Ethiopia and Colombia: 213.55.99.121 and 181.118.144.29 - 100% of the requests coming from Ethiopia are like this and result in an HTTP 500: ``` GET /rest/handle/10568/NaN?expand=parentCommunityList,metadata HTTP/1.1 ``` - For now I'll block just the Ethiopian IP - The owner of that application has said that the `NaN` (not a number) is an error in his code and he'll fix it ## 2016-05-03 - Update nginx to 1.10.x branch on CGSpace - Fix a reference to `dc.type.output` in Discovery that I had missed when we migrated to `dc.type` last month ([#223](https://github.com/ilri/DSpace/pull/223)) ![Item type in Discovery results](../images/2016/05/discovery-types.png) ## 2016-05-06 - DSpace Test is down, `catalina.out` has lots of messages about heap space from some time yesterday (!) - It looks like Sisay was doing some batch imports - Hmm, also disk space is full - I decided to blow away the solr indexes, since they are 50GB and we don't really need all the Atmire stuff there right now - I will re-generate the Discovery indexes after re-deploying - Testing `renew-letsencrypt.sh` script for nginx ``` #!/usr/bin/env bash readonly SERVICE_BIN=/usr/sbin/service readonly LETSENCRYPT_BIN=/opt/letsencrypt/letsencrypt-auto # stop nginx so LE can listen on port 443 $SERVICE_BIN nginx stop $LETSENCRYPT_BIN renew -nvv --standalone --standalone-supported-challenges tls-sni-01 > /var/log/letsencrypt/renew.log 2>&1 LE_RESULT=$? $SERVICE_BIN nginx start if [[ "$LE_RESULT" != 0 ]]; then echo 'Automated renewal failed:' cat /var/log/letsencrypt/renew.log exit 1 fi ``` - Seems to work well ## 2016-05-10 - Start looking at more metadata migrations - There are lots of fields in `dcterms` namespace that look interesting, like: - dcterms.type - dcterms.spatial - Not sure what `dcterms` is... - Looks like these were [added in DSpace 4](https://wiki.duraspace.org/display/DSDOC5x/Metadata+and+Bitstream+Format+Registries#MetadataandBitstreamFormatRegistries-DublinCoreTermsRegistry(DCTERMS)) to allow for future work to make DSpace more flexible - CGSpace's `dc` registry has 96 items, and the default DSpace one has 73. ## 2016-05-11 - Identify and propose the next phase of CGSpace fields to migrate: - dc.title.jtitle → cg.title.journal - dc.identifier.status → cg.identifier.status - dc.river.basin → cg.river.basin - dc.Species → cg.species - dc.targetaudience → cg.targetaudience - dc.fulltextstatus → cg.fulltextstatus - dc.editon → cg.edition - dc.isijournal → cg.isijournal - Start a test rebase of the `5_x-prod` branch on top of the `dspace-5.5` tag - There were a handful of conflicts that I didn't understand - After completing the rebase I tried to build with the module versions Atmire had indicated as being 5.5 ready but I got this error: ``` [ERROR] Failed to execute goal on project additions: Could not resolve dependencies for project org.dspace.modules:additions:jar:5.5: Could not find artifact com.atmire:atmire-metadata-quality-api:jar:5.5-2.10.1-0 in sonatype-releases (https://oss.sonatype.org/content/repositories/releases/) -> [Help 1] ``` - I've sent them a question about it - A user mentioned having problems with uploading a 33 MB PDF - I told her I would increase the limit temporarily tomorrow morning - Turns out she was able to decrease the size of the PDF so we didn't have to do anything ## 2016-05-12 - Looks like the issue that Abenet was having a few days ago with "Connection Reset" in Firefox might be due to a Firefox 46 issue: https://bugzilla.mozilla.org/show_bug.cgi?id=1268775 - I finally found a copy of the latest CG Core metadata guidelines and it looks like we can add a few more fields to our next migration: - dc.rplace.region → cg.coverage.region - dc.cplace.country → cg.coverage.country - Questions for CG people: - Our `dc.place` and `dc.srplace.subregion` could both map to `cg.coverage.admin-unit`? - Should we use `dc.contributor.crp` or `cg.contributor.crp` for the CRP (ours is `dc.crsubject.crpsubject`)? - Our `dc.contributor.affiliation` and `dc.contributor.corporate` could both map to `dc.contributor` and possibly `dc.contributor.center` depending on if it's a CG center or not - `dc.title.jtitle` could either map to `dc.publisher` or `dc.source` depending on how you read things - Found ~200 messed up CIAT values in `dc.publisher`: ``` # select text_value from metadatavalue where resource_type_id=2 and metadata_field_id=39 and text_value similar to "% %"; ```