
346 lines
16 KiB
Raw Normal View History

2019-01-02 08:59:01 +01:00
title: "January, 2019"
date: 2019-01-02T09:48:30+02:00
author: "Alan Orth"
tags: ["Notes"]
## 2019-01-02
- Linode alerted that CGSpace (linode18) had a higher outbound traffic rate than normal early this morning
- I don't see anything interesting in the web server logs around that time though:
# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "02/Jan/2019:0(1|2|3)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
- Analyzing the types of requests made by the top few IPs during that time:
# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "02/Jan/2019:0(1|2|3)" | grep | grep -o -E "(bitstream|discover|handle)" | sort | uniq -c
30 bitstream
534 discover
352 handle
# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "02/Jan/2019:0(1|2|3)" | grep | grep -o -E "(bitstream|discover|handle)" | sort | uniq -c
194 bitstream
345 handle
# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "02/Jan/2019:0(1|2|3)" | grep | grep -o -E "(bitstream|discover|handle)" | sort | uniq -c
261 handle
- It's not clear to me what was causing the outbound traffic spike
2019-01-02 09:28:26 +01:00
- Oh nice! The once-per-year cron job for rotating the Solr statistics actually worked now (for the first time ever!):
Moving: 81742 into core statistics-2010
Moving: 1837285 into core statistics-2011
Moving: 3764612 into core statistics-2012
Moving: 4557946 into core statistics-2013
Moving: 5483684 into core statistics-2014
Moving: 2941736 into core statistics-2015
Moving: 5926070 into core statistics-2016
Moving: 10562554 into core statistics-2017
Moving: 18497180 into core statistics-2018
- This could by why the outbound traffic rate was high, due to the S3 backup that run at 3:30AM...
2019-01-02 19:52:39 +01:00
- Run all system updates on DSpace Test (linode19) and reboot the server
2019-01-02 08:59:01 +01:00
2019-01-03 10:52:26 +01:00
## 2019-01-03
- Update local Docker image for DSpace PostgreSQL, re-using the existing data volume:
$ sudo docker pull postgres:9.6-alpine
$ sudo docker rm dspacedb
$ sudo docker run --name dspacedb -v /home/aorth/.local/lib/containers/volumes/dspacedb_data:/var/lib/postgresql/data -e POSTGRES_PASSWORD=postgres -p 5432:5432 -d postgres:9.6-alpine
2019-01-04 19:38:11 +01:00
- Testing DSpace 5.9 with Tomcat 8.5.37 on my local machine and I see that Atmire's Listings and Reports still doesn't work
- After logging in via XMLUI and clicking the Listings and Reports link from the sidebar it redirects me to a JSPUI login page
- If I log in again there the Listings and Reports work... hmm.
- The JSPUI application—which Listings and Reports depends upon—also does not load, though the error is perhaps unrelated:
2019-01-03 14:45:21,727 INFO org.dspace.browse.BrowseEngine @ anonymous:session_id=9471D72242DAA05BCC87734FE3C66EA6:ip_addr=
2019-01-03 14:45:21,971 INFO @ facets for scope, null: 23
2019-01-03 14:45:22,115 WARN @ :session_id=9471D72242DAA05BCC87734FE3C66EA6:internal_error:-- URL Was: http://localhost:8080/jspui/internal-error
-- Method: GET
-- Parameters were:
org.apache.jasper.JasperException: /home.jsp (line: [214], column: [1]) /discovery/static-tagcloud-facet.jsp (line: [57], column: [8]) No tag [tagcloud] defined in tag library imported with prefix [dspace]
at org.apache.jasper.compiler.DefaultErrorHandler.jspError(
at org.apache.jasper.compiler.ErrorDispatcher.dispatch(
at org.apache.jasper.compiler.ErrorDispatcher.jspError(
at org.apache.jasper.compiler.Parser.processIncludeDirective(
at org.apache.jasper.compiler.Parser.parseIncludeDirective(
at org.apache.jasper.compiler.Parser.parseDirective(
at org.apache.jasper.compiler.Parser.parseElements(
at org.apache.jasper.compiler.Parser.parseBody(
at org.apache.jasper.compiler.Parser.parseOptionalBody(
at org.apache.jasper.compiler.Parser.parseCustomTag(
at org.apache.jasper.compiler.Parser.parseElements(
at org.apache.jasper.compiler.Parser.parse(
at org.apache.jasper.compiler.ParserController.doParse(
at org.apache.jasper.compiler.ParserController.parse(
at org.apache.jasper.compiler.Compiler.generateJava(
at org.apache.jasper.compiler.Compiler.compile(
at org.apache.jasper.compiler.Compiler.compile(
at org.apache.jasper.compiler.Compiler.compile(
at org.apache.jasper.JspCompilationContext.compile(
at org.apache.jasper.servlet.JspServletWrapper.service(
at org.apache.jasper.servlet.JspServlet.serviceJspFile(
at org.apache.jasper.servlet.JspServlet.service(
at javax.servlet.http.HttpServlet.service(
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(
at org.apache.catalina.core.ApplicationFilterChain.doFilter(
at org.apache.tomcat.websocket.server.WsFilter.doFilter(
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(
at org.apache.catalina.core.ApplicationFilterChain.doFilter(
at org.apache.catalina.core.ApplicationDispatcher.invoke(
at org.apache.catalina.core.ApplicationDispatcher.processRequest(
at org.apache.catalina.core.ApplicationDispatcher.doForward(
at org.apache.catalina.core.ApplicationDispatcher.forward(
at org.apache.jsp.index_jsp._jspService(
at org.apache.jasper.runtime.HttpJspBase.service(
at javax.servlet.http.HttpServlet.service(
at org.apache.jasper.servlet.JspServletWrapper.service(
at org.apache.jasper.servlet.JspServlet.serviceJspFile(
at org.apache.jasper.servlet.JspServlet.service(
at javax.servlet.http.HttpServlet.service(
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(
at org.apache.catalina.core.ApplicationFilterChain.doFilter(
at org.apache.tomcat.websocket.server.WsFilter.doFilter(
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(
at org.apache.catalina.core.ApplicationFilterChain.doFilter(
at org.dspace.utils.servlet.DSpaceWebappServletFilter.doFilter(
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(
at org.apache.catalina.core.ApplicationFilterChain.doFilter(
at org.apache.catalina.core.StandardWrapperValve.invoke(
at org.apache.catalina.core.StandardContextValve.invoke(
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(
at org.apache.catalina.core.StandardHostValve.invoke(
at org.apache.catalina.valves.ErrorReportValve.invoke(
at org.apache.catalina.valves.CrawlerSessionManagerValve.invoke(
at org.apache.catalina.valves.AbstractAccessLogValve.invoke(
at org.apache.catalina.core.StandardEngineValve.invoke(
at org.apache.catalina.connector.CoyoteAdapter.service(
at org.apache.coyote.http11.Http11Processor.service(
at org.apache.coyote.AbstractProcessorLight.process(
at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(
at java.util.concurrent.ThreadPoolExecutor.runWorker(
at java.util.concurrent.ThreadPoolExecutor$
at org.apache.tomcat.util.threads.TaskThread$
- I notice that I get different JSESSIONID cookies for `/` (XMLUI) and `/jspui` (JSPUI) on Tomcat 8.5.37, I wonder if it's the same on Tomcat 7.0.92... yes I do.
- Hmm, on Tomcat 7.0.92 I see that I get a `` session cookie after logging into XMLUI, and then when I browse to JSPUI I am still logged in...
- I didn't see that cookie being set on Tomcat 8.5.37
- I sent a message to the dspace-tech mailing list to ask
## 2019-01-04
- Linode sent a message last night that CGSpace (linode18) had high CPU usage, but I don't see anything around that time in the web server logs:
# zcat --force /var/log/nginx/*.log /var/log/nginx/*.log.1 | grep -E "03/Jan/2019:1(7|8|9)" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
- I'm thinking about trying to validate our `dc.subject` terms against [AGROVOC webservices](
- There seem to be a few APIs and the documentation is kinda confusing, but I found this REST endpoint that does work well, for example searching for `SOIL`:
$ http
HTTP/1.1 200 OK
Access-Control-Allow-Origin: *
Connection: Keep-Alive
Content-Length: 493
Content-Type: application/json; charset=utf-8
Date: Fri, 04 Jan 2019 13:44:27 GMT
Keep-Alive: timeout=5, max=100
Server: Apache
Strict-Transport-Security: max-age=63072000; includeSubdomains
Vary: Accept
X-Content-Type-Options: nosniff
X-Frame-Options: ALLOW-FROM
"@context": {
"@language": "en",
"altLabel": "skos:altLabel",
"hiddenLabel": "skos:hiddenLabel",
"isothes": "",
"onki": "",
"prefLabel": "skos:prefLabel",
"results": {
"@container": "@list",
"@id": "onki:results"
"skos": "",
"type": "@type",
"uri": "@id"
"results": [
"lang": "en",
"prefLabel": "soil",
"type": [
"uri": "",
"vocab": "agrovoc"
"uri": ""
- The API does not appear to be case sensitive (searches for `SOIL` and `soil` return the same thing)
- I'm a bit confused that there's no obvious return code or status when a term is not found, for example `SOILS`:
HTTP/1.1 200 OK
Access-Control-Allow-Origin: *
Connection: Keep-Alive
Content-Length: 367
Content-Type: application/json; charset=utf-8
Date: Fri, 04 Jan 2019 13:48:31 GMT
Keep-Alive: timeout=5, max=100
Server: Apache
Strict-Transport-Security: max-age=63072000; includeSubdomains
Vary: Accept
X-Content-Type-Options: nosniff
X-Frame-Options: ALLOW-FROM
"@context": {
"@language": "en",
"altLabel": "skos:altLabel",
"hiddenLabel": "skos:hiddenLabel",
"isothes": "",
"onki": "",
"prefLabel": "skos:prefLabel",
"results": {
"@container": "@list",
"@id": "onki:results"
"skos": "",
"type": "@type",
"uri": "@id"
"results": [],
"uri": ""
- I guess the `results` object will just be empty...
- Another way would be to try with SPARQL, perhaps using the Python 2.7 [sparql-client](
$ python2.7 -m virtualenv /tmp/sparql
$ . /tmp/sparql/bin/activate
$ pip install sparql-client ipython
$ ipython
In [10]: import sparql
In [11]: s = sparql.Service("", "utf-8", "GET")
In [12]: statement=('PREFIX skos: <> '
...: 'SELECT '
...: '?label '
...: 'WHERE { '
...: '{ ?concept skos:altLabel ?label . } UNION { ?concept skos:prefLabel ?label . } '
...: 'FILTER regex(str(?label), "^fish", "i") . '
...: '} LIMIT 10')
In [13]: result = s.query(statement)
In [14]: for row in result.fetchone():
...: print(row)
(<Literal "fish catching"@en>,)
(<Literal "fish harvesting"@en>,)
(<Literal "fish meat"@en>,)
(<Literal "fish roe"@en>,)
(<Literal "fish conversion"@en>,)
(<Literal "fisheries catches (composition)"@en>,)
(<Literal "fishtail palm"@en>,)
(<Literal "fishflies"@en>,)
(<Literal "fishery biology"@en>,)
(<Literal "fish production"@en>,)
- The SPARQL query comes from my notes in [2017-08]({{< relref "" >}})
2019-01-06 10:50:26 +01:00
## 2019-01-06
- I built a clean DSpace 5.8 installation from the upstream `dspace-5.8` tag and the issue with the XMLUI/JSPUI login is still there with Tomcat 8.5.37
- If I log into XMLUI and then nagivate to JSPUI I need to log in again
- XMLUI does not set the `` session cookie in Tomcat 8.5.37 for some reason
- I sent an update to the dspace-tech mailing list to ask for more help troubleshooting
2019-01-07 21:30:23 +01:00
## 2019-01-07
- I built a clean DSpace 6.3 installation from the upstream `dspace-6.3` tag and the issue with the XMLUI/JSPUI login is still there with Tomcat 8.5.37
- If I log into XMLUI and then nagivate to JSPUI I need to log in again
- XMLUI does not set the `` session cookie in Tomcat 8.5.37 for some reason
- I sent an update to the dspace-tech mailing list to ask for more help troubleshooting
2019-01-08 13:30:41 +01:00
## 2019-01-08
- Tim Donohue responded to my thread about the cookies on the dspace-tech mailing list
- He suspects it's a change of behavior in Tomcat 8.5, and indeed I see a mention of new cookie processing in the [Tomcat 8.5 migration guide](
- I tried to switch my XMLUI and JSPUI contexts to use the `LegacyCookieProcessor`, but it didn't seem to help
- I [filed DS-4140 on the DSpace issue tracker](
2019-01-11 15:01:21 +01:00
## 2019-01-11
- Tezira wrote to say she has stopped receiving the `DSpace Submission Approved and Archived` emails from CGSpace as of January 2nd
- I told her that I haven't done anything to disable it lately, but that I would check
- Bizu also says she hasn't received them lately
2019-01-14 22:11:07 +01:00
## 2019-01-14
- Day one of CGSpace AReS meeting in Amman
2019-01-15 15:35:16 +01:00
## 2019-01-15
- Day two of CGSpace AReS meeting in Amman
- Discuss possibly extending the [dspace-statistics-api]( to make community and collection statistics available
- Discuss new "final" CG Core document and some changes that we'll need to do on CGSpace and other repositories
- We agreed to try to stick to pure Dublin Core where possible, then use fields that exist in standard DSpace, and use "cg" namespace for everything else
- Major changes are to move `` to `dc.creator` (which MELSpace and WorldFish are already using in their DSpace repositories)
- I am testing the speed of the WorldFish DSpace repository's REST API and it's five to ten times faster than CGSpace as I tested in [2018-10]({{< relref "" >}}):
$ time http --print h ',bitstreams,parentCommunityList&limit=100&offset=0'
0.16s user 0.03s system 3% cpu 5.185 total
0.17s user 0.02s system 2% cpu 7.123 total
0.18s user 0.02s system 6% cpu 3.047 total
2019-01-02 08:59:01 +01:00
<!-- vim: set sw=2 ts=2: -->