org.apache.jasper.JasperException: /home.jsp (line: [214], column: [1]) /discovery/static-tagcloud-facet.jsp (line: [57], column: [8]) No tag [tagcloud] defined in tag library imported with prefix [dspace]
at org.apache.jasper.compiler.DefaultErrorHandler.jspError(DefaultErrorHandler.java:41)
at org.apache.jasper.compiler.ErrorDispatcher.dispatch(ErrorDispatcher.java:291)
at org.apache.jasper.compiler.ErrorDispatcher.jspError(ErrorDispatcher.java:97)
at org.apache.jasper.compiler.Parser.processIncludeDirective(Parser.java:347)
at org.apache.jasper.compiler.Parser.parseIncludeDirective(Parser.java:380)
at org.apache.jasper.compiler.Parser.parseDirective(Parser.java:481)
at org.apache.jasper.compiler.Parser.parseElements(Parser.java:1445)
at org.apache.jasper.compiler.Parser.parseBody(Parser.java:1683)
at org.apache.jasper.compiler.Parser.parseOptionalBody(Parser.java:1016)
at org.apache.jasper.compiler.Parser.parseCustomTag(Parser.java:1291)
at org.apache.jasper.compiler.Parser.parseElements(Parser.java:1470)
at org.apache.jasper.compiler.Parser.parse(Parser.java:144)
at org.apache.jasper.compiler.ParserController.doParse(ParserController.java:244)
at org.apache.jasper.compiler.ParserController.parse(ParserController.java:105)
at org.apache.jasper.compiler.Compiler.generateJava(Compiler.java:202)
at org.apache.jasper.compiler.Compiler.compile(Compiler.java:373)
at org.apache.jasper.compiler.Compiler.compile(Compiler.java:350)
at org.apache.jasper.compiler.Compiler.compile(Compiler.java:334)
at org.apache.jasper.JspCompilationContext.compile(JspCompilationContext.java:595)
at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:399)
at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:386)
at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:330)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:742)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.apache.catalina.core.ApplicationDispatcher.invoke(ApplicationDispatcher.java:728)
at org.apache.catalina.core.ApplicationDispatcher.processRequest(ApplicationDispatcher.java:470)
at org.apache.catalina.core.ApplicationDispatcher.doForward(ApplicationDispatcher.java:395)
at org.apache.catalina.core.ApplicationDispatcher.forward(ApplicationDispatcher.java:316)
at org.dspace.app.webui.util.JSPManager.showJSP(JSPManager.java:60)
at org.apache.jsp.index_jsp._jspService(index_jsp.java:191)
at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:742)
at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:476)
at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:386)
at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:330)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:742)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.dspace.utils.servlet.DSpaceWebappServletFilter.doFilter(DSpaceWebappServletFilter.java:78)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:198)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96)
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:493)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:140)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:81)
at org.apache.catalina.valves.CrawlerSessionManagerValve.invoke(CrawlerSessionManagerValve.java:234)
at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:650)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:87)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:342)
at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:800)
at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66)
at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:806)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1498)
at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:748)
</code></pre><ul>
<li>I notice that I get different JSESSIONID cookies for <code>/</code> (XMLUI) and <code>/jspui</code> (JSPUI) on Tomcat 8.5.37, I wonder if it’s the same on Tomcat 7.0.92… yes I do.</li>
<li>Hmm, on Tomcat 7.0.92 I see that I get a <code>dspace.current.user.id</code> session cookie after logging into XMLUI, and then when I browse to JSPUI I am still logged in…
<ul>
<li>I didn’t see that cookie being set on Tomcat 8.5.37</li>
</ul>
</li>
<li>I sent a message to the dspace-tech mailing list to ask</li>
</ul>
<h2id="2019-01-04">2019-01-04</h2>
<ul>
<li>Linode sent a message last night that CGSpace (linode18) had high CPU usage, but I don’t see anything around that time in the web server logs:</li>
<li>I’m thinking about trying to validate our <code>dc.subject</code> terms against <ahref="http://aims.fao.org/agrovoc/webservices">AGROVOC webservices</a></li>
<li>There seem to be a few APIs and the documentation is kinda confusing, but I found this REST endpoint that does work well, for example searching for <code>SOIL</code>:</li>
<li>The SPARQL query comes from my notes in <ahref="/cgspace-notes/2017-08/">2017-08</a></li>
</ul>
<h2id="2019-01-06">2019-01-06</h2>
<ul>
<li>I built a clean DSpace 5.8 installation from the upstream <code>dspace-5.8</code> tag and the issue with the XMLUI/JSPUI login is still there with Tomcat 8.5.37
<ul>
<li>If I log into XMLUI and then nagivate to JSPUI I need to log in again</li>
<li>XMLUI does not set the <code>dspace.current.user.id</code> session cookie in Tomcat 8.5.37 for some reason</li>
<li>I sent an update to the dspace-tech mailing list to ask for more help troubleshooting</li>
</ul>
</li>
</ul>
<h2id="2019-01-07">2019-01-07</h2>
<ul>
<li>I built a clean DSpace 6.3 installation from the upstream <code>dspace-6.3</code> tag and the issue with the XMLUI/JSPUI login is still there with Tomcat 8.5.37
<ul>
<li>If I log into XMLUI and then nagivate to JSPUI I need to log in again</li>
<li>XMLUI does not set the <code>dspace.current.user.id</code> session cookie in Tomcat 8.5.37 for some reason</li>
<li>I sent an update to the dspace-tech mailing list to ask for more help troubleshooting</li>
</ul>
</li>
</ul>
<h2id="2019-01-08">2019-01-08</h2>
<ul>
<li>Tim Donohue responded to my thread about the cookies on the dspace-tech mailing list
<ul>
<li>He suspects it’s a change of behavior in Tomcat 8.5, and indeed I see a mention of new cookie processing in the <ahref="https://tomcat.apache.org/migration-85.html#Cookies">Tomcat 8.5 migration guide</a></li>
<li>I tried to switch my XMLUI and JSPUI contexts to use the <code>LegacyCookieProcessor</code>, but it didn’t seem to help</li>
<li>I <ahref="https://jira.duraspace.org/browse/DS-4140">filed DS-4140 on the DSpace issue tracker</a></li>
</ul>
</li>
</ul>
<h2id="2019-01-11">2019-01-11</h2>
<ul>
<li>Tezira wrote to say she has stopped receiving the <code>DSpace Submission Approved and Archived</code> emails from CGSpace as of January 2nd
<ul>
<li>I told her that I haven’t done anything to disable it lately, but that I would check</li>
<li>Bizu also says she hasn’t received them lately</li>
</ul>
</li>
</ul>
<h2id="2019-01-14">2019-01-14</h2>
<ul>
<li>Day one of CGSpace AReS meeting in Amman</li>
</ul>
<h2id="2019-01-15">2019-01-15</h2>
<ul>
<li>Day two of CGSpace AReS meeting in Amman
<ul>
<li>Discuss possibly extending the <ahref="https://github.com/ilri/dspace-statistics-api">dspace-statistics-api</a> to make community and collection statistics available</li>
<li>Discuss new “final” CG Core document and some changes that we’ll need to do on CGSpace and other repositories</li>
<li>We agreed to try to stick to pure Dublin Core where possible, then use fields that exist in standard DSpace, and use “cg” namespace for everything else</li>
<li>Major changes are to move <code>dc.contributor.author</code> to <code>dc.creator</code> (which MELSpace and WorldFish are already using in their DSpace repositories)</li>
</ul>
</li>
<li>I am testing the speed of the WorldFish DSpace repository’s REST API and it’s five to ten times faster than CGSpace as I tested in <ahref="/cgspace-notes/2018-10/">2018-10</a>:</li>
</ul>
<pretabindex="0"><code>$ time http --print h 'https://digitalarchive.worldfishcenter.org/rest/items?expand=metadata,bitstreams,parentCommunityList&limit=100&offset=0'
0.16s user 0.03s system 3% cpu 5.185 total
0.17s user 0.02s system 2% cpu 7.123 total
0.18s user 0.02s system 6% cpu 3.047 total
</code></pre><ul>
<li>In other news, Linode sent a mail last night that the CPU load on CGSpace (linode18) was high, here are the top IPs in the logs around those few hours:</li>
<li>And what is the relationship between DC and DCTERMS?</li>
<li>DSpace uses DCTERMS in the metadata it embeds in XMLUI item views!</li>
<li>We really need to look at this more carefully and see the impacts that might be made from switching core fields like languages, abstract, authors, etc</li>
<li>We can check WorldFish and MELSpace repositories to see what effects these changes have had on theirs because they have already adopted some of these changes…</li>
<li>I think I understand the difference between DC and DCTERMS finally: DC is the original set of fifteen elements and DCTERMS is the newer version that was supposed to address much of the drawbacks of the original with regards to digital content</li>
<li>We might be able to use some proper fields for citation, abstract, etc that are part of DCTERMS</li>
<li>To make matters more confusing, there is also “qualified Dublin Core” that uses the original fifteen elements of legacy DC and qualifies them, like <code>dc.date.accessioned</code>
<ul>
<li>According to Wikipedia <ahref="https://en.wikipedia.org/wiki/Dublin_Core">Qualified Dublin Core was superseded by DCTERMS in 2008</a>!</li>
</ul>
</li>
<li>So we should be trying to use DCTERMS where possible, unless it is some internal thing that might mess up DSpace (like dates)</li>
<li>“Elements 1.1” means legacy DC</li>
<p>There’s no official set of Dublin Core qualifiers so I can’t tell if things like <code>dc.contributor.author</code> that are used by DSpace are official</p>
</li>
<li>
<p>I found a great <ahref="https://www.dri.ie/sites/default/files/files/qualified-dublin-core-metadata-guidelines.pdf">presentation from 2015 by the Digital Repository of Ireland</a> that discusses using MARC Relator Terms with Dublin Core elements</p>
</li>
<li>
<p>It seems that <code>dc.contributor.author</code> would be a supported term according to this <ahref="https://memory.loc.gov/diglib/loc.terms/relators/dc-contributor.html">Library of Congress list</a> linked from the <ahref="http://dublincore.org/usage/documents/relators/">Dublin Core website</a></p>
</li>
<li>
<p>The Library of Congress document specifically says:</p>
<p>These terms conform with the DCMI Abstract Model and may be used in DCMI application profiles. DCMI endorses their use with Dublin Core elements as indicated.</p>
</li>
</ul>
<h2id="2019-01-20">2019-01-20</h2>
<ul>
<li>That’s weird, I logged into DSpace Test (linode19) and it says it has been up for 213 days:</li>
<li>I’ve definitely rebooted it several times in the past few months… according to <code>journalctl -b</code> it was a few weeks ago on 2019-01-02</li>
<li>I re-ran the Ansible DSpace tag, ran all system updates, and rebooted the host</li>
<li>After rebooting I notice that the Linode kernel went down from 4.19.8 to 4.18.16…</li>
<li>Atmire sent a quote on our <ahref="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=657">ticket about purchasing the Metadata Quality Module (MQM) for DSpace 5.8</a></li>
<li>Abenet asked me for an <ahref="https://cgspace.cgiar.org/open-search/discover?query=crpsubject:Livestock&sort_by=3&order=DESC">OpenSearch query that could generate and RSS feed for items in the Livestock CRP</a></li>
<li>According to my notes, <code>sort_by=3</code> is accession date (as configured in <code>dspace.cfg</code>)</li>
<li>The query currently shows 3023 items, but a <ahref="https://cgspace.cgiar.org/discover?filtertype_1=crpsubject&filter_relational_operator_1=equals&filter_1=Livestock&submit_apply_filter=&query=">Discovery search for Livestock CRP only returns 858 items</a></li>
<li>That query seems to return items tagged with <code>Livestock and Fish</code> CRP as well… hmm.</li>
</ul>
<h2id="2019-01-21">2019-01-21</h2>
<ul>
<li>Investigating running Tomcat 7 on Ubuntu 18.04 with the tarball and a custom systemd package instead of waiting for our DSpace to get compatible with Ubuntu 18.04’s Tomcat 8.5</li>
<li>I could either run with a simple <code>tomcat7.service</code> like this:</li>
</ul>
<pretabindex="0"><code>[Unit]
Description=Apache Tomcat 7 Web Application Container
<li>I see that <code>jsvc</code> and <code>libcommons-daemon-java</code> are both available on Ubuntu so that should be easy to port</li>
<li>We probably don’t need Eclipse Java Bytecode Compiler (ecj)</li>
<li>I tested Tomcat 7.0.92 on Arch Linux using the <code>tomcat7.service</code> with <code>jsvc</code> and it works… nice!</li>
<li>I think I might manage this the same way I do the restic releases in the <ahref="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure scripts</a>, where I download a specific version and symlink to some generic location without the version number</li>
<li>I verified that there is indeed an issue with sharded Solr statistics cores on DSpace, which will cause inaccurate results in the dspace-statistics-api:</li>
<li>I opened an issue on the GitHub issue tracker (<ahref="https://github.com/ilri/dspace-statistics-api/issues/10">#10</a>)</li>
<li>I don’t think the <ahref="https://solrclient.readthedocs.io/en/latest/">SolrClient library</a> we are currently using supports these type of queries so we might have to just do raw queries with requests</li>
<li>The <ahref="https://github.com/django-haystack/pysolr">pysolr</a> library says it supports multicore indexes, but I am not sure it does (or at least not with our setup):</li>
<li>If I double check one item from above, for example <code>77572</code>, it appears this is only working on the current statistics core and not the shards:</li>
<li>I should be able to modify the dspace-statistics-api to check the shards via the Solr core status, then add the <code>shards</code> parameter to each query to make the search distributed among the cores</li>
<li>I implemented a proof of concept to query the Solr STATUS for active cores and to add them with a <code>shards</code> query string</li>
<li>A few things I noticed:
<ul>
<li>Solr doesn’t mind if you use an empty <code>shards</code> parameter</li>
<li>Solr doesn’t mind if you have an extra comma at the end of the <code>shards</code> parameter</li>
<li>If you are searching multiple cores, you need to include the base core in the <code>shards</code> parameter as well</li>
<li>For example, compare the following two queries, first including the base core and the shard in the <code>shards</code> parameter, and then only including the shard:</li>
<li>Release <ahref="https://github.com/ilri/dspace-statistics-api/releases/tag/v0.9.0">version 0.9.0 of the dspace-statistics-api</a> to address the issue of querying multiple Solr statistics shards</li>
<li>I deployed it on DSpace Test (linode19) and restarted the indexer and now it shows all the stats from 2018 as well (756 pages of views, intead of 6)</li>
<li>I deployed it on CGSpace (linode18) and restarted the indexer as well</li>
<li>Linode sent an alert that CGSpace (linode18) was using high CPU this afternoon, the top ten IPs during that time were:</li>
<li>I don’t think we’ve seen 196.191.127.37 before. Its user agent is:</li>
</ul>
<pretabindex="0"><code>Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 UBrowser/7.0.185.1002 Safari/537.36
</code></pre><ul>
<li>Interestingly this IP is located in Addis Ababa…</li>
<li>Another interesting one is 154.113.73.30, which is apparently at IITA Nigeria and uses the user agent:</li>
</ul>
<pretabindex="0"><code>Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36
</code></pre><h2id="2019-01-23">2019-01-23</h2>
<ul>
<li>Peter noticed that some goo.gl links in our tweets from Feedburner are broken, for example this one from last week:</li>
</ul>
<blockquoteclass="twitter-tweet"><plang="en"dir="ltr"><ahref="https://twitter.com/hashtag/ILRI?src=hash&ref_src=twsrc%5Etfw">#ILRI</a> research: Towards unlocking the potential of the hides and skins value chain in Somaliland <ahref="https://t.co/EZH7ALW4dp">https://t.co/EZH7ALW4dp</a></p>— ILRI.org (@ILRI) <ahref="https://twitter.com/ILRI/status/1086330519904673793?ref_src=twsrc%5Etfw">January 18, 2019</a></blockquote>
<li>The shortened link is <ahref="goo.gl/fb/VRj9Gq">goo.gl/fb/VRj9Gq</a> and it shows a “Dynamic Link not found” error from Firebase:</li>
</ul>
<p><imgsrc="/cgspace-notes/2019/01/firebase-link-not-found.png"alt="Dynamic Link not found"></p>
<ul>
<li>
<p>Apparently Google announced last year that they plan to <ahref="https://developers.googleblog.com/2018/03/transitioning-google-url-shortener.html">discontinue the shortner and transition to Firebase Dynamic Links in March, 2019</a>, so maybe this is related…</p>
</li>
<li>
<p>Very interesting discussion of methods for <ahref="https://jdebp.eu/FGA/systemd-house-of-horror/tomcat.html">running Tomcat under systemd</a></p>
</li>
<li>
<p>We can set the ulimit options that used to be in <code>/etc/default/tomcat7</code> with systemd’s <code>LimitNOFILE</code> and <code>LimitAS</code> (see the <code>systemd.exec</code> man page)</p>
<ul>
<li>Note that we need to use <code>infinity</code> instead of <code>unlimited</code> for the address space</li>
</ul>
</li>
<li>
<p>Create accounts for Bosun from IITA and Valerio from ICARDA / CGMEL on DSpace Test</p>
</li>
<li>
<p>Maria Garruccio asked me for a list of author affiliations from all of their submitted items so she can clean them up</p>
</li>
<li>
<p>I got a list of their collections from the CGSpace XMLUI and then used an SQL query to dump the unique values to CSV:</p>
</li>
</ul>
<pretabindex="0"><code>dspace=# \copy (select distinct text_value, count(*) from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'contributor' and qualifier = 'affiliation') AND resource_type_id = 2 AND resource_id IN (select item_id from collection2item where collection_id IN (select resource_id from handle where handle in ('10568/35501', '10568/41728', '10568/49622', '10568/56589', '10568/56592', '10568/65064', '10568/65718', '10568/65719', '10568/67373', '10568/67731', '10568/68235', '10568/68546', '10568/69089', '10568/69160', '10568/69419', '10568/69556', '10568/70131', '10568/70252', '10568/70978'))) group by text_value order by count desc) to /tmp/bioversity-affiliations.csv with csv;
COPY 1109
</code></pre><ul>
<li>Send a mail to the dspace-tech mailing list about the OpenSearch issue we had with the Livestock CRP</li>
<li>Linode sent an alert that CGSpace (linode18) had a high load this morning, here are the top ten IPs during that time:</li>
<p>Just to make sure these were not uploaded by the user or something, I manually forced the regeneration of these with DSpace’s <code>filter-media</code>:</p>
<li>Both of these were successful, so there must have been an update to ImageMagick or Ghostscript in Ubuntu since early 2018-12</li>
<li>Looking at the apt history logs I see that on 2018-12-07 a security update for Ghostscript was installed (version 9.26~dfsg+0-0ubuntu0.16.04.3)</li>
<li>I think this Launchpad discussion is relevant: <ahref="https://bugs.launchpad.net/ubuntu/+source/ghostscript/+bug/1806517">https://bugs.launchpad.net/ubuntu/+source/ghostscript/+bug/1806517</a></li>
<li>As well as the original Ghostscript bug report: <ahref="https://bugs.ghostscript.com/show_bug.cgi?id=699815">https://bugs.ghostscript.com/show_bug.cgi?id=699815</a></li>
</ul>
<h2id="2019-01-24">2019-01-24</h2>
<ul>
<li>I noticed Ubuntu’s Ghostscript 9.26 works on some troublesome PDFs where Arch’s Ghostscript 9.26 doesn’t, so the fix for the first/last page crash is not the patch I found yesterday</li>
<li>Ubuntu’s Ghostscript uses another <ahref="http://git.ghostscript.com/?p=ghostpdl.git;h=fae21f1668d2b44b18b84cf0923a1d5f3008a696">patch from Ghostscript git</a> (<ahref="https://bugs.ghostscript.com/show_bug.cgi?id=700315">upstream bug report</a>)</li>
<li>I re-compiled Arch’s ghostscript with the patch and then I was able to generate a thumbnail from one of the <ahref="https://cgspace.cgiar.org/handle/10568/98390">troublesome PDFs</a></li>
<li>I reported it to the Arch Linux bug tracker (<ahref="https://bugs.archlinux.org/task/61513">61513</a>)</li>
<li>I told Atmire to go ahead with the Metadata Quality Module addition based on our <code>5_x-dev</code> branch (<ahref="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=657">657</a>)</li>
<li>Linode sent alerts last night to say that CGSpace (linode18) was using high CPU last night, here are the top ten IPs from the nginx logs around that time:</li>
<li>The full <ahref="http://id.loc.gov/vocabulary/relators.html">list of MARC Relators on the Library of Congress website</a> linked from the <ahref="http://dublincore.org/usage/documents/relators/">DMCI relators page</a> is very confusing</li>
<li>Looking at the default DSpace XMLUI crosswalk in <ahref="https://github.com/DSpace/DSpace/blob/dspace-5_x/dspace/config/crosswalks/xhtml-head-item.properties">xhtml-head-item.properties</a> I see a very complete mapping of DSpace DC and QDC fields to DCTERMS
<ul>
<li>This is good for standards-compliant web crawlers, but what about for those harvesting via REST or OAI APIs?</li>
</ul>
</li>
<li>I sent a message titled “<ahref="https://groups.google.com/forum/#!topic/dspace-tech/phV_t51TGuE">DC, QDC, and DCTERMS: reviewing our metadata practices</a>” to the dspace-tech mailing list to ask about some of this</li>
</ul>
<h2id="2019-01-25">2019-01-25</h2>
<ul>
<li>A little bit more work on getting Tomcat to run from a tarball on our <ahref="https://github.com/ilri/rmg-ansible-public">Ansible infrastructure playbooks</a>
<ul>
<li>I tested by doing a Tomcat 7.0.91 installation, then switching it to 7.0.92 and it worked… nice!</li>
<li>I refined the tasks so much that I was confident enough to deploy them on DSpace Test and it went very well</li>
<li>Basically I just stopped tomcat7, created a dspace user, removed tomcat7, chown’d everything to the dspace user, then ran the playbook</li>
<li>So now DSpace Test (linode19) is running Tomcat 7.0.92… w00t</li>
<li>Now we need to monitor it for a few weeks to see if there is anything we missed, and then I can change CGSpace (linode18) as well, and we’re ready for Ubuntu 18.04 too!</li>
</ul>
</li>
</ul>
<h2id="2019-01-27">2019-01-27</h2>
<ul>
<li>Linode sent an email that the server was using a lot of CPU this morning, and these were the top IPs in the web server logs at the time:</li>
<li>205.186.128.185 is CCAFS or perhaps another Macaroni Bros harvester (new ILRI website?)</li>
</ul>
</li>
</ul>
<h2id="2019-01-28">2019-01-28</h2>
<ul>
<li>Udana from WLE asked me about the interaction between their publication website and their items on CGSpace
<ul>
<li>There is an item that is mapped into their collection from IWMI and is missing their <code>cg.identifier.wletheme</code> metadata</li>
<li>I told him that, as far as I remember, when WLE introduced Phase II research themes in 2017 we decided to infer theme ownership from the collection hierarchy and we created a <ahref="https://cgspace.cgiar.org/handle/10568/81268">WLE Phase II Research Themes</a> subCommunity</li>
<li>Perhaps they need to ask Macaroni Bros about the mapping</li>
</ul>
</li>
<li>Linode alerted that CGSpace (linode18) was using too much CPU again this morning, here are the active IPs from the web server log at the time:</li>
<li>There seems to be a pattern with <code>70.32.83.92</code> and <code>205.186.128.185</code> lately!</li>
<li>Every morning at 8AM they are the top users… I should tell them to stagger their requests…</li>
<li>I signed up for a <ahref="https://visualping.io/">VisualPing</a> of the <ahref="https://jdbc.postgresql.org/download.html">PostgreSQL JDBC driver download page</a> to my CGIAR email address
<ul>
<li>Hopefully this will one day alert me that a new driver is released!</li>
</ul>
</li>
<li>Last night Linode sent an alert that CGSpace (linode18) was using high CPU, here are the most active IPs in the hours just before, during, and after the alert:</li>
<li>Of course there is CIAT’s <code>45.5.186.2</code>, but also <code>45.5.184.2</code> appears to be CIAT… I wonder why they have two harvesters?</li>
<li><code>199.47.87.140</code> and <code>199.47.87.141</code> is TurnItIn with the following user agent:</li>
<li>Linode sent an alert about CGSpace (linode18) CPU usage this morning, here are the top IPs in the web server logs just before, during, and after the alert:</li>
<li>I might need to adjust the threshold again, because the load average this morning was 296% and the activity looks pretty normal (as always recently)</li>
</ul>
<h2id="2019-01-31">2019-01-31</h2>
<ul>
<li>Linode sent alerts about CGSpace (linode18) last night and this morning, here are the top IPs before, during, and after those times:</li>