- A few users noticed that CGSpace wasn't loading items today, item pages seem blank
- I looked at the PostgreSQL locks but they don't seem unusual
- I guess this is the same "blank item page" issue that we had a few times in 2019 that we never solved
- I restarted Tomcat and PostgreSQL and the issue was gone
- Since I was restarting Tomcat anyways I decided to redeploy the latest changes from the `5_x-prod` branch and I added a note about COVID-19 items to the CGSpace frontpage at Peter's request
<!--more-->
- Also, Linode is alerting that we had high outbound traffic rate early this morning around midnight AND high CPU load later in the morning
- They used to be "TurnitinBot"... hhmmmm, seems they use both: https://turnitin.com/robot/crawlerinfo.html
- I will add Turnitin to the DSpace bot user agent list, but I see they are reqesting `robots.txt` and only requesting item pages, so that's impressive! I don't need to add them to the "bad bot" rate limit list in nginx
- While looking at the logs I noticed eighty-one IPs in the range 185.152.250.x making little requests this user agent:
```
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:76.0) Gecko/20100101 Firefox/76.0
- It's only a few hundred requests each, but I am very suspicious so I will record it here and purge their IPs from Solr
- Then I see 185.187.30.14 and 185.187.30.13 making requests also, with several different "normal" user agents
- They are both apparently in France, belonging to Scalair FR hosting
- I will purge their requests from Solr too
- Now I see some other new bots I hadn't noticed before:
-`Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; Trident/6.0) LinkCheck by Siteimprove.com`
-`Consilio (WebHare Platform 4.28.2-dev); LinkChecker)`, which appears to be a [university CMS](https://www.utwente.nl/en/websites/webhare/)
- I will add `LinkCheck`, `Consilio`, and `WebHare` to the list of DSpace bot agents and purge them from Solr stats
- COUNTER-Robots list already has `link.?check` but for some reason DSpace didn't match that and I see hits for some of these...
- Maybe I should add `[Ll]ink.?[Cc]heck.?` to a custom list for now?
- For now I added `Turnitin` to the [new bots pull request on COUNTER-Robots](https://github.com/atmire/COUNTER-Robots/pull/34)
- I purged 20,000 hits from IPs and 45,000 hits from user agents
- I will revert the default "example" agents file back to the upstream master branch of COUNTER-Robots, and then add all my custom ones that are pending in pull requests they haven't merged yet:
[java] java.lang.RuntimeException: Failed to startup the DSpace Service Manager: failure starting up spring service manager: Error creating bean with name 'DefaultStorageUpdateConfig': Injection of autowired dependencies failed; nested exception is org.springframework.beans.factory.BeanCreationException: Could not autowire method: public void com.atmire.statistics.util.StorageReportsUpdater.setStorageReportServices(java.util.List); nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'cuaEPersonStorageReportService': Injection of autowired dependencies failed; nested exception is org.springframework.beans.factory.BeanCreationException: Could not autowire field: private com.atmire.dspace.cua.dao.storage.CUAEPersonStorageReportDAO com.atmire.dspace.cua.CUAStorageReportServiceImpl$CUAEPersonStorageReportServiceImpl.CUAEPersonStorageReportDAO; nested exception is org.springframework.beans.factory.NoUniqueBeanDefinitionException: No qualifying bean of type [com.atmire.dspace.cua.dao.storage.CUAEPersonStorageReportDAO] is defined: expected single matching bean but found 2: com.atmire.dspace.cua.dao.impl.CUAStorageReportDAOImpl$CUAEPersonStorageReportDAOImpl#0,com.atmire.dspace.cua.dao.impl.CUAStorageReportDAOImpl$CUAEPersonStorageReportDAOImpl#1
- Atmire says they are able to build fine, so I tried again and noticed that I had been building with `-Denv=dspacetest.cgiar.org`, which is not necessary for DSpace 6 of course
- Once I removed that it builds fine
- I quickly re-applied the Font Awesome 5 changes to use SVG+JS instead of web fonts (from 2020-04) and things are looking good!
- Run all system updates on DSpace Test (linode26), deploy latest `6_x-dev-atmire-modules` branch, and reboot it
## 2020-07-02
- I need to export some Solr statistics data from CGSpace to test Salem's modifications to the dspace-statistics-api
- He modified it to query Solr on the fly instead of indexing it, which will be heavier and slower, but allows us to get more granular stats and countries/cities
- Because have so many records I want to use solr-import-export-json to get several months at a time with a date range, but it seems there are first issues with curl (need to disable globbing with `-g` and URL encode the range)
- For reference, the [Solr 4.10.x DateField docs](https://lucene.apache.org/solr/4_10_2/solr-core/org/apache/solr/schema/DateField.html)
- This range works in Solr UI: `[2019-01-01T00:00:00Z TO 2019-06-30T23:59:59Z]`
- Import twelve items into the [CRP Livestock multimedia](https://hdl.handle.net/10568/97076) collection for Peter Ballantyne
- I ran the data through csv-metadata-quality first to validate and fix some common mistakes
- Interesting to check the data with `csvstat` to see if there are any duplicates
- Peter recently asked me to add Target audience (`cg.targetaudience`) to the CGSpace sidebar facets and AReS filters
- I added it on my local DSpace test instance, but I'm waiting for him to tell me what he wants the header to be "Audiences" or "Target audience" etc...
- Peter also asked me to increase the size of links in the CGSpace "Welcome" text
- I suggested using the CSS `font-size: larger` property to just bump it up one relative to what it already is
- He said it looks good, but that actually now the links seem OK (I told him to refresh, as I had made them bold a few days ago) so we don't need to adjust it actually
- Mohammed Salem modified my [dspace-statistics-api](https://github.com/ilri/dspace-statistics-api) to query Solr directly so I started writing a script to benchmark it today
- I will monitor the JVM memory and CPU usage in visualvm, just like I did in 2019-04
- I noticed an issue with his limit parameter so I sent him some feedback on that in the meantime