cgspace-notes/2017-08.md at 8383cd466bd9e475c36e9fec05f6ddba99427b65

mirror of https://github.com/alanorth/cgspace-notes.git synced 2024-11-09 16:45:45 +01:00

They moved to Lyrasis in 2019.

2020-04-13 15:30:34 +03:00

28 KiB

Raw Blame History

+++ date = "2017-08-01T11:51:52+03:00" author = "Alan Orth" title = "August, 2017" tags = ["Notes"]

+++

2017-08-01

Linode sent an alert that CGSpace (linode18) was using 350% CPU for the past two hours
I looked in the Activity pane of the Admin Control Panel and it seems that Google, Baidu, Yahoo, and Bing are all crawling with massive numbers of bots concurrently (~100 total, mostly Baidu and Google)
The good thing is that, according to dspace.log.2017-08-01, they are all using the same Tomcat session
This means our Tomcat Crawler Session Valve is working
But many of the bots are browsing dynamic URLs like:
- /handle/10568/3353/discover
- /handle/10568/16510/browse
The robots.txt only blocks the top-level /discover and /browse URLs... we will need to find a way to forbid them from accessing these!
Relevant issue from DSpace Jira (semi resolved in DSpace 6.0): https://jira.duraspace.org/browse/DS-2962
It turns out that we're already adding the X-Robots-Tag "none" HTTP header, but this only forbids the search engine from indexing the page, not crawling it!
Also, the bot has to successfully browse the page first so it can receive the HTTP header...
We might actually have to block these requests with HTTP 403 depending on the user agent
Abenet pointed out that the CGIAR Library Historical Archive collection I sent July 20th only had ~100 entries, instead of 2415
This was due to newline characters in the dc.description.abstract column, which caused OpenRefine to choke when exporting the CSV
I exported a new CSV from the collection on DSpace Test and then manually removed the characters in vim using g/^$/d
Then I cleaned up the author authorities and HTML characters in OpenRefine and sent the file back to Abenet

2017-08-02

Magdalena from CCAFS asked if there was a way to get the top ten items published in 2016 (note: not the top items in 2016!)
I think Atmire's Content and Usage Analysis module should be able to do this but I will have to look at the configuration and maybe email Atmire if I can't figure it out
I had a look at the moduel configuration and couldn't figure out a way to do this, so I opened a ticket on the Atmire tracker
Atmire responded about the missing workflow statistics issue a few weeks ago but I didn't see it for some reason
They said they added a publication and saw the workflow stat for the user, so I should try again and let them know

2017-08-05

Usman from CIFOR emailed to ask about the status of our OAI tests for harvesting their DSpace repository
I told him that the OAI appears to not be harvesting properly after the first sync, and that the control panel shows an "Internal error" for that collection:

I don't see anything related in our logs, so I asked him to check for our server's IP in their logs
Also, in the mean time I stopped the harvesting process, reset the status, and restarted the process via the Admin control panel (note: I didn't reset the collection, just the harvester status!)

2017-08-07

Apply Abenet's corrections for the CGIAR Library's Consortium subcommunity (697 records)
I had to fix a few small things, like moving the dc.title column away from the beginning of the row, delete blank spaces in the abstract in vim using :g/^$/d, add the dc.subject[en_US] column back, as she had deleted it and DSpace didn't detect the changes made there (we needed to blank the values instead)

2017-08-08

Apply Abenet's corrections for the CGIAR Library's historic archive subcommunity (2415 records)
I had to add the dc.subject[en_US] column back with blank values so that DSpace could detect the changes
I applied the changes in 500 item batches

2017-08-09

Run system updates on DSpace Test and reboot server
Help ICARDA upgrade their MELSpace to DSpace 5.7 using the docker-dspace container
- We had to import the PostgreSQL dump to the PostgreSQL container using: pg_restore -U postgres -d dspace blah.dump
- Otherwise, when using -O it messes up the permissions on the schema and DSpace can't read it

2017-08-10

Apply last updates to the CGIAR Library's Fund community (812 items)
Had to do some quality checks and column renames before importing, as either Sisay or Abenet renamed a few columns and the metadata importer wanted to remove/add new metadata for title, abstract, etc.
Also I applied the HTML entities unescape transform on the abstract column in Open Refine
I need to get an author list from the database for only the CGIAR Library community to send to Peter
It turns out that I had already used this SQL query in May, 2017 to get the authors from CGIAR Library:

dspace#= \copy (select distinct text_value, count(*) from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'contributor' and qualifier = 'author') AND resource_type_id = 2 AND resource_id IN (select item_id from collection2item where collection_id IN (select resource_id from handle where handle in ('10568/93761', '10947/1', '10947/10', '10947/11', '10947/12', '10947/13', '10947/14', '10947/15', '10947/16', '10947/17', '10947/18', '10947/19', '10947/2', '10947/20', '10947/21', '10947/22', '10947/23', '10947/24', '10947/25', '10947/2512', '10947/2515', '10947/2516', '10947/2517', '10947/2518', '10947/2519', '10947/2520', '10947/2521', '10947/2522', '10947/2523', '10947/2524', '10947/2525', '10947/2526', '10947/2527', '10947/2528', '10947/2529', '10947/2530', '10947/2531', '10947/2532', '10947/2533', '10947/2534', '10947/2535', '10947/2536', '10947/2537', '10947/2538', '10947/2539', '10947/2540', '10947/2541', '10947/2589', '10947/26', '10947/2631', '10947/27', '10947/2708', '10947/2776', '10947/2782', '10947/2784', '10947/2786', '10947/2790', '10947/28', '10947/2805', '10947/2836', '10947/2871', '10947/2878', '10947/29', '10947/2900', '10947/2919', '10947/3', '10947/30', '10947/31', '10947/32', '10947/33', '10947/34', '10947/3457', '10947/35', '10947/36', '10947/37', '10947/38', '10947/39', '10947/4', '10947/40', '10947/4052', '10947/4054', '10947/4056', '10947/4068', '10947/41', '10947/42', '10947/43', '10947/4368', '10947/44', '10947/4467', '10947/45', '10947/4508', '10947/4509', '10947/4510', '10947/4573', '10947/46', '10947/4635', '10947/4636', '10947/4637', '10947/4638', '10947/4639', '10947/4651', '10947/4657', '10947/47', '10947/48', '10947/49', '10947/5', '10947/50', '10947/51', '10947/5308', '10947/5322', '10947/5324', '10947/5326', '10947/6', '10947/7', '10947/8', '10947/9'))) group by text_value order by count desc) to /tmp/cgiar-library-authors.csv with csv;

Meeting with Peter and CGSpace team
- Alan to follow up with ICARDA about depositing in CGSpace, we want ICARD and Drylands legacy content but not duplicates
- Alan to follow up on dc.rights, where are we?
- Alan to follow up with Atmire about a dedicated field for ORCIDs, based on the discussion in the June, 2017 DCAT meeting
- Alan to ask about how to query external services like AGROVOC in the DSpace submission form
Follow up with Atmire on the ticket about ORCID metadata in DSpace
Follow up with Lili and Andrea about the pending CCAFS metadata and flagship updates

2017-08-11

CGSpace had load issues and was throwing errors related to PostgreSQL
I told Tsega to reduce the max connections from 70 to 40 because actually each web application gets that limit and so for xmlui, oai, jspui, rest, etc it could be 70 x 4 = 280 connections depending on the load, and the PostgreSQL config itself is only 100!
I learned this on a recent discussion on the DSpace wiki
I need to either look into setting up a database pool through JNDI or increase the PostgreSQL max connections
Also, I need to find out where the load is coming from (rest?) and possibly block bots from accessing dynamic pages like Browse and Discover instead of just sending an X-Robots-Tag HTTP header
I noticed that Google has bitstreams from the rest interface in the search index. I need to ask on the dspace-tech mailing list to see what other people are doing about this, and maybe start issuing an X-Robots-Tag: none there!

2017-08-12

I sent a message to the mailing list about the duplicate content issue with /rest and /bitstream URLs
Looking at the logs for the REST API on /rest, it looks like there is someone hammering doing testing or something on it...

# awk '{print $1}' /var/log/nginx/rest.log.1 | sort -n | uniq -c | sort -h | tail -n 5
    140 66.249.66.91
    404 66.249.66.90
   1479 50.116.102.77
   9794 45.5.184.196
  85736 70.32.83.92

The top offender is 70.32.83.92 which is actually the same IP as ccafs.cgiar.org, so I will email the Macaroni Bros to see if they can test on DSpace Test instead
I've enabled logging of /oai requests on nginx as well so we can potentially determine bad actors here (also to see if anyone is actually using OAI!)

    # log oai requests
    location /oai {
        access_log /var/log/nginx/oai.log;
        proxy_pass http://tomcat_http;
    }

2017-08-13

Macaroni Bros say that CCAFS wants them to check once every hour for changes
I told them to check every four or six hours

2017-08-14

Run author corrections on CGIAR Library community from Peter

$ ./fix-metadata-values.py -i /tmp/authors-fix-523.csv -f dc.contributor.author -t correct -m 3 -d dspace -u dspace -p fuuuu

There were only three deletions so I just did them manually:

dspace=# delete from metadatavalue where resource_type_id=2 and metadata_field_id=3 and text_value='C';
DELETE 1
dspace=# delete from metadatavalue where resource_type_id=2 and metadata_field_id=3 and text_value='WSSD';

Generate a new list of authors from the CGIAR Library community for Peter to look through now that the initial corrections have been done
Thinking about resource limits for PostgreSQL again after last week's CGSpace crash and related to a recently discussion I had in the comments of the April, 2017 DCAT meeting notes
In that thread Chris Wilper suggests a new default of 35 max connections for db.maxconnections (from the current default of 30), knowing that each DSpace web application gets to use up to this many on its own
It would be good to approximate what the theoretical maximum number of connections on a busy server would be, perhaps by looking to see which apps use SQL:

$ grep -rsI SQLException dspace-jspui | wc -l          
473
$ grep -rsI SQLException dspace-oai | wc -l  
63
$ grep -rsI SQLException dspace-rest | wc -l
139
$ grep -rsI SQLException dspace-solr | wc -l                                                                               
0
$ grep -rsI SQLException dspace-xmlui | wc -l
866

Of those five applications we're running, only solr appears not to use the database directly
And JSPUI is only used internally (so it doesn't really count), leaving us with OAI, REST, and XMLUI
Assuming each takes a theoretical maximum of 35 connections during a heavy load (35 * 3 = 105), that would put the connections well above PostgreSQL's default max of 100 connections (remember a handful of connections are reserved for the PostgreSQL super user, see superuser_reserved_connections)
So we should adjust PostgreSQL's max connections to be DSpace's db.maxconnections * 3 + 3
This would allow each application to use up to db.maxconnections and not to go over the system's PostgreSQL limit
Perhaps since CGSpace is a busy site with lots of resources we could actually use something like 40 for db.maxconnections
Also worth looking into is to set up a database pool using JNDI, as apparently DSpace's db.poolname hasn't been used since around DSpace 1.7 (according to Chris Wilper's comments in the thread)
Need to go check the PostgreSQL connection stats in Munin on CGSpace from the past week to get an idea if 40 is appropriate
Looks like connections hover around 50:

Unfortunately I don't have the breakdown of which DSpace apps are making those connections (I'll assume XMLUI)
So I guess a limit of 30 (DSpace default) is too low, but 70 causes problems when the load increases and the system's PostgreSQL max_connections is too low
For now I think maybe setting DSpace's db.maxconnections to 40 and adjusting the system's max_connections might be a good starting point: 40 * 3 + 3 = 123
Apply 223 more author corrections from Peter on CGIAR Library
Help Magdalena from CCAFS with some CUA statistics questions

2017-08-15

Increase the nginx upload limit on CGSpace (linode18) so Sisay can upload 23 CIAT reports
Do some last minute cleanups and de-duplications of the CGIAR Library data, as I need to send it to Peter this week
Metadata fields like dc.contributor.author, dc.publisher, dc.type, and a few others had somehow been duplicated along the line
Also, a few dozen dc.description.abstract fields still had various HTML tags and entities in them
Also, a bunch of dc.subject fields that were not AGROVOC had not been moved properly to cg.system.subject

2017-08-16

I wanted to merge the various field variations like cg.subject.system and cg.subject.system[en_US] in OpenRefine but I realized it would be easier in PostgreSQL:

dspace=# select distinct text_value, text_lang from metadatavalue where resource_type_id=2 and metadata_field_id=254;

And actually, we can do it for other generic fields for items in those collections, for example dc.description.abstract:

dspace=# update metadatavalue set text_lang='en_US' where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'description' and qualifier = 'abstract') AND resource_type_id = 2 AND resource_id IN (select item_id from collection2item where collection_id IN (select resource_id from handle where handle in ('10568/93761', '10947/1', '10947/10', '10947/11', '10947/12', '10947/13', '10947/14', '10947/15', '10947/16', '10947/17', '10947/18', '10947/19', '10947/2', '10947/20', '10947/21', '10947/22', '10947/23', '10947/24', '10947/25', '10947/2512', '10947/2515', '10947/2516', '10947/2517', '10947/2518', '10947/2519', '10947/2520', '10947/2521', '10947/2522', '10947/2523', '10947/2524', '10947/2525', '10947/2526', '10947/2527', '10947/2528', '10947/2529', '10947/2530', '10947/2531', '10947/2532', '10947/2533', '10947/2534', '10947/2535', '10947/2536', '10947/2537', '10947/2538', '10947/2539', '10947/2540', '10947/2541', '10947/2589', '10947/26', '10947/2631', '10947/27', '10947/2708', '10947/2776', '10947/2782', '10947/2784', '10947/2786', '10947/2790', '10947/28', '10947/2805', '10947/2836', '10947/2871', '10947/2878', '10947/29', '10947/2900', '10947/2919', '10947/3', '10947/30', '10947/31', '10947/32', '10947/33', '10947/34', '10947/3457', '10947/35', '10947/36', '10947/37', '10947/38', '10947/39', '10947/4', '10947/40', '10947/4052', '10947/4054', '10947/4056', '10947/4068', '10947/41', '10947/42', '10947/43', '10947/4368', '10947/44', '10947/4467', '10947/45', '10947/4508', '10947/4509', '10947/4510', '10947/4573', '10947/46', '10947/4635', '10947/4636', '10947/4637', '10947/4638', '10947/4639', '10947/4651', '10947/4657', '10947/47', '10947/48', '10947/49', '10947/5', '10947/50', '10947/51', '10947/5308', '10947/5322', '10947/5324', '10947/5326', '10947/6', '10947/7', '10947/8', '10947/9')))

And on others like dc.language.iso, dc.relation.ispartofseries, dc.type, dc.title, etc...
Also, to move fields from dc.identifier.url to cg.identifier.url[en_US] (because we don't use the Dublin Core one for some reason):

dspace=# update metadatavalue set metadata_field_id = 219, text_lang = 'en_US' where resource_type_id = 2 AND metadata_field_id = 237;
UPDATE 15

Set the text_lang of all dc.identifier.uri (Handle) fields to be NULL, just like default DSpace does:

dspace=# update metadatavalue set text_lang=NULL where resource_type_id = 2 and metadata_field_id = 25 and text_value like 'http://hdl.handle.net/10947/%';
UPDATE 4248

Also update the text_lang of dc.contributor.author fields for metadata in these collections:

dspace=# update metadatavalue set text_lang=NULL where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'contributor' and qualifier = 'author') AND resource_type_id = 2 AND resource_id IN (select item_id from collection2item where collection_id IN (select resource_id from handle where handle in ('10568/93761', '10947/1', '10947/10', '10947/11', '10947/12', '10947/13', '10947/14', '10947/15', '10947/16', '10947/17', '10947/18', '10947/19', '10947/2', '10947/20', '10947/21', '10947/22', '10947/23', '10947/24', '10947/25', '10947/2512', '10947/2515', '10947/2516', '10947/2517', '10947/2518', '10947/2519', '10947/2520', '10947/2521', '10947/2522', '10947/2523', '10947/2524', '10947/2525', '10947/2526', '10947/2527', '10947/2528', '10947/2529', '10947/2530', '10947/2531', '10947/2532', '10947/2533', '10947/2534', '10947/2535', '10947/2536', '10947/2537', '10947/2538', '10947/2539', '10947/2540', '10947/2541', '10947/2589', '10947/26', '10947/2631', '10947/27', '10947/2708', '10947/2776', '10947/2782', '10947/2784', '10947/2786', '10947/2790', '10947/28', '10947/2805', '10947/2836', '10947/2871', '10947/2878', '10947/29', '10947/2900', '10947/2919', '10947/3', '10947/30', '10947/31', '10947/32', '10947/33', '10947/34', '10947/3457', '10947/35', '10947/36', '10947/37', '10947/38', '10947/39', '10947/4', '10947/40', '10947/4052', '10947/4054', '10947/4056', '10947/4068', '10947/41', '10947/42', '10947/43', '10947/4368', '10947/44', '10947/4467', '10947/45', '10947/4508', '10947/4509', '10947/4510', '10947/4573', '10947/46', '10947/4635', '10947/4636', '10947/4637', '10947/4638', '10947/4639', '10947/4651', '10947/4657', '10947/47', '10947/48', '10947/49', '10947/5', '10947/50', '10947/51', '10947/5308', '10947/5322', '10947/5324', '10947/5326', '10947/6', '10947/7', '10947/8', '10947/9')));
UPDATE 4899

Wow, I just wrote this baller regex facet to find duplicate authors:

isNotNull(value.match(/(CGIAR .+?)\|\|\1/))

This would be true if the authors were like CGIAR System Management Office||CGIAR System Management Office, which some of the CGIAR Library's were
Unfortunately when you fix these in OpenRefine and then submit the metadata to DSpace it doesn't detect any changes, so you have to edit them all manually via DSpace's "Edit Item"
Ooh! And an even more interesting regex would match any duplicated author:

isNotNull(value.match(/(.+?)\|\|\1/))

Which means it can also be used to find items with duplicate dc.subject fields...
Finally sent Peter the final dump of the CGIAR System Organization community so he can have a last look at it
Post a message to the dspace-tech mailing list to ask about querying the AGROVOC API from the submission form
Abenet was asking if there was some way to hide certain internal items from the "ILRI Research Outputs" RSS feed (which is the top-level ILRI community feed), because Shirley was complaining
I think we could use harvest.includerestricted.rss = false but the items might need to be 100% restricted, not just the metadata
Adjust Ansible postgres role to use max_connections from a template variable and deploy a new limit of 123 on CGSpace

2017-08-17

Run Peter's edits to the CGIAR System Organization community on DSpace Test
Uptime Robot said CGSpace went down for 1 minute, not sure why
Looking in dspace.log.2017-08-17 I see some weird errors that might be related?

2017-08-17 07:55:31,396 ERROR net.sf.ehcache.store.DiskStore @ cocoon-ehcacheCache: Could not read disk store element for key PK_G-aspect-cocoon://DRI/12/handle/10568/65885?pipelinehash=823411183535858997_T-Navigation-3368194896954203241. Error was invalid stream header: 00000000
java.io.StreamCorruptedException: invalid stream header: 00000000

Weird that these errors seem to have started on August 11th, the same day we had capacity issues with PostgreSQL:

# grep -c "ERROR net.sf.ehcache.store.DiskStore" dspace.log.2017-08-*
dspace.log.2017-08-01:0
dspace.log.2017-08-02:0
dspace.log.2017-08-03:0
dspace.log.2017-08-04:0
dspace.log.2017-08-05:0
dspace.log.2017-08-06:0
dspace.log.2017-08-07:0
dspace.log.2017-08-08:0
dspace.log.2017-08-09:0
dspace.log.2017-08-10:0
dspace.log.2017-08-11:8806
dspace.log.2017-08-12:5496
dspace.log.2017-08-13:2925
dspace.log.2017-08-14:2135
dspace.log.2017-08-15:1506
dspace.log.2017-08-16:1935
dspace.log.2017-08-17:584

There are none in 2017-07 either...
A few posts on the dspace-tech mailing list say this is related to the Cocoon cache somehow
I will clear the XMLUI cache for now and see if the errors continue (though perpaps shutting down Tomcat and removing the cache is more effective somehow?)
We tested the option for limiting restricted items from the RSS feeds on DSpace Test
I created four items, and only the two with public metadata showed up in the community's RSS feed:
- Public metadata, public bitstream ✓
- Public metadata, restricted bitstream ✓
- Restricted metadata, restricted bitstream ✗
- Private item ✗
Peter responded and said that he doesn't want to limit items to be restricted just so we can change the RSS feeds

2017-08-18

Someone on the dspace-tech mailing list responded with some tips about using the authority framework to do external queries from the submission form
He linked to some examples from DSpace-CRIS that use this functionality: VIAFAuthority
I wired it up to the dc.subject field of the submission interface using the "lookup" type and it works!
I think we can use this example to get a working AGROVOC query
More information about authority framework: https://wiki.lyrasis.org/display/DSPACE/Authority+Control+of+Metadata+Values
Wow, I'm playing with the AGROVOC SPARQL endpoint using the sparql-query tool:

$ ./sparql-query http://202.45.139.84:10035/catalogs/fao/repositories/agrovoc
sparql$ PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
SELECT 
    ?label 
WHERE {  
   {  ?concept  skos:altLabel ?label . } UNION {  ?concept  skos:prefLabel ?label . }
   FILTER regex(str(?label), "^fish", "i") .
} LIMIT 10;

┌───────────────────────┐                                                      
│ ?label                │                                                      
├───────────────────────┤                                                      
│ fisheries legislation │                                                      
│ fishery legislation   │                                                      
│ fishery law           │                                                      
│ fish production       │                                                      
│ fish farming          │                                                      
│ fishing industry      │                                                      
│ fisheries data        │                                                      
│ fishing power         │                                                      
│ fishing times         │                                                      
│ fish passes           │                                                      
└───────────────────────┘

More examples about SPARQL syntax: https://github.com/rsinger/openlcsh/wiki/Sparql-Examples
I found this blog post about speeding up the Tomcat startup time: http://skybert.net/java/improve-tomcat-startup-time/
The startup time went from ~80s to 40s!

2017-08-19

More examples of SPARQL queries: https://github.com/rsinger/openlcsh/wiki/Sparql-Examples
Specifically the explanation of the FILTER regex
Might want to SELECT DISTINCT or increase the LIMIT to get terms like "wheat" and "fish" to be visible
Test queries online on the AGROVOC SPARQL portal: http://202.45.139.84:10035/catalogs/fao/repositories/agrovoc

2017-08-20

Since I cleared the XMLUI cache on 2017-08-17 there haven't been any more ERROR net.sf.ehcache.store.DiskStore errors
Look at the CGIAR Library to see if I can find the items that have been submitted since May:

dspace=# select * from metadatavalue where metadata_field_id=11 and date(text_value) > '2017-05-01T00:00:00Z';
 metadata_value_id | item_id | metadata_field_id |      text_value      | text_lang | place | authority | confidence 
-------------------+---------+-------------------+----------------------+-----------+-------+-----------+------------
            123117 |    5872 |                11 | 2017-06-28T13:05:18Z |           |     1 |           |         -1
            123042 |    5869 |                11 | 2017-05-15T03:29:23Z |           |     1 |           |         -1
            123056 |    5870 |                11 | 2017-05-22T11:27:15Z |           |     1 |           |         -1
            123072 |    5871 |                11 | 2017-06-06T07:46:01Z |           |     1 |           |         -1
            123171 |    5874 |                11 | 2017-08-04T07:51:20Z |           |     1 |           |         -1
(5 rows)

According to dc.date.accessioned (metadata field id 11) there have only been five items submitted since May
These are their handles:

dspace=# select handle from item, handle where handle.resource_id = item.item_id AND item.item_id in (select item_id from metadatavalue where metadata_field_id=11 and date(text_value) > '2017-05-01T00:00:00Z');
   handle   
------------
 10947/4658
 10947/4659
 10947/4660
 10947/4661
 10947/4664
(5 rows)

2017-08-23

Start testing the nginx configs for the CGIAR Library migration as well as start making a checklist

2017-08-28

Bram had written to me two weeks ago to set up a chat about ORCID stuff but the email apparently bounced and I only found out when he emaiiled me on another account
I told him I can chat in a few weeks when I'm back

2017-08-31

I notice that in many WLE collections Marianne Gadeberg is in the edit or approval steps, but she is also in the groups for those steps.
I think we need to have a process to go back and check / fix some of these scenarios—to remove her user from the step and instead add her to the group—because we have way too many authorizations and in late 2016 we had performance issues with Solr because of this
I asked Sisay about this and hinted that he should go back and fix these things, but let's see what he says
Saw CGSpace go down briefly today and noticed SQL connection pool errors in the dspace log file:

ERROR org.dspace.storage.rdbms.DatabaseManager @ SQL connection Error
org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object

Looking at the logs I see we have been having hundreds or thousands of these errors a few times per week in 2017-07 and almost every day in 2017-08
It seems that I changed the db.maxconnections setting from 70 to 40 around 2017-08-14, but Macaroni Bros also reduced their hourly hammering of the REST API then
Nevertheless, it seems like a connection limit is not enough and that I should increase it (as well as the system's PostgreSQL max_connections)

28 KiB Raw Blame History