Update notes for 2017-08-14

This commit is contained in:
2017-08-14 14:41:36 +03:00
parent 5fcf4f536d
commit a4e2af0b48
4 changed files with 68 additions and 1 deletions

View File

@ -138,3 +138,35 @@ dspace=# delete from metadatavalue where resource_type_id=2 and metadata_field_i
```
- Generate a new list of authors from the CGIAR Library community for Peter to look through now that the initial corrections have been done
- Thinking about resource limits for PostgreSQL again after last week's CGSpace crash and related to a recently discussion I had in the comments of the [April, 2017 DCAT meeting notes](https://wiki.duraspace.org/display/cmtygp/DCAT+Meeting+April+2017)
- In that thread Chris Wilper suggests a new default of 35 max connections for `db.maxconnections` (from the current default of 30), knowing that _each DSpace web application_ gets to use up to this many on its own
- It would be good to approximate what the theoretical maximum number of connections on a busy server would be, perhaps by looking to see which apps use SQL:
```
$ grep -rsI SQLException dspace-jspui | wc -l
473
$ grep -rsI SQLException dspace-oai | wc -l
63
$ grep -rsI SQLException dspace-rest | wc -l
139
$ grep -rsI SQLException dspace-solr | wc -l
0
$ grep -rsI SQLException dspace-xmlui | wc -l
866
```
- Of those five applications we're running, only `solr` appears not to use the database directly
- And JSPUI is only used internally (so it doesn't really count), leaving us with OAI, REST, and XMLUI
- Assuming each takes a theoretical maximum of 35 connections during a heavy load (35 * 3 = 105), that would put the connections well above PostgreSQL's default max of 100 connections (remember a handful of connections are reserved for the PostgreSQL super user, see `superuser_reserved_connections`)
- So we should adjust PostgreSQL's max connections to be DSpace's `db.maxconnections` * 3 + 3
- This would allow each application to use up to `db.maxconnections` and not to go over the system's PostgreSQL limit
- Perhaps since CGSpace is a busy site with lots of resources we could actually use something like 40 for `db.maxconnections`
- Also worth looking into is to set up a database pool using JNDI, as apparently DSpace's `db.poolname` hasn't been used since around DSpace 1.7 (according to Chris Wilper's comments in the thread)
- Need to go check the PostgreSQL connection stats in Munin on CGSpace from the past week to get an idea if 40 is appropriate
- Looks like connections hover around 50:
![PostgreSQL connections 2017-08](/cgspace-notes/2017/08/postgresql-connections-cgspace.png)
- Unfortunately I don't have the breakdown of which DSpace apps are making those connections (I'll assume XMLUI)
- So I guess a limit of 30 (DSpace default) is too low, but 70 causes problems when the load increases and the system's PostgreSQL `max_connections` is too low
- For now I think maybe setting DSpace's `db.maxconnections` to 40 and adjusting the system's `max_connections` might be a good starting point: 40 * 3 + 4 = 123