- Last month I enabled the `log_lock_waits` in PostgreSQL so I checked the log and was surprised to find only a few since I restarted PostgreSQL three days ago:
- I think you could analyze the locks for the `dspaceWeb` user (XMLUI) and find out what queries were locking... but it's so much information and I don't know where to start
- For now I just restarted PostgreSQL...
- Francesca was able to do her submission immediately...
- On a related note, I want to enable the `pg_stat_statement` feature to see which queries get run the most, so I created the extension on the CGSpace database
- I was doing some research on PostgreSQL locks and found some interesting things to consider
- The default `lock_timeout` is 0, aka disabled
- The default `statement_timeout` is 0, aka disabled
- It seems to be recommended to start by setting `statement_timeout` first, rule of thumb [ten times longer than your longest query](https://github.com/jberkus/annotated.conf/blob/master/postgresql.10.simple.conf#L211)
- Mark Wood mentioned the `checker` cron job that apparently runs in one transaction and might be an issue
- I definitely saw it holding a bunch of locks for ~30 minutes during the first part of its execution, then it dropped them and did some other less-intensive things without locks
- Bizuwork was still not receiving emails even after we fixed the SMTP access on CGSpace
- After some troubleshooting it turns out that the emails from CGSpace were going in her Junk!
## 2021-12-03
- I see GARDIAN is now using a "GARDIAN" user agent finally
- I will add them to our local spider agent override in DSpace so that the hits don't get counted in Solr
## 2021-12-05
- Proof fifty records Abenet sent me from Africa Rice Center ("AfricaRice 1st batch Import")
- Fixed forty-six incorrect collections
- Cleaned up and normalize affiliations
- Cleaned up dates (extra `*` character in all?)
- Cleaned up citation format
- Fixed some encoding issues in abstracts
- Removed empty columns
- Removed one duplicate: Enhancing Rice Productivity and Soil Nitrogen Using Dual-Purpose Cowpea-NERICA® Rice Sequence in Degraded Savanna
- Added volume and issue metadata by extracting it from the citations
- All PDFs hosted on davidpublishing.com are dead...
- All DOIs linking to African Journal of Agricultural Research are dead...
- Fixed a handful of items marked as "Open Access" that are actually closed
- Added many missing ISSNs
- Added many missing countries/regions
- Fixed invalid AGROVOC terms and added some more based on article subjects
- I also made some minor changes to the [CSV Metadata Quality Checker](https://github.com/ilri/csv-metadata-quality)
- Added the ability to check if the item's title exists in the citation
- Updated to only run the mojibake check if we're not running in unsafe mode (so we don't print the same warning during both the check and fix steps)
- Some minor work on the `check-duplicates.py` script I wrote last month
- I found some corner cases where there were items that matched in the database, but they were `in_archive=f` and or `withdrawn=t` so I check that before trying to resolve the handles of potential duplicates
- More work on the Africa Rice Center 1st batch import
- I merged the metadata for three duplicates in Africa Rice's items and mapped them on CGSpace
- I did a bit more work to add missing AGROVOC subjects, countries, regions, extents, etc and then uploaded the forty-six items to CGSpace
- I started looking at the seventy CAS records that Abenet has been working on for the past few months