mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Add notes for 2023-02-12
This commit is contained in:
@ -87,4 +87,45 @@ curl -f -H "Content-Type: application/json" -X POST "https://dspacetest.cgiar.or
|
||||
- Export CGSpace to update Initiative mappings and country/region mappings
|
||||
- Then start a harvest on AReS
|
||||
|
||||
## 2023-02-09
|
||||
|
||||
- Do some minor work on the CSS on the DSpace 7 test
|
||||
|
||||
## 2023-02-10
|
||||
|
||||
- I noticed a large number of PostgreSQL locks from dspaceWeb on CGSpace:
|
||||
|
||||
```console
|
||||
$ psql -c 'SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;' | grep -o -E '(dspaceWeb|dspaceApi|dspaceCli)' | sort | uniq -c
|
||||
2033 dspaceWeb
|
||||
```
|
||||
|
||||
- Looking at the lock age, I see some already 1 day old, including this curious query:
|
||||
|
||||
```console
|
||||
select nextval ('public.registrationdata_seq')
|
||||
```
|
||||
|
||||
- I killed all locks that were more than a few hours old
|
||||
- Export CGSpace to update Initiative collection mappings
|
||||
- Discuss adding `dcterms.available` to the submission form
|
||||
- I also looked in the `dcterms.description` field on CGSpace and found ~1,500 items where the is an indication of an online published date
|
||||
- Using some facets in OpenRefine I narrowed down the ones mentioning "online" and then extracted the dates to a new column:
|
||||
|
||||
```console
|
||||
cells['dcterms.description[en_US]'].value.replace(/.*?(\d+{2}) ([a-zA-Z]+) (\d+{2}).*/,"$3-$2-$1")
|
||||
```
|
||||
|
||||
- Then to handle formats like "2022-April-26" and "2021-Nov-11" I used some replacement GRELs (note the order so we don't replace short patterns in longer strings prematurely):
|
||||
|
||||
```console
|
||||
value.replace("January","01").replace("February","02").replace("March","03").replace("April","04").replace("May","05").replace("June","06").replace("July","07").replace("August","08").replace("September","09").replace("October","10").replace("November","11").replace("December","12")
|
||||
value.replace("Jan","01").replace("Feb","02").replace("Mar","03").replace("Apr","04").replace("May","05").replace("Jun","06").replace("Jul","07").replace("Aug","08").replace("Sep","09").replace("Oct","10").replace("Nov","11").replace("Dec","12")
|
||||
```
|
||||
|
||||
- This covered about 1,300 items, then I did about 100 more messier ones with some more regex wranling
|
||||
- I removed the `dcterms.description[en_US]` field from items where I updated the dates
|
||||
- Then I added `dcterms.available` to the submission form and the item view
|
||||
- We need to announce this to the editors
|
||||
|
||||
<!-- vim: set sw=2 ts=2: -->
|
||||
|
Reference in New Issue
Block a user