- The upgrade was mostly normal, but I had to unhold the openjdk package in order for `do-release-upgrade` to run:
```console
# apt-mark hold openjdk-8-jdk-headless:amd64 openjdk-8-jre-headless:amd64
```
- In [2022-11]({{< relref "2022-11.md" >}}) an upstream Java update broke the DSpace 6 Handle server so we will have to pin this again after the upgrade to Ubuntu 22.04
- After the upgrade I made sure CGSpace was working, then proceeded to upgrade PostgreSQL from 12 to 14, like I did on [DSpace Test in 2023-03]({{< relref "2023-03.md" >}})
- Then I had to downgrade OpenJDK to fix the Handle server using the ones I had previously downloaded for Ubuntu 20.04 because they no longer exist on Launchpad:
- Export CGSpace to check and update `dcterms.extent` fields
- I normalized about 1,500 to use either "p. 1-6" or "5 p." format
- Also, I used this GREL expression to extract missing pages from the citation field: `cells['dcterms.bibliographicCitation[en_US]'].value.match(/.*(pp?\.\s?\d+[-–]\d+).*/)[0]`
- This was over 4,000 items with a format like "p. 1-6" and "pp. 1-6" in the citation
- I used another GREL expression to extract another 5,000: `cells['dcterms.bibliographicCitation[en_US]'].value.match(/.*?(\d+\s+?[Pp]+\.).*/)[0]`
- This was for the format like "1 p." (note we had to protect against the greedy `.*` in the beginning)
- I also did some work to capture a handful of missing DOIs and ISSNs, but it was only about 100 items and I will have to wait until the 10,000+ above finish importing