mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Add notes for 2020-11-22
This commit is contained in:
@ -409,4 +409,54 @@ $ psql -c 'SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid =
|
||||
|
||||
- Very curious that there was such a high number of rolled back transactions after the update
|
||||
|
||||
## 2020-11-22
|
||||
|
||||
- PostgreSQL situation on CGSpace (linode18) looks much better now:
|
||||
|
||||

|
||||

|
||||
|
||||
- In other news, I noticed that harvesting DSpace 6 works fine in OpenRXV, but the statistics fail on page 1
|
||||
- I filed an issue: https://github.com/ilri/OpenRXV/issues/59
|
||||
- Abenet asked for help trying to add a new user to the Bioversity and CIAT groups on CGSpace
|
||||
- I see that the user search is split on five results, so the user in question appears on page 2
|
||||
- I asked Abenet if she was getting an error or it was simply this...
|
||||
- Maria Garuccio sent me an example report that she wants to be able to generate from AReS
|
||||
- First, she would like to have the option to group by output type
|
||||
- Second, she would like to be able to control the sorting in the template, like sorting the citation alphabetically
|
||||
- I filed an issue: https://github.com/ilri/OpenRXV/issues/60
|
||||
- Mohammad Salem had asked if there was an item ID to UUID mapping for CGSpace
|
||||
- I found a thread on the dspace-tech mailing list that pointed out that there is a new `uuid` column in the item table
|
||||
- Only old items have an `item_id` so we can get a mapping easily:
|
||||
|
||||
```
|
||||
dspace=# \COPY (SELECT item_id,uuid FROM item WHERE in_archive='t' AND withdrawn='f' AND item_id IS NOT NULL) TO /tmp/2020-11-22-item-id2uuid.csv WITH CSV HEADER;
|
||||
COPY 87411
|
||||
```
|
||||
|
||||
- Saving some notes I wrote down about faceting by community and collection in Solr, for potential use in the future in the DSpace Statistics API
|
||||
- Facet by owningComm to see total number of distinct communities (136):
|
||||
|
||||
```
|
||||
facet=true&facet.mincount=1&facet.field=owningComm&facet.limit=1&facet.offset=0&stats=true&stats.field=id&stats.calcdistinct=true
|
||||
```
|
||||
|
||||
- Facet by owningComm and get the first 5 distinct:
|
||||
|
||||
```
|
||||
facet=true&facet.mincount=1&facet.field=owningComm&facet.limit=5&facet.offset=0&facet.pivot=id,countryCode
|
||||
```
|
||||
|
||||
- Facet by owningComm and countryCode using facet.pivot and maybe I can just skip the normal facet params?
|
||||
|
||||
```
|
||||
facet=true&f.owningComm.facet.limit=5&f.owningComm.facet.offset=5&facet.pivot=owningComm,countryCode
|
||||
```
|
||||
|
||||
- Facet by owningComm and countryCode using facet.pivot and limiting to top five countries... fuck it's possible!
|
||||
|
||||
```
|
||||
facet=true&f.owningComm.facet.limit=5&f.owningComm.facet.offset=5&f.countryCode.facet.limit=5&facet.pivot=owningComm,countryCode
|
||||
```
|
||||
|
||||
<!-- vim: set sw=2 ts=2: -->
|
||||
|
@ -410,8 +410,14 @@ $ curl -s "http://localhost:8081/solr/statistics-2010/update?softCommit=true" -H
|
||||
|
||||
### Processing Solr statistics with AtomicStatisticsUpdateCLI
|
||||
|
||||
On 2020-11-18 I finished processing the Solr statistics with solr-upgrade-statistics-6x and I started processing them with AtomicStatisticsUpdateCLI:
|
||||
On 2020-11-18 I finished processing the Solr statistics with solr-upgrade-statistics-6x and I started processing them with AtomicStatisticsUpdateCLI.
|
||||
|
||||
## statistics
|
||||
|
||||
First the current year's statistics core, in 12-hour batches:
|
||||
|
||||
```
|
||||
$ chrt -b 0 dspace dsrun com.atmire.statistics.util.update.atomic.AtomicStatisticsUpdateCLI -t 12 -c statistics
|
||||
```
|
||||
|
||||
It took ~38 hours to finish processing this core.
|
||||
|
Reference in New Issue
Block a user