mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Add notes for 2021-06-25
This commit is contained in:
@ -324,4 +324,49 @@ $ docker logs api 2>/dev/null | grep dspace_add_missing_items | sort | uniq | wc
|
||||
- Spent a few hours with Moayad troubleshooting and improving OpenRXV
|
||||
- We found a bug in the harvesting code that can occur when you are harvesting DSpace 5 and DSpace 6 instances, as DSpace 5 uses numeric (long) IDs, and DSpace 6 uses UUIDs
|
||||
|
||||
## 2021-06-25
|
||||
|
||||
- The new OpenRXV code creates almost 200,000 jobs when the plugins start
|
||||
- I figured out how to use [bee-queue/arena](https://github.com/bee-queue/arena/tree/master/example) to view our Bull job queue
|
||||
- Also, we can see the jobs directly using redis-cli:
|
||||
|
||||
```console
|
||||
$ redis-cli
|
||||
127.0.0.1:6379> SCAN 0 COUNT 5
|
||||
1) "49152"
|
||||
2) 1) "bull:plugins:476595"
|
||||
2) "bull:plugins:367382"
|
||||
3) "bull:plugins:369228"
|
||||
4) "bull:plugins:438986"
|
||||
5) "bull:plugins:366215"
|
||||
```
|
||||
|
||||
- We can apparently get the names of the jobs in each hash using `hget`:
|
||||
|
||||
```console
|
||||
127.0.0.1:6379> TYPE bull:plugins:401827
|
||||
hash
|
||||
127.0.0.1:6379> HGET bull:plugins:401827 name
|
||||
"dspace_add_missing_items"
|
||||
```
|
||||
|
||||
- I whipped up a one liner to get the keys for all plugin jobs, convert to redis `HGET` commands to extract the value of the name field, and then sort them by their counts:
|
||||
|
||||
```console
|
||||
$ redis-cli KEYS "bull:plugins:*" \
|
||||
| sed -e 's/^bull/HGET bull/' -e 's/\([[:digit:]]\)$/\1 name/' \
|
||||
| ncat -w 3 localhost 6379 \
|
||||
| grep -v -E '^\$' | sort | uniq -c | sort -h
|
||||
3 dspace_health_check
|
||||
4 -ERR wrong number of arguments for 'hget' command
|
||||
12 mel_downloads_and_views
|
||||
129 dspace_altmetrics
|
||||
932 dspace_downloads_and_views
|
||||
186428 dspace_add_missing_items
|
||||
```
|
||||
|
||||
- Note that this uses `ncat` to send commands directly to redis all at once instead of one at a time (`netcat` didn't work here, as it doesn't know when our input is finished and never quits)
|
||||
- I thought of using `redis-cli --pipe` but then you have to construct the commands in the redis protocol format with the number of args and length of each command
|
||||
- There is clearly something wrong with the new DSpace health check plugin, as it creates WAY too many jobs every time we run the plugins
|
||||
|
||||
<!-- vim: set sw=2 ts=2: -->
|
||||
|
Reference in New Issue
Block a user