mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Add notes
This commit is contained in:
@ -642,7 +642,7 @@ UPDATE 18659
|
||||
$ dspace metadata-import -f /tmp/0.csv
|
||||
```
|
||||
|
||||
- It took FOREVER to import each file... like several hours. MY GOD DSpace 6 is slow.
|
||||
- It took FOREVER to import each file... like several hours *each*. MY GOD DSpace 6 is slow.
|
||||
- Help Dominique Perera debug some issues with the WordPress DSpace importer plugin from Macaroni Bros
|
||||
- She is not seeing the community list for CGSpace, and I see weird requests like this in the logs:
|
||||
|
||||
@ -653,4 +653,90 @@ $ dspace metadata-import -f /tmp/0.csv
|
||||
|
||||
- The first request is OK, but the second one is malformed for sure
|
||||
|
||||
## 2021-02-24
|
||||
|
||||
- Export a list of journals for Peter to look through:
|
||||
|
||||
```console
|
||||
localhost/dspace63= > \COPY (SELECT DISTINCT text_value as "cg.journal", count(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=251 GROUP BY text_value ORDER BY count DESC) to /tmp/2021-02-24-journals.csv WITH CSV HEADER;
|
||||
COPY 3345
|
||||
```
|
||||
|
||||
- Start a fresh harvesting on AReS because Udana mapped some items today and wants to include them in his report:
|
||||
|
||||
```console
|
||||
$ curl -XDELETE 'http://localhost:9200/openrxv-items-temp'
|
||||
# start indexing in AReS
|
||||
```
|
||||
|
||||
- Also, I want to include the new series name/number cleanups so it's not a total waste of time
|
||||
|
||||
## 2021-02-25
|
||||
|
||||
- Hmm the AReS harvest last night seems to have finished successfully, but the number of items is less than I was expecting:
|
||||
|
||||
```console
|
||||
$ curl -s 'http://localhost:9200/openrxv-items-temp/_count?q=*&pretty'
|
||||
{
|
||||
"count" : 99546,
|
||||
"_shards" : {
|
||||
"total" : 1,
|
||||
"successful" : 1,
|
||||
"skipped" : 0,
|
||||
"failed" : 0
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
- The current items index has 101380 items... I wonder what happened
|
||||
- I started a new indexing
|
||||
|
||||
## 2021-02-26
|
||||
|
||||
- Last night's indexing was more successful, there are now 101479 items in the index
|
||||
- Yesterday Yousef sent a [pull request](https://github.com/ilri/OpenRXV/pull/77/) for the next/previous buttons on OpenRXV
|
||||
- I tested it this morning and it seems to be working
|
||||
|
||||
## 2021-02-28
|
||||
|
||||
- Abenet asked me to import seventy-three records for CRP Forests, Trees and Agroforestry
|
||||
- I checked them briefly and found that there were thirty+ journal articles, and none of them had `cg.journal`, `cg.volume`, `cg.issue`, or `dcterms.license` so I spent a little time adding them
|
||||
- I used a GREL expression to extract the journal volume and issue from the citation into new columns:
|
||||
|
||||
```console
|
||||
value.partition(/[0-9]+\([0-9]+\)/)[1].replace(/\(.*\)/,"")
|
||||
value.partition(/[0-9]+\([0-9]+\)/)[1].replace(/^\d+\((\d+)\)/,"$1")
|
||||
```
|
||||
|
||||
- This `value.partition` was new to me... and it took me a bit of time to figure out whether I needed to escape the parentheses in the issue number or not (no) and how to reference a capture group with `value.replace`
|
||||
- I tried to check the 1095 CIFOR records from last week for duplicates on DSpace Test, but the page says "Processing" and never loads
|
||||
- I don't see any errors in the logs, but there are two jQuery errors in the browser console
|
||||
- I filed [an issue](https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=934) with Atmire
|
||||
- Upload twelve items to CGSpace for Peter
|
||||
- Niroshini from IWMI is still having issues adding WLE subjects to items during the metadata review step in the workflow
|
||||
- It seems the BatchEditConsumer log spam is gone since I applied [Atmire's patch](https://github.com/ilri/DSpace/pull/462)
|
||||
|
||||
```console
|
||||
$ grep -c 'BatchEditConsumer should not have been given' dspace.log.2021-02-[12]*
|
||||
dspace.log.2021-02-10:5067
|
||||
dspace.log.2021-02-11:2647
|
||||
dspace.log.2021-02-12:4231
|
||||
dspace.log.2021-02-13:221
|
||||
dspace.log.2021-02-14:0
|
||||
dspace.log.2021-02-15:0
|
||||
dspace.log.2021-02-16:0
|
||||
dspace.log.2021-02-17:0
|
||||
dspace.log.2021-02-18:0
|
||||
dspace.log.2021-02-19:0
|
||||
dspace.log.2021-02-20:0
|
||||
dspace.log.2021-02-21:0
|
||||
dspace.log.2021-02-22:0
|
||||
dspace.log.2021-02-23:0
|
||||
dspace.log.2021-02-24:0
|
||||
dspace.log.2021-02-25:0
|
||||
dspace.log.2021-02-26:0
|
||||
dspace.log.2021-02-27:0
|
||||
dspace.log.2021-02-28:0
|
||||
```
|
||||
|
||||
<!-- vim: set sw=2 ts=2: -->
|
||||
|
97
content/posts/2021-03.md
Normal file
97
content/posts/2021-03.md
Normal file
@ -0,0 +1,97 @@
|
||||
---
|
||||
title: "March, 2021"
|
||||
date: 2021-03-01T10:13:54+02:00
|
||||
author: "Alan Orth"
|
||||
categories: ["Notes"]
|
||||
---
|
||||
|
||||
## 2021-03-01
|
||||
|
||||
- Discuss some OpenRXV issues with Abdullah from CodeObia
|
||||
- He's trying to work on the DSpace 6+ metadata schema autoimport using the DSpace 6+ REST API
|
||||
- Also, we found some issues building and running OpenRXV currently due to ecosystem shift in the Node.js dependencies
|
||||
|
||||
<!--more-->
|
||||
|
||||
## 2021-03-02
|
||||
|
||||
- I fixed three build and runtime issues in OpenRXV:
|
||||
- [fix highcharts-angular and ngx-tour-core build](https://github.com/ilri/OpenRXV/pull/80)
|
||||
- [frontend/package.json: Pin @types/ramda at 0.27.34](https://github.com/ilri/OpenRXV/pull/82)
|
||||
- Then I merged a few fixes that Abdullah had worked on last week
|
||||
|
||||
## 2021-03-03
|
||||
|
||||
- I [fixed another frontend build warning on OpenRXV](https://github.com/ilri/OpenRXV/issues/83)
|
||||
- Then I [updated the frontend container to use Node.js 12 and Ubuntu 20.04](https://github.com/ilri/OpenRXV/pull/84)
|
||||
- Also, I [added a GitHub Actions workflow to build the frontend](https://github.com/ilri/OpenRXV/pull/85)
|
||||
- I did some testing of Abdullah's patch for the values mapping search on OpenRXV
|
||||
- It still doesn't work with multi-word values, so I recorded a video with wf-recorder and uploaded it to [the issue](https://github.com/ilri/OpenRXV/issues/43) for him to investigate
|
||||
|
||||
## 2021-03-04
|
||||
|
||||
- Peter is having issues with the workflow since yesterday
|
||||
- I looked at the Munin stats and see a high number of database locks since yesterday
|
||||
|
||||

|
||||

|
||||
|
||||
- I looked at the number of connections in PostgreSQL and it's definitely high again:
|
||||
|
||||
```console
|
||||
$ psql -c 'SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid;' | wc -l
|
||||
1020
|
||||
```
|
||||
|
||||
- I reported it to Atmire to take a look, on the [same issue](https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=851) we had been tracking this before
|
||||
- Abenet asked me to add a new ORCID for ILRI staff member Zoe Campbell
|
||||
- I added it to the controlled vocabulary and then tagged her existing items on CGSpace using my `add-orcid-identifier.py` script:
|
||||
|
||||
```console
|
||||
$ cat 2021-03-04-add-zoe-campbell-orcid.csv
|
||||
dc.contributor.author,cg.creator.identifier
|
||||
"Campbell, Zoë","Zoe Campbell: 0000-0002-4759-9976"
|
||||
"Campbell, Zoe A.","Zoe Campbell: 0000-0002-4759-9976"
|
||||
$ ./ilri/add-orcid-identifiers-csv.py -i 2021-03-04-add-zoe-campbell-orcid.csv -db dspace -u dspace -p 'fuuu'
|
||||
```
|
||||
|
||||
- I still need to do cleanup on the journal articles metadata
|
||||
- Peter sent me some cleanups but I can't use them in the search/replace format he gave
|
||||
- I think it's better to export the metadata values with IDs and import cleaned up ones as CSV
|
||||
|
||||
```console
|
||||
localhost/dspace63= > \COPY (SELECT dspace_object_id AS id, text_value as "cg.journal" FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=251) to /tmp/2021-02-24-journals.csv WITH CSV HEADER;
|
||||
COPY 32087
|
||||
```
|
||||
|
||||
- I used OpenRefine to remove all journal values that didn't have one of these values: ; ( )
|
||||
- Then I cloned the `cg.journal` field to `cg.volume` and `cg.issue`
|
||||
- I used some GREL expressions like these to extract the journal name, volume, and issue:
|
||||
|
||||
```console
|
||||
value.partition(';')[0].trim() # to get journal names
|
||||
value.partition(/[0-9]+\([0-9]+\)/)[1].replace(/^(\d+)\(\d+\)/,"$1") # to get journal volumes
|
||||
value.partition(/[0-9]+\([0-9]+\)/)[1].replace(/^\d+\((\d+)\)/,"$1") # to get journal issues
|
||||
```
|
||||
|
||||
- Then I uploaded the changes to CGSpace using `dspace metadata-import`
|
||||
- Margarita from CCAFS was asking about an error deleting some items that were showing up in Google and should have been private
|
||||
- The error was "Authorization denied for action OBSOLETE (DELETE) on BITSTREAM:bd157345-448e ..."
|
||||
- I searched the DSpace issue tracker and found several issues reporting this:
|
||||
- [DS-3985 Delete item fails](https://jira.lyrasis.org/browse/DS-3985)
|
||||
- [DS-4004 Authorization denied Exception when trying to delete permanently an item, collection or community as a non-Admin user](https://jira.lyrasis.org/browse/DS-4004)
|
||||
- [DS-4297 Authorization error when trying to delete item by submitter/administrator](https://jira.lyrasis.org/browse/DS-4297)
|
||||
- The issue is apparently with non-admin users who are in the admin and submit groups of the owning collection...
|
||||
- In this case the item was uploaded to the CCAFS Reports collection, and Margarita is a non-admin user who is a member of the collection's admin and submit groups, exactly as the issue described
|
||||
- I added a comment about our issue to [DS-4297](https://jira.lyrasis.org/browse/DS-4297)
|
||||
- Yesterday Abenet added me to a WLE collection approver/editer steps so we can try to figure out why Niroshini is having issues adding metadata to Udana's submissions
|
||||
- I edited Udana's submission to CGSpace:
|
||||
- corrected the title
|
||||
- added language English
|
||||
- changed the link to the external item page instead of PDF
|
||||
- added SDGs from the external item page
|
||||
- added AGROVOC subjects from the external item page
|
||||
- added pagination (extent)
|
||||
- changed the license to "other" because CC-BY-NC-ND is not printed anywhere in the PDF or external item page
|
||||
|
||||
<!-- vim: set sw=2 ts=2: -->
|
Reference in New Issue
Block a user