mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Add notes for 2018-09-30
This commit is contained in:
@ -611,4 +611,73 @@ $ time schedtool -D -e ionice -c2 -n7 nice -n19 dspace index-discovery -b
|
||||
- It seems to be Moayad trying to do the AReS explorer indexing
|
||||
- He was sending too many (5 or 10) concurrent requests to the server, but still... why is this shit so slow?!
|
||||
|
||||
## 2018-09-30
|
||||
|
||||
- Valerio keeps sending items on CGSpace that have weird or incorrect languages, authors, etc
|
||||
- I think I should just batch export and update all languages...
|
||||
|
||||
```
|
||||
dspace=# \copy (select distinct text_value, count(*) from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'language' and qualifier = 'iso') AND resource_type_id = 2 group by text_value order by count desc) to /tmp/2018-09-30-languages.csv with csv;
|
||||
```
|
||||
|
||||
- Then I can simply delete the "Other" and "other" ones because that's not useful at all:
|
||||
|
||||
```
|
||||
dspace=# DELETE FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'language' and qualifier = 'iso') AND text_value='Other';
|
||||
DELETE 6
|
||||
dspace=# DELETE FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'language' and qualifier = 'iso') AND text_value='other';
|
||||
DELETE 79
|
||||
```
|
||||
|
||||
- Looking through the list I see some weird language codes like `gh`, so I checked out those items:
|
||||
|
||||
```
|
||||
dspace=# SELECT resource_id FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'language' and qualifier = 'iso') AND text_value='gh';
|
||||
resource_id
|
||||
-------------
|
||||
94530
|
||||
94529
|
||||
dspace=# SELECT handle,item_id FROM item, handle WHERE handle.resource_type_id=2 AND handle.resource_id = item.item_id AND handle.resource_id in (94530, 94529);
|
||||
handle | item_id
|
||||
-------------+---------
|
||||
10568/91386 | 94529
|
||||
10568/91387 | 94530
|
||||
```
|
||||
|
||||
- Those items are from Ghana, so the submitter apparently thought `gh` was a language... I can safely delete them:
|
||||
|
||||
```
|
||||
dspace=# DELETE FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'language' and qualifier = 'iso') AND text_value='gh';
|
||||
DELETE 2
|
||||
```
|
||||
|
||||
- The next issue would be `jn`:
|
||||
|
||||
```
|
||||
dspace=# SELECT resource_id FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'language' and qualifier = 'iso') AND text_value='jn';
|
||||
resource_id
|
||||
-------------
|
||||
94001
|
||||
94003
|
||||
dspace=# SELECT handle,item_id FROM item, handle WHERE handle.resource_type_id=2 AND handle.resource_id = item.item_id AND handle.resource_id in (94001, 94003);
|
||||
handle | item_id
|
||||
-------------+---------
|
||||
10568/90868 | 94001
|
||||
10568/90870 | 94003
|
||||
```
|
||||
|
||||
- Those items are about Japan, so I will update them to be `ja`
|
||||
- Other replacements:
|
||||
|
||||
```
|
||||
DELETE FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'language' and qualifier = 'iso') AND text_value='gh';
|
||||
UPDATE metadatavalue SET text_value='fr' WHERE resource_type_id=2 AND metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'language' and qualifier = 'iso') AND text_value='fn';
|
||||
UPDATE metadatavalue SET text_value='hi' WHERE resource_type_id=2 AND metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'language' and qualifier = 'iso') AND text_value='in';
|
||||
UPDATE metadatavalue SET text_value='ja' WHERE resource_type_id=2 AND metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'language' and qualifier = 'iso') AND text_value='Ja';
|
||||
UPDATE metadatavalue SET text_value='ja' WHERE resource_type_id=2 AND metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'language' and qualifier = 'iso') AND text_value='jn';
|
||||
UPDATE metadatavalue SET text_value='ja' WHERE resource_type_id=2 AND metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'language' and qualifier = 'iso') AND text_value='jp';
|
||||
```
|
||||
|
||||
- Then there are 12 items with `en|hi`, but they were all in one collection so I just exported it as a CSV and then re-imported the corrected metadata
|
||||
|
||||
<!-- vim: set sw=2 ts=2: -->
|
||||
|
Reference in New Issue
Block a user