Add notes for 2018-09-30

This commit is contained in:
2018-09-30 08:23:48 +03:00
parent 22aaf9a05c
commit d65b9e24ee
60 changed files with 270 additions and 66 deletions

View File

@ -611,4 +611,73 @@ $ time schedtool -D -e ionice -c2 -n7 nice -n19 dspace index-discovery -b
- It seems to be Moayad trying to do the AReS explorer indexing
- He was sending too many (5 or 10) concurrent requests to the server, but still... why is this shit so slow?!
## 2018-09-30
- Valerio keeps sending items on CGSpace that have weird or incorrect languages, authors, etc
- I think I should just batch export and update all languages...
```
dspace=# \copy (select distinct text_value, count(*) from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'language' and qualifier = 'iso') AND resource_type_id = 2 group by text_value order by count desc) to /tmp/2018-09-30-languages.csv with csv;
```
- Then I can simply delete the "Other" and "other" ones because that's not useful at all:
```
dspace=# DELETE FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'language' and qualifier = 'iso') AND text_value='Other';
DELETE 6
dspace=# DELETE FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'language' and qualifier = 'iso') AND text_value='other';
DELETE 79
```
- Looking through the list I see some weird language codes like `gh`, so I checked out those items:
```
dspace=# SELECT resource_id FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'language' and qualifier = 'iso') AND text_value='gh';
resource_id
-------------
94530
94529
dspace=# SELECT handle,item_id FROM item, handle WHERE handle.resource_type_id=2 AND handle.resource_id = item.item_id AND handle.resource_id in (94530, 94529);
handle | item_id
-------------+---------
10568/91386 | 94529
10568/91387 | 94530
```
- Those items are from Ghana, so the submitter apparently thought `gh` was a language... I can safely delete them:
```
dspace=# DELETE FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'language' and qualifier = 'iso') AND text_value='gh';
DELETE 2
```
- The next issue would be `jn`:
```
dspace=# SELECT resource_id FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'language' and qualifier = 'iso') AND text_value='jn';
resource_id
-------------
94001
94003
dspace=# SELECT handle,item_id FROM item, handle WHERE handle.resource_type_id=2 AND handle.resource_id = item.item_id AND handle.resource_id in (94001, 94003);
handle | item_id
-------------+---------
10568/90868 | 94001
10568/90870 | 94003
```
- Those items are about Japan, so I will update them to be `ja`
- Other replacements:
```
DELETE FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'language' and qualifier = 'iso') AND text_value='gh';
UPDATE metadatavalue SET text_value='fr' WHERE resource_type_id=2 AND metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'language' and qualifier = 'iso') AND text_value='fn';
UPDATE metadatavalue SET text_value='hi' WHERE resource_type_id=2 AND metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'language' and qualifier = 'iso') AND text_value='in';
UPDATE metadatavalue SET text_value='ja' WHERE resource_type_id=2 AND metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'language' and qualifier = 'iso') AND text_value='Ja';
UPDATE metadatavalue SET text_value='ja' WHERE resource_type_id=2 AND metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'language' and qualifier = 'iso') AND text_value='jn';
UPDATE metadatavalue SET text_value='ja' WHERE resource_type_id=2 AND metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'language' and qualifier = 'iso') AND text_value='jp';
```
- Then there are 12 items with `en|hi`, but they were all in one collection so I just exported it as a CSV and then re-imported the corrected metadata
<!-- vim: set sw=2 ts=2: -->