mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-11-17 12:17:05 +01:00
52 lines
1.6 KiB
Markdown
52 lines
1.6 KiB
Markdown
+++
|
|
date = "2016-02-05T13:18:00+03:00"
|
|
author = "Alan Orth"
|
|
title = "February, 2016"
|
|
tags = ["notes"]
|
|
image = "../images/bg.jpg"
|
|
|
|
+++
|
|
## 2016-02-05
|
|
|
|
- Looking at some DAGRIS data for Abenet Yabowork
|
|
- Lots of issues with spaces, newlines, etc causing the import to fail
|
|
- I noticed we have a very *interesting* list of countries on CGSpace:
|
|
|
|
![CGSpace country list](../images/2016/02/cgspace-countries.png)
|
|
|
|
- Not only are there 49,000 countries, we have some blanks (25)...
|
|
- Also, lots of things like "COTE D`LVOIRE" and "COTE D IVOIRE"
|
|
|
|
## 2016-02-06
|
|
|
|
- Found a way to get items with null/empty metadata values from SQL
|
|
- First, find the `metadata_field_id` for the field you want from the `metadatafieldregistry` table:
|
|
|
|
```
|
|
dspacetest=# select * from metadatafieldregistry;
|
|
```
|
|
|
|
- In this case our country field is 78
|
|
- Now find all resources with type 2 (item) that have null/empty values for that field:
|
|
|
|
```
|
|
dspacetest=# select resource_id from metadatavalue where resource_type_id=2 and metadata_field_id=78 and (text_value='' OR text_value IS NULL);
|
|
```
|
|
|
|
- Then you can find the handle that owns it from its `resource_id`:
|
|
|
|
```
|
|
dspacetest=# select handle from item, handle where handle.resource_id = item.item_id AND item.item_id = '22678';
|
|
```
|
|
|
|
- It's 25 items so editing in the web UI is annoying, let's try SQL!
|
|
|
|
```
|
|
dspacetest=# delete from metadatavalue where metadata_field_id=78 and text_value='';
|
|
DELETE 25
|
|
```
|
|
|
|
- After that perhaps a regular `dspace index-discovery` (no -b) *should* suffice...
|
|
- Hmm, I indexed, cleared the Cocoon cache, and restarted Tomcat but the 25 "|||" countries are still there
|
|
- Maybe I need to do a full re-index...
|