Add notes for 2022-07-08

This commit is contained in:
2022-07-08 15:49:45 +03:00
parent 19715c3295
commit 11ce30438c
29 changed files with 154 additions and 36 deletions

View File

@ -118,4 +118,53 @@ UPDATE 104
- I will also have to remove "Academicians" from input-forms.xml
<!-- vim: set sw=2 ts=2: -->
## 2022-07-07
- Finalize lists of non-AGROVOC subjects in CGSpace that I started last week
- I used the [SQL helper functions](https://wiki.lyrasis.org/display/DSPACE/Helper+SQL+functions+for+DSpace+6) to find the collections where each term was used:
```console
localhost/dspace= ☘ SELECT DISTINCT(ds6_item2collectionhandle(dspace_object_id)) AS collection, COUNT(*) FROM metadatavalue WHERE dspace_object_id IN (SELECT uuid FROM item) AND LOWER(text_value) = 'water demand' GROUP BY collection ORDER BY count DESC LIMIT 5;
collection │ count
─────────────┼───────
10568/36178 │ 56
10568/36185 │ 46
10568/36181 │ 35
10568/36188 │ 28
10568/36179 │ 21
(5 rows)
```
- For now I only did terms from my list that had 100 or more occurrences in CGSpace
- This leaves us with thirty-six terms that I will send to Sara Jani and Elizabeth Arnaud for evaluating possible inclusion to AGROVOC
- Write to some submitters from CIAT, Bioversity, and CCAFS to ask if they are still uploading new items with their legacy subject fields on CGSpace
- We want to remove them from the submission form to create space for new fields
- Update one term I noticed people using that was close to AGROVOC:
```console
dspace=# UPDATE metadatavalue SET text_value='development policies' WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=187 AND text_value='development policy';
UPDATE 108
```
- After contacting some editors I removed some old metadata fields from the submission form and browse indexes:
- Bioversity subject (`cg.subject.bioversity`)
- CCAFS phase 1 project tag (`cg.identifier.ccafsproject`)
- CIAT project tag (`cg.identifier.ciatproject`)
- CIAT subject (`cg.subject.ciat`)
- Work on cleaning and proofing forty-six AfricaRice items for CGSpace
- Last week we identified some duplicates so I removed those
- The data is of mediocre quality
- I've been fixing citations (nitpick), adding licenses, adding volume/issue/extent, fixing DOIs, and adding some AGROVOC subjects
- I even found titles that have typos, looking something like OCR errors...
## 2022-07-08
- Finalize the cleaning and proofing of AfricaRice records
- I found two suspicious items that claim to have been published but I can't find in the respective journals, so I removed those
- I uploaded the forty-four items to [DSpace Test](https://dspacetest.cgiar.org/handle/10568/119135)
- Margarita from CCAFS said they are no longer using the CCAFS subject or CCAFS phase 2 project tag
- I removed these from the input-form.xml and Discovery facets:
- cg.identifier.ccafsprojectpii
- cg.subject.cifor
- For now we will keep them in the search filters