Add notes for 2020-07-08

This commit is contained in:
2020-07-08 16:30:40 +03:00
parent 5291baa539
commit 8d42c71a44
24 changed files with 100 additions and 27 deletions

View File

@ -352,7 +352,7 @@ Total number of bot hits purged: 29025
## 2020-06-14
- Abenet asked for a list of authors from CIP's community so that Gabriel can make some corrections
- Abenet asked for a list of authors from CIP's community so that Gabriela can make some corrections
- I generated a list of collections in CIPs two communities using the REST API:
```

View File

@ -303,4 +303,38 @@ $ ./fix-metadata-values.py -i 2020-07-07-fix-sponsors.csv -db dspace -u dspace -
![Altmetric and Dimensions.ai badge](/cgspace-notes/2020/07/dimensions-badge2.png)
## 2020-07-08
- Generate a CSV of all the AGROVOC subjects that didn't match from the top 6500 I exported earlier this week:
```
$ csvgrep -c 'number of matches' -r "^0$" 2020-07-05-cgspace-subjects.csv | csvcut -c 1 > 2020-07-05-cgspace-invalid-subjects.csv
```
- Yesterday Gabriela from CIP emailed to say that she was removing the accents from her authors' names because of "funny character" issues with reports generated from CGSpace
- I told her that it's probably her Windows / Excel that is messing up the data, and she figured out how to open them correctly!
- Now she says she doesn't want to remove the accents after all and she sent me a new list of corrections
- I used csvgrep and found a few where she is still removing accents:
```
$ csvgrep -c 2 -r "^.+$" ~/Downloads/cip-authors-GH-20200706.csv | csvgrep -c 1 -r "^.*[À-ú].*$" | csvgrep -c 2 -r "^.*[À-ú].*$" -i | csvcut -c 1,2
dc.contributor.author,correction
"López, G.","Lopez, G."
"Gómez, R.","Gomez, R."
"García, M.","Garcia, M."
"Mejía, A.","Mejia, A."
"Quiróz, Roberto A.","Quiroz, R."
```
- csvgrep from the csvkit suite is *so cool*:
- Select lines with column two (the correction) having a value
- Select lines with column one (the original author name) having an accent / diacritic
- Select lines with column two (the correction) NOT having an accent (ie, she's not removing an accent)
- Select columns one and two
- Peter said he liked the work I didn on the badges yesterday so I put some finishing touches on it to detect more DOI URI styles and pushed it to the `5_x-prod` branch
- I will port it to DSpace 6 soon
![Altmetric and Dimensions badges](/cgspace-notes/2020/07/altmetrics-dimensions-badges.png)
<!-- vim: set sw=2 ts=2: -->