mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Update notes for 2018-05-30
This commit is contained in:
@ -365,3 +365,19 @@ $ sed 's/.*Item1.*/\n&/g' ~/cifor-duplicates.txt > ~/cifor-duplicates-cleaned.tx
|
||||
```
|
||||
|
||||
- I told Vika to look through the list manually and indicate which ones are indeed duplicates that we should delete, and which ones to map to CIFOR's collection
|
||||
- A few weeks ago Peter wanted a list of authors from the ILRI collections, so I need to find a way to get the handles of all those collections
|
||||
- I can use the `/communities/{id}/collections` endpoint of the REST API but it only takes IDs (not handles) and doesn't seem to descend into sub communities
|
||||
- Shit, so I need the IDs for the the top-level ILRI community and all its sub communities (and their sub communities)
|
||||
- There has got to be a better way to do this than going to each community and getting their handles and IDs manually
|
||||
- Oh shit, I literally already wrote a script to get all collections in a community hierarchy from the REST API: [rest-find-collections.py](https://gist.github.com/alanorth/ddd7f555f0e487fe0e9d3eb4ff26ce50)
|
||||
- The output isn't great, but all the handles and IDs are printed in debug mode:
|
||||
|
||||
```
|
||||
$ ./rest-find-collections.py -u https://cgspace.cgiar.org/rest -d 10568/1 2> /tmp/ilri-collections.txt
|
||||
```
|
||||
|
||||
- Then I format the list of handles and put it into this SQL query to export authors from items ONLY in those collections (too many to list here):
|
||||
|
||||
```
|
||||
dspace=# \copy (select distinct text_value, count(*) from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'contributor' and qualifier = 'author') AND resource_type_id = 2 AND resource_id IN (select item_id from collection2item where collection_id IN (select resource_id from handle where handle in ('10568/67236','10568/67274',...))) group by text_value order by count desc) to /tmp/ilri-authors.csv with csv;
|
||||
```
|
||||
|
Reference in New Issue
Block a user