mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Update notes for 2018-07-12
This commit is contained in:
@ -314,5 +314,15 @@ $ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=95.108.181.88' dspace.log.2018-07-
|
||||
|
||||
- So this bot is just like Baiduspider, and I need to add it to the nginx rate limiting
|
||||
- I'll also add it to Tomcat's Crawler Session Manager Valve to force the re-use of a common Tomcat sesssion for all crawlers just in case
|
||||
- Generate a list of all affiliations in CGSpace to send to Mohamed Salem to compare with the list on MEL (sorting the list by most occurrences):
|
||||
|
||||
```
|
||||
dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue where resource_type_id=2 and metadata_field_id=211 group by text_value order by count desc) to /tmp/affiliations.csv with csv header
|
||||
COPY 4518
|
||||
dspace=# \q
|
||||
$ csvcut -c 1 < /tmp/affiliations.csv > /tmp/affiliations-1.csv
|
||||
```
|
||||
|
||||
- We also need to discuss standardizing our countries and comparing our ORCID iDs
|
||||
|
||||
<!-- vim: set sw=2 ts=2: -->
|
||||
|
Reference in New Issue
Block a user