Update notes for 2018-11-05

This commit is contained in:
2018-11-05 17:45:39 +02:00
parent 6f561ce4b5
commit 9d81dc3176
3 changed files with 27 additions and 8 deletions

View File

@ -232,5 +232,14 @@ $ grep -c -E 'session_id=[A-Z0-9]{32}:ip_addr=2a03:2880:11ff' dspace.log.2018-11
- I added the "most-popular" pages to the list that return `X-Robots-Tag: none` to try to inform bots not to index or follow those pages
- Also, I implemented an nginx rate limit of twelve requests per minute on all dynamic pages... I figure a human user might legitimately request one every five seconds
- I wrote a small Python script [add-dc-rights.py](https://gist.github.com/alanorth/4ff81d5f65613814a66cb6f84fdf1fc5) to add usage rights (`dc.rights`) to CGSpace items based on the CSV Hector gave me from MARLO:
```
$ ./add-dc-rights.py -i /tmp/marlo.csv -db dspace -u dspace -p 'fuuu'
```
- The file `marlo.csv` was cleaned up and formatted in Open Refine
- 165 of the items in their 2017 data are from CGSpace!
- I will add the data to CGSpace this week
<!-- vim: set sw=2 ts=2: -->