mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-09-30 06:04:16 +02:00
86 lines
3.6 KiB
Markdown
86 lines
3.6 KiB
Markdown
---
|
|
title: "September, 2019"
|
|
date: 2019-09-01T10:17:51+03:00
|
|
author: "Alan Orth"
|
|
tags: ["Notes"]
|
|
---
|
|
|
|
## 2019-09-01
|
|
|
|
- Linode emailed to say that CGSpace (linode18) had a high rate of outbound traffic for several hours this morning
|
|
- Here are the top ten IPs in the nginx XMLUI and REST/OAI logs this morning:
|
|
|
|
```
|
|
# zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E "01/Sep/2019:0" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
|
440 17.58.101.255
|
|
441 157.55.39.101
|
|
485 207.46.13.43
|
|
728 169.60.128.125
|
|
730 207.46.13.108
|
|
758 157.55.39.9
|
|
808 66.160.140.179
|
|
814 207.46.13.212
|
|
2472 163.172.71.23
|
|
6092 3.94.211.189
|
|
# zcat --force /var/log/nginx/rest.log /var/log/nginx/rest.log.1 /var/log/nginx/oai.log /var/log/nginx/oai.log.1 | grep -E "01/Sep/2019:0" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
|
|
33 2a01:7e00::f03c:91ff:fe16:fcb
|
|
57 3.83.192.124
|
|
57 3.87.77.25
|
|
57 54.82.1.8
|
|
822 2a01:9cc0:47:1:1a:4:0:2
|
|
1223 45.5.184.72
|
|
1633 172.104.229.92
|
|
5112 205.186.128.185
|
|
7249 2a01:7e00::f03c:91ff:fe18:7396
|
|
9124 45.5.186.2
|
|
```
|
|
|
|
<!--more-->
|
|
|
|
- `3.94.211.189` is MauiBot, and most of its requests are to Discovery and get rate limited with HTTP 503
|
|
- `163.172.71.23` is some IP on Online SAS in France and its user agent is:
|
|
|
|
```
|
|
Mozilla/5.0 ((Windows; U; Windows NT 6.1; fr; rv:1.9.2) Gecko/20100115 Firefox/3.6)
|
|
```
|
|
|
|
- It actually got mostly HTTP 200 responses:
|
|
|
|
```
|
|
# zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E "01/Sep/2019:0" | grep 163.172.71.23 | awk '{print $9}' | sort | uniq -c
|
|
1775 200
|
|
703 499
|
|
72 503
|
|
```
|
|
|
|
- And it was mostly requesting Discover pages:
|
|
|
|
```
|
|
# zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E "01/Sep/2019:0" | grep 163.172.71.23 | grep -o -E "(bitstream|discover|handle)" | sort | uniq -c
|
|
2350 discover
|
|
71 handle
|
|
```
|
|
|
|
- I'm not sure why the outbound traffic rate was so high...
|
|
|
|
## 2019-09-02
|
|
|
|
- Follow up with Carol and Francesca from Bioversity as they were on holiday during the mid-to-late August
|
|
- I told them to check the [temporary collection on DSpace Test](https://dspacetest.cgiar.org/handle/10568/103999) where I uploaded the 1,427 items so they can see how it will look
|
|
- Also, I told them to advise me about the strange file extensions (.7z, .zip, .lck)
|
|
- Also, I reminded Abenet to check the metadata, as the institutional authors at least will need some modification
|
|
|
|
## 2019-09-10
|
|
|
|
- Altmetric responded to say that they have fixed an issue with their badge code so now research outputs with multiple handles are showing badges!
|
|
- See: https://hdl.handle.net/handle/10568/97825
|
|
- Follow up with Bosede about the mixup with PDFs in the items uploaded in 2018-12 (aka Daniel1807)
|
|
- These are the same ones that Peter noticed last week, that Bosede and I had been discussing earlier this year that we never sorted out
|
|
- Continue working on CG Core v2 migration, focusing on the crosswalk mappings
|
|
- I think we can skip the MODS crosswalk for now because it is only used in [AIP exports that are meant for non-DSpace systems](https://wiki.duraspace.org/display/DSDOC5x/DSpace+AIP+Format#DSpaceAIPFormat-MODSSchema)
|
|
- We should probably do the QDC crosswalk as well as those in `xhtml-head-item.properties`...
|
|
- Ouch, there is potentially a lot of work in the OAI metadata formats like DIM, METS, and QDC (see `dspace/config/crosswalks/oai/*.xsl`)
|
|
- In general I think I should only modify the left side of the crosswalk mappings (ie, where metadata is coming from) so we maintain the same exact output for search engines, etc
|
|
|
|
<!-- vim: set sw=2 ts=2: -->
|