CGSpace Notes

Documenting day-to-day work on the CGSpace repository.

September, 2019

2019-09-01

  • Linode emailed to say that CGSpace (linode18) had a high rate of outbound traffic for several hours this morning
  • Here are the top ten IPs in the nginx XMLUI and REST/OAI logs this morning:

    # zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E "01/Sep/2019:0" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
    440 17.58.101.255
    441 157.55.39.101
    485 207.46.13.43
    728 169.60.128.125
    730 207.46.13.108
    758 157.55.39.9
    808 66.160.140.179
    814 207.46.13.212
    2472 163.172.71.23
    6092 3.94.211.189
    # zcat --force /var/log/nginx/rest.log /var/log/nginx/rest.log.1 /var/log/nginx/oai.log /var/log/nginx/oai.log.1 | grep -E "01/Sep/2019:0" | awk '{print $1}' | sort | uniq -c | sort -n | tail -n 10
     33 2a01:7e00::f03c:91ff:fe16:fcb
     57 3.83.192.124
     57 3.87.77.25
     57 54.82.1.8
    822 2a01:9cc0:47:1:1a:4:0:2
    1223 45.5.184.72
    1633 172.104.229.92
    5112 205.186.128.185
    7249 2a01:7e00::f03c:91ff:fe18:7396
    9124 45.5.186.2
    
  • 3.94.211.189 is MauiBot, and most of its requests are to Discovery and get rate limited with HTTP 503
  • 163.172.71.23 is some IP on Online SAS in France and its user agent is:

    Mozilla/5.0 ((Windows; U; Windows NT 6.1; fr; rv:1.9.2) Gecko/20100115 Firefox/3.6)
    
  • It actually got mostly HTTP 200 responses:

    # zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E "01/Sep/2019:0" | grep 163.172.71.23 | awk '{print $9}' | sort | uniq -c
    1775 200
    703 499
     72 503
    
  • And it was mostly requesting Discover pages:

    # zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 | grep -E "01/Sep/2019:0" | grep 163.172.71.23 | grep -o -E "(bitstream|discover|handle)" | sort | uniq -c 
    2350 discover
     71 handle
    
  • I’m not sure why the outbound traffic rate was so high…

2019-09-02

  • Follow up with Carol and Francesca from Bioversity as they were on holiday during the mid-to-late August
    • I told them to check the temporary collection on DSpace Test where I uploaded the 1,427 items so they can see how it will look
    • Also, I told them to advise me about the strange file extensions (.7z, .zip, .lck)
    • Also, I reminded Abenet to check the metadata, as the institutional authors at least will need some modification