cgspace-notes/content/posts/2021-08.md

93 lines
3.9 KiB
Markdown
Raw Normal View History

2021-08-01 15:19:05 +02:00
---
title: "August, 2021"
date: 2021-08-01T09:01:07+03:00
author: "Alan Orth"
categories: ["Notes"]
---
## 2021-08-01
- Update Docker images on AReS server (linode20) and reboot the server:
```console
# docker images | grep -v ^REPO | sed 's/ \+/:/g' | cut -d: -f1,2 | grep -v none | xargs -L1 docker pull
```
- I decided to upgrade linode20 from Ubuntu 18.04 to 20.04
<!--more-->
- First running all existing updates, taking some backups, checking for broken packages, and then rebooting:
```console
# apt update && apt dist-upgrade
# apt autoremove && apt autoclean
# check for any packages with residual configs we can purge
# dpkg -l | grep -E '^rc' | awk '{print $2}'
# dpkg -l | grep -E '^rc' | awk '{print $2}' | xargs dpkg -P
# dpkg -C
# dpkg -l > 2021-08-01-linode20-dpkg.txt
# tar -I zstd -cvf 2021-08-01-etc.tar.zst /etc
# reboot
# sed -i 's/bionic/focal/' /etc/apt/sources.list.d/*.list
# do-release-upgrade
```
- ... but of course it hit [the libxcrypt bug](https://bugs.launchpad.net/ubuntu/+source/libxcrypt/+bug/1903838)
- I had to get a copy of libcrypt.so.1.1.0 from a working Ubuntu 20.04 system and finish the upgrade manually
```console
# apt install -f
# apt dist-upgrade
# reboot
```
- After rebooting I purged all packages with residual configs and cleaned up again:
```console
# dpkg -l | grep -E '^rc' | awk '{print $2}' | xargs dpkg -P
# apt autoremove && apt autoclean
```
- Then I cleared my local Ansible fact cache and re-ran the [infrastructure playbooks](https://github.com/ilri/rmg-ansible-public)
- Open [an issue for the value mappings global replacement bug in OpenRXV](https://github.com/ilri/OpenRXV/issues/111)
- Advise Peter and Abenet on expected CGSpace budget for 2022
- Start a fresh harvesting on AReS (linode20)
2021-08-02 15:00:42 +02:00
## 2021-08-02
- Help Udana with OAI validation on CGSpace
- He was checking the OAI base URL on OpenArchives and I had to verify the results in order to proceed to Step 2
- Now it seems to be verified (all green): https://www.openarchives.org/Register/ValidateSite?log=R23ZWX85
- We are listed in the OpenArchives list of databases conforming to OAI 2.0
2021-08-06 08:08:15 +02:00
## 2021-08-03
- Run fresh re-harvest on AReS
## 2021-08-05
- Have a quick call with Mishell Portilla from CIP about a journal article that was flagged as being in a predatory journal (Beall's List)
- We agreed to unmap it from RTB's collection for now, and I asked for advice from Peter and Abenet for what to do in the future
- A developer from the Alliance asked for access to the CGSpace database so they can make some integration with PowerBI
- I told them we don't allow direct database access, and that it would be tricky anyways (that's what APIs are for!)
- I'm curious if there are still any requests coming in to CGSpace from the abusive Russian networks
- I extracted all the unique IPs that nginx processed in the last week:
```console
# zcat --force /var/log/nginx/access.log /var/log/nginx/access.log.1 /var/log/nginx/access.log.2 /var/log/nginx/access.log.3 /var/log/nginx/access.log.4 /var/log/nginx/access.log.5 /var/log/nginx/access.log.6 /var/log/nginx/access.log.7 /var/log/nginx/access.log.8 | grep -E " (200|499) " | grep -v -E "(mahider|Googlebot|Turnitin|Grammarly|Unpaywall|UptimeRobot|bot)" | awk '{print $1}' | sort | uniq > /tmp/2021-08-05-all-ips.txt
# wc -l /tmp/2021-08-05-all-ips.txt
43428 /tmp/2021-08-05-all-ips.txt
```
- Already I can see that the total is much less than during the attack on one weekend last month (over 50,000!)
- Indeed, now I see that there are no IPs from those networks coming in now:
```console
$ ./ilri/resolve-addresses-geoip2.py -i /tmp/2021-08-05-all-ips.txt -o /tmp/2021-08-05-all-ips.csv
$ csvgrep -c asn -r '^(49453|46844|206485|62282|36352|35913|35624|8100)$' /tmp/2021-08-05-all-ips.csv | csvcut -c ip | sed 1d | sort | uniq > /tmp/2021-08-05-all-ips-to-purge.csv
$ wc -l /tmp/2021-08-05-all-ips-to-purge.csv
0 /tmp/2021-08-05-all-ips-to-purge.csv
```
2021-08-01 15:19:05 +02:00
<!-- vim: set sw=2 ts=2: -->