mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2025-01-27 05:49:12 +01:00
Add notes for 2022-09-30
This commit is contained in:
@ -496,4 +496,79 @@ Fixed 1 occurences of: Amanda De Filippo: 0000-0002-1536-3221
|
||||
...
|
||||
```
|
||||
|
||||
## 2022-09-29
|
||||
|
||||
- I've been checking the size of the nginx proxy cache the last few days and it always seems to hover around 14,000 entries and 385MB:
|
||||
|
||||
```console
|
||||
# find /var/cache/nginx/rest_cache/ -type f | wc -l
|
||||
14202
|
||||
# du -sh /var/cache/nginx/rest_cache
|
||||
384M /var/cache/nginx/rest_cache
|
||||
```
|
||||
|
||||
- Also on that note I'm trying to implement a workaround for a potential caching issue that causes MEL to not be able to update items on DSpace Test
|
||||
- I *think* we might need to allow requests with a JSESSIONID to bypass the cache, but I have to verify with Salem
|
||||
- We can do this with an nginx map:
|
||||
|
||||
```console
|
||||
# Check if the JSESSIONID cookie is present and contains a 32-character hex
|
||||
# value, which would mean that a user is actively attempting to re-use their
|
||||
# Tomcat session. Then we set the $active_user_session variable and use it
|
||||
# to bypass the nginx proxy cache in REST requests.
|
||||
map $cookie_jsessionid $active_user_session {
|
||||
# requests with an empty key are not evaluated by limit_req
|
||||
# see: http://nginx.org/en/docs/http/ngx_http_limit_req_module.html
|
||||
default '';
|
||||
|
||||
'~[A-Z0-9]{32}' 1;
|
||||
}
|
||||
```
|
||||
|
||||
- Then in the location block where we do the proxy cache:
|
||||
|
||||
```console
|
||||
# Don't cache when user Shift-refreshes (Cache-Control: no-cache) or
|
||||
# when a client has an active session (see the $cookie_jsessionid map).
|
||||
proxy_cache_bypass $http_cache_control $active_user_session;
|
||||
proxy_no_cache $http_cache_control $active_user_session;
|
||||
```
|
||||
|
||||
- I found one client making 10,000 requests using a Windows 98 user agent:
|
||||
|
||||
```console
|
||||
Mozilla/4.0 (compatible; MSIE 5.00; Windows 98)
|
||||
```
|
||||
|
||||
- They all come from one IP address (129.227.149.43) in Hong Kong
|
||||
- The IP belongs to a hosting provider called Zenlayer
|
||||
- I will add this IP to the nginx bot networks and purge its hits
|
||||
|
||||
```console
|
||||
$ ./ilri/check-spider-ip-hits.sh -f /tmp/ip -p
|
||||
Purging 33027 hits from 129.227.149.43 in statistics
|
||||
|
||||
Total number of bot hits purged: 33027
|
||||
```
|
||||
|
||||
- So it seems we've seen this bot before and the total number is much higher than the 10,000 this month
|
||||
- I had a call with Salem and we verified that the nginx cache bypass for clients who provide a JSESSIONID fixes their issue with updating items/bitstreams from MEL
|
||||
- The issue was that they delete all metadata and bitstreams, then add them again to make sure everything is up to date, and in that process they also re-request the item with all expands to get the bitstreams, which ends up getting cached and then they try to delete the old bitstream
|
||||
- I also noticed that someone made a [pull request to enable POSTing bitstreams to a particular bundle](https://github.com/DSpace/DSpace/pull/8343) and it works, so that's awesome!
|
||||
|
||||
## 2022-09-30
|
||||
|
||||
- I applied [the patch for POSTing bitstreams to other bundles](https://github.com/DSpace/DSpace/pull/8343) on CGSpace
|
||||
- Testing a few other DSpace 6.4 patches on DSpace Test:
|
||||
- [DS-3791 Make sure the "yearDifference" takes into account that a gap of 10 year contains 11 years](https://github.com/DSpace/DSpace/pull/1901)
|
||||
- [DS-3873 Limit the usage of PDFBoxThumbnail to PDFs](https://github.com/DSpace/DSpace/pull/2501)
|
||||
- [Reduce itemCounter init](https://github.com/DSpace/DSpace/pull/2161)
|
||||
- [ImageMagick: Only execute "identify" on first page](https://github.com/DSpace/DSpace/pull/2201)
|
||||
- [DS-3881: Show no total results on search-filter](https://github.com/DSpace/DSpace/pull/2371)
|
||||
- [pass value instead of qualifier to method](https://github.com/DSpace/DSpace/pull/2699)
|
||||
- [dspace-api: check for null AND empty qualifier in findByElement()](https://github.com/DSpace/DSpace/pull/7993)
|
||||
- [Avoid exporting mapped Item more than once](https://github.com/DSpace/DSpace/pull/7995)
|
||||
- [[DS-4574] v. 6 - Upgrade DBCP2 dependency](https://github.com/DSpace/DSpace/pull/3162)
|
||||
- [bump up pdfbox version on 6.x to match main branch](https://github.com/DSpace/DSpace/pull/2742)
|
||||
|
||||
<!-- vim: set sw=2 ts=2: -->
|
||||
|
Reference in New Issue
Block a user