diff --git a/content/posts/2020-12.md b/content/posts/2020-12.md index c39827bb7..3e337e700 100644 --- a/content/posts/2020-12.md +++ b/content/posts/2020-12.md @@ -389,8 +389,85 @@ dspace=# COMMIT; ```console $ export JAVA_OPTS="-Dfile.encoding=UTF-8 -Xmx512m" $ time chrt -b 0 ionice -c2 -n7 nice -n19 dspace index-discovery -b + +real 265m11.224s +user 171m29.141s +sys 2m41.097s ``` - Udana sent a report that the WLE approver is experiencing the same issue Peter highlighted a few weeks ago: they are unable to save metadata edits in the workflow +- Yesterday Atmire responded about the owningComm and owningColl duplicates in Solr saying they didn't see any anymore... + - Indeed I spent a few minutes looking randomly and I didn't find any either... + - I did, however, see lots of duplicates in countryCode_search, countryCode_ngram, ip_search, ip_ngram, userAgent_search, userAgent_ngram, referrer_search, referrer_ngram fields + - I sent feedback to them +- On the database locking front we haven't had issues in over a week and the Munin graphs look normal: + +![PostgreSQL connections all week](/cgspace-notes/2020/12/postgres_connections_ALL-week2.png) +![PostgreSQL locks all week](/cgspace-notes/2020/12/postgres_locks_ALL-week2.png) + +- After the Discovery re-indexing finished on CGSpace I prepared to start re-harvesting AReS by making sure the `openrxv-items-temp` index was empty and that the backup index I made yesterday was still there: + +```console +$ curl -XDELETE 'http://localhost:9200/openrxv-items-temp?pretty' +{ + "acknowledged" : true +} +$ curl -s 'http://localhost:9200/openrxv-items-final/_count?q=*&pretty' +{ + "count" : 0, + "_shards" : { + "total" : 1, + "successful" : 1, + "skipped" : 0, + "failed" : 0 + } +} +$ curl -s 'http://localhost:9200/openrxv-items-2020-12-14/_count?q=*&pretty' +{ + "count" : 99992, + "_shards" : { + "total" : 1, + "successful" : 1, + "skipped" : 0, + "failed" : 0 + } +} +``` + +## 2020-12-16 + +- The harvesting on AReS finished last night so this morning I manually cloned the `openrxv-items-temp` index to `openrxv-items` + - First check the number of items in the temp index, then set it to read only, then delete the items index, then delete the temp index: + +```console +$ curl -s 'http://localhost:9200/openrxv-items-temp/_count?q=*&pretty' +{ + "count" : 100046, + "_shards" : { + "total" : 1, + "successful" : 1, + "skipped" : 0, + "failed" : 0 + } +} +$ curl -X PUT "localhost:9200/openrxv-items-temp/_settings?pretty" -H 'Content-Type: application/json' -d'{"settings": {"index.blocks.write": true}}' +$ curl -XDELETE 'http://localhost:9200/openrxv-items?pretty' +$ curl -s -X POST "http://localhost:9200/openrxv-items-temp/_clone/openrxv-items?pretty" +$ curl -s 'http://localhost:9200/openrxv-items/_count?q=*&pretty' +{ + "count" : 100046, + "_shards" : { + "total" : 1, + "successful" : 1, + "skipped" : 0, + "failed" : 0 + } +} +$ curl -X PUT "localhost:9200/openrxv-items-temp/_settings?pretty" -H 'Content-Type: application/json' -d'{"settings": {"index.blocks.write": false}}' +$ curl -XDELETE 'http://localhost:9200/openrxv-items-temp?pretty' +``` + +- Interestingly [the item](https://hdl.handle.net/10568/110447) that we noticed was duplicated now only appears once +- The [missing item](https://hdl.handle.net/10568/110133) is still missing diff --git a/docs/2020-12/index.html b/docs/2020-12/index.html index 427762f51..5970e2562 100644 --- a/docs/2020-12/index.html +++ b/docs/2020-12/index.html @@ -20,7 +20,7 @@ I started processing those (about 411,000 records): - + @@ -46,9 +46,9 @@ I started processing those (about 411,000 records): "@type": "BlogPosting", "headline": "December, 2020", "url": "https://alanorth.github.io/cgspace-notes/2020-12/", - "wordCount": "2037", + "wordCount": "2609", "datePublished": "2020-12-01T11:32:54+02:00", - "dateModified": "2020-12-13T16:16:10+02:00", + "dateModified": "2020-12-15T13:13:18+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -490,7 +490,122 @@ $ query-json '.items | length' /tmp/policy2.json
$ curl -X PUT "localhost:9200/openrxv-items-temp/_settings?pretty" -H 'Content-Type: application/json' -d'{"settings": {"index.blocks.write": true}}'
 $ curl -s -X POST http://localhost:9200/openrxv-items-temp/_clone/openrxv-items-2020-12-14
 $ curl -X PUT "localhost:9200/openrxv-items-temp/_settings?pretty" -H 'Content-Type: application/json' -d'{"settings": {"index.blocks.write": false}}'
-
+

2020-12-15

+ +
$ ./fix-metadata-values.py -i /tmp/2020-10-28-fix-1534-Authors.csv -db dspace -u dspace -p 'fuuu' -f dc.contributor.author -t 'correct' -m 3
+$ ./delete-metadata-values.py -i /tmp/2020-10-28-delete-2-Authors.csv -db dspace -u dspace -p 'fuuu' -f dc.contributor.author -m 3
+
+
dspace=# BEGIN;
+BEGIN
+dspace=# UPDATE metadatavalue SET text_value=LOWER(text_value) WHERE dspace_object_id IN (SELECT uuid FROM item) AND metadata_field_id=57 AND text_value ~ '[[:upper:]]';
+UPDATE 406
+dspace=# COMMIT;
+COMMIT
+
+
dspace=# BEGIN;
+dspace=# UPDATE metadatavalue SET text_value = REGEXP_REPLACE(text_value, 'fa fa-rss','fas fa-rss', 'g') WHERE text_value LIKE '%fa fa-rss%';
+UPDATE 74
+dspace=# UPDATE metadatavalue SET text_value = REGEXP_REPLACE(text_value, 'fa fa-at','fas fa-at', 'g') WHERE text_value LIKE '%fa fa-at%';
+UPDATE 74
+dspace=# COMMIT;
+
+
$ export JAVA_OPTS="-Dfile.encoding=UTF-8 -Xmx512m"
+$ time chrt -b 0 ionice -c2 -n7 nice -n19 dspace index-discovery -b
+
+real    265m11.224s
+user    171m29.141s
+sys     2m41.097s
+
+

PostgreSQL connections all week +PostgreSQL locks all week

+ +
$ curl -XDELETE 'http://localhost:9200/openrxv-items-temp?pretty'
+{
+  "acknowledged" : true
+}
+$ curl -s 'http://localhost:9200/openrxv-items-final/_count?q=*&pretty'
+{
+  "count" : 0,
+  "_shards" : {
+    "total" : 1,
+    "successful" : 1,
+    "skipped" : 0,
+    "failed" : 0
+  }
+}
+$ curl -s 'http://localhost:9200/openrxv-items-2020-12-14/_count?q=*&pretty'
+{
+  "count" : 99992,
+  "_shards" : {
+    "total" : 1,
+    "successful" : 1,
+    "skipped" : 0,
+    "failed" : 0
+  }
+}
+

2020-12-16

+ +
$ curl -s 'http://localhost:9200/openrxv-items-temp/_count?q=*&pretty'
+{
+  "count" : 100046,
+  "_shards" : {
+    "total" : 1,
+    "successful" : 1,
+    "skipped" : 0,
+    "failed" : 0
+  }
+}
+$ curl -X PUT "localhost:9200/openrxv-items-temp/_settings?pretty" -H 'Content-Type: application/json' -d'{"settings": {"index.blocks.write": true}}'
+$ curl -XDELETE 'http://localhost:9200/openrxv-items?pretty'
+$ curl -s -X POST "http://localhost:9200/openrxv-items-temp/_clone/openrxv-items?pretty"
+$ curl -s 'http://localhost:9200/openrxv-items/_count?q=*&pretty'
+{
+  "count" : 100046,
+  "_shards" : {
+    "total" : 1,
+    "successful" : 1,
+    "skipped" : 0,
+    "failed" : 0
+  }
+}
+$ curl -X PUT "localhost:9200/openrxv-items-temp/_settings?pretty" -H 'Content-Type: application/json' -d'{"settings": {"index.blocks.write": false}}'
+$ curl -XDELETE 'http://localhost:9200/openrxv-items-temp?pretty'
+
+ diff --git a/docs/2020/12/postgres_connections_ALL-week2.png b/docs/2020/12/postgres_connections_ALL-week2.png new file mode 100644 index 000000000..43e75793a Binary files /dev/null and b/docs/2020/12/postgres_connections_ALL-week2.png differ diff --git a/docs/2020/12/postgres_locks_ALL-week2.png b/docs/2020/12/postgres_locks_ALL-week2.png new file mode 100644 index 000000000..f3e257ca0 Binary files /dev/null and b/docs/2020/12/postgres_locks_ALL-week2.png differ diff --git a/docs/categories/index.html b/docs/categories/index.html index dc5fd8090..b845c9af9 100644 --- a/docs/categories/index.html +++ b/docs/categories/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/index.html b/docs/categories/notes/index.html index 13f16b0e8..ed4f9866f 100644 --- a/docs/categories/notes/index.html +++ b/docs/categories/notes/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/2/index.html b/docs/categories/notes/page/2/index.html index ad9f0bf3f..4e54f4427 100644 --- a/docs/categories/notes/page/2/index.html +++ b/docs/categories/notes/page/2/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/3/index.html b/docs/categories/notes/page/3/index.html index c30ef4942..237faf052 100644 --- a/docs/categories/notes/page/3/index.html +++ b/docs/categories/notes/page/3/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/4/index.html b/docs/categories/notes/page/4/index.html index 84acd820d..53990d757 100644 --- a/docs/categories/notes/page/4/index.html +++ b/docs/categories/notes/page/4/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/categories/notes/page/5/index.html b/docs/categories/notes/page/5/index.html index 13ff454c8..50401e9d4 100644 --- a/docs/categories/notes/page/5/index.html +++ b/docs/categories/notes/page/5/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/index.html b/docs/index.html index b0927d9b1..10d083385 100644 --- a/docs/index.html +++ b/docs/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/2/index.html b/docs/page/2/index.html index db2a86f3d..5b43ca620 100644 --- a/docs/page/2/index.html +++ b/docs/page/2/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/3/index.html b/docs/page/3/index.html index 1f1cc8983..fc7bc8b71 100644 --- a/docs/page/3/index.html +++ b/docs/page/3/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/4/index.html b/docs/page/4/index.html index dba8579df..1555a7df4 100644 --- a/docs/page/4/index.html +++ b/docs/page/4/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/5/index.html b/docs/page/5/index.html index 362ba51fd..d38f6c48d 100644 --- a/docs/page/5/index.html +++ b/docs/page/5/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/6/index.html b/docs/page/6/index.html index 668449f25..c2d8e7909 100644 --- a/docs/page/6/index.html +++ b/docs/page/6/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/page/7/index.html b/docs/page/7/index.html index 5f74603cf..0d6b69f7d 100644 --- a/docs/page/7/index.html +++ b/docs/page/7/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/index.html b/docs/posts/index.html index 9447796f0..8f7ddca68 100644 --- a/docs/posts/index.html +++ b/docs/posts/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/2/index.html b/docs/posts/page/2/index.html index 86df01283..1993dc377 100644 --- a/docs/posts/page/2/index.html +++ b/docs/posts/page/2/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/3/index.html b/docs/posts/page/3/index.html index 396756935..645dc1684 100644 --- a/docs/posts/page/3/index.html +++ b/docs/posts/page/3/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/4/index.html b/docs/posts/page/4/index.html index b2f8e56cd..e5cbbdd4d 100644 --- a/docs/posts/page/4/index.html +++ b/docs/posts/page/4/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/5/index.html b/docs/posts/page/5/index.html index 73c23112c..3ca1553ab 100644 --- a/docs/posts/page/5/index.html +++ b/docs/posts/page/5/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/6/index.html b/docs/posts/page/6/index.html index 452eb9da4..fc79ef858 100644 --- a/docs/posts/page/6/index.html +++ b/docs/posts/page/6/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/posts/page/7/index.html b/docs/posts/page/7/index.html index aeca6cfc4..2be119d9e 100644 --- a/docs/posts/page/7/index.html +++ b/docs/posts/page/7/index.html @@ -10,7 +10,7 @@ - + diff --git a/docs/sitemap.xml b/docs/sitemap.xml index 81b5da041..0be7e190b 100644 --- a/docs/sitemap.xml +++ b/docs/sitemap.xml @@ -4,27 +4,27 @@ https://alanorth.github.io/cgspace-notes/categories/ - 2020-12-13T16:16:10+02:00 + 2020-12-15T13:13:18+02:00 https://alanorth.github.io/cgspace-notes/ - 2020-12-13T16:16:10+02:00 + 2020-12-15T13:13:18+02:00 https://alanorth.github.io/cgspace-notes/2020-12/ - 2020-12-13T16:16:10+02:00 + 2020-12-15T13:13:18+02:00 https://alanorth.github.io/cgspace-notes/categories/notes/ - 2020-12-13T16:16:10+02:00 + 2020-12-15T13:13:18+02:00 https://alanorth.github.io/cgspace-notes/posts/ - 2020-12-13T16:16:10+02:00 + 2020-12-15T13:13:18+02:00 diff --git a/static/2020/12/postgres_connections_ALL-week2.png b/static/2020/12/postgres_connections_ALL-week2.png new file mode 100644 index 000000000..43e75793a Binary files /dev/null and b/static/2020/12/postgres_connections_ALL-week2.png differ diff --git a/static/2020/12/postgres_locks_ALL-week2.png b/static/2020/12/postgres_locks_ALL-week2.png new file mode 100644 index 000000000..f3e257ca0 Binary files /dev/null and b/static/2020/12/postgres_locks_ALL-week2.png differ