diff --git a/content/posts/2019-03.md b/content/posts/2019-03.md index 43ccdfa91..2af39a221 100644 --- a/content/posts/2019-03.md +++ b/content/posts/2019-03.md @@ -485,4 +485,19 @@ $ xzgrep 'Can not load requested doc' cocoon.log.2019-03-08.xz | grep -oE '2019- - I'm not sure if it's cocoon or that's just a symptom of something else +## 2019-03-19 + +- I found a handful of AGROVOC subjects that use a non-breaking space (0x00a0) instead of a regular space, which makes for a pretty confusing debugging... +- I will replace these in the database immediately to save myself the headache later: + +``` +dspace=# SELECT count(text_value) FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id = 57 AND text_value ~ '.+\u00a0.+'; + count +------- + 84 +(1 row) +``` + +- Perhaps my `agrovoc-lookup.py` script could notify if it finds these because they potentially give false negatives + diff --git a/docs/2019-03/index.html b/docs/2019-03/index.html index 6f1c8b33b..4050cfc52 100644 --- a/docs/2019-03/index.html +++ b/docs/2019-03/index.html @@ -25,7 +25,7 @@ I think I will need to ask Udana to re-copy and paste the abstracts with more ca - + @@ -55,9 +55,9 @@ I think I will need to ask Udana to re-copy and paste the abstracts with more ca "@type": "BlogPosting", "headline": "March, 2019", "url": "https://alanorth.github.io/cgspace-notes/2019-03/", - "wordCount": "2973", + "wordCount": "3049", "datePublished": "2019-03-01T12:16:30+01:00", - "dateModified": "2019-03-18T15:32:22+02:00", + "dateModified": "2019-03-18T21:55:08+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -686,6 +686,24 @@ $ psql -c 'select * from pg_stat_activity' | grep -o -E '(dspaceWeb|dspaceApi|ds
dspace=# SELECT count(text_value) FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id = 57 AND text_value ~ '.+\u00a0.+';
+ count
+-------
+ 84
+(1 row)
+
+
+agrovoc-lookup.py
script could notify if it finds these because they potentially give false negatives