diff --git a/content/posts/2018-05.md b/content/posts/2018-05.md index 55b05d179..b7cc94c8e 100644 --- a/content/posts/2018-05.md +++ b/content/posts/2018-05.md @@ -272,3 +272,10 @@ ga('send', 'pageview', { - I tested loading a certain page before and after adding this and afterwards I saw that the parameter `aip=1` was being sent with the analytics response to Google - According to the [analytics.js protocol parameter documentation](https://developers.google.com/analytics/devguides/collection/analyticsjs/field-reference#anonymizeIp) this means that IPs are being anonymized - After finding and fixing some duplicates in IITA's `IITA_April_27` test collection on DSpace Test (10568/92703) I told Sisay that he can move them to IITA's Journal Articles collection on CGSpace + +## 2018-05-17 + +- Testing reconciliation of countries against Solr via conciliator, I notice that `CÔTE D'IVOIRE` doesn't match `COTE D'IVOIRE`, whereas with reconcile-csv it does +- Also, when reconciling regions against Solr via conciliator `EASTERN AFRICA` doesn't match `EAST AFRICA`, whereas with reconcile-csv it does +- And `SOUTH AMERICA` matches both `SOUTH ASIA` and `SOUTH AMERICA` with the same match score of 2... WTF. +- It could be that I just need to tune the index or query filters in Solr (currently using the example `text_en` field type) diff --git a/docs/2018-05/index.html b/docs/2018-05/index.html index c0ea0c081..b09ca0057 100644 --- a/docs/2018-05/index.html +++ b/docs/2018-05/index.html @@ -27,7 +27,7 @@ Also, I switched it to use OpenJDK instead of Oracle Java, as well as re-worked - + @@ -65,9 +65,9 @@ Also, I switched it to use OpenJDK instead of Oracle Java, as well as re-worked "@type": "BlogPosting", "headline": "May, 2018", "url": "https://alanorth.github.io/cgspace-notes/2018-05/", - "wordCount": "2081", + "wordCount": "2164", "datePublished": "2018-05-01T16:43:54+03:00", - "dateModified": "2018-05-16T14:17:54+03:00", + "dateModified": "2018-05-17T09:45:45+03:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -401,8 +401,8 @@ return "blank"
$ ./bin/solr start
$ ./bin/solr create_core -c countries
-$ ./bin/post -c countries ~/src/git/DSpace/2018-05-10-countries.csv
$ curl -X POST -H 'Content-type:application/json' --data-binary '{"add-field": {"name":"country", "type":"text_en", "multiValued":false, "stored":true}}' http://localhost:8983/solr/countries/schema
+$ ./bin/post -c countries ~/src/git/DSpace/2018-05-10-countries.csv
IITA_April_27
test collection on DSpace Test (10568⁄92703) I told Sisay that he can move them to IITA’s Journal Articles collection on CGSpaceCÔTE D'IVOIRE
doesn’t match COTE D'IVOIRE
, whereas with reconcile-csv it doesEASTERN AFRICA
doesn’t match EAST AFRICA
, whereas with reconcile-csv it doesSOUTH AMERICA
matches both SOUTH ASIA
and SOUTH AMERICA
with the same match score of 2… WTF.text_en
field type)