From 935ee71f855c557fc2cc8b22efd0899c0043908e Mon Sep 17 00:00:00 2001 From: Alan Orth Date: Tue, 19 Nov 2019 17:17:28 +0200 Subject: [PATCH] Add notes for 2019-11-19 --- content/posts/2019-11.md | 28 +++++++++ docs/2019-11/index.html | 73 ++++++++++++++-------- docs/cgspace-cgcorev2-migration/index.html | 4 +- docs/sitemap.xml | 16 ++--- 4 files changed, 85 insertions(+), 36 deletions(-) diff --git a/content/posts/2019-11.md b/content/posts/2019-11.md index 9b9631276..be28c35e2 100644 --- a/content/posts/2019-11.md +++ b/content/posts/2019-11.md @@ -373,5 +373,33 @@ Guzzle/ curl/ PHP/ - I tweeted the item twice last week and the score never got linked - Then I noticed that I had already made a note about the same issue in 2019-04, when I also tweeted it several times... - I will ask Altmetric support for help with that +- Finally deploy `5_x-cgcorev2` branch on DSpace Test + +## 2019-11-18 + +- I sent a mail to the CGSpace partners in Addis about the CG Core v2 changes on DSpace Test +- Then I filed an [issue on the CG Core GitHub](https://github.com/AgriculturalSemantics/cg-core/issues/11) to let the metadata people know about our progress +- It seems like I will do a session about CG Core v2 implementation and limitations in DSpace for the data workshop in December in Nairobi (?) + +## 2019-11-19 + +- Export IITA's community from CGSpace because they want to experiment with importing it into their internal DSpace for some testing or something + - I had previously sent them an export in 2019-04 +- Atmire merged my [pull request regarding unnecessary escaping of dashes](https://github.com/atmire/COUNTER-Robots/pull/28) in regular expressions, as well as [my suggestion of adding "User-Agent" to the list of patterns](https://github.com/atmire/COUNTER-Robots/issues/27) +- I made another [pull request to fix invalid escaping of one of their new patterns](https://github.com/atmire/COUNTER-Robots/pull/29) +- I ran my `check-spider-hits.sh` script again with these new patterns and found a bunch more statistics requests that match, for example: + - Found 39560 hits from ^Buck\/[0-9] in statistics + - Found 5471 hits from ^User-Agent in statistics + - Found 2994 hits from ^Buck\/[0-9] in statistics-2018 + - Found 14076 hits from ^User-Agent in statistics-2018 + - Found 16310 hits from ^User-Agent in statistics-2017 + - Found 4429 hits from ^User-Agent in statistics-2016 +- Buck is one I've never heard of before, its user agent is: + +``` +Buck/2.2; (+https://app.hypefactors.com/media-monitoring/about.html) +``` + +- All in all that's about 85,000 more hits purged, in addition to the 3.4 million I purged last week diff --git a/docs/2019-11/index.html b/docs/2019-11/index.html index 7c0b3da9c..dc3f9565d 100644 --- a/docs/2019-11/index.html +++ b/docs/2019-11/index.html @@ -34,7 +34,7 @@ Let’s see how many of the REST API requests were for bitstreams (because t - + @@ -73,9 +73,9 @@ Let’s see how many of the REST API requests were for bitstreams (because t "@type": "BlogPosting", "headline": "November, 2019", "url": "https:\/\/alanorth.github.io\/cgspace-notes\/2019-11\/", - "wordCount": "2595", + "wordCount": "2866", "datePublished": "2019-11-04T12:20:30+02:00", - "dateModified": "2019-11-17T14:21:58+02:00", + "dateModified": "2019-11-17T15:39:10+02:00", "author": { "@type": "Person", "name": "Alan Orth" @@ -524,32 +524,53 @@ $ ./resolve-orcids.py -i /tmp/2019-11-14-combined-orcids.txt -o /tmp/2019-11-14- -

Guzzle/ curl/ PHP/ +

Guzzle/ curl/ PHP/

+ +

+- Run system updates on DSpace Test and reboot the server
+
+## 2019-11-17
+
+- Altmetric support responded about our dashboard question, asking if the second "department" (aka WLE's collection) was added recently and might have not been in the last harvesting yet
+  - I told her no, that the department is several years old, and the item was added in 2017
+  - Then I looked again at the dashboard for each department and I see the item in both departments now... shit.
+  - A [search in the IWMI department shows the item](https://www.altmetric.com/explorer/outputs?department_id%5B%5D=CGSpace%3Agroup%3Acom_10568_16814&q=Towards%20sustainable%20sanitation%20management)
+  - A [search in the WLE department shows the item](https://www.altmetric.com/explorer/outputs?department_id%5B%5D=CGSpace%3Agroup%3Acom_10568_34494&q=Towards%20sustainable%20sanitation%20management)
+- I finally decided to revert `cg.hasMetadata` back to `cg.identifier.dataurl` in my CG Core v2 branch (see [#10](https://github.com/AgriculturalSemantics/cg-core/issues/10))
+- Regarding the [WLE item](https://hdl.handle.net/10568/97087) that has a much lower score than its DOI...
+  - I tweeted the item twice last week and the score never got linked
+  - Then I noticed that I had already made a note about the same issue in 2019-04, when I also tweeted it several times...
+  - I will ask Altmetric support for help with that
+- Finally deploy `5_x-cgcorev2` branch on DSpace Test
+
+## 2019-11-18
+
+- I sent a mail to the CGSpace partners in Addis about the CG Core v2 changes on DSpace Test
+- Then I filed an [issue on the CG Core GitHub](https://github.com/AgriculturalSemantics/cg-core/issues/11) to let the metadata people know about our progress
+- It seems like I will do a session about CG Core v2 implementation and limitations in DSpace for the data workshop in December in Nairobi (?)
+
+## 2019-11-19
+
+- Export IITA's community from CGSpace because they want to experiment with importing it into their internal DSpace for some testing or something
+  - I had previously sent them an export in 2019-04
+- Atmire merged my [pull request regarding unnecessary escaping of dashes](https://github.com/atmire/COUNTER-Robots/pull/28) in regular expressions, as well as [my suggestion of adding "User-Agent" to the list of patterns](https://github.com/atmire/COUNTER-Robots/issues/27)
+- I made another [pull request to fix invalid escaping of one of their new patterns](https://github.com/atmire/COUNTER-Robots/pull/29)
+- I ran my `check-spider-hits.sh` script again with these new patterns and found a bunch more statistics requests that match, for example:
+  - Found 39560 hits from ^Buck\/[0-9] in statistics
+  - Found 5471 hits from ^User-Agent in statistics
+  - Found 2994 hits from ^Buck\/[0-9] in statistics-2018
+  - Found 14076 hits from ^User-Agent in statistics-2018
+  - Found 16310 hits from ^User-Agent in statistics-2017
+  - Found 4429 hits from ^User-Agent in statistics-2016
+- Buck is one I've never heard of before, its user agent is:
+
+
+ +

Buck/2.2; (+https://app.hypefactors.com/media-monitoring/about.html) ```

    -
  • Run system updates on DSpace Test and reboot the server
  • -
- -

2019-11-17

- -
    -
  • Altmetric support responded about our dashboard question, asking if the second “department” (aka WLE’s collection) was added recently and might have not been in the last harvesting yet - -
  • -
  • I finally decided to revert cg.hasMetadata back to cg.identifier.dataurl in my CG Core v2 branch (see #10)
  • -
  • Regarding the WLE item that has a much lower score than its DOI… - -
      -
    • I tweeted the item twice last week and the score never got linked
    • -
    • Then I noticed that I had already made a note about the same issue in 2019-04, when I also tweeted it several times…
    • -
    • I will ask Altmetric support for help with that
    • -
  • +
  • All in all that’s about 85,000 more hits purged, in addition to the 3.4 million I purged last week
diff --git a/docs/cgspace-cgcorev2-migration/index.html b/docs/cgspace-cgcorev2-migration/index.html index 4b0151c9e..bab02f77a 100644 --- a/docs/cgspace-cgcorev2-migration/index.html +++ b/docs/cgspace-cgcorev2-migration/index.html @@ -10,7 +10,7 @@ - + @@ -27,7 +27,7 @@ "url": "https:\/\/alanorth.github.io\/cgspace-notes\/cgspace-cgcorev2-migration\/", "wordCount": "546", "datePublished": "2019-10-28T13:27:35+02:00", - "dateModified": "2019-11-17T14:21:20+02:00", + "dateModified": "2019-11-17T15:39:10+02:00", "author": { "@type": "Person", "name": "Alan Orth" diff --git a/docs/sitemap.xml b/docs/sitemap.xml index 37148d49b..cea56644a 100644 --- a/docs/sitemap.xml +++ b/docs/sitemap.xml @@ -4,42 +4,42 @@ https://alanorth.github.io/cgspace-notes/categories/ - 2019-11-17T14:21:58+02:00 + 2019-11-17T15:39:10+02:00 https://alanorth.github.io/cgspace-notes/ - 2019-11-17T14:21:58+02:00 + 2019-11-17T15:39:10+02:00 https://alanorth.github.io/cgspace-notes/categories/notes/ - 2019-11-17T14:21:58+02:00 + 2019-11-17T15:39:10+02:00 https://alanorth.github.io/cgspace-notes/2019-11/ - 2019-11-17T14:21:58+02:00 + 2019-11-17T15:39:10+02:00 https://alanorth.github.io/cgspace-notes/posts/ - 2019-11-17T14:21:58+02:00 + 2019-11-17T15:39:10+02:00 https://alanorth.github.io/cgspace-notes/cgspace-cgcorev2-migration/ - 2019-11-17T14:21:20+02:00 + 2019-11-17T15:39:10+02:00 https://alanorth.github.io/cgspace-notes/tags/migration/ - 2019-11-17T14:21:20+02:00 + 2019-11-17T15:39:10+02:00 https://alanorth.github.io/cgspace-notes/tags/ - 2019-11-17T14:21:20+02:00 + 2019-11-17T15:39:10+02:00