mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-11-22 14:45:03 +01:00
Update notes for 2018-07-18
This commit is contained in:
parent
b17330f157
commit
c451b22f2c
@ -393,5 +393,42 @@ $ tidy -xml -utf8 -iq -m -w 0 dspace/config/controlled-vocabularies/cg-creator-i
|
||||
- Participate in call with IWMI and WLE to discuss Altmetric, CGSpace, and social media
|
||||
- I told them that they should try to be including the Handle link on their social media shares because that's the only way to get Altmetric to notice them and associate them with their DOIs
|
||||
- I suggested that we should have a wider meeting about this, and that I would post that on Yammer
|
||||
- I was curious about how and when Altmetric harvests the OAI, so I looked in nginx's OAI log
|
||||
- For every day in the past week I only see about 50 to 100 requests per day, but then about nine days ago I see 1500 requsts
|
||||
- In there I see two bots making about 750 requests each, and this one is probably Altmetric:
|
||||
|
||||
```
|
||||
178.33.237.157 - - [09/Jul/2018:17:00:46 +0000] "GET /oai/request?verb=ListRecords&resumptionToken=oai_dc////100 HTTP/1.1" 200 58653 "-" "Apache-HttpClient/4.5.2 (Java/1.8.0_121)"
|
||||
178.33.237.157 - - [09/Jul/2018:17:01:11 +0000] "GET /oai/request?verb=ListRecords&resumptionToken=oai_dc////200 HTTP/1.1" 200 67950 "-" "Apache-HttpClient/4.5.2 (Java/1.8.0_121)"
|
||||
...
|
||||
178.33.237.157 - - [09/Jul/2018:22:10:39 +0000] "GET /oai/request?verb=ListRecords&resumptionToken=oai_dc////73900 HTTP/1.1" 20 0 25049 "-" "Apache-HttpClient/4.5.2 (Java/1.8.0_121)"
|
||||
```
|
||||
|
||||
- So if they are getting 100 records per OAI request it would take them 739 requests
|
||||
- I wonder if I should add this user agent to the Tomcat Crawler Session Manager valve... does OAI use Tomcat sessions?
|
||||
- Appears not:
|
||||
|
||||
```
|
||||
$ http --print Hh 'https://cgspace.cgiar.org/oai/request?verb=ListRecords&resumptionToken=oai_dc////100'
|
||||
GET /oai/request?verb=ListRecords&resumptionToken=oai_dc////100 HTTP/1.1
|
||||
Accept: */*
|
||||
Accept-Encoding: gzip, deflate
|
||||
Connection: keep-alive
|
||||
Host: cgspace.cgiar.org
|
||||
User-Agent: HTTPie/0.9.9
|
||||
|
||||
HTTP/1.1 200 OK
|
||||
Connection: keep-alive
|
||||
Content-Encoding: gzip
|
||||
Content-Type: application/xml;charset=UTF-8
|
||||
Date: Wed, 18 Jul 2018 14:46:37 GMT
|
||||
Server: nginx
|
||||
Strict-Transport-Security: max-age=15768000
|
||||
Transfer-Encoding: chunked
|
||||
Vary: Accept-Encoding
|
||||
X-Content-Type-Options: nosniff
|
||||
X-Frame-Options: SAMEORIGIN
|
||||
X-XSS-Protection: 1; mode=block
|
||||
```
|
||||
|
||||
<!-- vim: set sw=2 ts=2: -->
|
||||
|
@ -30,7 +30,7 @@ There is insufficient memory for the Java Runtime Environment to continue.
|
||||
|
||||
<meta property="article:published_time" content="2018-07-01T12:56:54+03:00"/>
|
||||
|
||||
<meta property="article:modified_time" content="2018-07-18T13:16:53+03:00"/>
|
||||
<meta property="article:modified_time" content="2018-07-18T13:25:02+03:00"/>
|
||||
|
||||
|
||||
|
||||
@ -71,9 +71,9 @@ There is insufficient memory for the Java Runtime Environment to continue.
|
||||
"@type": "BlogPosting",
|
||||
"headline": "July, 2018",
|
||||
"url": "https://alanorth.github.io/cgspace-notes/2018-07/",
|
||||
"wordCount": "2704",
|
||||
"wordCount": "2896",
|
||||
"datePublished": "2018-07-01T12:56:54+03:00",
|
||||
"dateModified": "2018-07-18T13:16:53+03:00",
|
||||
"dateModified": "2018-07-18T13:25:02+03:00",
|
||||
"author": {
|
||||
"@type": "Person",
|
||||
"name": "Alan Orth"
|
||||
@ -582,8 +582,45 @@ $ ./resolve-orcids.py -i /tmp/2018-07-15-orcid-ids.txt -o /tmp/2018-07-15-resolv
|
||||
<li>Participate in call with IWMI and WLE to discuss Altmetric, CGSpace, and social media</li>
|
||||
<li>I told them that they should try to be including the Handle link on their social media shares because that’s the only way to get Altmetric to notice them and associate them with their DOIs</li>
|
||||
<li>I suggested that we should have a wider meeting about this, and that I would post that on Yammer</li>
|
||||
<li>I was curious about how and when Altmetric harvests the OAI, so I looked in nginx’s OAI log</li>
|
||||
<li>For every day in the past week I only see about 50 to 100 requests per day, but then about nine days ago I see 1500 requsts</li>
|
||||
<li>In there I see two bots making about 750 requests each, and this one is probably Altmetric:</li>
|
||||
</ul>
|
||||
|
||||
<pre><code>178.33.237.157 - - [09/Jul/2018:17:00:46 +0000] "GET /oai/request?verb=ListRecords&resumptionToken=oai_dc////100 HTTP/1.1" 200 58653 "-" "Apache-HttpClient/4.5.2 (Java/1.8.0_121)"
|
||||
178.33.237.157 - - [09/Jul/2018:17:01:11 +0000] "GET /oai/request?verb=ListRecords&resumptionToken=oai_dc////200 HTTP/1.1" 200 67950 "-" "Apache-HttpClient/4.5.2 (Java/1.8.0_121)"
|
||||
...
|
||||
178.33.237.157 - - [09/Jul/2018:22:10:39 +0000] "GET /oai/request?verb=ListRecords&resumptionToken=oai_dc////73900 HTTP/1.1" 20 0 25049 "-" "Apache-HttpClient/4.5.2 (Java/1.8.0_121)"
|
||||
</code></pre>
|
||||
|
||||
<ul>
|
||||
<li>So if they are getting 100 records per OAI request it would take them 739 requests</li>
|
||||
<li>I wonder if I should add this user agent to the Tomcat Crawler Session Manager valve… does OAI use Tomcat sessions?</li>
|
||||
<li>Appears not:</li>
|
||||
</ul>
|
||||
|
||||
<pre><code>$ http --print Hh 'https://cgspace.cgiar.org/oai/request?verb=ListRecords&resumptionToken=oai_dc////100'
|
||||
GET /oai/request?verb=ListRecords&resumptionToken=oai_dc////100 HTTP/1.1
|
||||
Accept: */*
|
||||
Accept-Encoding: gzip, deflate
|
||||
Connection: keep-alive
|
||||
Host: cgspace.cgiar.org
|
||||
User-Agent: HTTPie/0.9.9
|
||||
|
||||
HTTP/1.1 200 OK
|
||||
Connection: keep-alive
|
||||
Content-Encoding: gzip
|
||||
Content-Type: application/xml;charset=UTF-8
|
||||
Date: Wed, 18 Jul 2018 14:46:37 GMT
|
||||
Server: nginx
|
||||
Strict-Transport-Security: max-age=15768000
|
||||
Transfer-Encoding: chunked
|
||||
Vary: Accept-Encoding
|
||||
X-Content-Type-Options: nosniff
|
||||
X-Frame-Options: SAMEORIGIN
|
||||
X-XSS-Protection: 1; mode=block
|
||||
</code></pre>
|
||||
|
||||
<!-- vim: set sw=2 ts=2: -->
|
||||
|
||||
|
||||
|
@ -4,7 +4,7 @@
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/2018-07/</loc>
|
||||
<lastmod>2018-07-18T13:16:53+03:00</lastmod>
|
||||
<lastmod>2018-07-18T13:25:02+03:00</lastmod>
|
||||
</url>
|
||||
|
||||
<url>
|
||||
@ -174,7 +174,7 @@
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/</loc>
|
||||
<lastmod>2018-07-18T13:16:53+03:00</lastmod>
|
||||
<lastmod>2018-07-18T13:25:02+03:00</lastmod>
|
||||
<priority>0</priority>
|
||||
</url>
|
||||
|
||||
@ -185,7 +185,7 @@
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/tags/notes/</loc>
|
||||
<lastmod>2018-07-18T13:16:53+03:00</lastmod>
|
||||
<lastmod>2018-07-18T13:25:02+03:00</lastmod>
|
||||
<priority>0</priority>
|
||||
</url>
|
||||
|
||||
@ -197,13 +197,13 @@
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/posts/</loc>
|
||||
<lastmod>2018-07-18T13:16:53+03:00</lastmod>
|
||||
<lastmod>2018-07-18T13:25:02+03:00</lastmod>
|
||||
<priority>0</priority>
|
||||
</url>
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/tags/</loc>
|
||||
<lastmod>2018-07-18T13:16:53+03:00</lastmod>
|
||||
<lastmod>2018-07-18T13:25:02+03:00</lastmod>
|
||||
<priority>0</priority>
|
||||
</url>
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user