Add notes for 2017-07-31

This commit is contained in:
2017-08-01 08:55:37 +03:00
parent 2efbe26be9
commit ff336ce2ba
3 changed files with 25 additions and 8 deletions

View File

@ -27,7 +27,7 @@ We can use PostgreSQL’s extended output format (-x) plus sed to format the
<meta property="article:published_time" content="2017-07-01T18:03:52&#43;03:00"/>
<meta property="article:modified_time" content="2017-07-30T14:18:23&#43;03:00"/>
<meta property="article:modified_time" content="2017-07-31T12:06:21&#43;03:00"/>
@ -73,9 +73,9 @@ We can use PostgreSQL&rsquo;s extended output format (-x) plus sed to format the
"@type": "BlogPosting",
"headline": "July, 2017",
"url": "https://alanorth.github.io/cgspace-notes/2017-07/",
"wordCount": "1086",
"wordCount": "1151",
"datePublished": "2017-07-01T18:03:52&#43;03:00",
"dateModified": "2017-07-30T14:18:23&#43;03:00",
"dateModified": "2017-07-31T12:06:21&#43;03:00",
"author": {
"@type": "Person",
"name": "Alan Orth"
@ -313,6 +313,15 @@ delete from metadatavalue where resource_type_id=2 and metadata_field_id=235 and
<ul>
<li>Now just waiting to run them on CGSpace, and then apply the modified input forms after Macaroni Bros give me an updated list</li>
<li>Temporarily increase the nginx upload limit to 200MB for Sisay to upload the CIAT presentations</li>
<li>Looking at CGSpace activity page, there are 52 Baidu bots concurrently crawling our website (I copied the activity page to a text file and grep it)!</li>
</ul>
<pre><code>$ grep 180.76. /tmp/status | awk '{print $5}' | sort | uniq | wc -l
52
</code></pre>
<ul>
<li>From looking at the <code>dspace.log</code> I see they are all using the same session, which means our Crawler Session Manager Valve is working</li>
</ul>