Update notes for 2017-08-01

This commit is contained in:
2017-08-01 12:03:37 +03:00
parent e3e602881e
commit 5b11434f0f
30 changed files with 91 additions and 71 deletions

View File

@ -119,6 +119,8 @@
</ul></li>
<li>The <code>robots.txt</code> only blocks the top-level <code>/discover</code> and <code>/browse</code> URLs&hellip; we will need to find a way to forbid them from accessing these!</li>
<li>Relevant issue from DSpace Jira (semi resolved in DSpace 6.0): <a href="https://jira.duraspace.org/browse/DS-2962">https://jira.duraspace.org/browse/DS-2962</a></li>
<li>It turns out that we&rsquo;re already adding the <code>X-Robots-Tag &quot;none&quot;</code> HTTP header, but this only forbids the search engine from <em>indexing</em> the page, not crawling it!</li>
<li>Also, the bot has to successfully browse the page first so it can receive the HTTP header&hellip;</li>
</ul>
<p></p>

View File

@ -32,6 +32,8 @@
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;robots.txt&lt;/code&gt; only blocks the top-level &lt;code&gt;/discover&lt;/code&gt; and &lt;code&gt;/browse&lt;/code&gt; URLs&amp;hellip; we will need to find a way to forbid them from accessing these!&lt;/li&gt;
&lt;li&gt;Relevant issue from DSpace Jira (semi resolved in DSpace 6.0): &lt;a href=&#34;https://jira.duraspace.org/browse/DS-2962&#34;&gt;https://jira.duraspace.org/browse/DS-2962&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;It turns out that we&amp;rsquo;re already adding the &lt;code&gt;X-Robots-Tag &amp;quot;none&amp;quot;&lt;/code&gt; HTTP header, but this only forbids the search engine from &lt;em&gt;indexing&lt;/em&gt; the page, not crawling it!&lt;/li&gt;
&lt;li&gt;Also, the bot has to successfully browse the page first so it can receive the HTTP header&amp;hellip;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;/p&gt;</description>