Update notes for 2016-03-21

Signed-off-by: Alan Orth <alan.orth@gmail.com>
This commit is contained in:
2016-03-22 10:42:18 +02:00
parent c613704234
commit 8c9dc9e310
6 changed files with 18 additions and 0 deletions

View File

@ -238,6 +238,11 @@
<li>I will mark these errors as resolved because they are returning HTTP 403 on purpose, for a long time!</li>
<li>Google says the first time it saw this particular error was September 29, 2015&hellip; so maybe it accidentally saw it somehow&hellip;</li>
<li>On a related note, we have 51,000 items indexed from the sitemap, but 500,000 items in the Google index, so we DEFINITELY have a problem with duplicate content</li>
</ul>
<p><img src="../images/2016/03/google-index.png" alt="CGSpace pages in Google index" /></p>
<ul>
<li>Turns out this is a problem with DSpace&rsquo;s <code>robots.txt</code>, and there&rsquo;s a Jira ticket since December, 2015: <a href="https://jira.duraspace.org/browse/DS-2962">https://jira.duraspace.org/browse/DS-2962</a></li>
<li>I am not sure if I want to apply it yet</li>
<li>For now I&rsquo;ve just set a bunch of these dynamic pages to not appear in search results by using the URL Parameters tool in Webmaster Tools</li>