Add notes for 2016-11-17

This commit is contained in:
Alan Orth 2016-11-17 15:59:59 +02:00
parent e171398149
commit 88a52106b9
Signed by: alanorth
GPG Key ID: 0FB860CC9C45B1B9
25 changed files with 249 additions and 3 deletions

View File

@ -319,3 +319,39 @@ $ mvn -U -Dmirage2.on=true -Dmirage2.deps.included=false -Denv=localhost -P \!ds
```
- We absolutely don't use those modules, so we shouldn't build them in the first place
## 2016-11-17
- Generate a list of journal titles for Peter and Abenet to look through so we can make a controlled vocabulary out of them:
```
dspace=# \copy (select distinct text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=55 group by text_value order by count desc) to /tmp/journal-titles.csv with csv;
COPY 2515
```
- Send a message to users of the CGSpace REST API to notify them of upcoming upgrade so they can test their apps against DSpace Test
- Test an update old, non-HTTPS links to the CCAFS website in CGSpace metadata:
```
dspacetest=# update metadatavalue set text_value = regexp_replace(text_value, 'http://ccafs.cgiar.org','https://ccafs.cgiar.org') where resource_type_id=2 and text_value like '%http://ccafs.cgiar.org%';
UPDATE 164
dspace=# update metadatavalue set text_value = regexp_replace(text_value, 'http://ccafs.cgiar.org','https://ccafs.cgiar.org') where resource_type_id=2 and text_value like '%http://ccafs.cgiar.org%';
UPDATE 7
```
- Had to run it twice to get all (not sure about "global" regex in PostgreSQL)
- Run the updates on CGSpace as well
- Run through some collections and manually regenerate some PDF thumbnails for items from before 2016 on DSpace Test to compare with CGSpace
- I'm debating forcing the re-generation of ALL thumbnails, since some come from DSpace 3 and 4 when the thumbnailing wasn't as good
- The results were very good, I think that after we upgrade to 5.5 I will do it, perhaps one community / collection at a time:
```
$ [dspace]/bin/dspace filter-media -f -i 10568/67156 -p "ImageMagick PDF Thumbnail"
```
- In related news, I'm looking at thumbnails of thumbnails (the ones we uploaded manually before, and now DSpace's media filter has made thumbnails of THEM):
```
dspace=# select text_value from metadatavalue where text_value like '%.jpg.jpg';
```
- I'm not sure if there's anything we can do, actually, because we would have to remove those from the thumbnail bundles, and replace them with the regular JPGs from the content bundle, and then remove them from the assetstore...

View File

@ -115,6 +115,8 @@ $ psql -c 'SELECT * from pg_stat_activity;' | grep idle | grep -c cgspac
<nav class="nav blog-nav">
<a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
</nav>
</div>
</div>

View File

@ -118,6 +118,8 @@ Replace lzop with xz in log compression cron jobs on DSpace Test—it uses less
<nav class="nav blog-nav">
<a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
</nav>
</div>
</div>

View File

@ -103,6 +103,8 @@ Update GitHub wiki for documentation of maintenance tasks.
<nav class="nav blog-nav">
<a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
</nav>
</div>
</div>

View File

@ -124,6 +124,8 @@ Also, lots of things like &ldquo;COTE D`LVOIRE&rdquo; and &ldquo;COTE D IVOIRE&r
<nav class="nav blog-nav">
<a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
</nav>
</div>
</div>

View File

@ -103,6 +103,8 @@ Reinstall my local (Mac OS X) DSpace stack with Tomcat 7, PostgreSQL 9.3, and Ja
<nav class="nav blog-nav">
<a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
</nav>
</div>
</div>

View File

@ -109,6 +109,8 @@ Also, I noticed the checker log has some errors we should pay attention to:
<nav class="nav blog-nav">
<a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
</nav>
</div>
</div>

View File

@ -115,6 +115,8 @@ There are 3,000 IPs accessing the REST API in a 24-hour period!
<nav class="nav blog-nav">
<a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
</nav>
</div>
</div>

View File

@ -112,6 +112,8 @@ Working on second phase of metadata migration, looks like this will work for mov
<nav class="nav blog-nav">
<a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
</nav>
</div>
</div>

View File

@ -136,6 +136,8 @@ In this case the select query was showing 95 results before the update
<nav class="nav blog-nav">
<a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
</nav>
</div>
</div>

View File

@ -127,6 +127,8 @@ $ git rebase -i dspace-5.5
<nav class="nav blog-nav">
<a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
</nav>
</div>
</div>

View File

@ -115,6 +115,8 @@ $ ldapsearch -x -H ldaps://svcgroot2.cgiarad.org:3269/ -b &quot;dc=cgiarad,dc=or
<nav class="nav blog-nav">
<a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
</nav>
</div>
</div>

View File

@ -79,6 +79,8 @@
<nav class="nav blog-nav">
<a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
</nav>
</div>
</div>

View File

@ -30,7 +30,7 @@
<meta itemprop="dateModified" content="2016-11-01T09:21:00&#43;03:00" />
<meta itemprop="wordCount" content="1596">
<meta itemprop="wordCount" content="1889">
@ -79,6 +79,8 @@
<nav class="nav blog-nav">
<a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
</nav>
</div>
</div>
@ -464,6 +466,49 @@ Mozilla/5.0 (compatible; YandexImages/3.0; +http://yandex.com/bots)&quot; &quot;
<li>We absolutely don&rsquo;t use those modules, so we shouldn&rsquo;t build them in the first place</li>
</ul>
<h2 id="2016-11-17">2016-11-17</h2>
<ul>
<li>Generate a list of journal titles for Peter and Abenet to look through so we can make a controlled vocabulary out of them:</li>
</ul>
<pre><code>dspace=# \copy (select distinct text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=55 group by text_value order by count desc) to /tmp/journal-titles.csv with csv;
COPY 2515
</code></pre>
<ul>
<li>Send a message to users of the CGSpace REST API to notify them of upcoming upgrade so they can test their apps against DSpace Test</li>
<li>Test an update old, non-HTTPS links to the CCAFS website in CGSpace metadata:</li>
</ul>
<pre><code>dspacetest=# update metadatavalue set text_value = regexp_replace(text_value, 'http://ccafs.cgiar.org','https://ccafs.cgiar.org') where resource_type_id=2 and text_value like '%http://ccafs.cgiar.org%';
UPDATE 164
dspace=# update metadatavalue set text_value = regexp_replace(text_value, 'http://ccafs.cgiar.org','https://ccafs.cgiar.org') where resource_type_id=2 and text_value like '%http://ccafs.cgiar.org%';
UPDATE 7
</code></pre>
<ul>
<li>Had to run it twice to get all (not sure about &ldquo;global&rdquo; regex in PostgreSQL)</li>
<li>Run the updates on CGSpace as well</li>
<li>Run through some collections and manually regenerate some PDF thumbnails for items from before 2016 on DSpace Test to compare with CGSpace</li>
<li>I&rsquo;m debating forcing the re-generation of ALL thumbnails, since some come from DSpace 3 and 4 when the thumbnailing wasn&rsquo;t as good</li>
<li>The results were very good, I think that after we upgrade to 5.5 I will do it, perhaps one community / collection at a time:</li>
</ul>
<pre><code>$ [dspace]/bin/dspace filter-media -f -i 10568/67156 -p &quot;ImageMagick PDF Thumbnail&quot;
</code></pre>
<ul>
<li>In related news, I&rsquo;m looking at thumbnails of thumbnails (the ones we uploaded manually before, and now DSpace&rsquo;s media filter has made thumbnails of THEM):</li>
</ul>
<pre><code>dspace=# select text_value from metadatavalue where text_value like '%.jpg.jpg';
</code></pre>
<ul>
<li>I&rsquo;m not sure if there&rsquo;s anything we can do, actually, because we would have to remove those from the thumbnail bundles, and replace them with the regular JPGs from the content bundle, and then remove them from the assetstore&hellip;</li>
</ul>

File diff suppressed because one or more lines are too long

View File

@ -65,6 +65,8 @@
<nav class="nav blog-nav">
<a class="nav-link active" href="https://alanorth.github.io/cgspace-notes/">Home</a>
</nav>
</div>
</div>

View File

@ -372,6 +372,49 @@ Mozilla/5.0 (compatible; YandexImages/3.0; +http://yandex.com/bots)&amp;quot; &a
&lt;ul&gt;
&lt;li&gt;We absolutely don&amp;rsquo;t use those modules, so we shouldn&amp;rsquo;t build them in the first place&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;2016-11-17&#34;&gt;2016-11-17&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Generate a list of journal titles for Peter and Abenet to look through so we can make a controlled vocabulary out of them:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;dspace=# \copy (select distinct text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=55 group by text_value order by count desc) to /tmp/journal-titles.csv with csv;
COPY 2515
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;Send a message to users of the CGSpace REST API to notify them of upcoming upgrade so they can test their apps against DSpace Test&lt;/li&gt;
&lt;li&gt;Test an update old, non-HTTPS links to the CCAFS website in CGSpace metadata:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;dspacetest=# update metadatavalue set text_value = regexp_replace(text_value, &#39;http://ccafs.cgiar.org&#39;,&#39;https://ccafs.cgiar.org&#39;) where resource_type_id=2 and text_value like &#39;%http://ccafs.cgiar.org%&#39;;
UPDATE 164
dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;http://ccafs.cgiar.org&#39;,&#39;https://ccafs.cgiar.org&#39;) where resource_type_id=2 and text_value like &#39;%http://ccafs.cgiar.org%&#39;;
UPDATE 7
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;Had to run it twice to get all (not sure about &amp;ldquo;global&amp;rdquo; regex in PostgreSQL)&lt;/li&gt;
&lt;li&gt;Run the updates on CGSpace as well&lt;/li&gt;
&lt;li&gt;Run through some collections and manually regenerate some PDF thumbnails for items from before 2016 on DSpace Test to compare with CGSpace&lt;/li&gt;
&lt;li&gt;I&amp;rsquo;m debating forcing the re-generation of ALL thumbnails, since some come from DSpace 3 and 4 when the thumbnailing wasn&amp;rsquo;t as good&lt;/li&gt;
&lt;li&gt;The results were very good, I think that after we upgrade to 5.5 I will do it, perhaps one community / collection at a time:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;$ [dspace]/bin/dspace filter-media -f -i 10568/67156 -p &amp;quot;ImageMagick PDF Thumbnail&amp;quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;In related news, I&amp;rsquo;m looking at thumbnails of thumbnails (the ones we uploaded manually before, and now DSpace&amp;rsquo;s media filter has made thumbnails of THEM):&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;dspace=# select text_value from metadatavalue where text_value like &#39;%.jpg.jpg&#39;;
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;I&amp;rsquo;m not sure if there&amp;rsquo;s anything we can do, actually, because we would have to remove those from the thumbnail bundles, and replace them with the regular JPGs from the content bundle, and then remove them from the assetstore&amp;hellip;&lt;/li&gt;
&lt;/ul&gt;
</description>
</item>

View File

@ -65,6 +65,8 @@
<nav class="nav blog-nav">
<a class="nav-link active" href="https://alanorth.github.io/cgspace-notes/">Home</a>
</nav>
</div>
</div>

View File

@ -65,6 +65,8 @@
<nav class="nav blog-nav">
<a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
</nav>
</div>
</div>

View File

@ -372,6 +372,49 @@ Mozilla/5.0 (compatible; YandexImages/3.0; +http://yandex.com/bots)&amp;quot; &a
&lt;ul&gt;
&lt;li&gt;We absolutely don&amp;rsquo;t use those modules, so we shouldn&amp;rsquo;t build them in the first place&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;2016-11-17&#34;&gt;2016-11-17&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Generate a list of journal titles for Peter and Abenet to look through so we can make a controlled vocabulary out of them:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;dspace=# \copy (select distinct text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=55 group by text_value order by count desc) to /tmp/journal-titles.csv with csv;
COPY 2515
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;Send a message to users of the CGSpace REST API to notify them of upcoming upgrade so they can test their apps against DSpace Test&lt;/li&gt;
&lt;li&gt;Test an update old, non-HTTPS links to the CCAFS website in CGSpace metadata:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;dspacetest=# update metadatavalue set text_value = regexp_replace(text_value, &#39;http://ccafs.cgiar.org&#39;,&#39;https://ccafs.cgiar.org&#39;) where resource_type_id=2 and text_value like &#39;%http://ccafs.cgiar.org%&#39;;
UPDATE 164
dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;http://ccafs.cgiar.org&#39;,&#39;https://ccafs.cgiar.org&#39;) where resource_type_id=2 and text_value like &#39;%http://ccafs.cgiar.org%&#39;;
UPDATE 7
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;Had to run it twice to get all (not sure about &amp;ldquo;global&amp;rdquo; regex in PostgreSQL)&lt;/li&gt;
&lt;li&gt;Run the updates on CGSpace as well&lt;/li&gt;
&lt;li&gt;Run through some collections and manually regenerate some PDF thumbnails for items from before 2016 on DSpace Test to compare with CGSpace&lt;/li&gt;
&lt;li&gt;I&amp;rsquo;m debating forcing the re-generation of ALL thumbnails, since some come from DSpace 3 and 4 when the thumbnailing wasn&amp;rsquo;t as good&lt;/li&gt;
&lt;li&gt;The results were very good, I think that after we upgrade to 5.5 I will do it, perhaps one community / collection at a time:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;$ [dspace]/bin/dspace filter-media -f -i 10568/67156 -p &amp;quot;ImageMagick PDF Thumbnail&amp;quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;In related news, I&amp;rsquo;m looking at thumbnails of thumbnails (the ones we uploaded manually before, and now DSpace&amp;rsquo;s media filter has made thumbnails of THEM):&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;dspace=# select text_value from metadatavalue where text_value like &#39;%.jpg.jpg&#39;;
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;I&amp;rsquo;m not sure if there&amp;rsquo;s anything we can do, actually, because we would have to remove those from the thumbnail bundles, and replace them with the regular JPGs from the content bundle, and then remove them from the assetstore&amp;hellip;&lt;/li&gt;
&lt;/ul&gt;
</description>
</item>

View File

@ -65,6 +65,8 @@
<nav class="nav blog-nav">
<a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
</nav>
</div>
</div>

View File

@ -65,6 +65,8 @@
<nav class="nav blog-nav">
<a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
</nav>
</div>
</div>

View File

@ -371,6 +371,49 @@ Mozilla/5.0 (compatible; YandexImages/3.0; +http://yandex.com/bots)&amp;quot; &a
&lt;ul&gt;
&lt;li&gt;We absolutely don&amp;rsquo;t use those modules, so we shouldn&amp;rsquo;t build them in the first place&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;2016-11-17&#34;&gt;2016-11-17&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Generate a list of journal titles for Peter and Abenet to look through so we can make a controlled vocabulary out of them:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;dspace=# \copy (select distinct text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=55 group by text_value order by count desc) to /tmp/journal-titles.csv with csv;
COPY 2515
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;Send a message to users of the CGSpace REST API to notify them of upcoming upgrade so they can test their apps against DSpace Test&lt;/li&gt;
&lt;li&gt;Test an update old, non-HTTPS links to the CCAFS website in CGSpace metadata:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;dspacetest=# update metadatavalue set text_value = regexp_replace(text_value, &#39;http://ccafs.cgiar.org&#39;,&#39;https://ccafs.cgiar.org&#39;) where resource_type_id=2 and text_value like &#39;%http://ccafs.cgiar.org%&#39;;
UPDATE 164
dspace=# update metadatavalue set text_value = regexp_replace(text_value, &#39;http://ccafs.cgiar.org&#39;,&#39;https://ccafs.cgiar.org&#39;) where resource_type_id=2 and text_value like &#39;%http://ccafs.cgiar.org%&#39;;
UPDATE 7
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;Had to run it twice to get all (not sure about &amp;ldquo;global&amp;rdquo; regex in PostgreSQL)&lt;/li&gt;
&lt;li&gt;Run the updates on CGSpace as well&lt;/li&gt;
&lt;li&gt;Run through some collections and manually regenerate some PDF thumbnails for items from before 2016 on DSpace Test to compare with CGSpace&lt;/li&gt;
&lt;li&gt;I&amp;rsquo;m debating forcing the re-generation of ALL thumbnails, since some come from DSpace 3 and 4 when the thumbnailing wasn&amp;rsquo;t as good&lt;/li&gt;
&lt;li&gt;The results were very good, I think that after we upgrade to 5.5 I will do it, perhaps one community / collection at a time:&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;$ [dspace]/bin/dspace filter-media -f -i 10568/67156 -p &amp;quot;ImageMagick PDF Thumbnail&amp;quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;In related news, I&amp;rsquo;m looking at thumbnails of thumbnails (the ones we uploaded manually before, and now DSpace&amp;rsquo;s media filter has made thumbnails of THEM):&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;dspace=# select text_value from metadatavalue where text_value like &#39;%.jpg.jpg&#39;;
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;I&amp;rsquo;m not sure if there&amp;rsquo;s anything we can do, actually, because we would have to remove those from the thumbnail bundles, and replace them with the regular JPGs from the content bundle, and then remove them from the assetstore&amp;hellip;&lt;/li&gt;
&lt;/ul&gt;
</description>
</item>

View File

@ -65,6 +65,8 @@
<nav class="nav blog-nav">
<a class="nav-link " href="https://alanorth.github.io/cgspace-notes/">Home</a>
</nav>
</div>
</div>

@ -1 +1 @@
Subproject commit e3f93bd38bd2c9d7aadf550b5d323ad45c566b0b
Subproject commit f6e262f05e1a07269f3c1c9c82e2e5ecd83b44d9