mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-11-29 18:08:20 +01:00
Update notes for 2017-02-28
This commit is contained in:
parent
a3f0d88945
commit
56a24bf456
@ -310,4 +310,14 @@ dspace=# \copy (select resource_id, metadata_value_id from metadatavalue where r
|
|||||||
COPY 1968
|
COPY 1968
|
||||||
```
|
```
|
||||||
|
|
||||||
- And then using awk or uniq to either remove or print the lines that have a duplicate `resource_id` (meaning they belong to the same item in DSpace and are therefore duplicates), and then using the `metadata_value_id` to delete them
|
- And then use awk to print the duplicate lines to a separate file:
|
||||||
|
|
||||||
|
```
|
||||||
|
$ awk -F',' 'seen[$1]++' /tmp/ciat.csv > /tmp/ciat-dupes.csv
|
||||||
|
```
|
||||||
|
|
||||||
|
- From that file I can create a list of 279 deletes and put them in a batch script like:
|
||||||
|
|
||||||
|
```
|
||||||
|
delete from metadatavalue where resource_type_id=2 and metadata_field_id=3 and metadata_value_id=2742061;
|
||||||
|
```
|
||||||
|
@ -90,7 +90,7 @@ Looks like we’ll be using cg.identifier.ccafsprojectpii as the field name
|
|||||||
|
|
||||||
"headline": "February, 2017",
|
"headline": "February, 2017",
|
||||||
"url": "https://alanorth.github.io/cgspace-notes/2017-02/",
|
"url": "https://alanorth.github.io/cgspace-notes/2017-02/",
|
||||||
"wordCount": "2019",
|
"wordCount": "2028",
|
||||||
|
|
||||||
|
|
||||||
"datePublished": "2017-02-07T07:04:52-08:00",
|
"datePublished": "2017-02-07T07:04:52-08:00",
|
||||||
@ -522,9 +522,19 @@ COPY 1968
|
|||||||
</code></pre>
|
</code></pre>
|
||||||
|
|
||||||
<ul>
|
<ul>
|
||||||
<li>And then using awk or uniq to either remove or print the lines that have a duplicate <code>resource_id</code> (meaning they belong to the same item in DSpace and are therefore duplicates), and then using the <code>metadata_value_id</code> to delete them</li>
|
<li>And then use awk to print the duplicate lines to a separate file:</li>
|
||||||
</ul>
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>$ awk -F',' 'seen[$1]++' /tmp/ciat.csv > /tmp/ciat-dupes.csv
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>From that file I can create a list of 279 deletes and put them in a batch script like:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>delete from metadatavalue where resource_type_id=2 and metadata_field_id=3 and metadata_value_id=2742061;
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -372,8 +372,18 @@ COPY 1968
|
|||||||
</code></pre>
|
</code></pre>
|
||||||
|
|
||||||
<ul>
|
<ul>
|
||||||
<li>And then using awk or uniq to either remove or print the lines that have a duplicate <code>resource_id</code> (meaning they belong to the same item in DSpace and are therefore duplicates), and then using the <code>metadata_value_id</code> to delete them</li>
|
<li>And then use awk to print the duplicate lines to a separate file:</li>
|
||||||
</ul></description>
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>$ awk -F',' 'seen[$1]++' /tmp/ciat.csv &gt; /tmp/ciat-dupes.csv
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>From that file I can create a list of 279 deletes and put them in a batch script like:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>delete from metadatavalue where resource_type_id=2 and metadata_field_id=3 and metadata_value_id=2742061;
|
||||||
|
</code></pre></description>
|
||||||
</item>
|
</item>
|
||||||
|
|
||||||
<item>
|
<item>
|
||||||
|
@ -372,8 +372,18 @@ COPY 1968
|
|||||||
</code></pre>
|
</code></pre>
|
||||||
|
|
||||||
<ul>
|
<ul>
|
||||||
<li>And then using awk or uniq to either remove or print the lines that have a duplicate <code>resource_id</code> (meaning they belong to the same item in DSpace and are therefore duplicates), and then using the <code>metadata_value_id</code> to delete them</li>
|
<li>And then use awk to print the duplicate lines to a separate file:</li>
|
||||||
</ul></description>
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>$ awk -F',' 'seen[$1]++' /tmp/ciat.csv &gt; /tmp/ciat-dupes.csv
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>From that file I can create a list of 279 deletes and put them in a batch script like:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>delete from metadatavalue where resource_type_id=2 and metadata_field_id=3 and metadata_value_id=2742061;
|
||||||
|
</code></pre></description>
|
||||||
</item>
|
</item>
|
||||||
|
|
||||||
<item>
|
<item>
|
||||||
|
@ -371,8 +371,18 @@ COPY 1968
|
|||||||
</code></pre>
|
</code></pre>
|
||||||
|
|
||||||
<ul>
|
<ul>
|
||||||
<li>And then using awk or uniq to either remove or print the lines that have a duplicate <code>resource_id</code> (meaning they belong to the same item in DSpace and are therefore duplicates), and then using the <code>metadata_value_id</code> to delete them</li>
|
<li>And then use awk to print the duplicate lines to a separate file:</li>
|
||||||
</ul></description>
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>$ awk -F',' 'seen[$1]++' /tmp/ciat.csv &gt; /tmp/ciat-dupes.csv
|
||||||
|
</code></pre>
|
||||||
|
|
||||||
|
<ul>
|
||||||
|
<li>From that file I can create a list of 279 deletes and put them in a batch script like:</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<pre><code>delete from metadatavalue where resource_type_id=2 and metadata_field_id=3 and metadata_value_id=2742061;
|
||||||
|
</code></pre></description>
|
||||||
</item>
|
</item>
|
||||||
|
|
||||||
<item>
|
<item>
|
||||||
|
Loading…
Reference in New Issue
Block a user