mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-11-22 14:45:03 +01:00
Update notes for 2017-02-28
This commit is contained in:
parent
a3f0d88945
commit
56a24bf456
@ -310,4 +310,14 @@ dspace=# \copy (select resource_id, metadata_value_id from metadatavalue where r
|
||||
COPY 1968
|
||||
```
|
||||
|
||||
- And then using awk or uniq to either remove or print the lines that have a duplicate `resource_id` (meaning they belong to the same item in DSpace and are therefore duplicates), and then using the `metadata_value_id` to delete them
|
||||
- And then use awk to print the duplicate lines to a separate file:
|
||||
|
||||
```
|
||||
$ awk -F',' 'seen[$1]++' /tmp/ciat.csv > /tmp/ciat-dupes.csv
|
||||
```
|
||||
|
||||
- From that file I can create a list of 279 deletes and put them in a batch script like:
|
||||
|
||||
```
|
||||
delete from metadatavalue where resource_type_id=2 and metadata_field_id=3 and metadata_value_id=2742061;
|
||||
```
|
||||
|
@ -90,7 +90,7 @@ Looks like we’ll be using cg.identifier.ccafsprojectpii as the field name
|
||||
|
||||
"headline": "February, 2017",
|
||||
"url": "https://alanorth.github.io/cgspace-notes/2017-02/",
|
||||
"wordCount": "2019",
|
||||
"wordCount": "2028",
|
||||
|
||||
|
||||
"datePublished": "2017-02-07T07:04:52-08:00",
|
||||
@ -522,9 +522,19 @@ COPY 1968
|
||||
</code></pre>
|
||||
|
||||
<ul>
|
||||
<li>And then using awk or uniq to either remove or print the lines that have a duplicate <code>resource_id</code> (meaning they belong to the same item in DSpace and are therefore duplicates), and then using the <code>metadata_value_id</code> to delete them</li>
|
||||
<li>And then use awk to print the duplicate lines to a separate file:</li>
|
||||
</ul>
|
||||
|
||||
<pre><code>$ awk -F',' 'seen[$1]++' /tmp/ciat.csv > /tmp/ciat-dupes.csv
|
||||
</code></pre>
|
||||
|
||||
<ul>
|
||||
<li>From that file I can create a list of 279 deletes and put them in a batch script like:</li>
|
||||
</ul>
|
||||
|
||||
<pre><code>delete from metadatavalue where resource_type_id=2 and metadata_field_id=3 and metadata_value_id=2742061;
|
||||
</code></pre>
|
||||
|
||||
|
||||
|
||||
|
||||
|
@ -372,8 +372,18 @@ COPY 1968
|
||||
</code></pre>
|
||||
|
||||
<ul>
|
||||
<li>And then using awk or uniq to either remove or print the lines that have a duplicate <code>resource_id</code> (meaning they belong to the same item in DSpace and are therefore duplicates), and then using the <code>metadata_value_id</code> to delete them</li>
|
||||
</ul></description>
|
||||
<li>And then use awk to print the duplicate lines to a separate file:</li>
|
||||
</ul>
|
||||
|
||||
<pre><code>$ awk -F',' 'seen[$1]++' /tmp/ciat.csv &gt; /tmp/ciat-dupes.csv
|
||||
</code></pre>
|
||||
|
||||
<ul>
|
||||
<li>From that file I can create a list of 279 deletes and put them in a batch script like:</li>
|
||||
</ul>
|
||||
|
||||
<pre><code>delete from metadatavalue where resource_type_id=2 and metadata_field_id=3 and metadata_value_id=2742061;
|
||||
</code></pre></description>
|
||||
</item>
|
||||
|
||||
<item>
|
||||
|
@ -372,8 +372,18 @@ COPY 1968
|
||||
</code></pre>
|
||||
|
||||
<ul>
|
||||
<li>And then using awk or uniq to either remove or print the lines that have a duplicate <code>resource_id</code> (meaning they belong to the same item in DSpace and are therefore duplicates), and then using the <code>metadata_value_id</code> to delete them</li>
|
||||
</ul></description>
|
||||
<li>And then use awk to print the duplicate lines to a separate file:</li>
|
||||
</ul>
|
||||
|
||||
<pre><code>$ awk -F',' 'seen[$1]++' /tmp/ciat.csv &gt; /tmp/ciat-dupes.csv
|
||||
</code></pre>
|
||||
|
||||
<ul>
|
||||
<li>From that file I can create a list of 279 deletes and put them in a batch script like:</li>
|
||||
</ul>
|
||||
|
||||
<pre><code>delete from metadatavalue where resource_type_id=2 and metadata_field_id=3 and metadata_value_id=2742061;
|
||||
</code></pre></description>
|
||||
</item>
|
||||
|
||||
<item>
|
||||
|
@ -371,8 +371,18 @@ COPY 1968
|
||||
</code></pre>
|
||||
|
||||
<ul>
|
||||
<li>And then using awk or uniq to either remove or print the lines that have a duplicate <code>resource_id</code> (meaning they belong to the same item in DSpace and are therefore duplicates), and then using the <code>metadata_value_id</code> to delete them</li>
|
||||
</ul></description>
|
||||
<li>And then use awk to print the duplicate lines to a separate file:</li>
|
||||
</ul>
|
||||
|
||||
<pre><code>$ awk -F',' 'seen[$1]++' /tmp/ciat.csv &gt; /tmp/ciat-dupes.csv
|
||||
</code></pre>
|
||||
|
||||
<ul>
|
||||
<li>From that file I can create a list of 279 deletes and put them in a batch script like:</li>
|
||||
</ul>
|
||||
|
||||
<pre><code>delete from metadatavalue where resource_type_id=2 and metadata_field_id=3 and metadata_value_id=2742061;
|
||||
</code></pre></description>
|
||||
</item>
|
||||
|
||||
<item>
|
||||
|
Loading…
Reference in New Issue
Block a user