mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-11-23 15:10:20 +01:00
Compare commits
2 Commits
17a241de5b
...
264cdcf1db
Author | SHA1 | Date | |
---|---|---|---|
264cdcf1db | |||
293b500b26 |
@ -53,7 +53,7 @@ categories: ["Notes"]
|
|||||||
- In the past I've found their _licensing_ information to not be very reliable (preferring Crossref), but I think their _open access status_ is more reliable, especially when the provider is listed as being the publisher
|
- In the past I've found their _licensing_ information to not be very reliable (preferring Crossref), but I think their _open access status_ is more reliable, especially when the provider is listed as being the publisher
|
||||||
- Even so, sometimes the version can be "acceptedVersion", which is presumably the author's version, as opposed to the "publishedVersion", which means it's available as open access on the publisher's website
|
- Even so, sometimes the version can be "acceptedVersion", which is presumably the author's version, as opposed to the "publishedVersion", which means it's available as open access on the publisher's website
|
||||||
- I did some quality assurance and found ~100 that were marked as Limited Access, but should have been Open Access, and fixed a handful of licenses
|
- I did some quality assurance and found ~100 that were marked as Limited Access, but should have been Open Access, and fixed a handful of licenses
|
||||||
- Delete duplicate metadata as describe in my DSpace issue from last year: https://github.com/DSpace/DSpace/issues/8253
|
- Delete duplicate metadata as described in my DSpace issue from last year: https://github.com/DSpace/DSpace/issues/8253
|
||||||
- Start working on some statistics on AGROVOC usage for my presenation next week
|
- Start working on some statistics on AGROVOC usage for my presenation next week
|
||||||
- I used the following SQL query to dump values from all subject fields and lower case them:
|
- I used the following SQL query to dump values from all subject fields and lower case them:
|
||||||
|
|
||||||
|
@ -185,4 +185,87 @@ dspace=*# COMMIT;
|
|||||||
COMMIT
|
COMMIT
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## 2023-12-25
|
||||||
|
|
||||||
|
- Looking into [Solr backups](https://solr.apache.org/guide/8_11/making-and-restoring-backups.html)
|
||||||
|
- Since we are not running in Solr Cloud mode we need to use the replication endpoint for Solr standalone
|
||||||
|
- This works:
|
||||||
|
|
||||||
|
```console
|
||||||
|
$ curl 'http://localhost:8983/solr/statistics/replication?command=backup'
|
||||||
|
{
|
||||||
|
"responseHeader":{
|
||||||
|
"status":0,
|
||||||
|
"QTime":26},
|
||||||
|
"status":"OK"}
|
||||||
|
```
|
||||||
|
|
||||||
|
- Then I saw the size of the snapshot reach the size of the index...
|
||||||
|
|
||||||
|
```console
|
||||||
|
# du -sh /var/solr/data/configsets/statistics/data/*
|
||||||
|
22G /var/solr/data/configsets/statistics/data/index
|
||||||
|
16G /var/solr/data/configsets/statistics/data/snapshot.20231225074111671
|
||||||
|
4.0K /var/solr/data/configsets/statistics/data/snapshot_metadata
|
||||||
|
# du -sh /var/solr/data/configsets/statistics/data/*
|
||||||
|
22G /var/solr/data/configsets/statistics/data/index
|
||||||
|
20G /var/solr/data/configsets/statistics/data/snapshot.20231225074111671
|
||||||
|
4.0K /var/solr/data/configsets/statistics/data/snapshot_metadata
|
||||||
|
# du -sh /var/solr/data/configsets/statistics/data/*
|
||||||
|
22G /var/solr/data/configsets/statistics/data/index
|
||||||
|
21G /var/solr/data/configsets/statistics/data/snapshot.20231225074111671
|
||||||
|
4.0K /var/solr/data/configsets/statistics/data/snapshot_metadata
|
||||||
|
# du -sh /var/solr/data/configsets/statistics/data/*
|
||||||
|
22G /var/solr/data/configsets/statistics/data/index
|
||||||
|
22G /var/solr/data/configsets/statistics/data/snapshot.20231225074111671
|
||||||
|
4.0K /var/solr/data/configsets/statistics/data/snapshot_metadata
|
||||||
|
```
|
||||||
|
|
||||||
|
- Then I deleted the core and restored from the snapshot backup:
|
||||||
|
|
||||||
|
```console
|
||||||
|
$ curl http://localhost:8983/solr/statistics/update -H "Content-type: text/xml" --data-binary '<delete><query>*:*</query></delete>'
|
||||||
|
$ curl http://localhost:8983/solr/statistics/update -H "Content-type: text/xml" --data-binary '<commit />'
|
||||||
|
$ curl 'http://localhost:8983/solr/statistics/replication?command=restore&name=statistics'
|
||||||
|
```
|
||||||
|
|
||||||
|
- Interestingly the import worked fine, but created a new data index:
|
||||||
|
|
||||||
|
```console
|
||||||
|
# du -sh /var/solr/data/configsets/statistics/data/*
|
||||||
|
4.0K /var/solr/data/configsets/statistics/data/index.properties
|
||||||
|
22G /var/solr/data/configsets/statistics/data/restore.20231225154626463
|
||||||
|
4.0K /var/solr/data/configsets/statistics/data/snapshot_metadata
|
||||||
|
22G /var/solr/data/configsets/statistics/data/snapshot.statistics
|
||||||
|
```
|
||||||
|
|
||||||
|
- Not sure the implications of that—Solr uses the data just fine
|
||||||
|
- I can surely use this for atomic Solr backups
|
||||||
|
|
||||||
|
## 2023-12-27
|
||||||
|
|
||||||
|
- Delete duplicate metadata as described in my DSpace issue from last year: https://github.com/DSpace/DSpace/issues/8253
|
||||||
|
- Do some other metadata cleanups on CGSpace
|
||||||
|
- I also looked up our DOIs on Crossref to get some missing abstracts and correct licenses and dates
|
||||||
|
- Some minor work on the CGSpace DSpace 7 theme to fix the navbar on mobile
|
||||||
|
- Some work on the IFPRI ISNAR archive
|
||||||
|
|
||||||
|
## 2023-12-28
|
||||||
|
|
||||||
|
- I started porting the [cgspace-java-helpers](https://github.com/ilri/cgspace-java-helpers) to DSpace 7
|
||||||
|
- Some work on the IFPRI ISNAR archive
|
||||||
|
- I ended up going through most of the PDFs to get better dates and abstracts
|
||||||
|
|
||||||
|
## 2023-12-29
|
||||||
|
|
||||||
|
- I created a new Hetzner server to replace the current DSpace 6 CGSpace next week when we migrate to DSpace 7
|
||||||
|
- Interesting, I haven't checked for content pointing to legacy domains in several years (!)
|
||||||
|
- `inurl:mahider.cgiar.org`: 0 results on Google!
|
||||||
|
- `inurl:mahider.ilri.org`: 2,100 results on Google
|
||||||
|
- `inurl:mahider.ilri.org inurl:https`: 2 results on Google (!)
|
||||||
|
- `inurl:dspace.ilri.org:` 1,390 results on Google
|
||||||
|
- `inurl:dspace.ilri.org inurl:https`: 0 results on Google (!)
|
||||||
|
- So it seems I can do away with the HTTPS virtual hosts finally
|
||||||
|
- Well my current certificates expired on 2021-02-13 and nobody noticed... so...
|
||||||
|
|
||||||
<!-- vim: set sw=2 ts=2: -->
|
<!-- vim: set sw=2 ts=2: -->
|
||||||
|
@ -7,17 +7,17 @@
|
|||||||
|
|
||||||
|
|
||||||
<meta property="og:title" content="July, 2023" />
|
<meta property="og:title" content="July, 2023" />
|
||||||
<meta property="og:description" content="2023-07-01 Export CGSpace to check for missing Initiative collection mappings Start harvesting on AReS 2023-07-02 Minor edits to the crossref_doi_lookup.py script while running some checks from 22,000 CGSpace DOIs 2023-07-03 I analyzed the licenses declared by Crossref and found with high confidence that ~400 of ours were incorrect I took the more accurate ones from Crossref and updated the items on CGSpace I took a few hundred ISBNs as well for where we were missing them I also tagged ~4,700 items with missing licenses as “Copyrighted; all rights reserved” based on their Crossref license status being TDM, mostly from Elsevier, Wiley, and Springer Checking a dozen or so manually, I confirmed that if Crossref only has a TDM license then it’s usually copyrighted (could still be open access, but we can’t tell via Crossref) I would be curious to write a script to check the Unpaywall API for open access status… In the past I found that their license status was not very accurate, but the open access status might be more reliable More minor work on the DSpace 7 item views I learned some new Angular template syntax I created a custom component to show Creative Commons licenses on the simple item page I also decided that I don’t like the Impact Area icons as a component because they don’t have any visual meaning 2023-07-04 Focus group meeting with CGSpace partners about DSpace 7 I added a themed file selection component to the CGSpace theme It displays the bistream description instead of the file name, just like we did in DSpace 6 XMLUI I added a custom component to show share icons 2023-07-05 I spent some time trying to update OpenRXV from Angular 9 to 10 to 11 to 12 to 13 Most things work but there are some minor bugs it seems Mishell from CIP emailed me to say she was having problems approving an item on CGSpace Looking at PostgreSQL I saw there were a dozen or so locks that were several hours and even over one day old so I killed those processes and told her to try again 2023-07-06 Types meeting I wrote a Python script to check Unpaywall for some information about DOIs 2023-07-7 Continue exploring Unpaywall data for some of our DOIs In the past I’ve found their licensing information to not be very reliable (preferring Crossref), but I think their open access status is more reliable, especially when the provider is listed as being the publisher Even so, sometimes the version can be “acceptedVersion”, which is presumably the author’s version, as opposed to the “publishedVersion”, which means it’s available as open access on the publisher’s website I did some quality assurance and found ~100 that were marked as Limited Access, but should have been Open Access, and fixed a handful of licenses Delete duplicate metadata as describe in my DSpace issue from last year: https://github." />
|
<meta property="og:description" content="2023-07-01 Export CGSpace to check for missing Initiative collection mappings Start harvesting on AReS 2023-07-02 Minor edits to the crossref_doi_lookup.py script while running some checks from 22,000 CGSpace DOIs 2023-07-03 I analyzed the licenses declared by Crossref and found with high confidence that ~400 of ours were incorrect I took the more accurate ones from Crossref and updated the items on CGSpace I took a few hundred ISBNs as well for where we were missing them I also tagged ~4,700 items with missing licenses as “Copyrighted; all rights reserved” based on their Crossref license status being TDM, mostly from Elsevier, Wiley, and Springer Checking a dozen or so manually, I confirmed that if Crossref only has a TDM license then it’s usually copyrighted (could still be open access, but we can’t tell via Crossref) I would be curious to write a script to check the Unpaywall API for open access status… In the past I found that their license status was not very accurate, but the open access status might be more reliable More minor work on the DSpace 7 item views I learned some new Angular template syntax I created a custom component to show Creative Commons licenses on the simple item page I also decided that I don’t like the Impact Area icons as a component because they don’t have any visual meaning 2023-07-04 Focus group meeting with CGSpace partners about DSpace 7 I added a themed file selection component to the CGSpace theme It displays the bistream description instead of the file name, just like we did in DSpace 6 XMLUI I added a custom component to show share icons 2023-07-05 I spent some time trying to update OpenRXV from Angular 9 to 10 to 11 to 12 to 13 Most things work but there are some minor bugs it seems Mishell from CIP emailed me to say she was having problems approving an item on CGSpace Looking at PostgreSQL I saw there were a dozen or so locks that were several hours and even over one day old so I killed those processes and told her to try again 2023-07-06 Types meeting I wrote a Python script to check Unpaywall for some information about DOIs 2023-07-7 Continue exploring Unpaywall data for some of our DOIs In the past I’ve found their licensing information to not be very reliable (preferring Crossref), but I think their open access status is more reliable, especially when the provider is listed as being the publisher Even so, sometimes the version can be “acceptedVersion”, which is presumably the author’s version, as opposed to the “publishedVersion”, which means it’s available as open access on the publisher’s website I did some quality assurance and found ~100 that were marked as Limited Access, but should have been Open Access, and fixed a handful of licenses Delete duplicate metadata as described in my DSpace issue from last year: https://github." />
|
||||||
<meta property="og:type" content="article" />
|
<meta property="og:type" content="article" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2023-07/" />
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2023-07/" />
|
||||||
<meta property="article:published_time" content="2023-07-01T17:14:36+03:00" />
|
<meta property="article:published_time" content="2023-07-01T17:14:36+03:00" />
|
||||||
<meta property="article:modified_time" content="2023-08-02T23:04:11+03:00" />
|
<meta property="article:modified_time" content="2023-12-27T10:48:32+03:00" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
<meta name="twitter:card" content="summary"/>
|
<meta name="twitter:card" content="summary"/>
|
||||||
<meta name="twitter:title" content="July, 2023"/>
|
<meta name="twitter:title" content="July, 2023"/>
|
||||||
<meta name="twitter:description" content="2023-07-01 Export CGSpace to check for missing Initiative collection mappings Start harvesting on AReS 2023-07-02 Minor edits to the crossref_doi_lookup.py script while running some checks from 22,000 CGSpace DOIs 2023-07-03 I analyzed the licenses declared by Crossref and found with high confidence that ~400 of ours were incorrect I took the more accurate ones from Crossref and updated the items on CGSpace I took a few hundred ISBNs as well for where we were missing them I also tagged ~4,700 items with missing licenses as “Copyrighted; all rights reserved” based on their Crossref license status being TDM, mostly from Elsevier, Wiley, and Springer Checking a dozen or so manually, I confirmed that if Crossref only has a TDM license then it’s usually copyrighted (could still be open access, but we can’t tell via Crossref) I would be curious to write a script to check the Unpaywall API for open access status… In the past I found that their license status was not very accurate, but the open access status might be more reliable More minor work on the DSpace 7 item views I learned some new Angular template syntax I created a custom component to show Creative Commons licenses on the simple item page I also decided that I don’t like the Impact Area icons as a component because they don’t have any visual meaning 2023-07-04 Focus group meeting with CGSpace partners about DSpace 7 I added a themed file selection component to the CGSpace theme It displays the bistream description instead of the file name, just like we did in DSpace 6 XMLUI I added a custom component to show share icons 2023-07-05 I spent some time trying to update OpenRXV from Angular 9 to 10 to 11 to 12 to 13 Most things work but there are some minor bugs it seems Mishell from CIP emailed me to say she was having problems approving an item on CGSpace Looking at PostgreSQL I saw there were a dozen or so locks that were several hours and even over one day old so I killed those processes and told her to try again 2023-07-06 Types meeting I wrote a Python script to check Unpaywall for some information about DOIs 2023-07-7 Continue exploring Unpaywall data for some of our DOIs In the past I’ve found their licensing information to not be very reliable (preferring Crossref), but I think their open access status is more reliable, especially when the provider is listed as being the publisher Even so, sometimes the version can be “acceptedVersion”, which is presumably the author’s version, as opposed to the “publishedVersion”, which means it’s available as open access on the publisher’s website I did some quality assurance and found ~100 that were marked as Limited Access, but should have been Open Access, and fixed a handful of licenses Delete duplicate metadata as describe in my DSpace issue from last year: https://github."/>
|
<meta name="twitter:description" content="2023-07-01 Export CGSpace to check for missing Initiative collection mappings Start harvesting on AReS 2023-07-02 Minor edits to the crossref_doi_lookup.py script while running some checks from 22,000 CGSpace DOIs 2023-07-03 I analyzed the licenses declared by Crossref and found with high confidence that ~400 of ours were incorrect I took the more accurate ones from Crossref and updated the items on CGSpace I took a few hundred ISBNs as well for where we were missing them I also tagged ~4,700 items with missing licenses as “Copyrighted; all rights reserved” based on their Crossref license status being TDM, mostly from Elsevier, Wiley, and Springer Checking a dozen or so manually, I confirmed that if Crossref only has a TDM license then it’s usually copyrighted (could still be open access, but we can’t tell via Crossref) I would be curious to write a script to check the Unpaywall API for open access status… In the past I found that their license status was not very accurate, but the open access status might be more reliable More minor work on the DSpace 7 item views I learned some new Angular template syntax I created a custom component to show Creative Commons licenses on the simple item page I also decided that I don’t like the Impact Area icons as a component because they don’t have any visual meaning 2023-07-04 Focus group meeting with CGSpace partners about DSpace 7 I added a themed file selection component to the CGSpace theme It displays the bistream description instead of the file name, just like we did in DSpace 6 XMLUI I added a custom component to show share icons 2023-07-05 I spent some time trying to update OpenRXV from Angular 9 to 10 to 11 to 12 to 13 Most things work but there are some minor bugs it seems Mishell from CIP emailed me to say she was having problems approving an item on CGSpace Looking at PostgreSQL I saw there were a dozen or so locks that were several hours and even over one day old so I killed those processes and told her to try again 2023-07-06 Types meeting I wrote a Python script to check Unpaywall for some information about DOIs 2023-07-7 Continue exploring Unpaywall data for some of our DOIs In the past I’ve found their licensing information to not be very reliable (preferring Crossref), but I think their open access status is more reliable, especially when the provider is listed as being the publisher Even so, sometimes the version can be “acceptedVersion”, which is presumably the author’s version, as opposed to the “publishedVersion”, which means it’s available as open access on the publisher’s website I did some quality assurance and found ~100 that were marked as Limited Access, but should have been Open Access, and fixed a handful of licenses Delete duplicate metadata as described in my DSpace issue from last year: https://github."/>
|
||||||
<meta name="generator" content="Hugo 0.121.1">
|
<meta name="generator" content="Hugo 0.121.1">
|
||||||
|
|
||||||
|
|
||||||
@ -30,7 +30,7 @@
|
|||||||
"url": "https://alanorth.github.io/cgspace-notes/2023-07/",
|
"url": "https://alanorth.github.io/cgspace-notes/2023-07/",
|
||||||
"wordCount": "2255",
|
"wordCount": "2255",
|
||||||
"datePublished": "2023-07-01T17:14:36+03:00",
|
"datePublished": "2023-07-01T17:14:36+03:00",
|
||||||
"dateModified": "2023-08-02T23:04:11+03:00",
|
"dateModified": "2023-12-27T10:48:32+03:00",
|
||||||
"author": {
|
"author": {
|
||||||
"@type": "Person",
|
"@type": "Person",
|
||||||
"name": "Alan Orth"
|
"name": "Alan Orth"
|
||||||
@ -170,7 +170,7 @@
|
|||||||
<li>I did some quality assurance and found ~100 that were marked as Limited Access, but should have been Open Access, and fixed a handful of licenses</li>
|
<li>I did some quality assurance and found ~100 that were marked as Limited Access, but should have been Open Access, and fixed a handful of licenses</li>
|
||||||
</ul>
|
</ul>
|
||||||
</li>
|
</li>
|
||||||
<li>Delete duplicate metadata as describe in my DSpace issue from last year: <a href="https://github.com/DSpace/DSpace/issues/8253">https://github.com/DSpace/DSpace/issues/8253</a></li>
|
<li>Delete duplicate metadata as described in my DSpace issue from last year: <a href="https://github.com/DSpace/DSpace/issues/8253">https://github.com/DSpace/DSpace/issues/8253</a></li>
|
||||||
<li>Start working on some statistics on AGROVOC usage for my presenation next week
|
<li>Start working on some statistics on AGROVOC usage for my presenation next week
|
||||||
<ul>
|
<ul>
|
||||||
<li>I used the following SQL query to dump values from all subject fields and lower case them:</li>
|
<li>I used the following SQL query to dump values from all subject fields and lower case them:</li>
|
||||||
|
@ -11,7 +11,7 @@
|
|||||||
<meta property="og:type" content="article" />
|
<meta property="og:type" content="article" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2023-12/" />
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2023-12/" />
|
||||||
<meta property="article:published_time" content="2023-12-01T08:48:36+03:00" />
|
<meta property="article:published_time" content="2023-12-01T08:48:36+03:00" />
|
||||||
<meta property="article:modified_time" content="2023-12-18T23:15:27+03:00" />
|
<meta property="article:modified_time" content="2023-12-21T10:08:59+03:00" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@ -28,9 +28,9 @@
|
|||||||
"@type": "BlogPosting",
|
"@type": "BlogPosting",
|
||||||
"headline": "December, 2023",
|
"headline": "December, 2023",
|
||||||
"url": "https://alanorth.github.io/cgspace-notes/2023-12/",
|
"url": "https://alanorth.github.io/cgspace-notes/2023-12/",
|
||||||
"wordCount": "980",
|
"wordCount": "1323",
|
||||||
"datePublished": "2023-12-01T08:48:36+03:00",
|
"datePublished": "2023-12-01T08:48:36+03:00",
|
||||||
"dateModified": "2023-12-18T23:15:27+03:00",
|
"dateModified": "2023-12-21T10:08:59+03:00",
|
||||||
"author": {
|
"author": {
|
||||||
"@type": "Person",
|
"@type": "Person",
|
||||||
"name": "Alan Orth"
|
"name": "Alan Orth"
|
||||||
@ -296,7 +296,97 @@
|
|||||||
</span></span><span style="display:flex;"><span>UPDATE 462
|
</span></span><span style="display:flex;"><span>UPDATE 462
|
||||||
</span></span><span style="display:flex;"><span>dspace=*# COMMIT;
|
</span></span><span style="display:flex;"><span>dspace=*# COMMIT;
|
||||||
</span></span><span style="display:flex;"><span>COMMIT
|
</span></span><span style="display:flex;"><span>COMMIT
|
||||||
</span></span></code></pre></div><!-- raw HTML omitted -->
|
</span></span></code></pre></div><h2 id="2023-12-25">2023-12-25</h2>
|
||||||
|
<ul>
|
||||||
|
<li>Looking into <a href="https://solr.apache.org/guide/8_11/making-and-restoring-backups.html">Solr backups</a>
|
||||||
|
<ul>
|
||||||
|
<li>Since we are not running in Solr Cloud mode we need to use the replication endpoint for Solr standalone</li>
|
||||||
|
<li>This works:</li>
|
||||||
|
</ul>
|
||||||
|
</li>
|
||||||
|
</ul>
|
||||||
|
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl <span style="color:#e6db74">'http://localhost:8983/solr/statistics/replication?command=backup'</span>
|
||||||
|
</span></span><span style="display:flex;"><span>{
|
||||||
|
</span></span><span style="display:flex;"><span> "responseHeader":{
|
||||||
|
</span></span><span style="display:flex;"><span> "status":0,
|
||||||
|
</span></span><span style="display:flex;"><span> "QTime":26},
|
||||||
|
</span></span><span style="display:flex;"><span> "status":"OK"}
|
||||||
|
</span></span></code></pre></div><ul>
|
||||||
|
<li>Then I saw the size of the snapshot reach the size of the index…</li>
|
||||||
|
</ul>
|
||||||
|
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># du -sh /var/solr/data/configsets/statistics/data/*
|
||||||
|
</span></span><span style="display:flex;"><span>22G /var/solr/data/configsets/statistics/data/index
|
||||||
|
</span></span><span style="display:flex;"><span>16G /var/solr/data/configsets/statistics/data/snapshot.20231225074111671
|
||||||
|
</span></span><span style="display:flex;"><span>4.0K /var/solr/data/configsets/statistics/data/snapshot_metadata
|
||||||
|
</span></span><span style="display:flex;"><span># du -sh /var/solr/data/configsets/statistics/data/*
|
||||||
|
</span></span><span style="display:flex;"><span>22G /var/solr/data/configsets/statistics/data/index
|
||||||
|
</span></span><span style="display:flex;"><span>20G /var/solr/data/configsets/statistics/data/snapshot.20231225074111671
|
||||||
|
</span></span><span style="display:flex;"><span>4.0K /var/solr/data/configsets/statistics/data/snapshot_metadata
|
||||||
|
</span></span><span style="display:flex;"><span># du -sh /var/solr/data/configsets/statistics/data/*
|
||||||
|
</span></span><span style="display:flex;"><span>22G /var/solr/data/configsets/statistics/data/index
|
||||||
|
</span></span><span style="display:flex;"><span>21G /var/solr/data/configsets/statistics/data/snapshot.20231225074111671
|
||||||
|
</span></span><span style="display:flex;"><span>4.0K /var/solr/data/configsets/statistics/data/snapshot_metadata
|
||||||
|
</span></span><span style="display:flex;"><span># du -sh /var/solr/data/configsets/statistics/data/*
|
||||||
|
</span></span><span style="display:flex;"><span>22G /var/solr/data/configsets/statistics/data/index
|
||||||
|
</span></span><span style="display:flex;"><span>22G /var/solr/data/configsets/statistics/data/snapshot.20231225074111671
|
||||||
|
</span></span><span style="display:flex;"><span>4.0K /var/solr/data/configsets/statistics/data/snapshot_metadata
|
||||||
|
</span></span></code></pre></div><ul>
|
||||||
|
<li>Then I deleted the core and restored from the snapshot backup:</li>
|
||||||
|
</ul>
|
||||||
|
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span>$ curl http://localhost:8983/solr/statistics/update -H <span style="color:#e6db74">"Content-type: text/xml"</span> --data-binary <span style="color:#e6db74">'<delete><query>*:*</query></delete>'</span>
|
||||||
|
</span></span><span style="display:flex;"><span>$ curl http://localhost:8983/solr/statistics/update -H <span style="color:#e6db74">"Content-type: text/xml"</span> --data-binary <span style="color:#e6db74">'<commit />'</span>
|
||||||
|
</span></span><span style="display:flex;"><span>$ curl <span style="color:#e6db74">'http://localhost:8983/solr/statistics/replication?command=restore&name=statistics'</span>
|
||||||
|
</span></span></code></pre></div><ul>
|
||||||
|
<li>Interestingly the import worked fine, but created a new data index:</li>
|
||||||
|
</ul>
|
||||||
|
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-console" data-lang="console"><span style="display:flex;"><span># du -sh /var/solr/data/configsets/statistics/data/*
|
||||||
|
</span></span><span style="display:flex;"><span>4.0K /var/solr/data/configsets/statistics/data/index.properties
|
||||||
|
</span></span><span style="display:flex;"><span>22G /var/solr/data/configsets/statistics/data/restore.20231225154626463
|
||||||
|
</span></span><span style="display:flex;"><span>4.0K /var/solr/data/configsets/statistics/data/snapshot_metadata
|
||||||
|
</span></span><span style="display:flex;"><span>22G /var/solr/data/configsets/statistics/data/snapshot.statistics
|
||||||
|
</span></span></code></pre></div><ul>
|
||||||
|
<li>Not sure the implications of that—Solr uses the data just fine</li>
|
||||||
|
<li>I can surely use this for atomic Solr backups</li>
|
||||||
|
</ul>
|
||||||
|
<h2 id="2023-12-27">2023-12-27</h2>
|
||||||
|
<ul>
|
||||||
|
<li>Delete duplicate metadata as described in my DSpace issue from last year: <a href="https://github.com/DSpace/DSpace/issues/8253">https://github.com/DSpace/DSpace/issues/8253</a></li>
|
||||||
|
<li>Do some other metadata cleanups on CGSpace
|
||||||
|
<ul>
|
||||||
|
<li>I also looked up our DOIs on Crossref to get some missing abstracts and correct licenses and dates</li>
|
||||||
|
</ul>
|
||||||
|
</li>
|
||||||
|
<li>Some minor work on the CGSpace DSpace 7 theme to fix the navbar on mobile</li>
|
||||||
|
<li>Some work on the IFPRI ISNAR archive</li>
|
||||||
|
</ul>
|
||||||
|
<h2 id="2023-12-28">2023-12-28</h2>
|
||||||
|
<ul>
|
||||||
|
<li>I started porting the <a href="https://github.com/ilri/cgspace-java-helpers">cgspace-java-helpers</a> to DSpace 7</li>
|
||||||
|
<li>Some work on the IFPRI ISNAR archive
|
||||||
|
<ul>
|
||||||
|
<li>I ended up going through most of the PDFs to get better dates and abstracts</li>
|
||||||
|
</ul>
|
||||||
|
</li>
|
||||||
|
</ul>
|
||||||
|
<h2 id="2023-12-29">2023-12-29</h2>
|
||||||
|
<ul>
|
||||||
|
<li>I created a new Hetzner server to replace the current DSpace 6 CGSpace next week when we migrate to DSpace 7</li>
|
||||||
|
<li>Interesting, I haven’t checked for content pointing to legacy domains in several years (!)
|
||||||
|
<ul>
|
||||||
|
<li><code>inurl:mahider.cgiar.org</code>: 0 results on Google!</li>
|
||||||
|
<li><code>inurl:mahider.ilri.org</code>: 2,100 results on Google</li>
|
||||||
|
<li><code>inurl:mahider.ilri.org inurl:https</code>: 2 results on Google (!)</li>
|
||||||
|
<li><code>inurl:dspace.ilri.org:</code> 1,390 results on Google</li>
|
||||||
|
<li><code>inurl:dspace.ilri.org inurl:https</code>: 0 results on Google (!)</li>
|
||||||
|
</ul>
|
||||||
|
</li>
|
||||||
|
<li>So it seems I can do away with the HTTPS virtual hosts finally
|
||||||
|
<ul>
|
||||||
|
<li>Well my current certificates expired on 2021-02-13 and nobody noticed… so…</li>
|
||||||
|
</ul>
|
||||||
|
</li>
|
||||||
|
</ul>
|
||||||
|
<!-- raw HTML omitted -->
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||||
<meta property="og:type" content="website" />
|
<meta property="og:type" content="website" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/" />
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/" />
|
||||||
<meta property="og:updated_time" content="2023-12-18T23:15:27+03:00" />
|
<meta property="og:updated_time" content="2023-12-27T10:48:32+03:00" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||||
<meta property="og:type" content="website" />
|
<meta property="og:type" content="website" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
||||||
<meta property="og:updated_time" content="2023-12-18T23:15:27+03:00" />
|
<meta property="og:updated_time" content="2023-12-27T10:48:32+03:00" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@ -213,7 +213,7 @@
|
|||||||
|
|
||||||
</p>
|
</p>
|
||||||
</header>
|
</header>
|
||||||
2023-07-01 Export CGSpace to check for missing Initiative collection mappings Start harvesting on AReS 2023-07-02 Minor edits to the crossref_doi_lookup.py script while running some checks from 22,000 CGSpace DOIs 2023-07-03 I analyzed the licenses declared by Crossref and found with high confidence that ~400 of ours were incorrect I took the more accurate ones from Crossref and updated the items on CGSpace I took a few hundred ISBNs as well for where we were missing them I also tagged ~4,700 items with missing licenses as “Copyrighted; all rights reserved” based on their Crossref license status being TDM, mostly from Elsevier, Wiley, and Springer Checking a dozen or so manually, I confirmed that if Crossref only has a TDM license then it’s usually copyrighted (could still be open access, but we can’t tell via Crossref) I would be curious to write a script to check the Unpaywall API for open access status… In the past I found that their license status was not very accurate, but the open access status might be more reliable More minor work on the DSpace 7 item views I learned some new Angular template syntax I created a custom component to show Creative Commons licenses on the simple item page I also decided that I don’t like the Impact Area icons as a component because they don’t have any visual meaning 2023-07-04 Focus group meeting with CGSpace partners about DSpace 7 I added a themed file selection component to the CGSpace theme It displays the bistream description instead of the file name, just like we did in DSpace 6 XMLUI I added a custom component to show share icons 2023-07-05 I spent some time trying to update OpenRXV from Angular 9 to 10 to 11 to 12 to 13 Most things work but there are some minor bugs it seems Mishell from CIP emailed me to say she was having problems approving an item on CGSpace Looking at PostgreSQL I saw there were a dozen or so locks that were several hours and even over one day old so I killed those processes and told her to try again 2023-07-06 Types meeting I wrote a Python script to check Unpaywall for some information about DOIs 2023-07-7 Continue exploring Unpaywall data for some of our DOIs In the past I’ve found their licensing information to not be very reliable (preferring Crossref), but I think their open access status is more reliable, especially when the provider is listed as being the publisher Even so, sometimes the version can be “acceptedVersion”, which is presumably the author’s version, as opposed to the “publishedVersion”, which means it’s available as open access on the publisher’s website I did some quality assurance and found ~100 that were marked as Limited Access, but should have been Open Access, and fixed a handful of licenses Delete duplicate metadata as describe in my DSpace issue from last year: https://github.
|
2023-07-01 Export CGSpace to check for missing Initiative collection mappings Start harvesting on AReS 2023-07-02 Minor edits to the crossref_doi_lookup.py script while running some checks from 22,000 CGSpace DOIs 2023-07-03 I analyzed the licenses declared by Crossref and found with high confidence that ~400 of ours were incorrect I took the more accurate ones from Crossref and updated the items on CGSpace I took a few hundred ISBNs as well for where we were missing them I also tagged ~4,700 items with missing licenses as “Copyrighted; all rights reserved” based on their Crossref license status being TDM, mostly from Elsevier, Wiley, and Springer Checking a dozen or so manually, I confirmed that if Crossref only has a TDM license then it’s usually copyrighted (could still be open access, but we can’t tell via Crossref) I would be curious to write a script to check the Unpaywall API for open access status… In the past I found that their license status was not very accurate, but the open access status might be more reliable More minor work on the DSpace 7 item views I learned some new Angular template syntax I created a custom component to show Creative Commons licenses on the simple item page I also decided that I don’t like the Impact Area icons as a component because they don’t have any visual meaning 2023-07-04 Focus group meeting with CGSpace partners about DSpace 7 I added a themed file selection component to the CGSpace theme It displays the bistream description instead of the file name, just like we did in DSpace 6 XMLUI I added a custom component to show share icons 2023-07-05 I spent some time trying to update OpenRXV from Angular 9 to 10 to 11 to 12 to 13 Most things work but there are some minor bugs it seems Mishell from CIP emailed me to say she was having problems approving an item on CGSpace Looking at PostgreSQL I saw there were a dozen or so locks that were several hours and even over one day old so I killed those processes and told her to try again 2023-07-06 Types meeting I wrote a Python script to check Unpaywall for some information about DOIs 2023-07-7 Continue exploring Unpaywall data for some of our DOIs In the past I’ve found their licensing information to not be very reliable (preferring Crossref), but I think their open access status is more reliable, especially when the provider is listed as being the publisher Even so, sometimes the version can be “acceptedVersion”, which is presumably the author’s version, as opposed to the “publishedVersion”, which means it’s available as open access on the publisher’s website I did some quality assurance and found ~100 that were marked as Limited Access, but should have been Open Access, and fixed a handful of licenses Delete duplicate metadata as described in my DSpace issue from last year: https://github.
|
||||||
<a href='https://alanorth.github.io/cgspace-notes/2023-07/'>Read more →</a>
|
<a href='https://alanorth.github.io/cgspace-notes/2023-07/'>Read more →</a>
|
||||||
</article>
|
</article>
|
||||||
|
|
||||||
|
@ -48,7 +48,7 @@
|
|||||||
<link>https://alanorth.github.io/cgspace-notes/2023-07/</link>
|
<link>https://alanorth.github.io/cgspace-notes/2023-07/</link>
|
||||||
<pubDate>Sat, 01 Jul 2023 17:14:36 +0300</pubDate>
|
<pubDate>Sat, 01 Jul 2023 17:14:36 +0300</pubDate>
|
||||||
<guid>https://alanorth.github.io/cgspace-notes/2023-07/</guid>
|
<guid>https://alanorth.github.io/cgspace-notes/2023-07/</guid>
|
||||||
<description>2023-07-01 Export CGSpace to check for missing Initiative collection mappings Start harvesting on AReS 2023-07-02 Minor edits to the crossref_doi_lookup.py script while running some checks from 22,000 CGSpace DOIs 2023-07-03 I analyzed the licenses declared by Crossref and found with high confidence that ~400 of ours were incorrect I took the more accurate ones from Crossref and updated the items on CGSpace I took a few hundred ISBNs as well for where we were missing them I also tagged ~4,700 items with missing licenses as &ldquo;Copyrighted; all rights reserved&rdquo; based on their Crossref license status being TDM, mostly from Elsevier, Wiley, and Springer Checking a dozen or so manually, I confirmed that if Crossref only has a TDM license then it&rsquo;s usually copyrighted (could still be open access, but we can&rsquo;t tell via Crossref) I would be curious to write a script to check the Unpaywall API for open access status&hellip; In the past I found that their license status was not very accurate, but the open access status might be more reliable More minor work on the DSpace 7 item views I learned some new Angular template syntax I created a custom component to show Creative Commons licenses on the simple item page I also decided that I don&rsquo;t like the Impact Area icons as a component because they don&rsquo;t have any visual meaning 2023-07-04 Focus group meeting with CGSpace partners about DSpace 7 I added a themed file selection component to the CGSpace theme It displays the bistream description instead of the file name, just like we did in DSpace 6 XMLUI I added a custom component to show share icons 2023-07-05 I spent some time trying to update OpenRXV from Angular 9 to 10 to 11 to 12 to 13 Most things work but there are some minor bugs it seems Mishell from CIP emailed me to say she was having problems approving an item on CGSpace Looking at PostgreSQL I saw there were a dozen or so locks that were several hours and even over one day old so I killed those processes and told her to try again 2023-07-06 Types meeting I wrote a Python script to check Unpaywall for some information about DOIs 2023-07-7 Continue exploring Unpaywall data for some of our DOIs In the past I&rsquo;ve found their licensing information to not be very reliable (preferring Crossref), but I think their open access status is more reliable, especially when the provider is listed as being the publisher Even so, sometimes the version can be &ldquo;acceptedVersion&rdquo;, which is presumably the author&rsquo;s version, as opposed to the &ldquo;publishedVersion&rdquo;, which means it&rsquo;s available as open access on the publisher&rsquo;s website I did some quality assurance and found ~100 that were marked as Limited Access, but should have been Open Access, and fixed a handful of licenses Delete duplicate metadata as describe in my DSpace issue from last year: https://github.</description>
|
<description>2023-07-01 Export CGSpace to check for missing Initiative collection mappings Start harvesting on AReS 2023-07-02 Minor edits to the crossref_doi_lookup.py script while running some checks from 22,000 CGSpace DOIs 2023-07-03 I analyzed the licenses declared by Crossref and found with high confidence that ~400 of ours were incorrect I took the more accurate ones from Crossref and updated the items on CGSpace I took a few hundred ISBNs as well for where we were missing them I also tagged ~4,700 items with missing licenses as &ldquo;Copyrighted; all rights reserved&rdquo; based on their Crossref license status being TDM, mostly from Elsevier, Wiley, and Springer Checking a dozen or so manually, I confirmed that if Crossref only has a TDM license then it&rsquo;s usually copyrighted (could still be open access, but we can&rsquo;t tell via Crossref) I would be curious to write a script to check the Unpaywall API for open access status&hellip; In the past I found that their license status was not very accurate, but the open access status might be more reliable More minor work on the DSpace 7 item views I learned some new Angular template syntax I created a custom component to show Creative Commons licenses on the simple item page I also decided that I don&rsquo;t like the Impact Area icons as a component because they don&rsquo;t have any visual meaning 2023-07-04 Focus group meeting with CGSpace partners about DSpace 7 I added a themed file selection component to the CGSpace theme It displays the bistream description instead of the file name, just like we did in DSpace 6 XMLUI I added a custom component to show share icons 2023-07-05 I spent some time trying to update OpenRXV from Angular 9 to 10 to 11 to 12 to 13 Most things work but there are some minor bugs it seems Mishell from CIP emailed me to say she was having problems approving an item on CGSpace Looking at PostgreSQL I saw there were a dozen or so locks that were several hours and even over one day old so I killed those processes and told her to try again 2023-07-06 Types meeting I wrote a Python script to check Unpaywall for some information about DOIs 2023-07-7 Continue exploring Unpaywall data for some of our DOIs In the past I&rsquo;ve found their licensing information to not be very reliable (preferring Crossref), but I think their open access status is more reliable, especially when the provider is listed as being the publisher Even so, sometimes the version can be &ldquo;acceptedVersion&rdquo;, which is presumably the author&rsquo;s version, as opposed to the &ldquo;publishedVersion&rdquo;, which means it&rsquo;s available as open access on the publisher&rsquo;s website I did some quality assurance and found ~100 that were marked as Limited Access, but should have been Open Access, and fixed a handful of licenses Delete duplicate metadata as described in my DSpace issue from last year: https://github.</description>
|
||||||
</item>
|
</item>
|
||||||
<item>
|
<item>
|
||||||
<title>June, 2023</title>
|
<title>June, 2023</title>
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||||
<meta property="og:type" content="website" />
|
<meta property="og:type" content="website" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
||||||
<meta property="og:updated_time" content="2023-12-18T23:15:27+03:00" />
|
<meta property="og:updated_time" content="2023-12-27T10:48:32+03:00" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||||
<meta property="og:type" content="website" />
|
<meta property="og:type" content="website" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
||||||
<meta property="og:updated_time" content="2023-12-18T23:15:27+03:00" />
|
<meta property="og:updated_time" content="2023-12-27T10:48:32+03:00" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||||
<meta property="og:type" content="website" />
|
<meta property="og:type" content="website" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
||||||
<meta property="og:updated_time" content="2023-12-18T23:15:27+03:00" />
|
<meta property="og:updated_time" content="2023-12-27T10:48:32+03:00" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||||
<meta property="og:type" content="website" />
|
<meta property="og:type" content="website" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
||||||
<meta property="og:updated_time" content="2023-12-18T23:15:27+03:00" />
|
<meta property="og:updated_time" content="2023-12-27T10:48:32+03:00" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||||
<meta property="og:type" content="website" />
|
<meta property="og:type" content="website" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
||||||
<meta property="og:updated_time" content="2023-12-18T23:15:27+03:00" />
|
<meta property="og:updated_time" content="2023-12-27T10:48:32+03:00" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||||
<meta property="og:type" content="website" />
|
<meta property="og:type" content="website" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
||||||
<meta property="og:updated_time" content="2023-12-18T23:15:27+03:00" />
|
<meta property="og:updated_time" content="2023-12-27T10:48:32+03:00" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||||
<meta property="og:type" content="website" />
|
<meta property="og:type" content="website" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
||||||
<meta property="og:updated_time" content="2023-12-18T23:15:27+03:00" />
|
<meta property="og:updated_time" content="2023-12-27T10:48:32+03:00" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||||
<meta property="og:type" content="website" />
|
<meta property="og:type" content="website" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
||||||
<meta property="og:updated_time" content="2023-12-18T23:15:27+03:00" />
|
<meta property="og:updated_time" content="2023-12-27T10:48:32+03:00" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@ -228,7 +228,7 @@
|
|||||||
|
|
||||||
</p>
|
</p>
|
||||||
</header>
|
</header>
|
||||||
2023-07-01 Export CGSpace to check for missing Initiative collection mappings Start harvesting on AReS 2023-07-02 Minor edits to the crossref_doi_lookup.py script while running some checks from 22,000 CGSpace DOIs 2023-07-03 I analyzed the licenses declared by Crossref and found with high confidence that ~400 of ours were incorrect I took the more accurate ones from Crossref and updated the items on CGSpace I took a few hundred ISBNs as well for where we were missing them I also tagged ~4,700 items with missing licenses as “Copyrighted; all rights reserved” based on their Crossref license status being TDM, mostly from Elsevier, Wiley, and Springer Checking a dozen or so manually, I confirmed that if Crossref only has a TDM license then it’s usually copyrighted (could still be open access, but we can’t tell via Crossref) I would be curious to write a script to check the Unpaywall API for open access status… In the past I found that their license status was not very accurate, but the open access status might be more reliable More minor work on the DSpace 7 item views I learned some new Angular template syntax I created a custom component to show Creative Commons licenses on the simple item page I also decided that I don’t like the Impact Area icons as a component because they don’t have any visual meaning 2023-07-04 Focus group meeting with CGSpace partners about DSpace 7 I added a themed file selection component to the CGSpace theme It displays the bistream description instead of the file name, just like we did in DSpace 6 XMLUI I added a custom component to show share icons 2023-07-05 I spent some time trying to update OpenRXV from Angular 9 to 10 to 11 to 12 to 13 Most things work but there are some minor bugs it seems Mishell from CIP emailed me to say she was having problems approving an item on CGSpace Looking at PostgreSQL I saw there were a dozen or so locks that were several hours and even over one day old so I killed those processes and told her to try again 2023-07-06 Types meeting I wrote a Python script to check Unpaywall for some information about DOIs 2023-07-7 Continue exploring Unpaywall data for some of our DOIs In the past I’ve found their licensing information to not be very reliable (preferring Crossref), but I think their open access status is more reliable, especially when the provider is listed as being the publisher Even so, sometimes the version can be “acceptedVersion”, which is presumably the author’s version, as opposed to the “publishedVersion”, which means it’s available as open access on the publisher’s website I did some quality assurance and found ~100 that were marked as Limited Access, but should have been Open Access, and fixed a handful of licenses Delete duplicate metadata as describe in my DSpace issue from last year: https://github.
|
2023-07-01 Export CGSpace to check for missing Initiative collection mappings Start harvesting on AReS 2023-07-02 Minor edits to the crossref_doi_lookup.py script while running some checks from 22,000 CGSpace DOIs 2023-07-03 I analyzed the licenses declared by Crossref and found with high confidence that ~400 of ours were incorrect I took the more accurate ones from Crossref and updated the items on CGSpace I took a few hundred ISBNs as well for where we were missing them I also tagged ~4,700 items with missing licenses as “Copyrighted; all rights reserved” based on their Crossref license status being TDM, mostly from Elsevier, Wiley, and Springer Checking a dozen or so manually, I confirmed that if Crossref only has a TDM license then it’s usually copyrighted (could still be open access, but we can’t tell via Crossref) I would be curious to write a script to check the Unpaywall API for open access status… In the past I found that their license status was not very accurate, but the open access status might be more reliable More minor work on the DSpace 7 item views I learned some new Angular template syntax I created a custom component to show Creative Commons licenses on the simple item page I also decided that I don’t like the Impact Area icons as a component because they don’t have any visual meaning 2023-07-04 Focus group meeting with CGSpace partners about DSpace 7 I added a themed file selection component to the CGSpace theme It displays the bistream description instead of the file name, just like we did in DSpace 6 XMLUI I added a custom component to show share icons 2023-07-05 I spent some time trying to update OpenRXV from Angular 9 to 10 to 11 to 12 to 13 Most things work but there are some minor bugs it seems Mishell from CIP emailed me to say she was having problems approving an item on CGSpace Looking at PostgreSQL I saw there were a dozen or so locks that were several hours and even over one day old so I killed those processes and told her to try again 2023-07-06 Types meeting I wrote a Python script to check Unpaywall for some information about DOIs 2023-07-7 Continue exploring Unpaywall data for some of our DOIs In the past I’ve found their licensing information to not be very reliable (preferring Crossref), but I think their open access status is more reliable, especially when the provider is listed as being the publisher Even so, sometimes the version can be “acceptedVersion”, which is presumably the author’s version, as opposed to the “publishedVersion”, which means it’s available as open access on the publisher’s website I did some quality assurance and found ~100 that were marked as Limited Access, but should have been Open Access, and fixed a handful of licenses Delete duplicate metadata as described in my DSpace issue from last year: https://github.
|
||||||
<a href='https://alanorth.github.io/cgspace-notes/2023-07/'>Read more →</a>
|
<a href='https://alanorth.github.io/cgspace-notes/2023-07/'>Read more →</a>
|
||||||
</article>
|
</article>
|
||||||
|
|
||||||
|
@ -48,7 +48,7 @@
|
|||||||
<link>https://alanorth.github.io/cgspace-notes/2023-07/</link>
|
<link>https://alanorth.github.io/cgspace-notes/2023-07/</link>
|
||||||
<pubDate>Sat, 01 Jul 2023 17:14:36 +0300</pubDate>
|
<pubDate>Sat, 01 Jul 2023 17:14:36 +0300</pubDate>
|
||||||
<guid>https://alanorth.github.io/cgspace-notes/2023-07/</guid>
|
<guid>https://alanorth.github.io/cgspace-notes/2023-07/</guid>
|
||||||
<description>2023-07-01 Export CGSpace to check for missing Initiative collection mappings Start harvesting on AReS 2023-07-02 Minor edits to the crossref_doi_lookup.py script while running some checks from 22,000 CGSpace DOIs 2023-07-03 I analyzed the licenses declared by Crossref and found with high confidence that ~400 of ours were incorrect I took the more accurate ones from Crossref and updated the items on CGSpace I took a few hundred ISBNs as well for where we were missing them I also tagged ~4,700 items with missing licenses as &ldquo;Copyrighted; all rights reserved&rdquo; based on their Crossref license status being TDM, mostly from Elsevier, Wiley, and Springer Checking a dozen or so manually, I confirmed that if Crossref only has a TDM license then it&rsquo;s usually copyrighted (could still be open access, but we can&rsquo;t tell via Crossref) I would be curious to write a script to check the Unpaywall API for open access status&hellip; In the past I found that their license status was not very accurate, but the open access status might be more reliable More minor work on the DSpace 7 item views I learned some new Angular template syntax I created a custom component to show Creative Commons licenses on the simple item page I also decided that I don&rsquo;t like the Impact Area icons as a component because they don&rsquo;t have any visual meaning 2023-07-04 Focus group meeting with CGSpace partners about DSpace 7 I added a themed file selection component to the CGSpace theme It displays the bistream description instead of the file name, just like we did in DSpace 6 XMLUI I added a custom component to show share icons 2023-07-05 I spent some time trying to update OpenRXV from Angular 9 to 10 to 11 to 12 to 13 Most things work but there are some minor bugs it seems Mishell from CIP emailed me to say she was having problems approving an item on CGSpace Looking at PostgreSQL I saw there were a dozen or so locks that were several hours and even over one day old so I killed those processes and told her to try again 2023-07-06 Types meeting I wrote a Python script to check Unpaywall for some information about DOIs 2023-07-7 Continue exploring Unpaywall data for some of our DOIs In the past I&rsquo;ve found their licensing information to not be very reliable (preferring Crossref), but I think their open access status is more reliable, especially when the provider is listed as being the publisher Even so, sometimes the version can be &ldquo;acceptedVersion&rdquo;, which is presumably the author&rsquo;s version, as opposed to the &ldquo;publishedVersion&rdquo;, which means it&rsquo;s available as open access on the publisher&rsquo;s website I did some quality assurance and found ~100 that were marked as Limited Access, but should have been Open Access, and fixed a handful of licenses Delete duplicate metadata as describe in my DSpace issue from last year: https://github.</description>
|
<description>2023-07-01 Export CGSpace to check for missing Initiative collection mappings Start harvesting on AReS 2023-07-02 Minor edits to the crossref_doi_lookup.py script while running some checks from 22,000 CGSpace DOIs 2023-07-03 I analyzed the licenses declared by Crossref and found with high confidence that ~400 of ours were incorrect I took the more accurate ones from Crossref and updated the items on CGSpace I took a few hundred ISBNs as well for where we were missing them I also tagged ~4,700 items with missing licenses as &ldquo;Copyrighted; all rights reserved&rdquo; based on their Crossref license status being TDM, mostly from Elsevier, Wiley, and Springer Checking a dozen or so manually, I confirmed that if Crossref only has a TDM license then it&rsquo;s usually copyrighted (could still be open access, but we can&rsquo;t tell via Crossref) I would be curious to write a script to check the Unpaywall API for open access status&hellip; In the past I found that their license status was not very accurate, but the open access status might be more reliable More minor work on the DSpace 7 item views I learned some new Angular template syntax I created a custom component to show Creative Commons licenses on the simple item page I also decided that I don&rsquo;t like the Impact Area icons as a component because they don&rsquo;t have any visual meaning 2023-07-04 Focus group meeting with CGSpace partners about DSpace 7 I added a themed file selection component to the CGSpace theme It displays the bistream description instead of the file name, just like we did in DSpace 6 XMLUI I added a custom component to show share icons 2023-07-05 I spent some time trying to update OpenRXV from Angular 9 to 10 to 11 to 12 to 13 Most things work but there are some minor bugs it seems Mishell from CIP emailed me to say she was having problems approving an item on CGSpace Looking at PostgreSQL I saw there were a dozen or so locks that were several hours and even over one day old so I killed those processes and told her to try again 2023-07-06 Types meeting I wrote a Python script to check Unpaywall for some information about DOIs 2023-07-7 Continue exploring Unpaywall data for some of our DOIs In the past I&rsquo;ve found their licensing information to not be very reliable (preferring Crossref), but I think their open access status is more reliable, especially when the provider is listed as being the publisher Even so, sometimes the version can be &ldquo;acceptedVersion&rdquo;, which is presumably the author&rsquo;s version, as opposed to the &ldquo;publishedVersion&rdquo;, which means it&rsquo;s available as open access on the publisher&rsquo;s website I did some quality assurance and found ~100 that were marked as Limited Access, but should have been Open Access, and fixed a handful of licenses Delete duplicate metadata as described in my DSpace issue from last year: https://github.</description>
|
||||||
</item>
|
</item>
|
||||||
<item>
|
<item>
|
||||||
<title>June, 2023</title>
|
<title>June, 2023</title>
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||||
<meta property="og:type" content="website" />
|
<meta property="og:type" content="website" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
||||||
<meta property="og:updated_time" content="2023-12-18T23:15:27+03:00" />
|
<meta property="og:updated_time" content="2023-12-27T10:48:32+03:00" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||||
<meta property="og:type" content="website" />
|
<meta property="og:type" content="website" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
||||||
<meta property="og:updated_time" content="2023-12-18T23:15:27+03:00" />
|
<meta property="og:updated_time" content="2023-12-27T10:48:32+03:00" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||||
<meta property="og:type" content="website" />
|
<meta property="og:type" content="website" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
||||||
<meta property="og:updated_time" content="2023-12-18T23:15:27+03:00" />
|
<meta property="og:updated_time" content="2023-12-27T10:48:32+03:00" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||||
<meta property="og:type" content="website" />
|
<meta property="og:type" content="website" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
||||||
<meta property="og:updated_time" content="2023-12-18T23:15:27+03:00" />
|
<meta property="og:updated_time" content="2023-12-27T10:48:32+03:00" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||||
<meta property="og:type" content="website" />
|
<meta property="og:type" content="website" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
||||||
<meta property="og:updated_time" content="2023-12-18T23:15:27+03:00" />
|
<meta property="og:updated_time" content="2023-12-27T10:48:32+03:00" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||||
<meta property="og:type" content="website" />
|
<meta property="og:type" content="website" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
||||||
<meta property="og:updated_time" content="2023-12-18T23:15:27+03:00" />
|
<meta property="og:updated_time" content="2023-12-27T10:48:32+03:00" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||||
<meta property="og:type" content="website" />
|
<meta property="og:type" content="website" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
||||||
<meta property="og:updated_time" content="2023-12-18T23:15:27+03:00" />
|
<meta property="og:updated_time" content="2023-12-27T10:48:32+03:00" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||||
<meta property="og:type" content="website" />
|
<meta property="og:type" content="website" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
||||||
<meta property="og:updated_time" content="2023-12-18T23:15:27+03:00" />
|
<meta property="og:updated_time" content="2023-12-27T10:48:32+03:00" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||||
<meta property="og:type" content="website" />
|
<meta property="og:type" content="website" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
||||||
<meta property="og:updated_time" content="2023-12-18T23:15:27+03:00" />
|
<meta property="og:updated_time" content="2023-12-27T10:48:32+03:00" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||||
<meta property="og:type" content="website" />
|
<meta property="og:type" content="website" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
||||||
<meta property="og:updated_time" content="2023-12-18T23:15:27+03:00" />
|
<meta property="og:updated_time" content="2023-12-27T10:48:32+03:00" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||||
<meta property="og:type" content="website" />
|
<meta property="og:type" content="website" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
||||||
<meta property="og:updated_time" content="2023-12-18T23:15:27+03:00" />
|
<meta property="og:updated_time" content="2023-12-27T10:48:32+03:00" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@ -228,7 +228,7 @@
|
|||||||
|
|
||||||
</p>
|
</p>
|
||||||
</header>
|
</header>
|
||||||
2023-07-01 Export CGSpace to check for missing Initiative collection mappings Start harvesting on AReS 2023-07-02 Minor edits to the crossref_doi_lookup.py script while running some checks from 22,000 CGSpace DOIs 2023-07-03 I analyzed the licenses declared by Crossref and found with high confidence that ~400 of ours were incorrect I took the more accurate ones from Crossref and updated the items on CGSpace I took a few hundred ISBNs as well for where we were missing them I also tagged ~4,700 items with missing licenses as “Copyrighted; all rights reserved” based on their Crossref license status being TDM, mostly from Elsevier, Wiley, and Springer Checking a dozen or so manually, I confirmed that if Crossref only has a TDM license then it’s usually copyrighted (could still be open access, but we can’t tell via Crossref) I would be curious to write a script to check the Unpaywall API for open access status… In the past I found that their license status was not very accurate, but the open access status might be more reliable More minor work on the DSpace 7 item views I learned some new Angular template syntax I created a custom component to show Creative Commons licenses on the simple item page I also decided that I don’t like the Impact Area icons as a component because they don’t have any visual meaning 2023-07-04 Focus group meeting with CGSpace partners about DSpace 7 I added a themed file selection component to the CGSpace theme It displays the bistream description instead of the file name, just like we did in DSpace 6 XMLUI I added a custom component to show share icons 2023-07-05 I spent some time trying to update OpenRXV from Angular 9 to 10 to 11 to 12 to 13 Most things work but there are some minor bugs it seems Mishell from CIP emailed me to say she was having problems approving an item on CGSpace Looking at PostgreSQL I saw there were a dozen or so locks that were several hours and even over one day old so I killed those processes and told her to try again 2023-07-06 Types meeting I wrote a Python script to check Unpaywall for some information about DOIs 2023-07-7 Continue exploring Unpaywall data for some of our DOIs In the past I’ve found their licensing information to not be very reliable (preferring Crossref), but I think their open access status is more reliable, especially when the provider is listed as being the publisher Even so, sometimes the version can be “acceptedVersion”, which is presumably the author’s version, as opposed to the “publishedVersion”, which means it’s available as open access on the publisher’s website I did some quality assurance and found ~100 that were marked as Limited Access, but should have been Open Access, and fixed a handful of licenses Delete duplicate metadata as describe in my DSpace issue from last year: https://github.
|
2023-07-01 Export CGSpace to check for missing Initiative collection mappings Start harvesting on AReS 2023-07-02 Minor edits to the crossref_doi_lookup.py script while running some checks from 22,000 CGSpace DOIs 2023-07-03 I analyzed the licenses declared by Crossref and found with high confidence that ~400 of ours were incorrect I took the more accurate ones from Crossref and updated the items on CGSpace I took a few hundred ISBNs as well for where we were missing them I also tagged ~4,700 items with missing licenses as “Copyrighted; all rights reserved” based on their Crossref license status being TDM, mostly from Elsevier, Wiley, and Springer Checking a dozen or so manually, I confirmed that if Crossref only has a TDM license then it’s usually copyrighted (could still be open access, but we can’t tell via Crossref) I would be curious to write a script to check the Unpaywall API for open access status… In the past I found that their license status was not very accurate, but the open access status might be more reliable More minor work on the DSpace 7 item views I learned some new Angular template syntax I created a custom component to show Creative Commons licenses on the simple item page I also decided that I don’t like the Impact Area icons as a component because they don’t have any visual meaning 2023-07-04 Focus group meeting with CGSpace partners about DSpace 7 I added a themed file selection component to the CGSpace theme It displays the bistream description instead of the file name, just like we did in DSpace 6 XMLUI I added a custom component to show share icons 2023-07-05 I spent some time trying to update OpenRXV from Angular 9 to 10 to 11 to 12 to 13 Most things work but there are some minor bugs it seems Mishell from CIP emailed me to say she was having problems approving an item on CGSpace Looking at PostgreSQL I saw there were a dozen or so locks that were several hours and even over one day old so I killed those processes and told her to try again 2023-07-06 Types meeting I wrote a Python script to check Unpaywall for some information about DOIs 2023-07-7 Continue exploring Unpaywall data for some of our DOIs In the past I’ve found their licensing information to not be very reliable (preferring Crossref), but I think their open access status is more reliable, especially when the provider is listed as being the publisher Even so, sometimes the version can be “acceptedVersion”, which is presumably the author’s version, as opposed to the “publishedVersion”, which means it’s available as open access on the publisher’s website I did some quality assurance and found ~100 that were marked as Limited Access, but should have been Open Access, and fixed a handful of licenses Delete duplicate metadata as described in my DSpace issue from last year: https://github.
|
||||||
<a href='https://alanorth.github.io/cgspace-notes/2023-07/'>Read more →</a>
|
<a href='https://alanorth.github.io/cgspace-notes/2023-07/'>Read more →</a>
|
||||||
</article>
|
</article>
|
||||||
|
|
||||||
|
@ -48,7 +48,7 @@
|
|||||||
<link>https://alanorth.github.io/cgspace-notes/2023-07/</link>
|
<link>https://alanorth.github.io/cgspace-notes/2023-07/</link>
|
||||||
<pubDate>Sat, 01 Jul 2023 17:14:36 +0300</pubDate>
|
<pubDate>Sat, 01 Jul 2023 17:14:36 +0300</pubDate>
|
||||||
<guid>https://alanorth.github.io/cgspace-notes/2023-07/</guid>
|
<guid>https://alanorth.github.io/cgspace-notes/2023-07/</guid>
|
||||||
<description>2023-07-01 Export CGSpace to check for missing Initiative collection mappings Start harvesting on AReS 2023-07-02 Minor edits to the crossref_doi_lookup.py script while running some checks from 22,000 CGSpace DOIs 2023-07-03 I analyzed the licenses declared by Crossref and found with high confidence that ~400 of ours were incorrect I took the more accurate ones from Crossref and updated the items on CGSpace I took a few hundred ISBNs as well for where we were missing them I also tagged ~4,700 items with missing licenses as &ldquo;Copyrighted; all rights reserved&rdquo; based on their Crossref license status being TDM, mostly from Elsevier, Wiley, and Springer Checking a dozen or so manually, I confirmed that if Crossref only has a TDM license then it&rsquo;s usually copyrighted (could still be open access, but we can&rsquo;t tell via Crossref) I would be curious to write a script to check the Unpaywall API for open access status&hellip; In the past I found that their license status was not very accurate, but the open access status might be more reliable More minor work on the DSpace 7 item views I learned some new Angular template syntax I created a custom component to show Creative Commons licenses on the simple item page I also decided that I don&rsquo;t like the Impact Area icons as a component because they don&rsquo;t have any visual meaning 2023-07-04 Focus group meeting with CGSpace partners about DSpace 7 I added a themed file selection component to the CGSpace theme It displays the bistream description instead of the file name, just like we did in DSpace 6 XMLUI I added a custom component to show share icons 2023-07-05 I spent some time trying to update OpenRXV from Angular 9 to 10 to 11 to 12 to 13 Most things work but there are some minor bugs it seems Mishell from CIP emailed me to say she was having problems approving an item on CGSpace Looking at PostgreSQL I saw there were a dozen or so locks that were several hours and even over one day old so I killed those processes and told her to try again 2023-07-06 Types meeting I wrote a Python script to check Unpaywall for some information about DOIs 2023-07-7 Continue exploring Unpaywall data for some of our DOIs In the past I&rsquo;ve found their licensing information to not be very reliable (preferring Crossref), but I think their open access status is more reliable, especially when the provider is listed as being the publisher Even so, sometimes the version can be &ldquo;acceptedVersion&rdquo;, which is presumably the author&rsquo;s version, as opposed to the &ldquo;publishedVersion&rdquo;, which means it&rsquo;s available as open access on the publisher&rsquo;s website I did some quality assurance and found ~100 that were marked as Limited Access, but should have been Open Access, and fixed a handful of licenses Delete duplicate metadata as describe in my DSpace issue from last year: https://github.</description>
|
<description>2023-07-01 Export CGSpace to check for missing Initiative collection mappings Start harvesting on AReS 2023-07-02 Minor edits to the crossref_doi_lookup.py script while running some checks from 22,000 CGSpace DOIs 2023-07-03 I analyzed the licenses declared by Crossref and found with high confidence that ~400 of ours were incorrect I took the more accurate ones from Crossref and updated the items on CGSpace I took a few hundred ISBNs as well for where we were missing them I also tagged ~4,700 items with missing licenses as &ldquo;Copyrighted; all rights reserved&rdquo; based on their Crossref license status being TDM, mostly from Elsevier, Wiley, and Springer Checking a dozen or so manually, I confirmed that if Crossref only has a TDM license then it&rsquo;s usually copyrighted (could still be open access, but we can&rsquo;t tell via Crossref) I would be curious to write a script to check the Unpaywall API for open access status&hellip; In the past I found that their license status was not very accurate, but the open access status might be more reliable More minor work on the DSpace 7 item views I learned some new Angular template syntax I created a custom component to show Creative Commons licenses on the simple item page I also decided that I don&rsquo;t like the Impact Area icons as a component because they don&rsquo;t have any visual meaning 2023-07-04 Focus group meeting with CGSpace partners about DSpace 7 I added a themed file selection component to the CGSpace theme It displays the bistream description instead of the file name, just like we did in DSpace 6 XMLUI I added a custom component to show share icons 2023-07-05 I spent some time trying to update OpenRXV from Angular 9 to 10 to 11 to 12 to 13 Most things work but there are some minor bugs it seems Mishell from CIP emailed me to say she was having problems approving an item on CGSpace Looking at PostgreSQL I saw there were a dozen or so locks that were several hours and even over one day old so I killed those processes and told her to try again 2023-07-06 Types meeting I wrote a Python script to check Unpaywall for some information about DOIs 2023-07-7 Continue exploring Unpaywall data for some of our DOIs In the past I&rsquo;ve found their licensing information to not be very reliable (preferring Crossref), but I think their open access status is more reliable, especially when the provider is listed as being the publisher Even so, sometimes the version can be &ldquo;acceptedVersion&rdquo;, which is presumably the author&rsquo;s version, as opposed to the &ldquo;publishedVersion&rdquo;, which means it&rsquo;s available as open access on the publisher&rsquo;s website I did some quality assurance and found ~100 that were marked as Limited Access, but should have been Open Access, and fixed a handful of licenses Delete duplicate metadata as described in my DSpace issue from last year: https://github.</description>
|
||||||
</item>
|
</item>
|
||||||
<item>
|
<item>
|
||||||
<title>June, 2023</title>
|
<title>June, 2023</title>
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||||
<meta property="og:type" content="website" />
|
<meta property="og:type" content="website" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
||||||
<meta property="og:updated_time" content="2023-12-18T23:15:27+03:00" />
|
<meta property="og:updated_time" content="2023-12-27T10:48:32+03:00" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||||
<meta property="og:type" content="website" />
|
<meta property="og:type" content="website" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
||||||
<meta property="og:updated_time" content="2023-12-18T23:15:27+03:00" />
|
<meta property="og:updated_time" content="2023-12-27T10:48:32+03:00" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||||
<meta property="og:type" content="website" />
|
<meta property="og:type" content="website" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
||||||
<meta property="og:updated_time" content="2023-12-18T23:15:27+03:00" />
|
<meta property="og:updated_time" content="2023-12-27T10:48:32+03:00" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||||
<meta property="og:type" content="website" />
|
<meta property="og:type" content="website" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
||||||
<meta property="og:updated_time" content="2023-12-18T23:15:27+03:00" />
|
<meta property="og:updated_time" content="2023-12-27T10:48:32+03:00" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||||
<meta property="og:type" content="website" />
|
<meta property="og:type" content="website" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
||||||
<meta property="og:updated_time" content="2023-12-18T23:15:27+03:00" />
|
<meta property="og:updated_time" content="2023-12-27T10:48:32+03:00" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||||
<meta property="og:type" content="website" />
|
<meta property="og:type" content="website" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
||||||
<meta property="og:updated_time" content="2023-12-18T23:15:27+03:00" />
|
<meta property="og:updated_time" content="2023-12-27T10:48:32+03:00" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||||
<meta property="og:type" content="website" />
|
<meta property="og:type" content="website" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
||||||
<meta property="og:updated_time" content="2023-12-18T23:15:27+03:00" />
|
<meta property="og:updated_time" content="2023-12-27T10:48:32+03:00" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||||
<meta property="og:type" content="website" />
|
<meta property="og:type" content="website" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
||||||
<meta property="og:updated_time" content="2023-12-18T23:15:27+03:00" />
|
<meta property="og:updated_time" content="2023-12-27T10:48:32+03:00" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||||
<meta property="og:type" content="website" />
|
<meta property="og:type" content="website" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
||||||
<meta property="og:updated_time" content="2023-12-18T23:15:27+03:00" />
|
<meta property="og:updated_time" content="2023-12-27T10:48:32+03:00" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||||
<meta property="og:type" content="website" />
|
<meta property="og:type" content="website" />
|
||||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
||||||
<meta property="og:updated_time" content="2023-12-18T23:15:27+03:00" />
|
<meta property="og:updated_time" content="2023-12-27T10:48:32+03:00" />
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -3,19 +3,19 @@
|
|||||||
xmlns:xhtml="http://www.w3.org/1999/xhtml">
|
xmlns:xhtml="http://www.w3.org/1999/xhtml">
|
||||||
<url>
|
<url>
|
||||||
<loc>https://alanorth.github.io/cgspace-notes/categories/</loc>
|
<loc>https://alanorth.github.io/cgspace-notes/categories/</loc>
|
||||||
<lastmod>2023-12-18T23:15:27+03:00</lastmod>
|
<lastmod>2023-12-27T10:48:32+03:00</lastmod>
|
||||||
</url><url>
|
</url><url>
|
||||||
<loc>https://alanorth.github.io/cgspace-notes/</loc>
|
<loc>https://alanorth.github.io/cgspace-notes/</loc>
|
||||||
<lastmod>2023-12-18T23:15:27+03:00</lastmod>
|
<lastmod>2023-12-27T10:48:32+03:00</lastmod>
|
||||||
</url><url>
|
</url><url>
|
||||||
<loc>https://alanorth.github.io/cgspace-notes/2023-12/</loc>
|
<loc>https://alanorth.github.io/cgspace-notes/2023-12/</loc>
|
||||||
<lastmod>2023-12-18T23:15:27+03:00</lastmod>
|
<lastmod>2023-12-21T10:08:59+03:00</lastmod>
|
||||||
</url><url>
|
</url><url>
|
||||||
<loc>https://alanorth.github.io/cgspace-notes/categories/notes/</loc>
|
<loc>https://alanorth.github.io/cgspace-notes/categories/notes/</loc>
|
||||||
<lastmod>2023-12-18T23:15:27+03:00</lastmod>
|
<lastmod>2023-12-27T10:48:32+03:00</lastmod>
|
||||||
</url><url>
|
</url><url>
|
||||||
<loc>https://alanorth.github.io/cgspace-notes/posts/</loc>
|
<loc>https://alanorth.github.io/cgspace-notes/posts/</loc>
|
||||||
<lastmod>2023-12-18T23:15:27+03:00</lastmod>
|
<lastmod>2023-12-27T10:48:32+03:00</lastmod>
|
||||||
</url><url>
|
</url><url>
|
||||||
<loc>https://alanorth.github.io/cgspace-notes/2023-11/</loc>
|
<loc>https://alanorth.github.io/cgspace-notes/2023-11/</loc>
|
||||||
<lastmod>2023-12-06T20:57:07+03:00</lastmod>
|
<lastmod>2023-12-06T20:57:07+03:00</lastmod>
|
||||||
@ -30,7 +30,7 @@
|
|||||||
<lastmod>2023-09-01T08:10:02+03:00</lastmod>
|
<lastmod>2023-09-01T08:10:02+03:00</lastmod>
|
||||||
</url><url>
|
</url><url>
|
||||||
<loc>https://alanorth.github.io/cgspace-notes/2023-07/</loc>
|
<loc>https://alanorth.github.io/cgspace-notes/2023-07/</loc>
|
||||||
<lastmod>2023-08-02T23:04:11+03:00</lastmod>
|
<lastmod>2023-12-27T10:48:32+03:00</lastmod>
|
||||||
</url><url>
|
</url><url>
|
||||||
<loc>https://alanorth.github.io/cgspace-notes/2023-06/</loc>
|
<loc>https://alanorth.github.io/cgspace-notes/2023-06/</loc>
|
||||||
<lastmod>2023-07-01T17:17:31+03:00</lastmod>
|
<lastmod>2023-07-01T17:17:31+03:00</lastmod>
|
||||||
|
Loading…
Reference in New Issue
Block a user