mirror of
https://github.com/alanorth/cgspace-notes.git
synced 2024-11-22 14:45:03 +01:00
Add notes for 2020-06-30
This commit is contained in:
parent
48159ebf8e
commit
a23b442c77
@ -437,5 +437,167 @@ COPY 3917
|
||||
|
||||
- Email GRID.ac to ask them about where old names for institutes are stores, as I see them in the "Disambiguate" search function online, but not in the standalone data
|
||||
- For example, both "International Laboratory for Research on Animal Diseases" (ILRAD) and "International Livestock Centre for Africa" (ILCA) correctly return a hit for "International Livestock Research Institute", but it's nowhere in the data
|
||||
- I discovered two interesting OpenRefine reconciliation services:
|
||||
- [OpenRefine reconciler for the Research Organization Registry](https://github.com/ror-community/ror-reconciler)
|
||||
- [Getty Vocabularies OpenRefine Reconciliation](https://www.getty.edu/research/tools/vocabularies/obtain/openrefine.html) (see the Getty Thesaurus of Geographic Names ® (TGN))
|
||||
|
||||
## 2020-06-29
|
||||
|
||||
- I stumbled upon a sort of [standard for rights statements](https://rightsstatements.org/page/1.0/) that we might want to use for `dc.rights` eventually
|
||||
- I'm trying to understand the difference between `dcterms.coverage`, `dcterms.spatial`, and `dcterms.temporal`
|
||||
- According to the [Dublin Core specification for coverage](https://www.dublincore.org/specifications/dublin-core/dcmi-terms/terms/coverage/) the more specific spatial and temporal subproperties:
|
||||
|
||||
> Because coverage is so broadly defined, it is preferable to use the more specific subproperties Temporal Coverage and Spatial Coverage.
|
||||
|
||||
- So I guess we should be using this for countries... but then all regions, countries, etc get merged together into this when you use DCTERMS
|
||||
- Perhaps better to use `cg.coverage.country` and crosswalk to `dcterms.spatial`
|
||||
- Another thing is that these values are not literals—you are supposed to embed classes...
|
||||
- I also notice that there is a [CrossRef funders registry](https://www.crossref.org/services/funder-registry/) with 23,000+ funders that you can [download as RDF](https://gitlab.com/crossref/open_funder_registry) or [access via an API](https://www.crossref.org/education/funder-registry/accessing-the-funder-registry/)
|
||||
|
||||
```
|
||||
$ http 'https://api.crossref.org/funders?query=Bill+and+Melinda+Gates&mailto=a.orth@cgiar.org'
|
||||
```
|
||||
|
||||
- Searching for "Bill and Melinda Gates" we can see the `name` literal and a list of `alt-names` literals
|
||||
- This could be good for checking our funders
|
||||
- The API currently returns pages for each funder in the vocabulary, but they are giving HTTP 404 right now: https://data.crossref.org/fundingdata/vocabulary/Label-599174
|
||||
- I sent an email to the CrossRef Funders Registry team
|
||||
- See the [CrossRef API docs](https://github.com/CrossRef/rest-api-doc) (specifically the parameters and filters)
|
||||
- I made a pull request on CG Core v2 to recommend using persistent identifiers for DOIs and ORCID iDs ([#26](https://github.com/AgriculturalSemantics/cg-core/pull/26))
|
||||
- I exported sponsors/funders from CGSpace and wrote a script to query the CrossRef API for matches:
|
||||
|
||||
```
|
||||
dspace=# \COPY (SELECT DISTINCT text_value FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id=29) TO /tmp/2020-06-29-sponsors.csv;
|
||||
COPY 682
|
||||
```
|
||||
|
||||
- The script is `crossref-funders-lookup.py` and it is based on `agrovoc-lookup.py`
|
||||
- On that note, I realized I need to URL encode the funder before making the search request with requests because, while the requests library *does* do URL encoding, it seems that it interprets characters like `&` as indicative query parameters and this causes searches for funders like `Bill & Melinda Gates Foundation` to get misinterpreted
|
||||
- So then I noticed that I had worked around this in `agrovoc-lookup.py` a few years ago by just ignoring subjects with special characters like apostrophes and accents!
|
||||
- I tested the script on our funders:
|
||||
|
||||
```
|
||||
$ ./crossref-funders-lookup.py -i /tmp/2020-06-29-sponsors.csv -om /tmp/sponsors-matched.txt -or /tmp/sponsors-rejected.txt -d -e blah@blah.com
|
||||
$ wc -l /tmp/2020-06-29-sponsors.csv
|
||||
682 /tmp/2020-06-29-sponsors.csv
|
||||
$ wc -l /tmp/sponsors-*
|
||||
180 /tmp/sponsors-matched.txt
|
||||
502 /tmp/sponsors-rejected.txt
|
||||
682 total
|
||||
```
|
||||
|
||||
- It seems that 35% of our funders already match... I bet a few more will match if I check for simple errors
|
||||
- Interesting, I found a few funders that we have correct, but can't figure out how to match them in the API:
|
||||
- `Claussen-Simon-Stiftung`
|
||||
- `H2020 Marie Skłodowska-Curie Actions`
|
||||
|
||||
## 2020-06-30
|
||||
|
||||
- GRID responded to my question about historical names
|
||||
- They said the information is not part of the public GRID or ROR lists, but you can access it with a license to the Dimensions API
|
||||
- Gabriela from CIP sent me a list of erroneously added CIP subjects to remove from CGSpace:
|
||||
|
||||
```
|
||||
$ cat /tmp/2020-06-30-remove-cip-subjects.csv
|
||||
cg.subject.cip
|
||||
INTEGRATED PEST MANAGEMENT
|
||||
ORANGE FLESH SWEET POTATOES
|
||||
AEROPONICS
|
||||
FOOD SUPPLY
|
||||
SASHA
|
||||
SPHI
|
||||
INSECT LIFE CYCLE MODELLING
|
||||
SUSTAIN
|
||||
AGRICULTURAL INNOVATIONS
|
||||
NATIVE VARIETIES
|
||||
PHYTOPHTHORA INFESTANS
|
||||
$ ./delete-metadata-values.py -i /tmp/2020-06-30-remove-cip-subjects.csv -db dspace -u dspace -p 'fuuu' -f cg.subject.cip -m 127 -d
|
||||
```
|
||||
|
||||
- She also wants to change their `SWEET POTATOES` term to `SWEETPOTATOES`, both in the CIP subject list and existing items so I updated those too:
|
||||
|
||||
```
|
||||
$ cat /tmp/2020-06-30-fix-cip-subjects.csv
|
||||
cg.subject.cip,correct
|
||||
SWEET POTATOES,SWEETPOTATOES
|
||||
$ ./fix-metadata-values.py -i /tmp/2020-06-30-fix-cip-subjects.csv -db dspace -u dspace -p 'fuuu' -f cg.subject.cip -t correct -m 127 -d
|
||||
```
|
||||
|
||||
- She also finished doing all the corrections to authors that I had sent her last week, but many of the changes are removing Spanish accents from authors names so I asked if she's really should she wants to do that
|
||||
- I ran the fixes and deletes on CGSpace, but not on DSpace Test yet because those scripts need updating for DSpace 6 UUIDs
|
||||
- I spent about two hours manually checking our sponsors that were rejected from CrossRef and found about fifty-five corrections that I ran on CGSpace:
|
||||
|
||||
```
|
||||
$ cat 2020-06-29-fix-sponsors.csv
|
||||
dc.description.sponsorship,correct
|
||||
"Conselho Nacional de Desenvolvimento Científico e Tecnológico, Brazil","Conselho Nacional de Desenvolvimento Científico e Tecnológico"
|
||||
"Claussen Simon Stiftung","Claussen-Simon-Stiftung"
|
||||
"Fonds pour la formation á la Recherche dans l'Industrie et dans l'Agriculture, Belgium","Fonds pour la Formation à la Recherche dans l’Industrie et dans l’Agriculture"
|
||||
"Fundação de Amparo à Pesquisa do Estado de São Paulo, Brazil","Fundação de Amparo à Pesquisa do Estado de São Paulo"
|
||||
"Schlumberger Foundation Faculty for the Future","Schlumberger Foundation"
|
||||
"Wildlife Conservation Society, United States","Wildlife Conservation Society"
|
||||
"Portuguese Foundation for Science and Technology","Portuguese Science and Technology Foundation"
|
||||
"Wageningen University and Research","Wageningen University and Research Centre"
|
||||
"Leverhulme Centre for Integrative Research in Agriculture and Health","Leverhulme Centre for Integrative Research on Agriculture and Health"
|
||||
"Natural Science and Engineering Research Council of Canada","Natural Sciences and Engineering Research Council of Canada"
|
||||
"Biotechnology and Biological Sciences Research Council, United Kingdom","Biotechnology and Biological Sciences Research Council"
|
||||
"Home Grown Ceraels Authority United Kingdom","Home-Grown Cereals Authority"
|
||||
"Fiat Panis Foundation","Foundation fiat panis"
|
||||
"Defence Science and Technology Laboratory, United Kingdom","Defence Science and Technology Laboratory"
|
||||
"African Development Bank","African Development Bank Group"
|
||||
"Ministry of Health, Labour, and Welfare, Japan","Ministry of Health, Labour and Welfare"
|
||||
"World Academy of Sciences","The World Academy of Sciences"
|
||||
"Agricultural Research Council, South Africa","Agricultural Research Council"
|
||||
"Department of Homeland Security, USA","U.S. Department of Homeland Security"
|
||||
"Quadram Institute","Quadram Institute Bioscience"
|
||||
"Google.org","Google"
|
||||
"Department for Environment, Food and Rural Affairs, United Kingdom","Department for Environment, Food and Rural Affairs, UK Government"
|
||||
"National Commission for Science, Technology and Innovation, Kenya","National Commission for Science, Technology and Innovation"
|
||||
"Hainan Province Natural Science Foundation of China","Natural Science Foundation of Hainan Province"
|
||||
"German Society for International Cooperation (GIZ)","GIZ"
|
||||
"German Federal Ministry of Food and Agriculture","Federal Ministry of Food and Agriculture"
|
||||
"State Key Laboratory of Environmental Geochemistry, China","State Key Laboratory of Environmental Geochemistry"
|
||||
"QUT student scholarship","Queensland University of Technology"
|
||||
"Australia Centre for International Agricultural Research","Australian Centre for International Agricultural Research"
|
||||
"Belgian Science Policy","Belgian Federal Science Policy Office"
|
||||
"U.S. Department of Agriculture USDA","U.S. Department of Agriculture"
|
||||
"U.S.. Department of Agriculture (USDA)","U.S. Department of Agriculture"
|
||||
"Fundação de Amparo à Pesquisa do Estado de São Paulo ( FAPESP)","Fundação de Amparo à Pesquisa do Estado de São Paulo"
|
||||
"Fundação de Amparo à Pesquisa do Estado do Rio Grande do Sul, Brazil","Fundação de Amparo à Pesquisa do Estado do Rio Grande do Sul"
|
||||
"Fundação de Amparo à Pesquisa do Estado do Rio de Janeiro, Brazil","Fundação Carlos Chagas Filho de Amparo à Pesquisa do Estado do Rio de Janeiro"
|
||||
"Swedish University of Agricultural Sciences (SLU)","Swedish University of Agricultural Sciences"
|
||||
"U.S. Department of Agriculture (USDA)","U.S. Department of Agriculture"
|
||||
"Swedish International Development Cooperation Agency (Sida)","Sida"
|
||||
"Swedish International Development Agency","Sida"
|
||||
"Federal Ministry for Economic Cooperation and Development, Germany","Federal Ministry for Economic Cooperation and Development"
|
||||
"Natural Environment Research Council, United Kingdom","Natural Environment Research Council"
|
||||
"Economic and Social Research Council, United Kingdom","Economic and Social Research Council"
|
||||
"Medical Research Council, United Kingdom","Medical Research Council"
|
||||
"Federal Ministry for Education and Research, Germany","Federal Ministry for Education, Science, Research and Technology"
|
||||
"UK Government’s Department for International Development","Department for International Development, UK Government"
|
||||
"Department for International Development, United Kingdom","Department for International Development, UK Government"
|
||||
"United Nations Children's Fund","United Nations Children's Emergency Fund"
|
||||
"Swedish Research Council for Environment, Agricultural Science and Spatial Planning","Swedish Research Council for Environment, Agricultural Sciences and Spatial Planning"
|
||||
"Agence Nationale de la Recherche, France","French National Research Agency"
|
||||
"Fondation pour la recherche sur la biodiversité","Foundation for Research on Biodiversity"
|
||||
"Programa Nacional de Innovacion Agraria, Peru","Programa Nacional de Innovación Agraria, Peru"
|
||||
"United States Agency for International Development (USAID)","United States Agency for International Development"
|
||||
"West Africa Agricultural Productivity Programme","West Africa Agricultural Productivity Program"
|
||||
"West African Agricultural Productivity Project","West Africa Agricultural Productivity Program"
|
||||
"Rural Development Administration, Republic of Korea","Rural Development Administration"
|
||||
"UK’s Biotechnology and Biological Sciences Research Council (BBSRC)","Biotechnology and Biological Sciences Research Council"
|
||||
$ ./fix-metadata-values.py -i /tmp/2020-06-29-fix-sponsors.csv -db dspace -u dspace -p 'fuuu' -f dc.description.sponsorship -t correct -m 29
|
||||
```
|
||||
|
||||
- Peter wants me to add "CORONAVIRUS DISEASE" to all ILRI items that have ILRI subject "COVID19"
|
||||
- I exported the ILRI community and cut the columns I needed, then opened the file in OpenRefine:
|
||||
|
||||
```
|
||||
$ export JAVA_OPTS="-Xmx512m -Dfile.encoding=UTF-8"
|
||||
$ dspace metadata-export -i 10568/1 -f /tmp/ilri.cs
|
||||
$ csvcut -c 'id,cg.subject.ilri[],cg.subject.ilri[en_US],dc.subject[en_US]' /tmp/ilri.csv > /tmp/ilri-covid19.csv
|
||||
```
|
||||
|
||||
- I see that all items with "COVID19" already have "CORONAVIRUS DISEASE" so I don't need to do anything
|
||||
|
||||
<!-- vim: set sw=2 ts=2: -->
|
||||
|
@ -31,7 +31,7 @@ Last week I had increased the limit from 30 to 60, which seemed to help, but now
|
||||
$ psql -c 'SELECT * from pg_stat_activity;' | grep idle | grep -c cgspace
|
||||
78
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -33,7 +33,7 @@ Replace lzop with xz in log compression cron jobs on DSpace Test—it uses less
|
||||
-rw-rw-r-- 1 tomcat7 tomcat7 387K Nov 18 23:59 dspace.log.2015-11-18.lzo
|
||||
-rw-rw-r-- 1 tomcat7 tomcat7 169K Nov 18 23:59 dspace.log.2015-11-18.xz
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -25,7 +25,7 @@ Move ILRI collection 10568/12503 from 10568/27869 to 10568/27629 using the move_
|
||||
I realized it is only necessary to clear the Cocoon cache after moving collections—rather than reindexing—as no metadata has changed, and therefore no search or browse indexes need to be updated.
|
||||
Update GitHub wiki for documentation of maintenance tasks.
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -35,7 +35,7 @@ I noticed we have a very interesting list of countries on CGSpace:
|
||||
Not only are there 49,000 countries, we have some blanks (25)…
|
||||
Also, lots of things like “COTE D`LVOIRE” and “COTE D IVOIRE”
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -25,7 +25,7 @@ Looking at issues with author authorities on CGSpace
|
||||
For some reason we still have the index-lucene-update cron job active on CGSpace, but I’m pretty sure we don’t need it as of the latest few versions of Atmire’s Listings and Reports module
|
||||
Reinstall my local (Mac OS X) DSpace stack with Tomcat 7, PostgreSQL 9.3, and Java JDK 1.7 to match environment on CGSpace server
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -29,7 +29,7 @@ After running DSpace for over five years I’ve never needed to look in any
|
||||
This will save us a few gigs of backup space we’re paying for on S3
|
||||
Also, I noticed the checker log has some errors we should pay attention to:
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -31,7 +31,7 @@ There are 3,000 IPs accessing the REST API in a 24-hour period!
|
||||
# awk '{print $1}' /var/log/nginx/rest.log | uniq | wc -l
|
||||
3168
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -31,7 +31,7 @@ This is their publications set: http://ebrary.ifpri.org/oai/oai.php?verb=ListRec
|
||||
You can see the others by using the OAI ListSets verb: http://ebrary.ifpri.org/oai/oai.php?verb=ListSets
|
||||
Working on second phase of metadata migration, looks like this will work for moving CPWF-specific data in dc.identifier.fund to cg.identifier.cpwfproject and then the rest to dc.description.sponsorship
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -41,7 +41,7 @@ dspacetest=# select text_value from metadatavalue where metadata_field_id=3 and
|
||||
|
||||
In this case the select query was showing 95 results before the update
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -39,7 +39,7 @@ $ git checkout -b 55new 5_x-prod
|
||||
$ git reset --hard ilri/5_x-prod
|
||||
$ git rebase -i dspace-5.5
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -31,7 +31,7 @@ It looks like we might be able to use OUs now, instead of DCs:
|
||||
|
||||
$ ldapsearch -x -H ldaps://svcgroot2.cgiarad.org:3269/ -b "dc=cgiarad,dc=org" -D "admigration1@cgiarad.org" -W "(sAMAccountName=admigration1)"
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -39,7 +39,7 @@ I exported a random item’s metadata as CSV, deleted all columns except id
|
||||
|
||||
0000-0002-6115-0956||0000-0002-3812-8793||0000-0001-7462-405X
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -23,7 +23,7 @@ Add dc.type to the output options for Atmire’s Listings and Reports module
|
||||
Add dc.type to the output options for Atmire’s Listings and Reports module (#286)
|
||||
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -43,7 +43,7 @@ I see thousands of them in the logs for the last few months, so it’s not r
|
||||
I’ve raised a ticket with Atmire to ask
|
||||
Another worrying error from dspace.log is:
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -25,7 +25,7 @@ I checked to see if the Solr sharding task that is supposed to run on January 1s
|
||||
I tested on DSpace Test as well and it doesn’t work there either
|
||||
I asked on the dspace-tech mailing list because it seems to be broken, and actually now I’m not sure if we’ve ever had the sharding task run successfully over all these years
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -47,7 +47,7 @@ DELETE 1
|
||||
Create issue on GitHub to track the addition of CCAFS Phase II project tags (#301)
|
||||
Looks like we’ll be using cg.identifier.ccafsprojectpii as the field name
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -51,7 +51,7 @@ Interestingly, it seems DSpace 4.x’s thumbnails were sRGB, but forcing reg
|
||||
$ identify ~/Desktop/alc_contrastes_desafios.jpg
|
||||
/Users/aorth/Desktop/alc_contrastes_desafios.jpg JPEG 464x600 464x600+0+0 8-bit CMYK 168KB 0.000u 0:00.000
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -37,7 +37,7 @@ Testing the CMYK patch on a collection with 650 items:
|
||||
|
||||
$ [dspace]/bin/dspace filter-media -f -i 10568/16498 -p "ImageMagick PDF Thumbnail" -v >& /tmp/filter-media-cmyk.txt
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -15,7 +15,7 @@
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="May, 2017"/>
|
||||
<meta name="twitter:description" content="2017-05-01 ICARDA apparently started working on CG Core on their MEL repository They have done a few cg.* fields, but not very consistent and even copy some of CGSpace items: https://mel.cgiar.org/xmlui/handle/20.500.11766/6911?show=full https://cgspace.cgiar.org/handle/10568/73683 2017-05-02 Atmire got back about the Workflow Statistics issue, and apparently it’s a bug in the CUA module so they will send us a pull request 2017-05-04 Sync DSpace Test with database and assetstore from CGSpace Re-deploy DSpace Test with Atmire’s CUA patch for workflow statistics, run system updates, and restart the server Now I can see the workflow statistics and am able to select users, but everything returns 0 items Megan says there are still some mapped items are not appearing since last week, so I forced a full index-discovery -b Need to remember to check if the collection has more items (currently 39 on CGSpace, but 118 on the freshly reindexed DSPace Test) tomorrow: https://cgspace."/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -15,7 +15,7 @@
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="June, 2017"/>
|
||||
<meta name="twitter:description" content="2017-06-01 After discussion with WLE and CGSpace content people, we decided to just add one metadata field for the WLE Research Themes The cg.identifier.wletheme field will be used for both Phase I and Phase II Research Themes Then we’ll create a new sub-community for Phase II and create collections for the research themes there The current “Research Themes” community will be renamed to “WLE Phase I Research Themes” Tagged all items in the current Phase I collections with their appropriate themes Create pull request to add Phase II research themes to the submission form: #328 Add cg."/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -33,7 +33,7 @@ Merge changes for WLE Phase II theme rename (#329)
|
||||
Looking at extracting the metadata registries from ICARDA’s MEL DSpace database so we can compare fields with CGSpace
|
||||
We can use PostgreSQL’s extended output format (-x) plus sed to format the output into quasi XML:
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -57,7 +57,7 @@ This was due to newline characters in the dc.description.abstract column, which
|
||||
I exported a new CSV from the collection on DSpace Test and then manually removed the characters in vim using g/^$/d
|
||||
Then I cleaned up the author authorities and HTML characters in OpenRefine and sent the file back to Abenet
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -29,7 +29,7 @@ Linode sent an alert that CGSpace (linode18) was using 261% CPU for the past two
|
||||
|
||||
Ask Sisay to clean up the WLE approvers a bit, as Marianne’s user account is both in the approvers step as well as the group
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -31,7 +31,7 @@ http://hdl.handle.net/10568/78495||http://hdl.handle.net/10568/79336
|
||||
There appears to be a pattern but I’ll have to look a bit closer and try to clean them up automatically, either in SQL or in OpenRefine
|
||||
Add Katherine Lutz to the groups for content submission and edit steps of the CGIAR System collections
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -45,7 +45,7 @@ Generate list of authors on CGSpace for Peter to go through and correct:
|
||||
dspace=# \copy (select distinct text_value, count(*) as count from metadatavalue where metadata_field_id = (select metadata_field_id from metadatafieldregistry where element = 'contributor' and qualifier = 'author') AND resource_type_id = 2 group by text_value order by count desc) to /tmp/authors.csv with csv;
|
||||
COPY 54701
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -27,7 +27,7 @@ The logs say “Timeout waiting for idle object”
|
||||
PostgreSQL activity says there are 115 connections currently
|
||||
The list of connections to XMLUI and REST API for today:
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -147,7 +147,7 @@ dspace.log.2018-01-02:34
|
||||
|
||||
Danny wrote to ask for help renewing the wildcard ilri.org certificate and I advised that we should probably use Let’s Encrypt if it’s just a handful of domains
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -27,7 +27,7 @@ We don’t need to distinguish between internal and external works, so that
|
||||
Yesterday I figured out how to monitor DSpace sessions using JMX
|
||||
I copied the logic in the jmx_tomcat_dbpools provided by Ubuntu’s munin-plugins-java package and used the stuff I discovered about JMX in 2018-01
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -21,7 +21,7 @@ Export a CSV of the IITA community metadata for Martin Mueller
|
||||
|
||||
Export a CSV of the IITA community metadata for Martin Mueller
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -23,7 +23,7 @@ Catalina logs at least show some memory errors yesterday:
|
||||
I tried to test something on DSpace Test but noticed that it’s down since god knows when
|
||||
Catalina logs at least show some memory errors yesterday:
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -35,7 +35,7 @@ http://localhost:3000/solr/statistics/update?stream.body=%3Ccommit/%3E
|
||||
Then I reduced the JVM heap size from 6144 back to 5120m
|
||||
Also, I switched it to use OpenJDK instead of Oracle Java, as well as re-worked the Ansible infrastructure scripts to support hosts choosing which distribution they want to use
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -55,7 +55,7 @@ real 74m42.646s
|
||||
user 8m5.056s
|
||||
sys 2m7.289s
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -33,7 +33,7 @@ During the mvn package stage on the 5.8 branch I kept getting issues with java r
|
||||
|
||||
There is insufficient memory for the Java Runtime Environment to continue.
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -43,7 +43,7 @@ Anyways, perhaps I should increase the JVM heap from 5120m to 6144m like we did
|
||||
The server only has 8GB of RAM so we’ll eventually need to upgrade to a larger one because we’ll start starving the OS, PostgreSQL, and command line batch processes
|
||||
I ran all system updates on DSpace Test and rebooted it
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -27,7 +27,7 @@ I’ll update the DSpace role in our Ansible infrastructure playbooks and ru
|
||||
Also, I’ll re-run the postgresql tasks because the custom PostgreSQL variables are dynamic according to the system’s RAM, and we never re-ran them after migrating to larger Linodes last month
|
||||
I’m testing the new DSpace 5.8 branch in my Ubuntu 18.04 environment and I’m getting those autowire errors in Tomcat 8.5.30 again:
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -23,7 +23,7 @@ I created a GitHub issue to track this #389, because I’m super busy in Nai
|
||||
Phil Thornton got an ORCID identifier so we need to add it to the list on CGSpace and tag his existing items
|
||||
I created a GitHub issue to track this #389, because I’m super busy in Nairobi right now
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -33,7 +33,7 @@ Send a note about my dspace-statistics-api to the dspace-tech mailing list
|
||||
Linode has been sending mails a few times a day recently that CGSpace (linode18) has had high CPU usage
|
||||
Today these are the top 10 IPs:
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -33,7 +33,7 @@ Then I ran all system updates and restarted the server
|
||||
|
||||
I noticed that there is another issue with PDF thumbnails on CGSpace, and I see there was another Ghostscript vulnerability last week
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -47,7 +47,7 @@ I don’t see anything interesting in the web server logs around that time t
|
||||
357 207.46.13.1
|
||||
903 54.70.40.11
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -69,7 +69,7 @@ real 0m19.873s
|
||||
user 0m22.203s
|
||||
sys 0m1.979s
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -43,7 +43,7 @@ Most worryingly, there are encoding errors in the abstracts for eleven items, fo
|
||||
|
||||
I think I will need to ask Udana to re-copy and paste the abstracts with more care using Google Docs
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -61,7 +61,7 @@ $ ./fix-metadata-values.py -i /tmp/2019-02-21-fix-4-regions.csv -db dspace -u ds
|
||||
$ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-2-countries.csv -db dspace -u dspace -p 'fuuu' -m 228 -f cg.coverage.country -d
|
||||
$ ./delete-metadata-values.py -i /tmp/2019-02-21-delete-1-region.csv -db dspace -u dspace -p 'fuuu' -m 231 -f cg.coverage.region -d
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -45,7 +45,7 @@ DELETE 1
|
||||
|
||||
But after this I tried to delete the item from the XMLUI and it is still present…
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -31,7 +31,7 @@ Run system updates on CGSpace (linode18) and reboot it
|
||||
|
||||
Skype with Marie-Angélique and Abenet about CG Core v2
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -35,7 +35,7 @@ CGSpace
|
||||
|
||||
Abenet had another similar issue a few days ago when trying to find the stats for 2018 in the RTB community
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -43,7 +43,7 @@ After rebooting, all statistics cores were loaded… wow, that’s luck
|
||||
|
||||
Run system updates on DSpace Test (linode19) and reboot it
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -69,7 +69,7 @@ Here are the top ten IPs in the nginx XMLUI and REST/OAI logs this morning:
|
||||
7249 2a01:7e00::f03c:91ff:fe18:7396
|
||||
9124 45.5.186.2
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -15,7 +15,7 @@
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="October, 2019"/>
|
||||
<meta name="twitter:description" content="2019-10-01 Udana from IWMI asked me for a CSV export of their community on CGSpace I exported it, but a quick run through the csv-metadata-quality tool shows that there are some low-hanging fruits we can fix before I send him the data I will limit the scope to the titles, regions, subregions, and river basins for now to manually fix some non-breaking spaces (U+00A0) there that would otherwise be removed by the csv-metadata-quality script’s “unneccesary Unicode” fix: $ csvcut -c 'id,dc."/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -55,7 +55,7 @@ Let’s see how many of the REST API requests were for bitstreams (because t
|
||||
# zcat --force /var/log/nginx/rest.log.*.gz | grep -E "[0-9]{1,2}/Oct/2019" | grep -c -E "/rest/bitstreams"
|
||||
106781
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -43,7 +43,7 @@ Make sure all packages are up to date and the package manager is up to date, the
|
||||
# dpkg -C
|
||||
# reboot
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -53,7 +53,7 @@ I tweeted the CGSpace repository link
|
||||
|
||||
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -35,7 +35,7 @@ The code finally builds and runs with a fresh install
|
||||
|
||||
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -39,7 +39,7 @@ You need to download this into the DSpace 6.x source and compile it
|
||||
|
||||
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -45,7 +45,7 @@ The third item now has a donut with score 1 since I tweeted it last week
|
||||
|
||||
On the same note, the one item Abenet pointed out last week now has a donut with score of 104 after I tweeted it last week
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -31,7 +31,7 @@ I see that CGSpace (linode18) is still using PostgreSQL JDBC driver version 42.2
|
||||
|
||||
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -19,7 +19,7 @@ I tried to build the OAI registry on the freshly migrated DSpace 6 on DSpace Tes
|
||||
<meta property="og:type" content="article" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/2020-06/" />
|
||||
<meta property="article:published_time" content="2020-06-01T13:55:39+03:00" />
|
||||
<meta property="article:modified_time" content="2020-06-23T16:13:27+03:00" />
|
||||
<meta property="article:modified_time" content="2020-06-28T18:13:44+03:00" />
|
||||
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="June, 2020"/>
|
||||
@ -33,7 +33,7 @@ I sent Atmire the dspace.log from today and told them to log into the server to
|
||||
In other news, I checked the statistics API on DSpace 6 and it’s working
|
||||
I tried to build the OAI registry on the freshly migrated DSpace 6 on DSpace Test and I get an error:
|
||||
"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
@ -43,9 +43,9 @@ I tried to build the OAI registry on the freshly migrated DSpace 6 on DSpace Tes
|
||||
"@type": "BlogPosting",
|
||||
"headline": "June, 2020",
|
||||
"url": "https://alanorth.github.io/cgspace-notes/2020-06/",
|
||||
"wordCount": "3319",
|
||||
"wordCount": "4764",
|
||||
"datePublished": "2020-06-01T13:55:39+03:00",
|
||||
"dateModified": "2020-06-23T16:13:27+03:00",
|
||||
"dateModified": "2020-06-28T18:13:44+03:00",
|
||||
"author": {
|
||||
"@type": "Person",
|
||||
"name": "Alan Orth"
|
||||
@ -594,6 +594,191 @@ COPY 3917
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<h2 id="2020-06-28">2020-06-28</h2>
|
||||
<ul>
|
||||
<li>Email GRID.ac to ask them about where old names for institutes are stores, as I see them in the “Disambiguate” search function online, but not in the standalone data
|
||||
<ul>
|
||||
<li>For example, both “International Laboratory for Research on Animal Diseases” (ILRAD) and “International Livestock Centre for Africa” (ILCA) correctly return a hit for “International Livestock Research Institute”, but it’s nowhere in the data</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>I discovered two interesting OpenRefine reconciliation services:
|
||||
<ul>
|
||||
<li><a href="https://github.com/ror-community/ror-reconciler">OpenRefine reconciler for the Research Organization Registry</a></li>
|
||||
<li><a href="https://www.getty.edu/research/tools/vocabularies/obtain/openrefine.html">Getty Vocabularies OpenRefine Reconciliation</a> (see the Getty Thesaurus of Geographic Names ® (TGN))</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<h2 id="2020-06-29">2020-06-29</h2>
|
||||
<ul>
|
||||
<li>I stumbled upon a sort of <a href="https://rightsstatements.org/page/1.0/">standard for rights statements</a> that we might want to use for <code>dc.rights</code> eventually</li>
|
||||
<li>I’m trying to understand the difference between <code>dcterms.coverage</code>, <code>dcterms.spatial</code>, and <code>dcterms.temporal</code>
|
||||
<ul>
|
||||
<li>According to the <a href="https://www.dublincore.org/specifications/dublin-core/dcmi-terms/terms/coverage/">Dublin Core specification for coverage</a> the more specific spatial and temporal subproperties:</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<blockquote>
|
||||
<p>Because coverage is so broadly defined, it is preferable to use the more specific subproperties Temporal Coverage and Spatial Coverage.</p>
|
||||
</blockquote>
|
||||
<ul>
|
||||
<li>So I guess we should be using this for countries… but then all regions, countries, etc get merged together into this when you use DCTERMS
|
||||
<ul>
|
||||
<li>Perhaps better to use <code>cg.coverage.country</code> and crosswalk to <code>dcterms.spatial</code></li>
|
||||
<li>Another thing is that these values are not literals—you are supposed to embed classes…</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>I also notice that there is a <a href="https://www.crossref.org/services/funder-registry/">CrossRef funders registry</a> with 23,000+ funders that you can <a href="https://gitlab.com/crossref/open_funder_registry">download as RDF</a> or <a href="https://www.crossref.org/education/funder-registry/accessing-the-funder-registry/">access via an API</a></li>
|
||||
</ul>
|
||||
<pre><code>$ http 'https://api.crossref.org/funders?query=Bill+and+Melinda+Gates&mailto=a.orth@cgiar.org'
|
||||
</code></pre><ul>
|
||||
<li>Searching for “Bill and Melinda Gates” we can see the <code>name</code> literal and a list of <code>alt-names</code> literals
|
||||
<ul>
|
||||
<li>This could be good for checking our funders</li>
|
||||
<li>The API currently returns pages for each funder in the vocabulary, but they are giving HTTP 404 right now: <a href="https://data.crossref.org/fundingdata/vocabulary/Label-599174">https://data.crossref.org/fundingdata/vocabulary/Label-599174</a></li>
|
||||
<li>I sent an email to the CrossRef Funders Registry team</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>See the <a href="https://github.com/CrossRef/rest-api-doc">CrossRef API docs</a> (specifically the parameters and filters)</li>
|
||||
<li>I made a pull request on CG Core v2 to recommend using persistent identifiers for DOIs and ORCID iDs (<a href="https://github.com/AgriculturalSemantics/cg-core/pull/26">#26</a>)</li>
|
||||
<li>I exported sponsors/funders from CGSpace and wrote a script to query the CrossRef API for matches:</li>
|
||||
</ul>
|
||||
<pre><code>dspace=# \COPY (SELECT DISTINCT text_value FROM metadatavalue WHERE resource_type_id=2 AND metadata_field_id=29) TO /tmp/2020-06-29-sponsors.csv;
|
||||
COPY 682
|
||||
</code></pre><ul>
|
||||
<li>The script is <code>crossref-funders-lookup.py</code> and it is based on <code>agrovoc-lookup.py</code>
|
||||
<ul>
|
||||
<li>On that note, I realized I need to URL encode the funder before making the search request with requests because, while the requests library <em>does</em> do URL encoding, it seems that it interprets characters like <code>&</code> as indicative query parameters and this causes searches for funders like <code>Bill & Melinda Gates Foundation</code> to get misinterpreted</li>
|
||||
<li>So then I noticed that I had worked around this in <code>agrovoc-lookup.py</code> a few years ago by just ignoring subjects with special characters like apostrophes and accents!</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>I tested the script on our funders:</li>
|
||||
</ul>
|
||||
<pre><code>$ ./crossref-funders-lookup.py -i /tmp/2020-06-29-sponsors.csv -om /tmp/sponsors-matched.txt -or /tmp/sponsors-rejected.txt -d -e blah@blah.com
|
||||
$ wc -l /tmp/2020-06-29-sponsors.csv
|
||||
682 /tmp/2020-06-29-sponsors.csv
|
||||
$ wc -l /tmp/sponsors-*
|
||||
180 /tmp/sponsors-matched.txt
|
||||
502 /tmp/sponsors-rejected.txt
|
||||
682 total
|
||||
</code></pre><ul>
|
||||
<li>It seems that 35% of our funders already match… I bet a few more will match if I check for simple errors
|
||||
<ul>
|
||||
<li>Interesting, I found a few funders that we have correct, but can’t figure out how to match them in the API:
|
||||
<ul>
|
||||
<li><code>Claussen-Simon-Stiftung</code></li>
|
||||
<li><code>H2020 Marie Skłodowska-Curie Actions</code></li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<h2 id="2020-06-30">2020-06-30</h2>
|
||||
<ul>
|
||||
<li>GRID responded to my question about historical names
|
||||
<ul>
|
||||
<li>They said the information is not part of the public GRID or ROR lists, but you can access it with a license to the Dimensions API</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>Gabriela from CIP sent me a list of erroneously added CIP subjects to remove from CGSpace:</li>
|
||||
</ul>
|
||||
<pre><code>$ cat /tmp/2020-06-30-remove-cip-subjects.csv
|
||||
cg.subject.cip
|
||||
INTEGRATED PEST MANAGEMENT
|
||||
ORANGE FLESH SWEET POTATOES
|
||||
AEROPONICS
|
||||
FOOD SUPPLY
|
||||
SASHA
|
||||
SPHI
|
||||
INSECT LIFE CYCLE MODELLING
|
||||
SUSTAIN
|
||||
AGRICULTURAL INNOVATIONS
|
||||
NATIVE VARIETIES
|
||||
PHYTOPHTHORA INFESTANS
|
||||
$ ./delete-metadata-values.py -i /tmp/2020-06-30-remove-cip-subjects.csv -db dspace -u dspace -p 'fuuu' -f cg.subject.cip -m 127 -d
|
||||
</code></pre><ul>
|
||||
<li>She also wants to change their <code>SWEET POTATOES</code> term to <code>SWEETPOTATOES</code>, both in the CIP subject list and existing items so I updated those too:</li>
|
||||
</ul>
|
||||
<pre><code>$ cat /tmp/2020-06-30-fix-cip-subjects.csv
|
||||
cg.subject.cip,correct
|
||||
SWEET POTATOES,SWEETPOTATOES
|
||||
$ ./fix-metadata-values.py -i /tmp/2020-06-30-fix-cip-subjects.csv -db dspace -u dspace -p 'fuuu' -f cg.subject.cip -t correct -m 127 -d
|
||||
</code></pre><ul>
|
||||
<li>She also finished doing all the corrections to authors that I had sent her last week, but many of the changes are removing Spanish accents from authors names so I asked if she’s really should she wants to do that</li>
|
||||
<li>I ran the fixes and deletes on CGSpace, but not on DSpace Test yet because those scripts need updating for DSpace 6 UUIDs</li>
|
||||
<li>I spent about two hours manually checking our sponsors that were rejected from CrossRef and found about fifty-five corrections that I ran on CGSpace:</li>
|
||||
</ul>
|
||||
<pre><code>$ cat 2020-06-29-fix-sponsors.csv
|
||||
dc.description.sponsorship,correct
|
||||
"Conselho Nacional de Desenvolvimento Científico e Tecnológico, Brazil","Conselho Nacional de Desenvolvimento Científico e Tecnológico"
|
||||
"Claussen Simon Stiftung","Claussen-Simon-Stiftung"
|
||||
"Fonds pour la formation á la Recherche dans l'Industrie et dans l'Agriculture, Belgium","Fonds pour la Formation à la Recherche dans l’Industrie et dans l’Agriculture"
|
||||
"Fundação de Amparo à Pesquisa do Estado de São Paulo, Brazil","Fundação de Amparo à Pesquisa do Estado de São Paulo"
|
||||
"Schlumberger Foundation Faculty for the Future","Schlumberger Foundation"
|
||||
"Wildlife Conservation Society, United States","Wildlife Conservation Society"
|
||||
"Portuguese Foundation for Science and Technology","Portuguese Science and Technology Foundation"
|
||||
"Wageningen University and Research","Wageningen University and Research Centre"
|
||||
"Leverhulme Centre for Integrative Research in Agriculture and Health","Leverhulme Centre for Integrative Research on Agriculture and Health"
|
||||
"Natural Science and Engineering Research Council of Canada","Natural Sciences and Engineering Research Council of Canada"
|
||||
"Biotechnology and Biological Sciences Research Council, United Kingdom","Biotechnology and Biological Sciences Research Council"
|
||||
"Home Grown Ceraels Authority United Kingdom","Home-Grown Cereals Authority"
|
||||
"Fiat Panis Foundation","Foundation fiat panis"
|
||||
"Defence Science and Technology Laboratory, United Kingdom","Defence Science and Technology Laboratory"
|
||||
"African Development Bank","African Development Bank Group"
|
||||
"Ministry of Health, Labour, and Welfare, Japan","Ministry of Health, Labour and Welfare"
|
||||
"World Academy of Sciences","The World Academy of Sciences"
|
||||
"Agricultural Research Council, South Africa","Agricultural Research Council"
|
||||
"Department of Homeland Security, USA","U.S. Department of Homeland Security"
|
||||
"Quadram Institute","Quadram Institute Bioscience"
|
||||
"Google.org","Google"
|
||||
"Department for Environment, Food and Rural Affairs, United Kingdom","Department for Environment, Food and Rural Affairs, UK Government"
|
||||
"National Commission for Science, Technology and Innovation, Kenya","National Commission for Science, Technology and Innovation"
|
||||
"Hainan Province Natural Science Foundation of China","Natural Science Foundation of Hainan Province"
|
||||
"German Society for International Cooperation (GIZ)","GIZ"
|
||||
"German Federal Ministry of Food and Agriculture","Federal Ministry of Food and Agriculture"
|
||||
"State Key Laboratory of Environmental Geochemistry, China","State Key Laboratory of Environmental Geochemistry"
|
||||
"QUT student scholarship","Queensland University of Technology"
|
||||
"Australia Centre for International Agricultural Research","Australian Centre for International Agricultural Research"
|
||||
"Belgian Science Policy","Belgian Federal Science Policy Office"
|
||||
"U.S. Department of Agriculture USDA","U.S. Department of Agriculture"
|
||||
"U.S.. Department of Agriculture (USDA)","U.S. Department of Agriculture"
|
||||
"Fundação de Amparo à Pesquisa do Estado de São Paulo ( FAPESP)","Fundação de Amparo à Pesquisa do Estado de São Paulo"
|
||||
"Fundação de Amparo à Pesquisa do Estado do Rio Grande do Sul, Brazil","Fundação de Amparo à Pesquisa do Estado do Rio Grande do Sul"
|
||||
"Fundação de Amparo à Pesquisa do Estado do Rio de Janeiro, Brazil","Fundação Carlos Chagas Filho de Amparo à Pesquisa do Estado do Rio de Janeiro"
|
||||
"Swedish University of Agricultural Sciences (SLU)","Swedish University of Agricultural Sciences"
|
||||
"U.S. Department of Agriculture (USDA)","U.S. Department of Agriculture"
|
||||
"Swedish International Development Cooperation Agency (Sida)","Sida"
|
||||
"Swedish International Development Agency","Sida"
|
||||
"Federal Ministry for Economic Cooperation and Development, Germany","Federal Ministry for Economic Cooperation and Development"
|
||||
"Natural Environment Research Council, United Kingdom","Natural Environment Research Council"
|
||||
"Economic and Social Research Council, United Kingdom","Economic and Social Research Council"
|
||||
"Medical Research Council, United Kingdom","Medical Research Council"
|
||||
"Federal Ministry for Education and Research, Germany","Federal Ministry for Education, Science, Research and Technology"
|
||||
"UK Government’s Department for International Development","Department for International Development, UK Government"
|
||||
"Department for International Development, United Kingdom","Department for International Development, UK Government"
|
||||
"United Nations Children's Fund","United Nations Children's Emergency Fund"
|
||||
"Swedish Research Council for Environment, Agricultural Science and Spatial Planning","Swedish Research Council for Environment, Agricultural Sciences and Spatial Planning"
|
||||
"Agence Nationale de la Recherche, France","French National Research Agency"
|
||||
"Fondation pour la recherche sur la biodiversité","Foundation for Research on Biodiversity"
|
||||
"Programa Nacional de Innovacion Agraria, Peru","Programa Nacional de Innovación Agraria, Peru"
|
||||
"United States Agency for International Development (USAID)","United States Agency for International Development"
|
||||
"West Africa Agricultural Productivity Programme","West Africa Agricultural Productivity Program"
|
||||
"West African Agricultural Productivity Project","West Africa Agricultural Productivity Program"
|
||||
"Rural Development Administration, Republic of Korea","Rural Development Administration"
|
||||
"UK’s Biotechnology and Biological Sciences Research Council (BBSRC)","Biotechnology and Biological Sciences Research Council"
|
||||
$ ./fix-metadata-values.py -i /tmp/2020-06-29-fix-sponsors.csv -db dspace -u dspace -p 'fuuu' -f dc.description.sponsorship -t correct -m 29
|
||||
</code></pre><ul>
|
||||
<li>Peter wants me to add “CORONAVIRUS DISEASE” to all ILRI items that have ILRI subject “COVID19”
|
||||
<ul>
|
||||
<li>I exported the ILRI community and cut the columns I needed, then opened the file in OpenRefine:</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<pre><code>$ export JAVA_OPTS="-Xmx512m -Dfile.encoding=UTF-8"
|
||||
$ dspace metadata-export -i 10568/1 -f /tmp/ilri.cs
|
||||
$ csvcut -c 'id,cg.subject.ilri[],cg.subject.ilri[en_US],dc.subject[en_US]' /tmp/ilri.csv > /tmp/ilri-covid19.csv
|
||||
</code></pre><ul>
|
||||
<li>I see that all items with “COVID19” already have “CORONAVIRUS DISEASE” so I don’t need to do anything</li>
|
||||
</ul>
|
||||
<!-- raw HTML omitted -->
|
||||
|
||||
|
||||
|
@ -14,7 +14,7 @@
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="404 Page not found"/>
|
||||
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -9,30 +9,16 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/" />
|
||||
<meta property="og:updated_time" content="2020-06-23T16:13:27+03:00" />
|
||||
<meta property="og:updated_time" content="2020-06-28T18:13:44+03:00" />
|
||||
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="Categories"/>
|
||||
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
||||
<script type="application/ld+json">
|
||||
{
|
||||
"@context": "http://schema.org",
|
||||
"@type": "Blog",
|
||||
"headline": "CGSpace Notes",
|
||||
"url" : "https://alanorth.github.io/cgspace-notes/categories/",
|
||||
"author": {
|
||||
"@type": "Person",
|
||||
"name": "Alan Orth"
|
||||
},
|
||||
"dateModified": "2020-06-01T13:55:39+03:00",
|
||||
"keywords": "notes,""migration,""notes,",
|
||||
"description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
|
||||
}
|
||||
</script>
|
||||
|
||||
|
||||
<link rel="canonical" href="https://alanorth.github.io/cgspace-notes/categories/">
|
||||
@ -94,27 +80,13 @@
|
||||
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-06/">June, 2020</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2020-06-01T13:55:39+03:00">Mon Jun 01, 2020</time> by Alan Orth in
|
||||
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
|
||||
|
||||
|
||||
</p>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/categories/notes/">Notes</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2020-06-01T13:55:39+03:00">Mon Jun 01, 2020</time> by Alan Orth</p>
|
||||
</header>
|
||||
<h2 id="2020-06-01">2020-06-01</h2>
|
||||
<ul>
|
||||
<li>I tried to run the <code>AtomicStatisticsUpdateCLI</code> CUA migration script on DSpace Test (linode26) again and it is still going very slowly and has tons of errors like I noticed yesterday
|
||||
<ul>
|
||||
<li>I sent Atmire the dspace.log from today and told them to log into the server to debug the process</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>In other news, I checked the statistics API on DSpace 6 and it’s working</li>
|
||||
<li>I tried to build the OAI registry on the freshly migrated DSpace 6 on DSpace Test and I get an error:</li>
|
||||
</ul>
|
||||
<a href='https://alanorth.github.io/cgspace-notes/2020-06/'>Read more →</a>
|
||||
|
||||
<a href='https://alanorth.github.io/cgspace-notes/categories/notes/'>Read more →</a>
|
||||
</article>
|
||||
|
||||
|
||||
@ -122,279 +94,6 @@
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-05/">May, 2020</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2020-05-02T09:52:04+03:00">Sat May 02, 2020</time> by Alan Orth in
|
||||
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
|
||||
|
||||
|
||||
</p>
|
||||
</header>
|
||||
<h2 id="2020-05-02">2020-05-02</h2>
|
||||
<ul>
|
||||
<li>Peter said that CTA is having problems submitting an item to CGSpace
|
||||
<ul>
|
||||
<li>Looking at the PostgreSQL stats it seems to be the same issue that Tezira was having last week, as I see the number of connections in ‘idle in transaction’ and ‘waiting for lock’ state are increasing again</li>
|
||||
<li>I see that CGSpace (linode18) is still using PostgreSQL JDBC driver version 42.2.11, and there were some bugs related to transactions fixed in 42.2.12 (which I had updated in the Ansible playbooks, but not deployed yet)</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<a href='https://alanorth.github.io/cgspace-notes/2020-05/'>Read more →</a>
|
||||
</article>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-04/">April, 2020</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2020-04-02T10:53:24+03:00">Thu Apr 02, 2020</time> by Alan Orth in
|
||||
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
|
||||
|
||||
|
||||
</p>
|
||||
</header>
|
||||
<h2 id="2020-04-02">2020-04-02</h2>
|
||||
<ul>
|
||||
<li>Maria asked me to update Charles Staver’s ORCID iD in the submission template and on CGSpace, as his name was lower case before, and now he has corrected it
|
||||
<ul>
|
||||
<li>I updated the fifty-eight existing items on CGSpace</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>Looking into the items Udana had asked about last week that were missing Altmetric donuts:
|
||||
<ul>
|
||||
<li><a href="https://hdl.handle.net/10568/103225">The first</a> is still missing its DOI, so I added it and <a href="https://twitter.com/mralanorth/status/1245632619661766657">tweeted its handle</a> (after a few hours there was a donut with score 222)</li>
|
||||
<li><a href="https://hdl.handle.net/10568/106899">The second item</a> now has a donut with score 2 since I <a href="https://twitter.com/mralanorth/status/1243158045540134913">tweeted its handle</a> last week</li>
|
||||
<li><a href="https://hdl.handle.net/10568/107258">The third item</a> now has a donut with score 1 since I <a href="https://twitter.com/mralanorth/status/1243158786392625153">tweeted it</a> last week</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>On the same note, the <a href="https://hdl.handle.net/10568/106573">one item</a> Abenet pointed out last week now has a donut with score of 104 after I <a href="https://twitter.com/mralanorth/status/1243163710241345536">tweeted it</a> last week</li>
|
||||
</ul>
|
||||
<a href='https://alanorth.github.io/cgspace-notes/2020-04/'>Read more →</a>
|
||||
</article>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-03/">March, 2020</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2020-03-02T12:31:30+02:00">Mon Mar 02, 2020</time> by Alan Orth in
|
||||
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
|
||||
|
||||
|
||||
</p>
|
||||
</header>
|
||||
<h2 id="2020-03-02">2020-03-02</h2>
|
||||
<ul>
|
||||
<li>Update <a href="https://github.com/ilri/dspace-statistics-api">dspace-statistics-api</a> for DSpace 6+ UUIDs
|
||||
<ul>
|
||||
<li>Tag version 1.2.0 on GitHub</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>Test migrating legacy Solr statistics to UUIDs with the as-of-yet unreleased <a href="https://github.com/DSpace/DSpace/commit/184f2b2153479045fba6239342c63e7f8564b8b6#diff-0350ce2e13b28d5d61252b7a8f50a059">SolrUpgradePre6xStatistics.java</a>
|
||||
<ul>
|
||||
<li>You need to download this into the DSpace 6.x source and compile it</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<a href='https://alanorth.github.io/cgspace-notes/2020-03/'>Read more →</a>
|
||||
</article>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-02/">February, 2020</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2020-02-02T11:56:30+02:00">Sun Feb 02, 2020</time> by Alan Orth in
|
||||
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
|
||||
|
||||
|
||||
</p>
|
||||
</header>
|
||||
<h2 id="2020-02-02">2020-02-02</h2>
|
||||
<ul>
|
||||
<li>Continue working on porting CGSpace’s DSpace 5 code to DSpace 6.3 that I started yesterday
|
||||
<ul>
|
||||
<li>Sign up for an account with MaxMind so I can get the GeoLite2-City.mmdb database</li>
|
||||
<li>I still need to wire up the API credentials and cron job into the Ansible infrastructure playbooks</li>
|
||||
<li>Fix some minor issues in the config and XMLUI themes, like removing Atmire stuff</li>
|
||||
<li>The code finally builds and runs with a fresh install</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<a href='https://alanorth.github.io/cgspace-notes/2020-02/'>Read more →</a>
|
||||
</article>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-01/">January, 2020</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2020-01-06T10:48:30+02:00">Mon Jan 06, 2020</time> by Alan Orth in
|
||||
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
|
||||
|
||||
|
||||
</p>
|
||||
</header>
|
||||
<h2 id="2020-01-06">2020-01-06</h2>
|
||||
<ul>
|
||||
<li>Open <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=706">a ticket</a> with Atmire to request a quote for the upgrade to DSpace 6</li>
|
||||
<li>Last week Altmetric responded about the <a href="https://hdl.handle.net/10568/97087">item</a> that had a lower score than than its DOI
|
||||
<ul>
|
||||
<li>The score is now linked to the DOI</li>
|
||||
<li>Another <a href="https://handle.hdl.net/10568/91278">item</a> that had the same problem in 2019 has now also linked to the score for its DOI</li>
|
||||
<li>Another <a href="https://hdl.handle.net/10568/81236">item</a> that had the same problem in 2019 has also been fixed</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<h2 id="2020-01-07">2020-01-07</h2>
|
||||
<ul>
|
||||
<li>Peter Ballantyne highlighted one more WLE <a href="https://hdl.handle.net/10568/101286">item</a> that is missing the Altmetric score that its DOI has
|
||||
<ul>
|
||||
<li>The DOI has a score of 259, but the Handle has no score at all</li>
|
||||
<li>I <a href="https://twitter.com/mralanorth/status/1214471427157626881">tweeted</a> the CGSpace repository link</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<a href='https://alanorth.github.io/cgspace-notes/2020-01/'>Read more →</a>
|
||||
</article>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-12/">December, 2019</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2019-12-01T11:22:30+02:00">Sun Dec 01, 2019</time> by Alan Orth in
|
||||
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
|
||||
|
||||
|
||||
</p>
|
||||
</header>
|
||||
<h2 id="2019-12-01">2019-12-01</h2>
|
||||
<ul>
|
||||
<li>Upgrade CGSpace (linode18) to Ubuntu 18.04:
|
||||
<ul>
|
||||
<li>Check any packages that have residual configs and purge them:</li>
|
||||
<li><!-- raw HTML omitted --># dpkg -l | grep -E ‘^rc’ | awk ‘{print $2}’ | xargs dpkg -P<!-- raw HTML omitted --></li>
|
||||
<li>Make sure all packages are up to date and the package manager is up to date, then reboot:</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<pre><code># apt update && apt full-upgrade
|
||||
# apt-get autoremove && apt-get autoclean
|
||||
# dpkg -C
|
||||
# reboot
|
||||
</code></pre>
|
||||
<a href='https://alanorth.github.io/cgspace-notes/2019-12/'>Read more →</a>
|
||||
</article>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-11/">November, 2019</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2019-11-04T12:20:30+02:00">Mon Nov 04, 2019</time> by Alan Orth in
|
||||
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
|
||||
|
||||
|
||||
</p>
|
||||
</header>
|
||||
<h2 id="2019-11-04">2019-11-04</h2>
|
||||
<ul>
|
||||
<li>Peter noticed that there were 5.2 million hits on CGSpace in 2019-10 according to the Atmire usage statistics
|
||||
<ul>
|
||||
<li>I looked in the nginx logs and see 4.6 million in the access logs, and 1.2 million in the API logs:</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<pre><code># zcat --force /var/log/nginx/*access.log.*.gz | grep -cE "[0-9]{1,2}/Oct/2019"
|
||||
4671942
|
||||
# zcat --force /var/log/nginx/{rest,oai,statistics}.log.*.gz | grep -cE "[0-9]{1,2}/Oct/2019"
|
||||
1277694
|
||||
</code></pre><ul>
|
||||
<li>So 4.6 million from XMLUI and another 1.2 million from API requests</li>
|
||||
<li>Let’s see how many of the REST API requests were for bitstreams (because they are counted in Solr stats):</li>
|
||||
</ul>
|
||||
<pre><code># zcat --force /var/log/nginx/rest.log.*.gz | grep -c -E "[0-9]{1,2}/Oct/2019"
|
||||
1183456
|
||||
# zcat --force /var/log/nginx/rest.log.*.gz | grep -E "[0-9]{1,2}/Oct/2019" | grep -c -E "/rest/bitstreams"
|
||||
106781
|
||||
</code></pre>
|
||||
<a href='https://alanorth.github.io/cgspace-notes/2019-11/'>Read more →</a>
|
||||
</article>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/cgspace-cgcorev2-migration/">CGSpace CG Core v2 Migration</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2019-10-28T13:27:35+02:00">Mon Oct 28, 2019</time> by Alan Orth in
|
||||
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
|
||||
|
||||
|
||||
<span class="fas fa-tag" aria-hidden="true"></span> <a href="/cgspace-notes/tags/migration/" rel="tag">Migration</a>
|
||||
|
||||
</p>
|
||||
</header>
|
||||
<p>Possible changes to CGSpace metadata fields to align more with DC, QDC, and DCTERMS as well as CG Core v2.</p>
|
||||
<p>With reference to <a href="https://agriculturalsemantics.github.io/cg-core/cgcore.html">CG Core v2 draft standard</a> by Marie-Angélique as well as <a href="http://www.dublincore.org/specifications/dublin-core/dcmi-terms/">DCMI DCTERMS</a>.</p>
|
||||
<a href='https://alanorth.github.io/cgspace-notes/cgspace-cgcorev2-migration/'>Read more →</a>
|
||||
</article>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-10/">October, 2019</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2019-10-01T13:20:51+03:00">Tue Oct 01, 2019</time> by Alan Orth in
|
||||
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
|
||||
|
||||
|
||||
</p>
|
||||
</header>
|
||||
2019-10-01 Udana from IWMI asked me for a CSV export of their community on CGSpace I exported it, but a quick run through the csv-metadata-quality tool shows that there are some low-hanging fruits we can fix before I send him the data I will limit the scope to the titles, regions, subregions, and river basins for now to manually fix some non-breaking spaces (U+00A0) there that would otherwise be removed by the csv-metadata-quality script’s “unneccesary Unicode” fix: $ csvcut -c 'id,dc.
|
||||
<a href='https://alanorth.github.io/cgspace-notes/2019-10/'>Read more →</a>
|
||||
</article>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<nav class="blog-pagination">
|
||||
|
||||
|
||||
|
||||
<a class="btn btn-outline-primary disabled" href="#" role="button" aria-disabled="true">Previous page</a>
|
||||
<a class="btn btn-outline-primary" href="/cgspace-notes/categories/page/2/" rel="next" role="button">Next page</a>
|
||||
|
||||
</nav>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
</div> <!-- /.blog-main -->
|
||||
|
@ -9,12 +9,12 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
||||
<meta property="og:updated_time" content="2020-06-23T16:13:27+03:00" />
|
||||
<meta property="og:updated_time" content="2020-06-28T18:13:44+03:00" />
|
||||
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="Notes"/>
|
||||
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -9,12 +9,12 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
||||
<meta property="og:updated_time" content="2020-06-23T16:13:27+03:00" />
|
||||
<meta property="og:updated_time" content="2020-06-28T18:13:44+03:00" />
|
||||
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="Notes"/>
|
||||
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -9,12 +9,12 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
||||
<meta property="og:updated_time" content="2020-06-23T16:13:27+03:00" />
|
||||
<meta property="og:updated_time" content="2020-06-28T18:13:44+03:00" />
|
||||
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="Notes"/>
|
||||
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -9,12 +9,12 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/categories/notes/" />
|
||||
<meta property="og:updated_time" content="2020-06-23T16:13:27+03:00" />
|
||||
<meta property="og:updated_time" content="2020-06-28T18:13:44+03:00" />
|
||||
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="Notes"/>
|
||||
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -15,7 +15,7 @@
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="CGIAR Library Migration"/>
|
||||
<meta name="twitter:description" content="Notes on the migration of the CGIAR Library to CGSpace"/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -15,7 +15,7 @@
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="CGSpace CG Core v2 Migration"/>
|
||||
<meta name="twitter:description" content="Possible changes to CGSpace metadata fields to align more with DC, QDC, and DCTERMS as well as CG Core v2."/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -9,12 +9,12 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
||||
<meta property="og:updated_time" content="2020-06-23T16:13:27+03:00" />
|
||||
<meta property="og:updated_time" content="2020-06-28T18:13:44+03:00" />
|
||||
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="CGSpace Notes"/>
|
||||
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -9,12 +9,12 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
||||
<meta property="og:updated_time" content="2020-06-23T16:13:27+03:00" />
|
||||
<meta property="og:updated_time" content="2020-06-28T18:13:44+03:00" />
|
||||
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="CGSpace Notes"/>
|
||||
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -9,12 +9,12 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
||||
<meta property="og:updated_time" content="2020-06-23T16:13:27+03:00" />
|
||||
<meta property="og:updated_time" content="2020-06-28T18:13:44+03:00" />
|
||||
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="CGSpace Notes"/>
|
||||
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -9,12 +9,12 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
||||
<meta property="og:updated_time" content="2020-06-23T16:13:27+03:00" />
|
||||
<meta property="og:updated_time" content="2020-06-28T18:13:44+03:00" />
|
||||
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="CGSpace Notes"/>
|
||||
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -9,12 +9,12 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
||||
<meta property="og:updated_time" content="2020-06-23T16:13:27+03:00" />
|
||||
<meta property="og:updated_time" content="2020-06-28T18:13:44+03:00" />
|
||||
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="CGSpace Notes"/>
|
||||
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -9,12 +9,12 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/" />
|
||||
<meta property="og:updated_time" content="2020-06-23T16:13:27+03:00" />
|
||||
<meta property="og:updated_time" content="2020-06-28T18:13:44+03:00" />
|
||||
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="CGSpace Notes"/>
|
||||
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -9,12 +9,12 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
||||
<meta property="og:updated_time" content="2020-06-23T16:13:27+03:00" />
|
||||
<meta property="og:updated_time" content="2020-06-28T18:13:44+03:00" />
|
||||
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="Posts"/>
|
||||
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -9,12 +9,12 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
||||
<meta property="og:updated_time" content="2020-06-23T16:13:27+03:00" />
|
||||
<meta property="og:updated_time" content="2020-06-28T18:13:44+03:00" />
|
||||
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="Posts"/>
|
||||
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -9,12 +9,12 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
||||
<meta property="og:updated_time" content="2020-06-23T16:13:27+03:00" />
|
||||
<meta property="og:updated_time" content="2020-06-28T18:13:44+03:00" />
|
||||
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="Posts"/>
|
||||
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -9,12 +9,12 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
||||
<meta property="og:updated_time" content="2020-06-23T16:13:27+03:00" />
|
||||
<meta property="og:updated_time" content="2020-06-28T18:13:44+03:00" />
|
||||
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="Posts"/>
|
||||
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -9,12 +9,12 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
||||
<meta property="og:updated_time" content="2020-06-23T16:13:27+03:00" />
|
||||
<meta property="og:updated_time" content="2020-06-28T18:13:44+03:00" />
|
||||
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="Posts"/>
|
||||
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -9,12 +9,12 @@
|
||||
<meta property="og:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository." />
|
||||
<meta property="og:type" content="website" />
|
||||
<meta property="og:url" content="https://alanorth.github.io/cgspace-notes/posts/" />
|
||||
<meta property="og:updated_time" content="2020-06-23T16:13:27+03:00" />
|
||||
<meta property="og:updated_time" content="2020-06-28T18:13:44+03:00" />
|
||||
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="Posts"/>
|
||||
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -4,27 +4,27 @@
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/categories/</loc>
|
||||
<lastmod>2020-06-23T16:13:27+03:00</lastmod>
|
||||
<lastmod>2020-06-28T18:13:44+03:00</lastmod>
|
||||
</url>
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/</loc>
|
||||
<lastmod>2020-06-23T16:13:27+03:00</lastmod>
|
||||
<lastmod>2020-06-28T18:13:44+03:00</lastmod>
|
||||
</url>
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/2020-06/</loc>
|
||||
<lastmod>2020-06-23T16:13:27+03:00</lastmod>
|
||||
<lastmod>2020-06-28T18:13:44+03:00</lastmod>
|
||||
</url>
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/categories/notes/</loc>
|
||||
<lastmod>2020-06-23T16:13:27+03:00</lastmod>
|
||||
<lastmod>2020-06-28T18:13:44+03:00</lastmod>
|
||||
</url>
|
||||
|
||||
<url>
|
||||
<loc>https://alanorth.github.io/cgspace-notes/posts/</loc>
|
||||
<lastmod>2020-06-23T16:13:27+03:00</lastmod>
|
||||
<lastmod>2020-06-28T18:13:44+03:00</lastmod>
|
||||
</url>
|
||||
|
||||
<url>
|
||||
|
@ -14,25 +14,11 @@
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="Tags"/>
|
||||
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
||||
<script type="application/ld+json">
|
||||
{
|
||||
"@context": "http://schema.org",
|
||||
"@type": "Blog",
|
||||
"headline": "CGSpace Notes",
|
||||
"url" : "https://alanorth.github.io/cgspace-notes/tags/",
|
||||
"author": {
|
||||
"@type": "Person",
|
||||
"name": "Alan Orth"
|
||||
},
|
||||
"dateModified": "2019-10-28T13:27:35+02:00",
|
||||
"keywords": "notes,""migration,""notes,",
|
||||
"description":"Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."
|
||||
}
|
||||
</script>
|
||||
|
||||
|
||||
<link rel="canonical" href="https://alanorth.github.io/cgspace-notes/tags/">
|
||||
@ -94,27 +80,13 @@
|
||||
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-06/">June, 2020</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2020-06-01T13:55:39+03:00">Mon Jun 01, 2020</time> by Alan Orth in
|
||||
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
|
||||
|
||||
|
||||
</p>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/tags/migration/">Migration</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2019-10-28T13:27:35+02:00">Mon Oct 28, 2019</time> by Alan Orth</p>
|
||||
</header>
|
||||
<h2 id="2020-06-01">2020-06-01</h2>
|
||||
<ul>
|
||||
<li>I tried to run the <code>AtomicStatisticsUpdateCLI</code> CUA migration script on DSpace Test (linode26) again and it is still going very slowly and has tons of errors like I noticed yesterday
|
||||
<ul>
|
||||
<li>I sent Atmire the dspace.log from today and told them to log into the server to debug the process</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>In other news, I checked the statistics API on DSpace 6 and it’s working</li>
|
||||
<li>I tried to build the OAI registry on the freshly migrated DSpace 6 on DSpace Test and I get an error:</li>
|
||||
</ul>
|
||||
<a href='https://alanorth.github.io/cgspace-notes/2020-06/'>Read more →</a>
|
||||
|
||||
<a href='https://alanorth.github.io/cgspace-notes/tags/migration/'>Read more →</a>
|
||||
</article>
|
||||
|
||||
|
||||
@ -124,23 +96,11 @@
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-05/">May, 2020</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2020-05-02T09:52:04+03:00">Sat May 02, 2020</time> by Alan Orth in
|
||||
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
|
||||
|
||||
|
||||
</p>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/tags/notes/">Notes</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2017-09-07T16:54:52+07:00">Thu Sep 07, 2017</time> by Alan Orth</p>
|
||||
</header>
|
||||
<h2 id="2020-05-02">2020-05-02</h2>
|
||||
<ul>
|
||||
<li>Peter said that CTA is having problems submitting an item to CGSpace
|
||||
<ul>
|
||||
<li>Looking at the PostgreSQL stats it seems to be the same issue that Tezira was having last week, as I see the number of connections in ‘idle in transaction’ and ‘waiting for lock’ state are increasing again</li>
|
||||
<li>I see that CGSpace (linode18) is still using PostgreSQL JDBC driver version 42.2.11, and there were some bugs related to transactions fixed in 42.2.12 (which I had updated in the Ansible playbooks, but not deployed yet)</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<a href='https://alanorth.github.io/cgspace-notes/2020-05/'>Read more →</a>
|
||||
|
||||
<a href='https://alanorth.github.io/cgspace-notes/tags/notes/'>Read more →</a>
|
||||
</article>
|
||||
|
||||
|
||||
@ -148,253 +108,6 @@
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-04/">April, 2020</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2020-04-02T10:53:24+03:00">Thu Apr 02, 2020</time> by Alan Orth in
|
||||
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
|
||||
|
||||
|
||||
</p>
|
||||
</header>
|
||||
<h2 id="2020-04-02">2020-04-02</h2>
|
||||
<ul>
|
||||
<li>Maria asked me to update Charles Staver’s ORCID iD in the submission template and on CGSpace, as his name was lower case before, and now he has corrected it
|
||||
<ul>
|
||||
<li>I updated the fifty-eight existing items on CGSpace</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>Looking into the items Udana had asked about last week that were missing Altmetric donuts:
|
||||
<ul>
|
||||
<li><a href="https://hdl.handle.net/10568/103225">The first</a> is still missing its DOI, so I added it and <a href="https://twitter.com/mralanorth/status/1245632619661766657">tweeted its handle</a> (after a few hours there was a donut with score 222)</li>
|
||||
<li><a href="https://hdl.handle.net/10568/106899">The second item</a> now has a donut with score 2 since I <a href="https://twitter.com/mralanorth/status/1243158045540134913">tweeted its handle</a> last week</li>
|
||||
<li><a href="https://hdl.handle.net/10568/107258">The third item</a> now has a donut with score 1 since I <a href="https://twitter.com/mralanorth/status/1243158786392625153">tweeted it</a> last week</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>On the same note, the <a href="https://hdl.handle.net/10568/106573">one item</a> Abenet pointed out last week now has a donut with score of 104 after I <a href="https://twitter.com/mralanorth/status/1243163710241345536">tweeted it</a> last week</li>
|
||||
</ul>
|
||||
<a href='https://alanorth.github.io/cgspace-notes/2020-04/'>Read more →</a>
|
||||
</article>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-03/">March, 2020</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2020-03-02T12:31:30+02:00">Mon Mar 02, 2020</time> by Alan Orth in
|
||||
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
|
||||
|
||||
|
||||
</p>
|
||||
</header>
|
||||
<h2 id="2020-03-02">2020-03-02</h2>
|
||||
<ul>
|
||||
<li>Update <a href="https://github.com/ilri/dspace-statistics-api">dspace-statistics-api</a> for DSpace 6+ UUIDs
|
||||
<ul>
|
||||
<li>Tag version 1.2.0 on GitHub</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>Test migrating legacy Solr statistics to UUIDs with the as-of-yet unreleased <a href="https://github.com/DSpace/DSpace/commit/184f2b2153479045fba6239342c63e7f8564b8b6#diff-0350ce2e13b28d5d61252b7a8f50a059">SolrUpgradePre6xStatistics.java</a>
|
||||
<ul>
|
||||
<li>You need to download this into the DSpace 6.x source and compile it</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<a href='https://alanorth.github.io/cgspace-notes/2020-03/'>Read more →</a>
|
||||
</article>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-02/">February, 2020</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2020-02-02T11:56:30+02:00">Sun Feb 02, 2020</time> by Alan Orth in
|
||||
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
|
||||
|
||||
|
||||
</p>
|
||||
</header>
|
||||
<h2 id="2020-02-02">2020-02-02</h2>
|
||||
<ul>
|
||||
<li>Continue working on porting CGSpace’s DSpace 5 code to DSpace 6.3 that I started yesterday
|
||||
<ul>
|
||||
<li>Sign up for an account with MaxMind so I can get the GeoLite2-City.mmdb database</li>
|
||||
<li>I still need to wire up the API credentials and cron job into the Ansible infrastructure playbooks</li>
|
||||
<li>Fix some minor issues in the config and XMLUI themes, like removing Atmire stuff</li>
|
||||
<li>The code finally builds and runs with a fresh install</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<a href='https://alanorth.github.io/cgspace-notes/2020-02/'>Read more →</a>
|
||||
</article>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2020-01/">January, 2020</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2020-01-06T10:48:30+02:00">Mon Jan 06, 2020</time> by Alan Orth in
|
||||
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
|
||||
|
||||
|
||||
</p>
|
||||
</header>
|
||||
<h2 id="2020-01-06">2020-01-06</h2>
|
||||
<ul>
|
||||
<li>Open <a href="https://tracker.atmire.com/tickets-cgiar-ilri/view-ticket?id=706">a ticket</a> with Atmire to request a quote for the upgrade to DSpace 6</li>
|
||||
<li>Last week Altmetric responded about the <a href="https://hdl.handle.net/10568/97087">item</a> that had a lower score than than its DOI
|
||||
<ul>
|
||||
<li>The score is now linked to the DOI</li>
|
||||
<li>Another <a href="https://handle.hdl.net/10568/91278">item</a> that had the same problem in 2019 has now also linked to the score for its DOI</li>
|
||||
<li>Another <a href="https://hdl.handle.net/10568/81236">item</a> that had the same problem in 2019 has also been fixed</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<h2 id="2020-01-07">2020-01-07</h2>
|
||||
<ul>
|
||||
<li>Peter Ballantyne highlighted one more WLE <a href="https://hdl.handle.net/10568/101286">item</a> that is missing the Altmetric score that its DOI has
|
||||
<ul>
|
||||
<li>The DOI has a score of 259, but the Handle has no score at all</li>
|
||||
<li>I <a href="https://twitter.com/mralanorth/status/1214471427157626881">tweeted</a> the CGSpace repository link</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<a href='https://alanorth.github.io/cgspace-notes/2020-01/'>Read more →</a>
|
||||
</article>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-12/">December, 2019</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2019-12-01T11:22:30+02:00">Sun Dec 01, 2019</time> by Alan Orth in
|
||||
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
|
||||
|
||||
|
||||
</p>
|
||||
</header>
|
||||
<h2 id="2019-12-01">2019-12-01</h2>
|
||||
<ul>
|
||||
<li>Upgrade CGSpace (linode18) to Ubuntu 18.04:
|
||||
<ul>
|
||||
<li>Check any packages that have residual configs and purge them:</li>
|
||||
<li><!-- raw HTML omitted --># dpkg -l | grep -E ‘^rc’ | awk ‘{print $2}’ | xargs dpkg -P<!-- raw HTML omitted --></li>
|
||||
<li>Make sure all packages are up to date and the package manager is up to date, then reboot:</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<pre><code># apt update && apt full-upgrade
|
||||
# apt-get autoremove && apt-get autoclean
|
||||
# dpkg -C
|
||||
# reboot
|
||||
</code></pre>
|
||||
<a href='https://alanorth.github.io/cgspace-notes/2019-12/'>Read more →</a>
|
||||
</article>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-11/">November, 2019</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2019-11-04T12:20:30+02:00">Mon Nov 04, 2019</time> by Alan Orth in
|
||||
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
|
||||
|
||||
|
||||
</p>
|
||||
</header>
|
||||
<h2 id="2019-11-04">2019-11-04</h2>
|
||||
<ul>
|
||||
<li>Peter noticed that there were 5.2 million hits on CGSpace in 2019-10 according to the Atmire usage statistics
|
||||
<ul>
|
||||
<li>I looked in the nginx logs and see 4.6 million in the access logs, and 1.2 million in the API logs:</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
<pre><code># zcat --force /var/log/nginx/*access.log.*.gz | grep -cE "[0-9]{1,2}/Oct/2019"
|
||||
4671942
|
||||
# zcat --force /var/log/nginx/{rest,oai,statistics}.log.*.gz | grep -cE "[0-9]{1,2}/Oct/2019"
|
||||
1277694
|
||||
</code></pre><ul>
|
||||
<li>So 4.6 million from XMLUI and another 1.2 million from API requests</li>
|
||||
<li>Let’s see how many of the REST API requests were for bitstreams (because they are counted in Solr stats):</li>
|
||||
</ul>
|
||||
<pre><code># zcat --force /var/log/nginx/rest.log.*.gz | grep -c -E "[0-9]{1,2}/Oct/2019"
|
||||
1183456
|
||||
# zcat --force /var/log/nginx/rest.log.*.gz | grep -E "[0-9]{1,2}/Oct/2019" | grep -c -E "/rest/bitstreams"
|
||||
106781
|
||||
</code></pre>
|
||||
<a href='https://alanorth.github.io/cgspace-notes/2019-11/'>Read more →</a>
|
||||
</article>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/cgspace-cgcorev2-migration/">CGSpace CG Core v2 Migration</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2019-10-28T13:27:35+02:00">Mon Oct 28, 2019</time> by Alan Orth in
|
||||
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
|
||||
|
||||
|
||||
<span class="fas fa-tag" aria-hidden="true"></span> <a href="/cgspace-notes/tags/migration/" rel="tag">Migration</a>
|
||||
|
||||
</p>
|
||||
</header>
|
||||
<p>Possible changes to CGSpace metadata fields to align more with DC, QDC, and DCTERMS as well as CG Core v2.</p>
|
||||
<p>With reference to <a href="https://agriculturalsemantics.github.io/cg-core/cgcore.html">CG Core v2 draft standard</a> by Marie-Angélique as well as <a href="http://www.dublincore.org/specifications/dublin-core/dcmi-terms/">DCMI DCTERMS</a>.</p>
|
||||
<a href='https://alanorth.github.io/cgspace-notes/cgspace-cgcorev2-migration/'>Read more →</a>
|
||||
</article>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<article class="blog-post">
|
||||
<header>
|
||||
<h2 class="blog-post-title" dir="auto"><a href="https://alanorth.github.io/cgspace-notes/2019-10/">October, 2019</a></h2>
|
||||
<p class="blog-post-meta"><time datetime="2019-10-01T13:20:51+03:00">Tue Oct 01, 2019</time> by Alan Orth in
|
||||
<span class="fas fa-folder" aria-hidden="true"></span> <a href="/cgspace-notes/categories/notes/" rel="category tag">Notes</a>
|
||||
|
||||
|
||||
</p>
|
||||
</header>
|
||||
2019-10-01 Udana from IWMI asked me for a CSV export of their community on CGSpace I exported it, but a quick run through the csv-metadata-quality tool shows that there are some low-hanging fruits we can fix before I send him the data I will limit the scope to the titles, regions, subregions, and river basins for now to manually fix some non-breaking spaces (U+00A0) there that would otherwise be removed by the csv-metadata-quality script’s “unneccesary Unicode” fix: $ csvcut -c 'id,dc.
|
||||
<a href='https://alanorth.github.io/cgspace-notes/2019-10/'>Read more →</a>
|
||||
</article>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<nav class="blog-pagination">
|
||||
|
||||
|
||||
|
||||
<a class="btn btn-outline-primary disabled" href="#" role="button" aria-disabled="true">Previous page</a>
|
||||
<a class="btn btn-outline-primary" href="/cgspace-notes/tags/page/2/" rel="next" role="button">Next page</a>
|
||||
|
||||
</nav>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
</div> <!-- /.blog-main -->
|
||||
|
@ -14,7 +14,7 @@
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="Migration"/>
|
||||
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -14,7 +14,7 @@
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="Notes"/>
|
||||
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -14,7 +14,7 @@
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="Notes"/>
|
||||
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
@ -14,7 +14,7 @@
|
||||
<meta name="twitter:card" content="summary"/>
|
||||
<meta name="twitter:title" content="Notes"/>
|
||||
<meta name="twitter:description" content="Documenting day-to-day work on the [CGSpace](https://cgspace.cgiar.org) repository."/>
|
||||
<meta name="generator" content="Hugo 0.72.0" />
|
||||
<meta name="generator" content="Hugo 0.73.0" />
|
||||
|
||||
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user