- Looking at the CG Core document again, I'll send some feedback to Peter and Abenet:
- We use cg.contributor.crp to indicate the CRP(s) affiliated with the item
- DSpace has dc.date.available, but this field isn't particularly meaningful other than as an automatic timestamp at the time of item accession (and is identical to dc.date.accessioned)
- dc.relation exists in CGSpace, but isn't used—rather dc.relation.ispartofseries, which is used ~5,000 times to Series name and number within that series
- Also, I'm noticing some weird outliers in `cg.coverage.region`, need to remember to go correct these later:
```
dspace=# select text_value from metadatavalue where resource_type_id=2 and metadata_field_id=227;
- Trying to find a way to get the number of items submitted by a certain user in 2016
- It's not possible in the DSpace search / module interfaces, but might be able to be derived from `dc.description.provenance`, as that field contains the name and email of the submitter/approver, ie:
```
Submitted by Francesca Giampieri (fgiampieri) on 2016-01-19T13:56:43Z^M
- This SQL query returns fields that were submitted or approved by giampieri in 2016 and contain a "checksum" (ie, there was a bitstream in the submission):
```
dspace=# select * from metadatavalue where resource_type_id=2 and metadata_field_id=28 and text_value ~ '^(Submitted|Approved).*giampieri.*2016-.*checksum.*';
```
- Then this one does the same, but for fields that don't contain checksums (ie, there was no bitstream in the submission):
```
dspace=# select * from metadatavalue where resource_type_id=2 and metadata_field_id=28 and text_value ~ '^(Submitted|Approved).*giampieri.*2016-.*' and text_value !~ '^(Submitted|Approved).*giampieri.*2016-.*checksum.*';
```
- For some reason there seem to be way too many fields, for example there are 498 + 13 here, which is 511 items for just this one user.
- It looks like there can be a scenario where the user submitted AND approved it, so some records might be doubled...
- In that case it might just be better to see how many the user submitted (both _with_ and _without_ bitstreams):
- After reading the [notes for DCAT April 2017](https://wiki.duraspace.org/display/cmtygp/DCAT+Meeting+April+2017) I am testing some new settings for PostgreSQL on DSpace Test:
-`db.maxconnections` 30→70 (the default PostgreSQL config allows 100 connections, so DSpace's default of 30 is quite low)
-`db.maxwait` 5000→10000
-`db.maxidle` 8→20 (DSpace default is -1, unlimited, but we had set it to 8 earlier)
- I need to look at the Munin graphs after a few days to see if the load has changed
- Sisay added their OAI as a source to a new collection, but using the Simple Dublin Core method, so many fields are unqualified and duplicated
- Looking at the [documentation](https://wiki.duraspace.org/display/DSDOC5x/XMLUI+Configuration+and+Customization) it seems that we probably want to be using DSpace Intermediate Metadata
- CIFOR is starting to test aligning their metadata more with CGSpace/CG core
- They shared a [test item](https://data.cifor.org/dspace/xmlui/handle/11463/947?show=full) which is using `cg.coverage.country`, `cg.subject.cifor`, `dc.subject`, and `dc.date.issued`
- Looking at their OAI I'm not sure it has updated as I don't see the new fields: https://data.cifor.org/dspace/oai/request?verb=ListRecords&resumptionToken=oai_dc///col_11463_6/900
- Maybe they need to make sure they are running the OAI cache refresh cron job, or maybe OAI doesn't export these?
- I added `cg.subject.cifor` to the metadata registry and I'm waiting for the harvester to re-harvest to see if it picks up more data now
- Another possiblity is that we could use a cross walk... but I've never done it.
## 2017-04-11
- Looking at the item from CIFOR it hasn't been updated yet, maybe they aren't running the cron job
- Side note: WTF, I just saw an item on CGSpace's OAI that is using `dc.cplace.country` and `dc.rplace.region`, which we stopped using in 2016 after the metadata migrations:
![stale metadata in OAI](/cgspace-notes/2017/04/cplace.png)
- The particular item is [10568/6](http://hdl.handle.net/10568/6) and, for what it's worth, the stale metadata only appears in the OAI view:
OAI 2.0 manager action ended. It took 829 seconds.
```
- After reading some threads on the DSpace mailing list, I see that `clean-cache` is actually only for caching _responses_, ie to client requests in the OAI web application
- These are stored in `[dspace]/var/oai/requests/`
- The import command should theoretically catch situations like this where an item's metadata was updated, but in this case we changed the metadata schema and it doesn't seem to catch it (could be a bug!)
OAI 2.0 manager action ended. It took 1032 seconds.
real 17m20.156s
user 4m35.293s
sys 1m29.310s
```
- Now the data for 10568/6 is correct in OAI: https://cgspace.cgiar.org/oai/request?verb=GetRecord&metadataPrefix=dim&identifier=oai:cgspace.cgiar.org:10568/6
- Perhaps I need to file a bug for this, or at least ask on the DSpace Test mailing list?
- I wonder if we could use a crosswalk to convert to a format that CG Core wants, like `<date Type="Available">`