Update notes for 2017-01-24

This commit is contained in:
2017-01-24 12:41:58 +02:00
parent dad9c406f6
commit 54c60de7d1
5 changed files with 70 additions and 22 deletions

View File

@ -194,8 +194,7 @@ value + "__description:" + cells["dc.type"].value
- Test importing of the new CIAT records (actually there are 232, not 234):
```
$ JAVA_OPTS="-Xmx512m -Dfile.encoding=UTF-8" /home/dspacetest.cgiar.org/bin/dspace import --add --eperson=aorth@mjanja.ch --collection=10568
/79042 --source /home/aorth/CIAT_234/SimpleArchiveFormat/ --mapfile=/tmp/ciat.map &> /tmp/ciat.log
$ JAVA_OPTS="-Xmx512m -Dfile.encoding=UTF-8" /home/dspacetest.cgiar.org/bin/dspace import --add --eperson=aorth@mjanja.ch --collection=10568/79042 --source /home/aorth/CIAT_234/SimpleArchiveFormat/ --mapfile=/tmp/ciat.map &> /tmp/ciat.log
```
- Many of the PDFs are 20, 30, 40, 50+ MB, which makes a total of 4GB
@ -246,3 +245,12 @@ $ for community in 10568/171 10568/27868 10568/231 10568/27869 10568/150 10568/2
```
$ ./fix-metadata-values.py -i /tmp/fix-49-journal-titles.csv -f dc.source -t correct -m 55 -d dspace -u dspace -p 'password'
```
- Create a new list of the top 500 journal titles from the database:
```
dspace-# \copy (select distinct text_value, count(*) from metadatavalue where resource_type_id=2 and metadata_field_id=55 group by text_value order by count desc limit 500) to /tmp/journal-titles.csv with csv;
```
- Then sort them in OpenRefine and create a controlled vocabulary by manually adding the XML markup, pull request ([#298](https://github.com/ilri/DSpace/pull/298))
- This would be the last issue remaining to close the meta issue about switching to controlled vocabularies ([#69](https://github.com/ilri/DSpace/pull/69))