2018-02-11 18:28:23 +02:00
<!DOCTYPE html>
2019-10-11 11:19:42 +03:00
< html lang = "en" >
2018-02-11 18:28:23 +02:00
< head >
< meta charset = "utf-8" >
< meta name = "viewport" content = "width=device-width, initial-scale=1, shrink-to-fit=no" >
2020-12-06 16:53:29 +02:00
2018-02-11 18:28:23 +02:00
< meta property = "og:title" content = "May, 2016" / >
< meta property = "og:description" content = "2016-05-01
Since yesterday there have been 10,000 REST errors and the site has been unstable again
I have blocked access to the API now
2019-05-05 16:45:12 +03:00
There are 3,000 IPs accessing the REST API in a 24-hour period!
2018-02-11 18:28:23 +02:00
# awk ' {print $1}' /var/log/nginx/rest.log | uniq | wc -l
3168
" />
< meta property = "og:type" content = "article" / >
2019-02-02 14:12:57 +02:00
< meta property = "og:url" content = "https://alanorth.github.io/cgspace-notes/2016-05/" / >
2019-08-08 18:10:44 +03:00
< meta property = "article:published_time" content = "2016-05-01T23:06:00+03:00" / >
2020-04-13 17:24:05 +03:00
< meta property = "article:modified_time" content = "2020-04-13T15:30:24+03:00" / >
2018-09-30 08:23:48 +03:00
2020-12-06 16:53:29 +02:00
2018-02-11 18:28:23 +02:00
< meta name = "twitter:card" content = "summary" / >
< meta name = "twitter:title" content = "May, 2016" / >
< meta name = "twitter:description" content = "2016-05-01
Since yesterday there have been 10,000 REST errors and the site has been unstable again
I have blocked access to the API now
2019-05-05 16:45:12 +03:00
There are 3,000 IPs accessing the REST API in a 24-hour period!
2018-02-11 18:28:23 +02:00
# awk ' {print $1}' /var/log/nginx/rest.log | uniq | wc -l
3168
"/>
2023-01-10 22:22:03 +03:00
< meta name = "generator" content = "Hugo 0.109.0" >
2018-02-11 18:28:23 +02:00
< script type = "application/ld+json" >
{
"@context": "http://schema.org",
"@type": "BlogPosting",
"headline": "May, 2016",
2020-04-02 10:55:42 +03:00
"url": "https://alanorth.github.io/cgspace-notes/2016-05/",
2018-04-30 19:05:39 +03:00
"wordCount": "1349",
2019-10-11 11:19:42 +03:00
"datePublished": "2016-05-01T23:06:00+03:00",
2020-04-13 17:24:05 +03:00
"dateModified": "2020-04-13T15:30:24+03:00",
2018-02-11 18:28:23 +02:00
"author": {
"@type": "Person",
"name": "Alan Orth"
},
"keywords": "Notes"
}
< / script >
< link rel = "canonical" href = "https://alanorth.github.io/cgspace-notes/2016-05/" >
< title > May, 2016 | CGSpace Notes< / title >
2019-10-11 11:19:42 +03:00
2018-02-11 18:28:23 +02:00
<!-- combined, minified CSS -->
2020-01-23 20:19:38 +02:00
2022-09-12 11:35:57 +03:00
< link href = "https://alanorth.github.io/cgspace-notes/css/style.c6ba80bc50669557645abe05f86b73cc5af84408ed20f1551a267bc19ece8228.css" rel = "stylesheet" integrity = "sha256-xrqAvFBmlVdkWr4F+GtzzFr4RAjtIPFVGiZ7wZ7Ogig=" crossorigin = "anonymous" >
2019-10-11 11:19:42 +03:00
2018-02-11 18:28:23 +02:00
2020-01-28 12:01:42 +02:00
<!-- minified Font Awesome for SVG icons -->
2021-09-28 10:32:32 +03:00
< script defer src = "https://alanorth.github.io/cgspace-notes/js/fontawesome.min.f5072c55a0721857184db93a50561d7dc13975b4de2e19db7f81eb5f3fa57270.js" integrity = "sha256-9QcsVaByGFcYTbk6UFYdfcE5dbTeLhnbf4HrXz+lcnA=" crossorigin = "anonymous" > < / script >
2020-01-28 12:01:42 +02:00
2019-04-14 16:59:47 +03:00
<!-- RSS 2.0 feed -->
2018-02-11 18:28:23 +02:00
< / head >
< body >
< div class = "blog-masthead" >
< div class = "container" >
< nav class = "nav blog-nav" >
< a class = "nav-link " href = "https://alanorth.github.io/cgspace-notes/" > Home< / a >
< / nav >
< / div >
< / div >
2018-12-19 13:20:39 +02:00
2018-02-11 18:28:23 +02:00
< header class = "blog-header" >
< div class = "container" >
2019-10-11 11:19:42 +03:00
< h1 class = "blog-title" dir = "auto" > < a href = "https://alanorth.github.io/cgspace-notes/" rel = "home" > CGSpace Notes< / a > < / h1 >
< p class = "lead blog-description" dir = "auto" > Documenting day-to-day work on the < a href = "https://cgspace.cgiar.org" > CGSpace< / a > repository.< / p >
2018-02-11 18:28:23 +02:00
< / div >
< / header >
2018-12-19 13:20:39 +02:00
2018-02-11 18:28:23 +02:00
< div class = "container" >
< div class = "row" >
< div class = "col-sm-8 blog-main" >
< article class = "blog-post" >
< header >
2019-10-11 11:19:42 +03:00
< h2 class = "blog-post-title" dir = "auto" > < a href = "https://alanorth.github.io/cgspace-notes/2016-05/" > May, 2016< / a > < / h2 >
2020-11-16 10:54:00 +02:00
< p class = "blog-post-meta" >
< time datetime = "2016-05-01T23:06:00+03:00" > Sun May 01, 2016< / time >
in
2018-02-11 18:28:23 +02:00
2022-06-23 08:40:53 +03:00
< span class = "fas fa-tag" aria-hidden = "true" > < / span > < a href = "/tags/notes/" rel = "tag" > Notes< / a >
2018-02-11 18:28:23 +02:00
< / p >
< / header >
2019-12-17 14:49:24 +02:00
< h2 id = "2016-05-01" > 2016-05-01< / h2 >
2018-02-11 18:28:23 +02:00
< ul >
< li > Since yesterday there have been 10,000 REST errors and the site has been unstable again< / li >
< li > I have blocked access to the API now< / li >
2019-11-28 17:30:45 +02:00
< li > There are 3,000 IPs accessing the REST API in a 24-hour period!< / li >
< / ul >
2022-03-04 15:30:06 +03:00
< pre tabindex = "0" > < code > # awk ' {print $1}' /var/log/nginx/rest.log | uniq | wc -l
2018-02-11 18:28:23 +02:00
3168
2019-11-28 17:30:45 +02:00
< / code > < / pre > < ul >
2018-02-11 18:28:23 +02:00
< li > The two most often requesters are in Ethiopia and Colombia: 213.55.99.121 and 181.118.144.29< / li >
2019-11-28 17:30:45 +02:00
< li > 100% of the requests coming from Ethiopia are like this and result in an HTTP 500:< / li >
< / ul >
2021-09-13 16:21:16 +03:00
< pre tabindex = "0" > < code > GET /rest/handle/10568/NaN?expand=parentCommunityList,metadata HTTP/1.1
2019-11-28 17:30:45 +02:00
< / code > < / pre > < ul >
2020-01-27 16:20:44 +02:00
< li > For now I’ ll block just the Ethiopian IP< / li >
< li > The owner of that application has said that the < code > NaN< / code > (not a number) is an error in his code and he’ ll fix it< / li >
2018-02-11 18:28:23 +02:00
< / ul >
2019-12-17 14:49:24 +02:00
< h2 id = "2016-05-03" > 2016-05-03< / h2 >
2018-02-11 18:28:23 +02:00
< ul >
< li > Update nginx to 1.10.x branch on CGSpace< / li >
< li > Fix a reference to < code > dc.type.output< / code > in Discovery that I had missed when we migrated to < code > dc.type< / code > last month (< a href = "https://github.com/ilri/DSpace/pull/223" > #223< / a > )< / li >
< / ul >
2019-11-28 17:30:45 +02:00
< p > < img src = "/cgspace-notes/2016/05/discovery-types.png" alt = "Item type in Discovery results" > < / p >
2019-12-17 14:49:24 +02:00
< h2 id = "2016-05-06" > 2016-05-06< / h2 >
2018-02-11 18:28:23 +02:00
< ul >
< li > DSpace Test is down, < code > catalina.out< / code > has lots of messages about heap space from some time yesterday (!)< / li >
< li > It looks like Sisay was doing some batch imports< / li >
< li > Hmm, also disk space is full< / li >
2020-01-27 16:20:44 +02:00
< li > I decided to blow away the solr indexes, since they are 50GB and we don’ t really need all the Atmire stuff there right now< / li >
2018-02-11 18:28:23 +02:00
< li > I will re-generate the Discovery indexes after re-deploying< / li >
2019-11-28 17:30:45 +02:00
< li > Testing < code > renew-letsencrypt.sh< / code > script for nginx< / li >
< / ul >
2021-09-13 16:21:16 +03:00
< pre tabindex = "0" > < code > #!/usr/bin/env bash
2018-02-11 18:28:23 +02:00
readonly SERVICE_BIN=/usr/sbin/service
readonly LETSENCRYPT_BIN=/opt/letsencrypt/letsencrypt-auto
# stop nginx so LE can listen on port 443
$SERVICE_BIN nginx stop
$LETSENCRYPT_BIN renew -nvv --standalone --standalone-supported-challenges tls-sni-01 > /var/log/letsencrypt/renew.log 2> & 1
LE_RESULT=$?
$SERVICE_BIN nginx start
2022-03-04 15:30:06 +03:00
if [[ " $LE_RESULT" != 0 ]]; then
echo ' Automated renewal failed:'
2018-02-11 18:28:23 +02:00
2019-11-28 17:30:45 +02:00
cat /var/log/letsencrypt/renew.log
2018-02-11 18:28:23 +02:00
2019-11-28 17:30:45 +02:00
exit 1
2018-02-11 18:28:23 +02:00
fi
2019-11-28 17:30:45 +02:00
< / code > < / pre > < ul >
< li > Seems to work well< / li >
2018-02-11 18:28:23 +02:00
< / ul >
2019-12-17 14:49:24 +02:00
< h2 id = "2016-05-10" > 2016-05-10< / h2 >
2018-02-11 18:28:23 +02:00
< ul >
< li > Start looking at more metadata migrations< / li >
< li > There are lots of fields in < code > dcterms< / code > namespace that look interesting, like:
< ul >
< li > dcterms.type< / li >
< li > dcterms.spatial< / li >
2019-11-28 17:30:45 +02:00
< / ul >
< / li >
2018-02-11 18:28:23 +02:00
< li > Not sure what < code > dcterms< / code > is… < / li >
2020-04-13 17:24:05 +03:00
< li > Looks like these were < a href = "https://wiki.lyrasis.org/display/DSDOC5x/Metadata+and+Bitstream+Format+Registries#MetadataandBitstreamFormatRegistries-DublinCoreTermsRegistry(DCTERMS)" > added in DSpace 4< / a > to allow for future work to make DSpace more flexible< / li >
2020-01-27 16:20:44 +02:00
< li > CGSpace’ s < code > dc< / code > registry has 96 items, and the default DSpace one has 73.< / li >
2018-02-11 18:28:23 +02:00
< / ul >
2019-12-17 14:49:24 +02:00
< h2 id = "2016-05-11" > 2016-05-11< / h2 >
2018-02-11 18:28:23 +02:00
< ul >
2019-11-28 17:30:45 +02:00
< li >
< p > Identify and propose the next phase of CGSpace fields to migrate:< / p >
2018-02-11 18:28:23 +02:00
< ul >
< li > dc.title.jtitle → cg.title.journal< / li >
< li > dc.identifier.status → cg.identifier.status< / li >
< li > dc.river.basin → cg.river.basin< / li >
< li > dc.Species → cg.species< / li >
< li > dc.targetaudience → cg.targetaudience< / li >
< li > dc.fulltextstatus → cg.fulltextstatus< / li >
< li > dc.editon → cg.edition< / li >
< li > dc.isijournal → cg.isijournal< / li >
2019-11-28 17:30:45 +02:00
< / ul >
< / li >
< li >
< p > Start a test rebase of the < code > 5_x-prod< / code > branch on top of the < code > dspace-5.5< / code > tag< / p >
< / li >
< li >
2020-01-27 16:20:44 +02:00
< p > There were a handful of conflicts that I didn’ t understand< / p >
2019-11-28 17:30:45 +02:00
< / li >
< li >
< p > After completing the rebase I tried to build with the module versions Atmire had indicated as being 5.5 ready but I got this error:< / p >
< / li >
< / ul >
2021-09-13 16:21:16 +03:00
< pre tabindex = "0" > < code > [ERROR] Failed to execute goal on project additions: Could not resolve dependencies for project org.dspace.modules:additions:jar:5.5: Could not find artifact com.atmire:atmire-metadata-quality-api:jar:5.5-2.10.1-0 in sonatype-releases (https://oss.sonatype.org/content/repositories/releases/) -> [Help 1]
2019-11-28 17:30:45 +02:00
< / code > < / pre > < ul >
2020-01-27 16:20:44 +02:00
< li > I’ ve sent them a question about it< / li >
2019-11-28 17:30:45 +02:00
< li > A user mentioned having problems with uploading a 33 MB PDF< / li >
< li > I told her I would increase the limit temporarily tomorrow morning< / li >
2020-01-27 16:20:44 +02:00
< li > Turns out she was able to decrease the size of the PDF so we didn’ t have to do anything< / li >
2018-02-11 18:28:23 +02:00
< / ul >
2019-12-17 14:49:24 +02:00
< h2 id = "2016-05-12" > 2016-05-12< / h2 >
2018-02-11 18:28:23 +02:00
< ul >
< li > Looks like the issue that Abenet was having a few days ago with “ Connection Reset” in Firefox might be due to a Firefox 46 issue: < a href = "https://bugzilla.mozilla.org/show_bug.cgi?id=1268775" > https://bugzilla.mozilla.org/show_bug.cgi?id=1268775< / a > < / li >
< li > I finally found a copy of the latest CG Core metadata guidelines and it looks like we can add a few more fields to our next migration:
< ul >
< li > dc.rplace.region → cg.coverage.region< / li >
< li > dc.cplace.country → cg.coverage.country< / li >
2019-11-28 17:30:45 +02:00
< / ul >
< / li >
2018-02-11 18:28:23 +02:00
< li > Questions for CG people:
< ul >
< li > Our < code > dc.place< / code > and < code > dc.srplace.subregion< / code > could both map to < code > cg.coverage.admin-unit< / code > ?< / li >
< li > Should we use < code > dc.contributor.crp< / code > or < code > cg.contributor.crp< / code > for the CRP (ours is < code > dc.crsubject.crpsubject< / code > )?< / li >
2020-01-27 16:20:44 +02:00
< li > Our < code > dc.contributor.affiliation< / code > and < code > dc.contributor.corporate< / code > could both map to < code > dc.contributor< / code > and possibly < code > dc.contributor.center< / code > depending on if it’ s a CG center or not< / li >
2018-02-11 18:28:23 +02:00
< li > < code > dc.title.jtitle< / code > could either map to < code > dc.publisher< / code > or < code > dc.source< / code > depending on how you read things< / li >
2019-05-05 16:45:12 +03:00
< / ul >
2019-11-28 17:30:45 +02:00
< / li >
< li > Found ~200 messed up CIAT values in < code > dc.publisher< / code > :< / li >
< / ul >
2022-03-04 15:30:06 +03:00
< pre tabindex = "0" > < code > # select text_value from metadatavalue where resource_type_id=2 and metadata_field_id=39 and text_value similar to " % %" ;
2019-12-17 14:49:24 +02:00
< / code > < / pre > < h2 id = "2016-05-13" > 2016-05-13< / h2 >
2018-02-11 18:28:23 +02:00
< ul >
< li > More theorizing about CGcore< / li >
< li > Add two new fields:
< ul >
< li > dc.srplace.subregion → cg.coverage.admin-unit< / li >
< li > dc.place → cg.place< / li >
< / ul >
2019-11-28 17:30:45 +02:00
< / li >
2020-01-27 16:20:44 +02:00
< li > < code > dc.place< / code > is our own field, so it’ s easy to move< / li >
< li > I’ ve removed < code > dc.title.jtitle< / code > from the list for now because there’ s no use moving it out of DC until we know where it will go (see discussion yesterday)< / li >
2019-11-28 17:30:45 +02:00
< / ul >
2019-12-17 14:49:24 +02:00
< h2 id = "2016-05-18" > 2016-05-18< / h2 >
2018-02-11 18:28:23 +02:00
< ul >
< li > Work on 707 CCAFS records< / li >
< li > They have thumbnails on Flickr and elsewhere< / li >
2019-11-28 17:30:45 +02:00
< li > In OpenRefine I created a new < code > filename< / code > column based on the < code > thumbnail< / code > column with the following GREL:< / li >
< / ul >
2022-03-04 15:30:06 +03:00
< pre tabindex = "0" > < code > if(cells[' thumbnails' ].value.contains(' hqdefault' ), cells[' thumbnails' ].value.split(' /' )[-2] + ' .jpg' , cells[' thumbnails' ].value.split(' /' )[-1])
2019-11-28 17:30:45 +02:00
< / code > < / pre > < ul >
< li > Because ~400 records had the same filename on Flickr (hqdefault.jpg) but different UUIDs in the URL< / li >
< li > So for the < code > hqdefault.jpg< / code > ones I just take the UUID (-2) and use it as the filename< / li >
< li > Before importing with SAFBuilder I tested adding “ __bundle:THUMBNAIL” to the < code > filename< / code > column and it works fine< / li >
2018-02-11 18:28:23 +02:00
< / ul >
2019-12-17 14:49:24 +02:00
< h2 id = "2016-05-19" > 2016-05-19< / h2 >
2018-02-11 18:28:23 +02:00
< ul >
2019-11-28 17:30:45 +02:00
< li > More quality control on < code > filename< / code > field of CCAFS records to make processing in shell and SAFBuilder more reliable:< / li >
< / ul >
2022-03-04 15:30:06 +03:00
< pre tabindex = "0" > < code > value.replace(' _' ,' ' ).replace(' -' ,' ' )
2019-11-28 17:30:45 +02:00
< / code > < / pre > < ul >
< li > We need to hold off on moving < code > dc.Species< / code > to < code > cg.species< / code > because it is only used for plants, and might be better to move it to something like < code > cg.species.plant< / code > < / li >
< li > And < code > dc.identifier.fund< / code > is MOSTLY used for CPWF project identifier but has some other sponsorship things
2018-02-11 18:28:23 +02:00
< ul >
2019-11-28 17:30:45 +02:00
< li > We should move PN*, SG*, CBA, IA, and PHASE* values to < code > cg.identifier.cpwfproject< / code > < / li >
2018-02-11 18:28:23 +02:00
< li > The rest, like BMGF and USAID etc, might have to go to either < code > dc.description.sponsorship< / code > or < code > cg.identifier.fund< / code > (not sure yet)< / li >
2020-01-27 16:20:44 +02:00
< li > There are also some mistakes in CPWF’ s things, like “ PN 47” < / li >
< li > This ought to catch all the CPWF values (there don’ t appear to be and SG* values):< / li >
2019-05-05 16:45:12 +03:00
< / ul >
2019-11-28 17:30:45 +02:00
< / li >
< / ul >
2022-03-04 15:30:06 +03:00
< pre tabindex = "0" > < code > # select text_value from metadatavalue where resource_type_id=2 and metadata_field_id=75 and (text_value like ' PN%' or text_value like ' PHASE%' or text_value = ' CBA' or text_value = ' IA' );
2019-12-17 14:49:24 +02:00
< / code > < / pre > < h2 id = "2016-05-20" > 2016-05-20< / h2 >
2018-02-11 18:28:23 +02:00
< ul >
< li > More work on CCAFS Video and Images records< / li >
2019-11-28 17:30:45 +02:00
< li > For SAFBuilder we need to modify filename column to have the thumbnail bundle:< / li >
< / ul >
2022-03-04 15:30:06 +03:00
< pre tabindex = "0" > < code > value + " __bundle:THUMBNAIL"
2019-11-28 17:30:45 +02:00
< / code > < / pre > < ul >
2020-01-27 16:20:44 +02:00
< li > Also, I fixed some weird characters using OpenRefine’ s transform with the following GREL:< / li >
2019-11-28 17:30:45 +02:00
< / ul >
2022-03-04 15:30:06 +03:00
< pre tabindex = "0" > < code > value.replace(/\u0081/,' ' )
2019-11-28 17:30:45 +02:00
< / code > < / pre > < ul >
< li > Write shell script to resize thumbnails with height larger than 400: < a href = "https://gist.github.com/alanorth/131401dcd39d00e0ce12e1be3ed13256" > https://gist.github.com/alanorth/131401dcd39d00e0ce12e1be3ed13256< / a > < / li >
< li > Upload 707 CCAFS records to DSpace Test< / li >
< li > A few miscellaneous fixes for XMLUI display niggles (spaces in item lists and link target < code > _black< / code > ): < a href = "https://github.com/ilri/DSpace/pull/224" > #224< / a > < / li >
< li > Work on configuration changes for Phase 2 metadata migrations< / li >
2018-02-11 18:28:23 +02:00
< / ul >
2019-12-17 14:49:24 +02:00
< h2 id = "2016-05-23" > 2016-05-23< / h2 >
2018-02-11 18:28:23 +02:00
< ul >
< li > Try to import the CCAFS Images and Videos to CGSpace but had some issues with LibreOffice and OpenRefine< / li >
< li > LibreOffice excludes empty cells when it exports and all the fields shift over to the left and cause URLs to go to Subjects, etc.< / li >
2020-01-27 16:20:44 +02:00
< li > Google Docs does this better, but somehow reorders the rows and when I paste the thumbnail/filename row in they don’ t match!< / li >
2018-02-11 18:28:23 +02:00
< li > I will have to try later< / li >
< / ul >
2019-12-17 14:49:24 +02:00
< h2 id = "2016-05-30" > 2016-05-30< / h2 >
2018-02-11 18:28:23 +02:00
< ul >
2019-11-28 17:30:45 +02:00
< li > Export CCAFS video and image records from DSpace Test using the migrate option (< code > -m< / code > ):< / li >
< / ul >
2021-09-13 16:21:16 +03:00
< pre tabindex = "0" > < code > $ mkdir ~/ccafs-images
2018-02-11 18:28:23 +02:00
$ /home/dspacetest.cgiar.org/bin/dspace export -t COLLECTION -i 10568/79355 -d ~/ccafs-images -n 0 -m
2019-11-28 17:30:45 +02:00
< / code > < / pre > < ul >
< li > And then import to CGSpace:< / li >
< / ul >
2022-03-04 15:30:06 +03:00
< pre tabindex = "0" > < code > $ JAVA_OPTS=" -Xmx512m -Dfile.encoding=UTF-8" /home/cgspace.cgiar.org/bin/dspace import --add --eperson=aorth@mjanja.ch --collection=10568/70974 --source /tmp/ccafs-images --mapfile=/tmp/ccafs-images-may30.map & > /tmp/ccafs-images-may30.log
2019-11-28 17:30:45 +02:00
< / code > < / pre > < ul >
< li > But now we have double authors for “ CGIAR Research Program on Climate Change, Agriculture and Food Security” in the authority< / li >
2020-01-27 16:20:44 +02:00
< li > I’ m trying to do a Discovery index before messing with the authority index< / li >
< li > Looks like we are missing the < code > index-authority< / code > cron job, so who knows what’ s up with our authority index< / li >
2019-11-28 17:30:45 +02:00
< li > Run system updates on DSpace Test, re-deploy code, and reboot the server< / li >
< li > Clean up and import ~200 CTA records to CGSpace via CSV like:< / li >
< / ul >
2022-03-04 15:30:06 +03:00
< pre tabindex = "0" > < code > $ export JAVA_OPTS=" -Xmx512m -Dfile.encoding=UTF-8"
2018-02-11 18:28:23 +02:00
$ /home/cgspace.cgiar.org/bin/dspace metadata-import -e aorth@mjanja.ch -f ~/CTA-May30/CTA-42229.csv & > ~/CTA-May30/CTA-42229.log
2019-11-28 17:30:45 +02:00
< / code > < / pre > < ul >
< li > Discovery indexing took a few hours for some reason, and after that I started the < code > index-authority< / code > script< / li >
2019-05-05 16:45:12 +03:00
< / ul >
2022-03-04 15:30:06 +03:00
< pre tabindex = "0" > < code > $ JAVA_OPTS=" -Xmx1024m -Dfile.encoding=UTF-8" /home/cgspace.cgiar.org/bin/dspace index-authority
2019-12-17 14:49:24 +02:00
< / code > < / pre > < h2 id = "2016-05-31" > 2016-05-31< / h2 >
2018-02-11 18:28:23 +02:00
< ul >
< li > The < code > index-authority< / code > script ran over night and was finished in the morning< / li >
2020-01-27 16:20:44 +02:00
< li > Hopefully this was because we haven’ t been running it regularly and it will speed up next time< / li >
2019-11-28 17:30:45 +02:00
< li > I am running it again with a timer to see:< / li >
< / ul >
2021-09-13 16:21:16 +03:00
< pre tabindex = "0" > < code > $ time /home/cgspace.cgiar.org/bin/dspace index-authority
2018-02-11 18:28:23 +02:00
Retrieving all data
Initialize org.dspace.authority.indexer.DSpaceAuthorityIndexer
Cleaning the old index
Writing new data
All done !
real 37m26.538s
user 2m24.627s
sys 0m20.540s
2019-11-28 17:30:45 +02:00
< / code > < / pre > < ul >
< li > Update < code > tomcat7< / code > crontab on CGSpace and DSpace Test to have the < code > index-authority< / code > script that we were missing< / li >
< li > Add new ILRI subject and CCAFS project tags to < code > input-forms.xml< / code > (< a href = "https://github.com/ilri/DSpace/pull/226" > #226< / a > , < a href = "https://github.com/ilri/DSpace/pull/225" > #225< / a > )< / li >
< li > Manually mapped the authors of a few old CCAFS records to the new CCAFS authority UUID and re-indexed authority indexes to see if it helps correct those items.< / li >
< li > Re-sync DSpace Test data with CGSpace< / li >
< li > Clean up and import ~65 more CTA items into CGSpace< / li >
2018-02-11 18:28:23 +02:00
< / ul >
< / article >
< / div > <!-- /.blog - main -->
< aside class = "col-sm-3 ml-auto blog-sidebar" >
< section class = "sidebar-module" >
< h4 > Recent Posts< / h4 >
< ol class = "list-unstyled" >
2023-01-01 10:12:13 +02:00
< li > < a href = "/cgspace-notes/2023-01/" > January, 2023< / a > < / li >
2022-12-03 10:46:29 +03:00
< li > < a href = "/cgspace-notes/2022-12/" > December, 2022< / a > < / li >
2022-11-01 22:12:24 +03:00
< li > < a href = "/cgspace-notes/2022-11/" > November, 2022< / a > < / li >
2022-10-01 19:47:37 +03:00
< li > < a href = "/cgspace-notes/2022-10/" > October, 2022< / a > < / li >
2022-09-15 08:37:57 +03:00
< li > < a href = "/cgspace-notes/2022-09/" > September, 2022< / a > < / li >
2018-02-11 18:28:23 +02:00
< / ol >
< / section >
< section class = "sidebar-module" >
< h4 > Links< / h4 >
< ol class = "list-unstyled" >
< li > < a href = "https://cgspace.cgiar.org" > CGSpace< / a > < / li >
< li > < a href = "https://dspacetest.cgiar.org" > DSpace Test< / a > < / li >
< li > < a href = "https://github.com/ilri/DSpace" > CGSpace @ GitHub< / a > < / li >
< / ol >
< / section >
< / aside >
< / div > <!-- /.row -->
< / div > <!-- /.container -->
< footer class = "blog-footer" >
2019-10-11 11:19:42 +03:00
< p dir = "auto" >
2018-02-11 18:28:23 +02:00
Blog template created by < a href = "https://twitter.com/mdo" > @mdo< / a > , ported to Hugo by < a href = 'https://twitter.com/mralanorth' > @mralanorth< / a > .
< / p >
< p >
< a href = "#" > Back to top< / a >
< / p >
< / footer >
< / body >
< / html >