diff --git a/content/posts/2018-09.md b/content/posts/2018-09.md index 373e112a2..a30557055 100644 --- a/content/posts/2018-09.md +++ b/content/posts/2018-09.md @@ -510,5 +510,19 @@ http://localhost:8081/solr/statistics/update?commit=true&stream.body= diff --git a/docs/2018-09/index.html b/docs/2018-09/index.html index 2249c19f4..f98688079 100644 --- a/docs/2018-09/index.html +++ b/docs/2018-09/index.html @@ -18,7 +18,7 @@ I’m testing the new DSpace 5.8 branch in my Ubuntu 18.04 environment and I " /> - +
  • And magically all those 81,000 documents are gone!
  • +
  • After a few hours the Solr statistics core is down to 44GB on CGSpace!
  • +
  • I did a major refactor and logic fix in the DSpace Statistics API’s indexer.py
  • +
  • Basically, it turns out that using facet.mincount=1 is really beneficial for me because it reduces the size of the Solr result set, reduces the amount of data we need to ingest into PostgreSQL, and the API returns HTTP 404 Not Found for items without views or downloads anyways
  • +
  • I deployed the new version on CGSpace and now there are only 92,000 pages of item views, which is half that it was before… but still seems like way too many (we only have about 74,000 items, which I’d assume would be 740 pages of 100)
  • +
  • Anyways, the indexing crashed kinda… I think
  • +
  • But systemd seems to have restarted it, and I now see:
  • + + +
    Indexing item views (page 28 of 753)
    +...
    +Indexing item downloads (page 260 of 260)
    +
    + + diff --git a/docs/sitemap.xml b/docs/sitemap.xml index b7e93182a..28487b86e 100644 --- a/docs/sitemap.xml +++ b/docs/sitemap.xml @@ -4,7 +4,7 @@ https://alanorth.github.io/cgspace-notes/2018-09/ - 2018-09-25T22:06:05+03:00 + 2018-09-25T23:54:29+03:00 @@ -184,7 +184,7 @@ https://alanorth.github.io/cgspace-notes/ - 2018-09-25T22:06:05+03:00 + 2018-09-25T23:54:29+03:00 0 @@ -195,7 +195,7 @@ https://alanorth.github.io/cgspace-notes/tags/notes/ - 2018-09-25T22:06:05+03:00 + 2018-09-25T23:54:29+03:00 0 @@ -207,13 +207,13 @@ https://alanorth.github.io/cgspace-notes/posts/ - 2018-09-25T22:06:05+03:00 + 2018-09-25T23:54:29+03:00 0 https://alanorth.github.io/cgspace-notes/tags/ - 2018-09-25T22:06:05+03:00 + 2018-09-25T23:54:29+03:00 0