<pclass="lead">The DSpace CSV Metadata Quality Checker is a collection of sanity checks and automated fixes for a number of common issues in metadata files.</p>
<labelfor="formFile"class="form-label">Select a CSV file to process (or try <ahref="https://raw.githubusercontent.com/ilri/csv-metadata-quality/master/data/test.csv">test.csv</a>)</label>
<inputclass="form-check-input"type="checkbox"id="excludeFieldsCheckbox"name="excludeCheckbox"aria-label="Checkbox for following text input">
</div>
<inputtype="text"class="form-control"placeholder="dcterms.subject"id="excludeFieldsText"name="excludeText"aria-label="Text input with checkbox">
<divid="excludeHelp"class="form-text">Optionally indicate fields to skip during analysis. Separate multiple fields with a comma, for example: <code>dcterms.issued,dcterms.subject</code>.</div>
<inputclass="form-check-input"type="checkbox"checked="true"id="agrovocFieldsCheckbox"name="agrovocCheckbox"aria-label="Checkbox for following text input">
<divid="agrovocHelp"class="form-text">Optionally indicate fields to validate against <ahref="https://agrovoc.uniroma2.it/agrovoc/agrovoc/en/"title="AGROVOC Multilingual Thesaurus">AGROVOC</a>. Separate multiple fields with a comma, for example: <code>dcterms.subject,cg.coverage.country</code>. Note: this can take an extra minute or more depending on your data. If you have a problem please try again and it will generally be faster the second time.</div>
<divid="unsafeHelp"class="form-text">This will remove newlines, perform <ahref="https://withblue.ink/2019/03/11/why-you-need-to-normalize-unicode-strings.html"title='When "Zoë" !== "Zoë". Or why you need to normalize Unicode strings'>normalization of Unicode characters</a>, and attempt to fix <ahref="https://en.wikipedia.org/wiki/Mojibake">mojibake</a> character encoding issues. Read more about these <ahref="https://github.com/ilri/csv-metadata-quality#unsafe-fixes">unsafe fixes</a>.</div>
<divid="experimentalHelp"class="form-text">Attempt to validate whether the value of an item's <code>dc.language.iso</code> or <code>dcterms.language</code> field matches the <em>actual</em> language of text used in its title, abstract, and citation. Read more about these <ahref="https://github.com/ilri/csv-metadata-quality#experimental-checks">experimental checks</a>.</div>