1
0
mirror of https://github.com/ilri/csv-metadata-quality.git synced 2025-05-08 06:06:00 +02:00

Add support for removing newlines

This was tricky because of the nature of newlines. In actuality we
are removing Unix line feeds here (U+000A) because Windows carriage
returns are actually already removed by the string stripping in the
whitespace fix.

Creating the test case in Vim was difficult because I couldn't fig-
ure out how to manually enter a line feed character. In the end I
used a search and replace on a known pattern like "ALAN", replacing
it with \r. Neither entering the Unicode code point (U+000A) direc-
tly or typing an "Enter" character after ^V worked. Grrr.
This commit is contained in:
2019-07-30 20:05:12 +03:00
parent 346e66ca98
commit 40d5f7d81b
5 changed files with 47 additions and 0 deletions

View File

@ -17,3 +17,5 @@ Invalid ISO 639-2 language,2019-07-29,,,jp,
Invalid ISO 639-3 language,2019-07-29,,,chi,
Invalid language,2019-07-29,,,Span,
Invalid AGROVOC subject,2019-07-29,,,,FOREST
Newline,2019-07-30,,,,"TANZA
NIA"

1 dc.contributor.author birthdate dc.identifier.issn dc.identifier.isbn dc.language.iso dc.subject
17 Invalid ISO 639-3 language 2019-07-29 chi
18 Invalid language 2019-07-29 Span
19 Invalid AGROVOC subject 2019-07-29 FOREST
20 Newline 2019-07-30 TANZA NIA
21