Skip to content

If you have some csv file and having CRLF, LF in between data and you want to create some table (Hive table). You will face issue that some of column have null value. It’s because line terminator in hive is \n and if and \n or \r coming between data it treating as line terminator before actual line terminator and rest for column is getting null …

Notifications You must be signed in to change notification settings

Ashishkr007/Data-Cleaner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Data-Cleaner

How do i remove a newline (CR, LF or \r, \n) in between data?

If you have some csv file and having CRLF, LF in between data and you want to create some table (Hive table). You will face issue that some of column have null value.

It’s because line terminator in hive is \n and if and \n or \r coming between data it treating as line terminator before actual line terminator and rest for column is getting null value.

I tried multiple option like spark, hive serde and many more but I found good with perl. Today I a sharing my Perl script to remove all newline and special characters.

Input:

1407233497,1407233514,bar

1407233498,1407233515,foo

mingstats&fmt=n

1407233499,1407233516,foobar

Expected output:

1407233497,1407233514,bar

1407233498,1407233515,foomingstats&fmt=n

1407233499,1407233516,foobar

How to execute script ?

perl E:\Ashish\DataCleaner\NewlineCleaner.pl E:\Ashish\DataCleaner\Sample Data.csv > E:\Ashish\DataCleaner\OutPut.csv

How to install perl ? https://www.perl.org/get.html http://strawberryperl.com/releases.html

About

If you have some csv file and having CRLF, LF in between data and you want to create some table (Hive table). You will face issue that some of column have null value. It’s because line terminator in hive is \n and if and \n or \r coming between data it treating as line terminator before actual line terminator and rest for column is getting null …

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages