Convert text file between UTF-8, ISO8859, ASCII

Problem

I want to change a txt file codification from UTF-8 to ISO_8859

Solution

The iconv utility converts the encondig of characters from one codeset to another.
To show all the supported formats write:

iconv -l

Check that your desired formats are supported and then use iconv -t to perform the new encoding. The output will be written to standard output so you can redirect it to the destination file.

$ iconv -t tocodeset file > outputfile

Example:

$ file test.txt 
test.txt: UTF-8 Unicode text
$ iconv -t ISO_8859-15 test.txt > test2.txt
$ file test2.txt 
test2.txt: ISO-8859 text

If you try to conver to ASCII it will fail when it finds an unicode character:

$ iconv -t ASCII test.txt > test2.txt
iconv: illegal input sequence at position 19

It is possible to use the -c parameter just to ignore these unknown characters:

$ iconv -c -t ASCII test.txt > test2.txt

Extra, to remove the CRLF use the command:

$ dos2unix
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s