Different operating systems like Linux and Windows have different text file formats which are not compatible with each other. While using these platforms to edit and read text file there will be problems. If we want to use a text file which is created or edited in Windows or MS-DOS environment in Linux we need to change the text file into Linux format with the
dos2unix tool is a very simple tool we can install it for different Linux distributions like below easily.
Ubuntu, Debian, Mint, Kali
We will use
apt command and
dos2unix package name for installation.
$ sudo apt install dos2unix
Fedora, CentOS, RedHat
We will use
dnf command like below.
$ sudo dnf install dos2unix
Hidden Characters Problem
As stated previously there are some characters which are not used in Linux text files but used in Windows text files.
- End of line is specified with Carriage Return `CR` followed by Line Feed `LF` in Windows but In Linux, only Line Feed `LF` is used.
- Default encoding of Windows Files are UTF-16 but in Linux, UTF-8 is used
- Binary files are skipped automatically
Print Given File Encoding and Type
We will start by listing or printing the given file encoding type and line terminator type. We will use the Linux
file command which provides this information.
$ file a.txt
We can see that the file
a.txt is encoded with the
ISO-8859 and have
CRLF or Windows line terminators.
Convert From DOS To Unix Format
We will start with a simple example where we will convert the file which is created in the Windows or MS-DOS environment named
a.txt into a file which will be used in the Linux environment.
a.txt will be converted as the same file name.
$ dos2unix a.txt
After checking with the
file command we can see that it is referred to as text file without CRLF line terminator.
Specify Conversion Mode
While converting there are different conversion modes which provide alternative about the conversion. We can use the following options and conversion modes.
- `-iso` will convert from default code page into Unix Latin-1
- `-850` will convert from DOS CP850 to Unix Latin-1
- `-1252` will convert from Windows CP1252 to Unix UTF-8 or Unicode
In this example, we will convert to ASCII format with the
$ dos2unix -iso a.txt
We can see that detailed information is provided during conversion.
Keep Date Stamp Of Old File
Timestamps are file attributes those holds information like creation time, modification time, access time etc. While converting a text file from DOS format to Linux format this information will be changed. We can preserve the time stamp information with the
-k option like below.
$ dos2unix -k a.txt
Specify New File Name
Up to now we have converted files and replaced. We can specify a new file name where the converted file content will be written to this new file and old file will be kept without a change. We will use
-o option and specify the new file name.
In this example, we will create the newly converted file name
$ dos2unix -n a.txt b.txt
Recursive Convert with find Command
If there are a lot of files to be converted to Unix or Linux text file format converting them one by one is trivial work. We need to run the bulk operation on all files. We can use
find command which will execute given dos2unix command on all found files which are specified as
In this example, we will find all text files which is specified with
-name *.txt in the current working directory which are specified with
. and then run
dos2unix command all of these files with the
$ find . -name "*.txt" | xargs dos2unix