Monday, October 27, 2008

The date command in unix

In unix, the date command is to show the current date and time.

If you are the root, you can use the date command to set the system date (i.e. the current date as reported by the unix machine). Since the majority of the users are not root, for most of the time this command is used for display.

Without any parameter, just type the date command will display the current date and/or time in the default format.

But, you can set the display format as you like.

Usually, I will create file with a filename having the current date-time. Then I will use this date command. Here it is:

touch my_file.d`date +"%Y%m%d%H%M%S"`

Assuming now is 2008-12-31 23:58:59, this will create a file my_file.d20081231235859


Thursday, October 23, 2008

What is a line ?

When a human being reads a text file in computer, the file is well formatted line by line. When the file is a COBOL program in mainframe MVS, the program is displayed in a well formatted 80-column line by line. When the file is a C program in unix, it is also formatted well line by line, similarly for many DOS files.

Human beings see the text file as a line-by-line file. How does the computer see this file ?

Actually, the computer does NOT see the file in a human being line-by-line concept.

In fact, the file is stored as a STREAM of bytes, i.e. a byte following a byte continuously until the end of the file. Then, some special handling (e.g. using line delimiter or else) is done to identify which portion is line-1 and which portion is line-2 in that long stream of bytes.

For example, when you see this 2-line file (named C:\example.txt) in DOS:

This is first line.
This is second line.

This file has 2 lines. The first line [ This is first line. ] has 19 characters from T to the last period. Similarly, the second line [ This is second line. ] has 20 characters. One can count a total of 39 characters.

However, when you use the DOS command [ dir ] to examine the file size:

C:\> dir example.txt
2006-03-20  13:39                41 example.txt
1 File(s)             41 bytes
This [ dir ] command reports a file size of 41 byte, not 39 byte. Why there are 2 more bytes ?

Actually, how the file is stored in DOS ? Using the [ debug ] program, the following will be seen:

C:\> debug example.txt
-d
0B1A:0100  54 68 69 73 20 69 73 20-66 69 72 73 74 20 6C 69   This is first li
0B1A:0110  6E 65 2E 0D 0A 54 68 69-73 20 69 73 20 73 65 63   ne...This is sec
0B1A:0120  6F 6E 64 20 6C 69 6E 65-2E 6B 6A 6B 6A 65 72 6A   ond line.kjkjerj
0B1A:0130  20 64 67 6B 3B 6C 64 73-20 6C 6B 6A 66 67 6C 73    dgk;lds lkjfgls
0B1A:0140  20 64 68 6A 73 64 6B 68-6A 6B 73 68 20 73 20 68    dhjsdkhjksh s h
0B1A:0150  6B 73 20 68 20 73 68 20-73 20 68 20 64 73 68 20   ks h sh s h dsh
0B1A:0160  66 64 73 20 68 20 73 66-64 68 20 6B 20 73 68 20   fds h sfdh k sh
0B1A:0170  6B 68 6A 6B 73 68 6B 20-73 68 20 73 66 64 68 20   khjkshk sh sfdh
-q

As one can see, the file is stored as a STREAM of characters in the harddisk. After the string [ This is first line. ], one can find 2 characters [ 0D 0A ]. Then, the second line follows.

This [ 0D 0A ] characters are termed line delimiter. This 39-byte information plus 2-byte line delimiter results in a 41-byte file.

Also, this line delimiter [ 0D 0A ] tells the software editor to display the file into 2 lines.

According to ASCII encoding sequence, [ 0D ] is the decimal value 13. This [ 0D ] is called [ carriage return ], with an abbreviation [ CR ]. Similarly, [ 0A ] is decimal 10, called [ line feed ], abbreviated as [ LF ]. Together, [ 0D 0A ] are represented by CRLF.

In the world of unix, the line delimiter is [ 0A ] for most of the common settings.

As a result, when using FTP to transfer files between DOS and unix, it is better to use the [ ascii ] option to turn on the line delimiter conversion between [ 0D 0A ] and [ 0A ]. If someone forget to do so, after uploading a DOS file into unix, there will be a ^M character at the end of each line (when the file is opened by vi editor). This ^M is actually the [ 0D ] character which is NOT treated as line delimiter and is considered as normal character to be displayed.

Similar mistake can occur for downloading unix file into DOS. Without using the [ ascii ] option, there is no line delimiter conversion. The received file in DOS will be delimited by [ 0A ] only, not the common DOS line delimiter [ 0D 0A ]. So far, the Notepad application CANNOT recognize this [ 0A ] as line delimiter. You will see all the line mess together into one very long line. Another application, Wordpad, is more intelligent. It knows that [ 0A ] is also a line delimiter. It can open the file normally for human being to read.

In the mainframe MVS world, if the dataset file is a QSAM, there is no need to have line delimiter. The dataset organization has already tell the exact number of characters for each line.


This is the first article

This is the first article.

Just for testing.


Duplicate Open Current Folder in a New Window

Sometimes after I opened a folder in Win7, I would like to duplicate open the same folder again in another explorer window. Then, I can ope...