Check (Rough) Progress of Your CSV Import to MySQL

If you are importing large CSV or SQL dumps to MySQL, chances are you were looking for ways to see how far the import has gone. If you know how many rows there are from the file being imported, you can do a SELECT COUNT(*) but that would take sometime for the query to finish especially on really big imports.

Using lsof, you can monitor the current file offset to which a process is reading from using the -o option. Knowing the size of the file and some snapshots of the offset, you can get a somewhat rough idea of how fast the import goes. Note though that this is only file-read-pace not actual import speed as MySQL import can vary depending on a number of conditions i.e. table growth, secondary indexes, etc.

Let’s say I am importing a 1.1G CSV file into a table.

[revin@forge msb_5_5_300]$ ls -al /wok/dta/samples/ft_history.csv 
-rw-rw-r--. 1 revin revin 1075456654 Nov 8 23:25 /wok/dta/samples/ft_history.csv

mysql [localhost] {msandbox} (test) > LOAD DATA INFILE '/wok/dta/samples/ft_history.csv' INTO TABLE ft_history;

I’d try to sample the OFFSET column from the lsof output at some interval, in my case I used a 6seconds interval:

[revin@forge msb_5_5_300]$ date && lsof -o | grep ft_history
Tue Apr 16 02:53:50 EDT 2013
...
mysqld 2178 revin 107r REG 9,3 0t34865152 3029595 /wok/dta/samples/ft_history.csv

[revin@forge msb_5_5_300]$ date && lsof -o | grep ft_history
Tue Apr 16 02:53:56 EDT 2013
...
mysqld 2178 revin 107r REG 9,3 0t48234496 3029595 /wok/dta/samples/ft_history.csv

The 7th column is my OFFSET, a simple formula:

((csv_size / (offset_1 - offset_2)) * interval_seconds) / 60 = Estimate in minutes

In my case I have:

((1075456654 / (48234496 - 34865152)) * 6) / 60 = 8mins

In reality, my import took about 10mins, not bad considering the size of the table in the end.

mysql [localhost] {msandbox} (test) > LOAD DATA INFILE '/wok/dta/samples/ft_history.csv' INTO TABLE ft_history;
Query OK, 18294744 rows affected (9 min 52.91 sec)
Records: 18294744 Deleted: 0 Skipped: 0 Warnings: 0