0

We have a large dataset that's current residing in many, many spreadsheets.
I've been tasked with getting part of that data into our MySQL DB.
When all is said and done, I will probably be inserting somewhere around 3 million rows.

I intend to collect the inserts into batches of about a quarter-million or so (parsing through it with PHP).
Initially, my plan was to just run/send those batch INSERTs straight from my script.
More recently, though, I thought about actually writing out the INSERTs to .sql files for later importing.

I tried to google around on this a little ... didn't find much.
Are there any real pro's/con's one way or the other?
If not, I'm inclined to stick with my original plan.
Otherwise, I'm completely open to suggestions.

2
  • Bear in mind that a PHP script running via a webserver usually has an execution time limit. If you are importing a large amount of data, you may encounter timeout issues. There is no time limit when running the script via the command line directly through php though.CommentedMay 23, 2014 at 12:56
  • Your Database can do this better then any php script you come up with.
    – Pieter B
    CommentedMay 23, 2014 at 13:21

2 Answers 2

2

The only major plus I see one way or the other is that by making the .sql files, if some problem occurs and you have to blow away the database and re-do the import process, you don't have to re-run your scripts against the spreadsheets. Or looking at it another way, you can test the process on a test server before doing the actual import.

But if this is just a one time thing, ultimately, either would suffice.

1
  • Agreed, if you were going to do it multiple times then I'd be looking at writing some logic to detect failed inserts and report them, and perhaps give the option of retrying or throwing away the entire import - but for a one-off, it's not worth the extra effort.CommentedMay 23, 2014 at 11:16
1

I think importing millions of records through the php script is not a good idea. What you can do is, prepare multiple .sql files with bunch of records based on the setting of execution time and memory of your server. and put it in one fixed location directory also .sql file name with some logical sequence so that we can read the file programmatically.

Write small shell script which extract each file entry in that particular directory. In terms of UNIX/LINUX you need to write shell script. If it is windows server you need to write batch file.

and within that batch file write following command.

mysql -u userName -p password databaseName < FileName.sql > logfile.log

Now lets assume that I had created 10 sql file. so I named as 1.sql, 2.sql and so on and access those in loop using shell script/batch file.

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.