Say I am given a csv file that I want to import into an sqlite3 database. The first two columns in the csv file contain unique, important information while the rest of the columns are really a list of things. So the csv file with the header row could look like:
cucumber.csv
'important1', 'important2', 'listItem_1', 'listItem_2', 'listItem_3' '123', '9876', 'apple', 'orange', 'banana' '456', '7890', 'cabbage', 'onion', 'carrot'
So, when I import into the database, I want to shove all but the first two columns into one column. The schema for the table in the database would then look like:
import csv import sqlite3 def main(): data_filename = 'cucumber.csv' db_filename = 'spam_mart.sqlite' SQL_create_table = """DROP TABLE IF EXISTS cucumber; CREATE TABLE cucumber ( important1 NUMBER PRIMARY KEY, important2 NUMBER, item_list TEXT ); """ SQL = """insert into cucumber(important1, important2, item_list) values (:important1, :important2, :item_list) """ with open(data_filename) as f, sqlite3.connect(db_filename) as conn: conn.executescript(SQL_create_table) cursor = conn.cursor() reader_object = csv.reader(f) next(reader_object, None) # Skip the headers row_count = 0 for row in reader_object: row_count += 1 row_dict = {} important1, important2, *item_list = row # unpack the row row_dict['important1'] = important1 row_dict['important2'] = important2 row_dict['item_list'] = repr(item_list) # convert list to string first cursor.execute(SQL, row_dict) print('Loaded {} of {} records'.format(str(row_count), str(reader_object.line_num - 1))) if __name__ == '__main__': main()
I would normally use a csv.DictReader()
object to transfer a csv file to a database, but since I was creating a list from some of the columns first, I am going with a normal csv.reader()
object.
I used the repr()
so that I can easily access the list again using the eval()
if need be. Script works as expected. However, the whole technique seems a little clumsy to me. All honest critiques welcomed.