I have a function below, store_in_matchid_file
, and it adds certain strings to a .txt file as long as that string doesn't already exist in the file. However that file is getting to be millions of lines long and the process of checking is becoming too long.
I was hoping someone would be able to indicate a way I could speed up the process, by changing how I've coded the process.
def store_in_matchid_file(distinct_matchids_func): # Ids stored in the text file with open('MatchIds.txt') as f: current_id_list = f.read().splitlines() # Ids recently collected (Previously a set, converted to list to iterate over) distinct_matchids_list = list(distinct_matchids_func) # Adding Ids that don't exist in the file with open('MatchIds.txt', 'a') as file: for match in distinct_matchids_list: if match not in current_id_list: file.write(match + '\n')