I'm attempting to apply a long set of conditions and operations onto a pandas dataframe (see the dataframe below with VTI, upper, lower, etc). I attempted to use apply, but I was having a lot of trouble doing so. My current solution (which works perfectly) relies on a for
loop iterating through the dataframe. But my sense is that this is an inefficient way to complete my simulation. I'd appreciate help on the design of my code.
VTI uppelower sell buy AU BU BL date Tok order 44.58 NaN NaN False False False False False 2001-06-15 5 0 44.29 NaN NaN False False False False False 2001-06-18 5 1 44.42 NaN NaN False False False False False 2001-06-19 5 2 44.88 NaN NaN False False False False False 2001-06-20 5 3 45.24 NaN NaN False False False False False 2001-06-21 5 4
If I wanted to run a bunch of conditions and for
loops like the below and run the function below (the get row data function) only if the row meets the conditions provided, how would I do so?
My intuition says to use .apply()
but I'm not clear how to do it within this scenario. With all the if's and for
-loops combined, it's a lot of rows. The below actually outputs an entirely new dataframe. I'm wondering if there are more efficient/better ways to think about the design of this simulation/stock backtesting process.
Get row data simply obtains key data from the dataframe and computes certain information based on globals (like how much capital I have already, how many stocks I have already) and spits out a list. I append all these lists into a dataframe that I call the portfolio.
I've given a snippet of the code that I've already made using a for
-loop.
##This is only for the sell portion of the algorithm if val['sell'] == True and tokens == maxtokens: print 'nothign to sell' if val['sell'] == True and tokens < maxtokens: print 'sellprice', price #CHOOSE THE MOST EXPENSIVE POSITION AND SELL IT# portfolio = sorted(portfolio, key = lambda x: x[0], reverse = True) soldpositions = [position for position in portfolio if position[7] == 'sold'] soldpositions = sorted(portfolio, key=lambda x: x[1], reverse = True) for position in portfolio: #Position must exist if position[9] == True: print 'position b4 sold', position #Position's price must be LOWER than current price #AND the difference between the position's price and current price must be greater than sellbuybuffer if abs(position[0] - price) <= sellbuybuffer: print 'does not meet sellbuybuffer' if position[0] < price and abs(position[0] - price) >= sellbuybuffer: status = 'sold' #This is if we have no sold positions yet if len(soldpositions) == 0: ##this is the function that is applied only if the row meets a large number of conditions get_row_data(price, date, position, totalstocks, capital, status, tokens)