Look back, I don't know look back as an hyper parameter, but in LSTM when you trying to predict the next step you need to arrange your data by "looking back" certain time steps to prepare the data set for training, for example, suppose you want to estimate the next value of an episode that happens every time t. You need to re-arrange you data in a shape like: {t1, t2, t3} -> t4 {t2, t3, t4] -> t5 {t3, t4, t5} -> t6 The net will learn this and will be able to predict tx based on previous time steps.
Batch size (is not referred to LSTM only), roughly is how much samples will be trained per single step, as bigger the batch size is the faster the training is but more memory is needed. In a GPU is better to have bigger batch sizes because copying the values from GPU to memory is slow.
LSTM units, refers to how much "smart" neurons you will have. This is highly dependent on your dataset, usually you determine this depending on your vector dimensions.
No. of Epochs, how much times the algorithm will run to approximate the observations. Usually to much epochs will overfit your model and to little will end up in an under fitted one.