0

I am running the same python script multiple times redirecting the output in different ways and getting broken output. Sometimes lines are missing and sometimes the order is reversed. The python script includes a number of print statements which I would like to print to an output file.

First of all let me show you how the output should look like:

Number of haplotypes: 400000,400000,400000 Number of trees: 2946715 Number of mutations: 3454912 Sequence length: 62949445.0 True PRS read from file: ../data/prs_true/prs.prs_true. 600000 lines imported. Defining cases/controls [2020-07-27 23:26:09] Case control variables saved to ../data/case_control/c_c.case_control.pickle. End 

Run A: I am running the script with default output: my screen. The order of print statements is correct, but the last two print statements are missing:

(genPy2) [user@vm code]$ python simulate_prs_py2.py --tree ../data/tree/tree.hdf5 --true_prs ../data/prs_true/prs.prs_true --ncausal=200 --h2=0.33 --out ../data/case_control/sim_full Number of haplotypes: 400000,400000,400000 Number of trees: 2946715 Number of mutations: 3454912 Sequence length: 62949445.0 True PRS read from file: ../data/prs_true/prs.prs_true. 600000 lines imported. Defining cases/controls [2020-07-27 23:24:48] 

Run B: Now I am running the same script and redirecting the output to a file "output.txt". Now it prints the first few lines to my screen and the last two lines to the file. Why not everything to the file? Further the order is mixed up now: The first line of the file (True PRS...) should come before the last line of the screen output (Defining cases...).

(genPy2) [user@vm code]$ python simulate_prs_py2.py --tree ../data/tree/tree.hdf5 --true_prs ../data/prs_true/prs.prs_true --ncausal=200 --h2=0.33 --out ../data/case_control/sim_full > output.txt Number of haplotypes: 400000,400000,400000 Number of trees: 2946715 Number of mutations: 3454912 Sequence length: 62949445.0 Defining cases/controls [2020-07-27 23:25:22] 
(genPy2) [user@vm code]$ cat output.txt True PRS read from file: ../data/prs_true/prs.prs_true. 600000 lines imported. Case control variables saved to ../data/case_control/c_c.case_control.pickle. End 

Run C: I am now using nohup and saving the output to a file "../data/case_control/output.txt". Now all the output is redirected to the output file but the order of the two statements "True PRS..." and "Defining cases..." is still reversed.

(genPy2) [user@vm code]$ nohup python simulate_prs_py2.py --tree ../data/tree/tree.hdf5 --true_prs ../data/prs_true/prs.prs_true --ncausal=200 --h2=0.33 --out ../data/case_control/sim_full > ../data/case_control/output.txt nohup: ignoring input and redirecting stderr to stdout 
(genPy2) [user@vm code]$ cat ../data/case_control/output.txt Number of haplotypes: 400000,400000,400000 Number of trees: 2946715 Number of mutations: 3454912 Sequence length: 62949445.0 Defining cases/controls [2020-07-27 23:26:09] True PRS read from file: ../data/prs_true/prs.prs_true. 600000 lines imported. Case control variables saved to ../data/case_control/c_c.case_control.pickle. End 

I am 80% certain, that this is a problem in the shell and not in my python script. Everything is fine, should that be the case. However, it is crucial that the python script runs properly.

Any suggestions why this occurs and how to fix it are greatly appreciated.

2
  • 1
    Your script is printing some things to stdout (output.txt in case B) and some to stderr (the rest).
    – Panki
    CommentedJul 28, 2020 at 9:54
  • @Panki found it. Thank you. I didn't know that was a thingCommentedJul 28, 2020 at 10:05

1 Answer 1

1

I recycled a python script without knowing one detail:

The script runs in Python 2 and imports a print function from the __future___ package as eprint. This got printed to stderr while everything printed with print (the Python 2 default) got printed to stdout. This caused

  • the print statements missing in Run A and the eprint statements to show on screen
  • the print statements to be saved to "output.txt" in Run B while the eprint statements were saved
  • nohub directs both stderr as well as stdout into the file. So in Run C, everything got directed to the output file. However, I still can't explain the reversed order.

Only using eprint solved all my problems.

Thanks to @Panki for directing me here

    You must log in to answer this question.

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.