6
$\begingroup$

I'm running a lot of experiments that give their output as CSV files. An experiment might be running for hours, with a new line being added to the CSV every 10 seconds.

Right now I'm opening these CSV files in a text editor, which isn't too convenient. I'm looking for a better way.

Here are some features I want:

  1. View CSV files as a table.
  2. Automatically display numbers in a reasonable way (i.e. instead of 10 significant digits after the dot, show 2.
  3. Automatically tail the file, i.e. show new lines as the file is updated.
  4. Allow me to hide columns, and remember my selection for new CSV files of the same format.
  5. Allow me to show the data in a plot, and remember the plot format I used for the new CSV files of the same format.

Does anything like that exist?

$\endgroup$
3
  • $\begingroup$May I suggest using 'less' if you are on Linux or Mac. Less is a C.L. tool that allows one to view any text file a page or line at a time. Handy for quick viewing.$\endgroup$
    – mccurcio
    CommentedDec 21, 2022 at 16:06
  • $\begingroup$@mccurcio Never have I been so offended by a comment on StackExchange, good job ;) On a serious note, I meant to tail the file while showing it as a table, not a raw text file.$\endgroup$CommentedDec 22, 2022 at 7:25
  • $\begingroup$I take nothing for granted anymore. ;)$\endgroup$
    – mccurcio
    CommentedDec 22, 2022 at 20:48

4 Answers 4

1
$\begingroup$

With pandas to format your file as you want Visidata (https://github.com/saulpw/visidata) is a must have !

it is a open source project, run in terminal and rocks ! more info here https://www.visidata.org

$\endgroup$
3
  • $\begingroup$Use # key on the column to set as int, or % for float. Use the doc !$\endgroup$CommentedDec 21, 2022 at 21:10
  • $\begingroup$I got that working, but it's only a scatter plot, while I would like a line plot with possible smoothing... And I don't want to mark the columns every time. Not to mention I want it to work on Windows. I think I better just write my own Python scripts.$\endgroup$CommentedDec 22, 2022 at 11:51
  • $\begingroup$Update: I've learned to use VisiData using jsvine.github.io/intro-to-visidata and it's absolutely great! I love it and I'm going to use it regularly.$\endgroup$CommentedDec 29, 2022 at 23:00
1
$\begingroup$

It is my first reply, and it already starts with the holly war. Please don't kill me.

I don't like EXCEL much, but maybe it answers all the requirements.

  1. Open a blank excel file (.xlsx)
  2. Go to Data -> From Text, select a comma delimiter
  3. Choose numeric columns, right click and change the Format Cells category to Number with 2 significant digits.
  4. Hide any columns you want. Save the file, your selection will be remembered when you open the file next time.
  5. Create a plot. When creating a plot, select the range that includes the last filled row. When you save the file the plot's format will be remembered.
  6. To update the information (or load another CSV in the same format) press Alt+F5. The sheet contents both with the plot will be refreshed and will contain all of the CSV's data (including the lastly added rows).
$\endgroup$
2
  • $\begingroup$I won't kill you, but for the amount of effort involved in your answer, I might as well write code. Consider I have to do this thing 100 times per day. Also, I don't like Excel's interface for this.$\endgroup$CommentedDec 20, 2022 at 7:47
  • $\begingroup$After the initial setup, the only thing is needed - hit Alt + F5 and Enter. I agree that excel's interface is not the best.$\endgroup$
    – griko
    CommentedDec 20, 2022 at 9:19
1
$\begingroup$

If you do not like Excel, consider using the Pandas library. You can use the pandas.read_csv command to read the file in a DataFrame. After you have imported the data you can use the pd.options.display.float_format = "{:,.2f}".format to get it to show only 2 significant digits. If you want the script to show new records, you can write a simple code to re-read the csv in a time period you want. Hiding columns is also really easy in Pandas, you simply have to selected_columns = all_columns["selected_1", "selected_2"]. The library is also capable of drawing simple plots, so they might do the job, depending on how complex diagrams you want. For the different formats just write if/else functions.

$\endgroup$
1
  • $\begingroup$Sure, but then I need to write a script. I might do that, but I was hoping this is a common enough need that I wouldn't have to do that.$\endgroup$CommentedDec 20, 2022 at 18:26
0
$\begingroup$

Why don't you try an interpreted language, something like Octave, R, Julia, Python etc. with an IDE? Below is a series of screenshots showing the use of Octave to do the equivalent of what you want.

The background to this is that every 10 minutes data is appended to CSV files via a cronjob - this is analogous to your experiment updating and writing to file every 10 seconds. A script, "rolling_h1_cyclic_plot," is then run which then allows you to "Select Instrument," in your case this would be "Select Relevant file," and then "Number of 10 min Bars" allows selection of the last N lines in "Relevant file." The script loads the last N lines into the workspace, plots the data, and then using the variable editor you can view the raw data.

This approach meets most of your requirements and with a bit of refinement could probably meet all of them.screen1screen2screen3screen4screen5

$\endgroup$

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.