Questions tagged [bioinformatics]
Use this tag for questions relating to common bioinformatics tasks performed on a *nix system. Things like manipulating/converting between standard biological text formats, extracting data of interest from such formats etc.
322 questions
1vote
3answers
66views
edit all the values in a specific column based on row numbers range
I have a PDB file (coordinates of atoms in a protein) on a Linux machine: ATOM 1 N GLY A 1 0.535 51.766 5.682 1.00 0.00 ATOM 2 CA GLY A 1 -0.712 50....
2votes
1answer
350views
how to pass environment variables to singularity exec
I have a BASH pipeline which at a point runs a Singularity container with singularity exec as follows: singularity exec --bind `pwd`:/folder --bind $d:/results <image>.sif <tool_command> -...
4votes
3answers
220views
Add columns from variable number of files to base file
I'm dealing with a series of bed files, which look like this: chr1 100 110 0.5 chr1 150 175 0.2 chr1 200 300 1.5 With the columns being chromosome, start, end, score. I have multiple different files ...
1vote
4answers
117views
Find lines in Vim that start one way and that don't end in another way
I'm trying to use Vim to find, via /, lines that start and end in specific ways. In particular, I'd be looking for lines that start with the character > and without the string RNA at the very end. ...
6votes
3answers
927views
bash script quoting frustration
This problem is driving me crazy. From the command prompt I can enter this command and it works as expected (records where the INFO/RegionType tag contains the value Core are emitted in the output ...
2votes
5answers
321views
Grouping rows by categories avoiding repetition
I have a tab-separated file with two columns on a Linux machine. The first column contains names, the second column contains GO IDs (these are always of the format GO: followed by seven digits) ...
5votes
6answers
284views
subset columns from the 1st file using column names in 2nd file
I have two text files: 1st file is a Tab delimited file which looks like this: chrom pos ref alt a1 a2 a3 a4 10 12345 C T aa bb cc dd 10 12345 C T aa bb cc dd 10 12345 C ...
2votes
5answers
1kviews
Replace new lines with spaces using awk
I have a text file that I generated of all files in a directory. I'd like to use this file as input into a script that I have, but I need the text file to be formatted in a particular way to be parsed ...
1vote
5answers
123views
sed command to replace a word within a line following a pattern
I'm working with a file that looks like the following, containing with over 50,000 lines of gene IDs followed by their sequence: gene_A:3342234 CTCTTTCTTTTACGCCT gene_A:1244-5205 CTCTTTCTTTTACGCCT ...
2votes
5answers
545views
How to split a given column's string values in a text file
I have a text file on a Linux machine with two columns: Column 1 = id_no (most are 5, with some 6 digits long); Column 2 = genetic_markers (all are 50674 digits long); 12345 0102010205 54322 ...
1vote
4answers
158views
Remove everything in a third column but only keep specific text
I have a data set with three columns: https://drive.google.com/file/d/1gtCssfAXHxRjGfX8uTAaimGPWCA2cnci/view?usp=sharing Here are the first few lines: ID transcript_id go_description ...
1vote
4answers
485views
Retrieve the 1st and 5th column of a tab-separated file, convert the spaces in the 5th to tabs
I have a tsv file with tab-separated columns. I want to obtain the 5th column, which has space-separated values. Convert the space-separation to tab-separation and save as a new file. Attempt: cut -d&...
1vote
5answers
89views
How to count word from a column when consecutive cells are equal in a different column using shell script!
I'm trying to count the number of C_R and S_R in column 9 when consecutive cells in column 2, column 3, and column 1 are the same. The file is in bed format (tab-separated format). The original file ...
1vote
4answers
202views
Duplicate part of a line to another part
I would like to copy the first part (IxoscaEVm****t1_, without the '.p[number]') of the line starting with ">" and paste a before the last ":" of the same line. Input: >...
0votes
1answer
54views
Counting characters between grep searches
Is there a way I can use the grep command in conjunction with a series of other commands to find a character sequence (ie 'GAATTC' in a fasta file) and count how many characters are between each match?...