Skip to main content

Questions tagged [bioinformatics]

Use this tag for questions relating to common bioinformatics tasks performed on a *nix system. Things like manipulating/converting between standard biological text formats, extracting data of interest from such formats etc.

1vote
3answers
66views

edit all the values in a specific column based on row numbers range

I have a PDB file (coordinates of atoms in a protein) on a Linux machine: ATOM 1 N GLY A 1 0.535 51.766 5.682 1.00 0.00 ATOM 2 CA GLY A 1 -0.712 50....
Paolo Lorenzini's user avatar
2votes
1answer
350views

how to pass environment variables to singularity exec

I have a BASH pipeline which at a point runs a Singularity container with singularity exec as follows: singularity exec --bind `pwd`:/folder --bind $d:/results <image>.sif <tool_command> -...
Matteo's user avatar
4votes
3answers
220views

Add columns from variable number of files to base file

I'm dealing with a series of bed files, which look like this: chr1 100 110 0.5 chr1 150 175 0.2 chr1 200 300 1.5 With the columns being chromosome, start, end, score. I have multiple different files ...
Whitehot's user avatar
1vote
4answers
117views

Find lines in Vim that start one way and that don't end in another way

I'm trying to use Vim to find, via /, lines that start and end in specific ways. In particular, I'd be looking for lines that start with the character > and without the string RNA at the very end. ...
Mark Pauley's user avatar
6votes
3answers
927views

bash script quoting frustration

This problem is driving me crazy. From the command prompt I can enter this command and it works as expected (records where the INFO/RegionType tag contains the value Core are emitted in the output ...
mcrepeau's user avatar
2votes
5answers
321views

Grouping rows by categories avoiding repetition

I have a tab-separated file with two columns on a Linux machine. The first column contains names, the second column contains GO IDs (these are always of the format GO: followed by seven digits) ...
rseg's user avatar
5votes
6answers
284views

subset columns from the 1st file using column names in 2nd file

I have two text files: 1st file is a Tab delimited file which looks like this: chrom pos ref alt a1 a2 a3 a4 10 12345 C T aa bb cc dd 10 12345 C T aa bb cc dd 10 12345 C ...
user3138373's user avatar
2votes
5answers
1kviews

Replace new lines with spaces using awk

I have a text file that I generated of all files in a directory. I'd like to use this file as input into a script that I have, but I need the text file to be formatted in a particular way to be parsed ...
lovelyrubbish's user avatar
1vote
5answers
123views

sed command to replace a word within a line following a pattern

I'm working with a file that looks like the following, containing with over 50,000 lines of gene IDs followed by their sequence: gene_A:3342234 CTCTTTCTTTTACGCCT gene_A:1244-5205 CTCTTTCTTTTACGCCT ...
bryophyta's user avatar
2votes
5answers
545views

How to split a given column's string values in a text file

I have a text file on a Linux machine with two columns: Column 1 = id_no (most are 5, with some 6 digits long); Column 2 = genetic_markers (all are 50674 digits long); 12345 0102010205 54322 ...
Michiel Van Niekerk's user avatar
1vote
4answers
158views

Remove everything in a third column but only keep specific text

I have a data set with three columns: https://drive.google.com/file/d/1gtCssfAXHxRjGfX8uTAaimGPWCA2cnci/view?usp=sharing Here are the first few lines: ID transcript_id go_description ...
Muahammad Ahmad's user avatar
1vote
4answers
485views

Retrieve the 1st and 5th column of a tab-separated file, convert the spaces in the 5th to tabs

I have a tsv file with tab-separated columns. I want to obtain the 5th column, which has space-separated values. Convert the space-separation to tab-separation and save as a new file. Attempt: cut -d&...
Anon's user avatar
  • 133
1vote
5answers
89views

How to count word from a column when consecutive cells are equal in a different column using shell script!

I'm trying to count the number of C_R and S_R in column 9 when consecutive cells in column 2, column 3, and column 1 are the same. The file is in bed format (tab-separated format). The original file ...
Debajyoti Kabiraj's user avatar
1vote
4answers
202views

Duplicate part of a line to another part

I would like to copy the first part (IxoscaEVm****t1_, without the '.p[number]') of the line starting with ">" and paste a before the last ":" of the same line. Input: >...
alex kiarie's user avatar
0votes
1answer
54views

Counting characters between grep searches

Is there a way I can use the grep command in conjunction with a series of other commands to find a character sequence (ie 'GAATTC' in a fasta file) and count how many characters are between each match?...
Alina's user avatar

153050per page
close