Questions tagged [bioinformatics]
Bioinformatics is the use of software tools to analyse biological data.
152 questions
1vote
1answer
58views
Needleman-Wunsch algorithm with affine gap cost
Needleman-Wunsch is a bioinformatics algorithm used to align 2 sequences. The algorithm outputs the score of the alignment and a Vec containing all operations to reconstruct the alignment. I do not ...
3votes
1answer
86views
Finding repeats in DNA in R
I recently wrote some R code, which I would normally not do, to find repeats in DNA fasta files. Here is an example fasta file: ...
6votes
2answers
226views
Calculating probability of offspring having dominant phenotype given a random mating - Mendel's First Law
I'm a beginner to python and have been working through the Rosalind problems. If you're unfamiliar with Rosalind, they're a website where you can practice bioinformatic coding through problem solving. ...
2votes
1answer
172views
First order hidden Markov model with Viterbi algorithm in Java
Introduction A first order HMM (hidden Markov model) is a tuple \$(H, \Sigma, T, E, \mathbb{P})\$, where \$H = \{1, \ldots, \vert H \vert\}\$ is the set of hidden states, \$\Sigma\$ is the set of ...
2votes
1answer
94views
Semi-dynamic range minimum query (RMQ) tree in Java
Introduction I have this semi-dynamic range minimum query (RMQ) tree in Java. It is called semi-dynamic due to the fact that it cannot be modified after it is constructed. However, the values ...
6votes
1answer
93views
Optimizing __getitem__ for a bioinformatics script in Python
I'm writing a script for a bioinformatics application in Python that iterates through several sequences looking for amino acids in specific positions to calculate their frequency in relation to a ...
4votes
1answer
95views
Random FASTA file generator
This is my code to generate a FASTA file containing multiple records with randomized DNA sequences with distinct length. I am looking for feedback on how to write this script better. ...
2votes
2answers
104views
Slow Bioinformatics algorithm - Clump finding algorithm in Haskell
I'm working on the famous clump finding problem to learn Haskell. Part of the problem involve breaking nucleotide sequences, called kmers, into subsequences as follows: ...
5votes
1answer
279views
Converting an RNA sequence into the amino acid code and introducing a mutation
I have recently started coding in Python and this is my first proper project. The user enters an RNA sequence and the program converts the code into the corresponding amino acid sequence. The user can ...
1vote
1answer
100views
Show different concentrations on scatter plots with different colors [closed]
I have an app which parses data files on the clients browsers and displays a scatter chart. This is an example of a scatter chart with 10,000 data points: This is an example of a scatter chart with 1....
1vote
1answer
61views
Sensibly using fastq crate to modify fastq files
I'm learning Rust, and I have a simple program that I hope to use as a learning exercise. My goal here is to get a better idea of the proper way of doing things. I'm trying to use the fastq crate to ...
5votes
1answer
402views
Rust program to one hot encode genetic sequences from .fa files
I wanted to write some code which reads in a FASTA file and one hot encodes the sequence which is consequentially saved to a file. A FASTA file is a text based file format commonly used in ...
2votes
1answer
469views
Python FASTA parser using dictionaries without using BioPython or other external libraries
I am writing my own parser for FASTA format. I can't use BioPython or anything else, because it's a part of an assignment and our teacher wants us to try to do it manually. For now, I have done this: <...
4votes
1answer
694views
Function to calculate the GC content variation in a sequence
I came across this BMC Genomics paper: Analysis of intra-genomic GC content homogeneity within prokaryotes And I implemented some Python functions to make this available as part of a personal project. ...
8votes
3answers
1kviews
Filter out ambiguous bases from a DNA sequence
I have this function: ...