Skip to main content

Questions tagged [bioinformatics]

Bioinformatics is the use of software tools to analyse biological data.

1vote
1answer
58views

Needleman-Wunsch algorithm with affine gap cost

Needleman-Wunsch is a bioinformatics algorithm used to align 2 sequences. The algorithm outputs the score of the alignment and a Vec containing all operations to reconstruct the alignment. I do not ...
RomainL.'s user avatar
3votes
1answer
86views

Finding repeats in DNA in R

I recently wrote some R code, which I would normally not do, to find repeats in DNA fasta files. Here is an example fasta file: ...
con's user avatar
  • 259
6votes
2answers
226views

Calculating probability of offspring having dominant phenotype given a random mating - Mendel's First Law

I'm a beginner to python and have been working through the Rosalind problems. If you're unfamiliar with Rosalind, they're a website where you can practice bioinformatic coding through problem solving. ...
bio_boy's user avatar
2votes
1answer
172views

First order hidden Markov model with Viterbi algorithm in Java

Introduction A first order HMM (hidden Markov model) is a tuple \$(H, \Sigma, T, E, \mathbb{P})\$, where \$H = \{1, \ldots, \vert H \vert\}\$ is the set of hidden states, \$\Sigma\$ is the set of ...
coderodde's user avatar
2votes
1answer
94views

Semi-dynamic range minimum query (RMQ) tree in Java

Introduction I have this semi-dynamic range minimum query (RMQ) tree in Java. It is called semi-dynamic due to the fact that it cannot be modified after it is constructed. However, the values ...
coderodde's user avatar
6votes
1answer
93views

Optimizing __getitem__ for a bioinformatics script in Python

I'm writing a script for a bioinformatics application in Python that iterates through several sequences looking for amino acids in specific positions to calculate their frequency in relation to a ...
Eduardo Menotti's user avatar
4votes
1answer
95views

Random FASTA file generator

This is my code to generate a FASTA file containing multiple records with randomized DNA sequences with distinct length. I am looking for feedback on how to write this script better. ...
Supertech's user avatar
2votes
2answers
104views

Slow Bioinformatics algorithm - Clump finding algorithm in Haskell

I'm working on the famous clump finding problem to learn Haskell. Part of the problem involve breaking nucleotide sequences, called kmers, into subsequences as follows: ...
plaffont's user avatar
5votes
1answer
279views

Converting an RNA sequence into the amino acid code and introducing a mutation

I have recently started coding in Python and this is my first proper project. The user enters an RNA sequence and the program converts the code into the corresponding amino acid sequence. The user can ...
user260750's user avatar
1vote
1answer
100views

Show different concentrations on scatter plots with different colors [closed]

I have an app which parses data files on the clients browsers and displays a scatter chart. This is an example of a scatter chart with 10,000 data points: This is an example of a scatter chart with 1....
Mark's user avatar
  • 221
1vote
1answer
61views

Sensibly using fastq crate to modify fastq files

I'm learning Rust, and I have a simple program that I hope to use as a learning exercise. My goal here is to get a better idea of the proper way of doing things. I'm trying to use the fastq crate to ...
Tsaari's user avatar
5votes
1answer
402views

Rust program to one hot encode genetic sequences from .fa files

I wanted to write some code which reads in a FASTA file and one hot encodes the sequence which is consequentially saved to a file. A FASTA file is a text based file format commonly used in ...
dry-leaf's user avatar
2votes
1answer
469views

Python FASTA parser using dictionaries without using BioPython or other external libraries

I am writing my own parser for FASTA format. I can't use BioPython or anything else, because it's a part of an assignment and our teacher wants us to try to do it manually. For now, I have done this: <...
CitronWorld's user avatar
4votes
1answer
694views

Function to calculate the GC content variation in a sequence

I came across this BMC Genomics paper: Analysis of intra-genomic GC content homogeneity within prokaryotes And I implemented some Python functions to make this available as part of a personal project. ...
Paulo Sergio Schlogl's user avatar
8votes
3answers
1kviews

Filter out ambiguous bases from a DNA sequence

I have this function: ...
Paulo Sergio Schlogl's user avatar

153050per page
close