What is BiasAway?
The BiasAway software tool is introduced to generate nucleotide composition-matched DNA sequences. It is available as open source code from bitbucket.
The tool provides users with four approaches to generate synthetic or genomic background sequences matching mono- and Dinucleotide composition of user-provided foreground sequences:
- 1. synthetic k-mer shuffled sequences in a sliding window,
- 2. synthetic k-mer shuffled sequences in a sliding window,
- 3. genomic mononucleotide distribution matched sequences,
- 4. genomic mononucleotide distribution within a sliding window matched sequences.
The 1st and 2nd approaches shuffle each user-provided sequences independently by preserving the k-mer composition, respectively. The 2th approaches apply the same method as for the 1st approache but within a sliding window along the user-provided sequences. For the 3th and 4th approaches, the background sequences are selected from a pool of provided genomic sequences to match the distribution of mononucleotide for each target sequence. The 4th approach consideres the mean and standard deviation of %GC computed within the sliding window along the user-provided sequences to match as closely as possible the distribution for each user-provided sequence.
The approaches based on a sliding window were considered because due to evolutionary changes such as insertion of repetitive sequences, local rearrangements, or biochemical missteps, the target sequences may have sub-regions of distinct nucleotide composition.
You can find the complete documentation for BiasAway at readthedocs.
The source code for BiasAway is available on bitbucket.