A tool to generate nucleotide composition-matched DNA sequences

Read more about BiasAway

BiasAway modules

Citing BiasAway

A. Khan, R. Riudavets Puig, P. Boddie, and A. Mathelier. BiasAway: command-line and web server to generate nucleotide composition-matched DNA background sequences. Bioinformatics btaa928 (2020). https://doi.org/10.1093/bioinformatics/btaa928

Worsley Hunt, R., Mathelier, A., del Peso, L. et al. Improving analysis of transcription factor binding sites within ChIP-Seq data based on topological motif enrichment. BMC Genomics 15, 472 (2014). https://doi.org/10.1186/1471-2164-15-472

What is BiasAway?

The BiasAway software tool is introduced to generate nucleotide composition-matched DNA sequences. It is available as open source code from bitbucket.

The tool provides users with four approaches to generate synthetic or genomic background sequences matching mono- and dinucleotide composition of user-provided foreground sequences:

  • synthetic k-mer shuffled sequences
  • synthetic k-mer shuffled sequences in a sliding window
  • genomic mononucleotide distribution matched sequences
  • genomic mononucleotide distribution within a sliding window matched sequences

The 1st and 2nd approaches shuffle each user-provided sequences independently by preserving the mononucleotide or dinucleotide composition, respectively. The 3rd and 4th approaches apply the same method as for the 1st and 2nd approaches but within a sliding window along the user-provided sequences. For the 5th and 6th approaches, the background sequences are selected from a pool of provided genomic sequences to match the distribution of mononucleotide for each target sequence. The 6th approach consideres the mean and standard deviation of %GC computed within the sliding window along the user-provided sequences to match as closely as possible the distribution for each user-provided sequence.

The approaches based on a sliding window were considered because due to evolutionary changes such as insertion of repetitive sequences, local rearrangements, or biochemical missteps, the target sequences may have sub-regions of distinct nucleotide composition.

You can find the complete documentation for BiasAway at readthedocs.
The source code for BiasAway is available on bitbucket.