UDACHA - help

Coordinate system

Genome positions of ASVs and SNVs in the online database are vcf/gtf-like (1-based), hg38 genome assembly. The downloads are provided in a bed-like format (0-based start, 1-based end).

BAD

Background Allelic Dosage (BAD) is the expected ratio of major to minor allelic frequencies in a particular genomic region.
For example, if a copy number of two alternating alleles is the same (e.g. 1:1 (diploid), 2:2, or 3:3), then the respective region has BAD=1, i.e. the expected ratio of reads mapped to alternative alleles on heterozygous SNVs is 1. All triploid regions have BAD=2 and the expected allelic read ratio is either 2 or ½. In general, if BAD of a particular region is known, then the expected frequencies of allelic reads are 1/(BAD +1) and BAD/(BAD + 1). Whole genome BADmaps are obtained with BABACHI. More details can be found in the ADASTRA paper.

ADASTRA coverage filters

ADASTRA pipeline utilizes several SNV filters before statistical evaluations of ASV:

Heterozygous biallelic SNVs (according to GATK ‘GT’ annotation);
Having ≥ 5 reads at each of the alleles;
Present in dbSNP build151 ‘common’ collection;
Belonging to the regions with BABACHI-estimated BAD;
Having a total read coverage of ≥ 20 in at least one of the experiments (for a particular TF or cell type).

More details can be found in the ADASTRA paper.

ASV properties

ASV significance: ASV significance ASV calling is done separately for each ChIP-Seq experiment. For each candidate ASV site, the P-values for Reference and Alternative allele are calculated separately according to the fitted Negative Binomial Mixture model accounting for different assignment of the alleles to the higher or lower DNA copies in genomic regions with BAD > 1. Prior mixture weights obtained with the global fit across SNVs were updated with Bayesian estimation separately for each SNV.
For a particular SNV, the P-values from individual data sets are aggregated with the logit (Mudholkar-George) method for each TF (using ChIP-Seq data from all cell types) and cell type (using ChIP-Seq data from all TFs) and FDR-corrected with the Benjamini-Hochberg procedure for SNVs (for each TF and each cell type separately).
ASV effect size: The Effect Size of an ASV event is calculated separately for Reference and Alternative alleles and is defined as the weighted mean of log-ratios of observed and expected allelic read counts, with weights being -log₁₀ of the respective P-values. The expected read counts are estimated from the fitted Negative Binomial Mixture model accounting for different assignments of the allies to the higher or lower DNA copies in genomic regions with BAD > 1. Prior mixture weights obtained with the global fit across SNVs were updated with Bayesian estimation separately for each SNV.
The Effect Size is not assigned (n/a) if all of the raw individual P-values of an SNV on a particular genome position are equal to 1, considering Ref- and Alt-ASVs separately.

OS	Version	Chrome	Firefox	Microsoft Edge	Opera	Safari
Linux	Ubuntu 18.04	93	92	n/a	not tested	n/a
MacOS	Catalina, Mojave	93	92	n/a	79	11.1
Windows	10	93	92	93	79	n/a

Help

Where to start?

What is ASV?

Glossary

ASV properties

Software

External resources

Resources and databases used to annotate UDACHA ASVs

Browser compatibility