updog - Flexible Genotyping for Polyploids
Implements empirical Bayes approaches to genotype polyploids from next generation sequencing data while accounting for allele bias, overdispersion, and sequencing error. The main functions are flexdog() and multidog(), which allow the specification of many different genotype distributions. Also provided are functions to simulate genotypes, rgeno(), and read-counts, rflexdog(), as well as functions to calculate oracle genotyping error rates, oracle_mis(), and correlation with the true genotypes, oracle_cor(). These latter two functions are useful for read depth calculations. Run browseVignettes(package = "updog") in R for example usage. See Gerard et al. (2018) <doi:10.1534/genetics.118.301468> and Gerard and Ferrao (2020) <doi:10.1093/bioinformatics/btz852> for details on the implemented methods.
Last updated 1 years ago
openblascppopenmp
8.41 score 26 stars 2 dependents 82 scripts 401 downloadstensr - Covariance Inference and Decompositions for Tensor Datasets
A collection of functions for Kronecker structured covariance estimation and testing under the array normal model. For estimation, maximum likelihood and Bayesian equivariant estimation procedures are implemented. For testing, a likelihood ratio testing procedure is available. This package also contains additional functions for manipulating and decomposing tensor data sets. This work was partially supported by NSF grant DMS-1505136. Details of the methods are described in Gerard and Hoff (2015) <doi:10.1016/j.jmva.2015.01.020> and Gerard and Hoff (2016) <doi:10.1016/j.laa.2016.04.033>.
Last updated 2 years ago
6.48 score 5 stars 4 dependents 50 scripts 145 downloadsseqgendiff - RNA-Seq Generation/Modification for Simulation
Generates/modifies RNA-seq data for use in simulations. We provide a suite of functions that will add a known amount of signal to a real RNA-seq dataset. The advantage of using this approach over simulating under a theoretical distribution is that common/annoying aspects of the data are more preserved, giving a more realistic evaluation of your method. The main functions are select_counts(), thin_diff(), thin_lib(), thin_gene(), thin_2group(), thin_all(), and effective_cor(). See Gerard (2020) <doi:10.1186/s12859-020-3450-9> for details on the implemented methods.
Last updated 7 months ago
5.85 score 10 stars 71 scripts 199 downloadsldsep - Linkage Disequilibrium Shrinkage Estimation for Polyploids
Estimate haplotypic or composite pairwise linkage disequilibrium (LD) in polyploids, using either genotypes or genotype likelihoods. Support is provided to estimate the popular measures of LD: the LD coefficient D, the standardized LD coefficient D', and the Pearson correlation coefficient r. All estimates are returned with corresponding standard errors. These estimates and standard errors can then be used for shrinkage estimation. The main functions are ldfast(), ldest(), mldest(), sldest(), plot.lddf(), format_lddf(), and ldshrink(). Details of the methods are available in Gerard (2021a) <doi:10.1111/1755-0998.13349> and Gerard (2021b) <doi:10.1038/s41437-021-00462-5>.
Last updated 2 years ago
openblascppopenmp
5.23 score 9 stars 19 scripts 203 downloadssegtest - Tests for Segregation Distortion in Polyploids
Provides a suite of tests for segregation distortion in F1 polyploid populations (for now, just tetraploids). This is under different assumptions of meiosis. Details of these methods are described in Gerard et al. (2024) <doi:10.1101/2024.02.07.579361>. This material is based upon work supported by the National Science Foundation under Grant No. 2132247. The opinions, findings, and conclusions or recommendations expressed are those of the author and do not necessarily reflect the views of the National Science Foundation.
Last updated 2 months ago
cpp
4.78 score 1 stars 3 scripts 130 downloads