Title: | Automatic Shift Detection using a Phylogenetic EM |
Version: | 1.8.0 |
Description: | Implementation of the automatic shift detection method for Brownian Motion (BM) or Ornstein–Uhlenbeck (OU) models of trait evolution on phylogenies. Some tools to handle equivalent shifts configurations are also available. See Bastide et al. (2017) <doi:10.1111/rssb.12206> and Bastide et al. (2018) <doi:10.1093/sysbio/syy005>. |
Depends: | ape (≥ 5.3), Matrix (≥ 1.2.18), R (≥ 3.6.0), |
Imports: | capushe (≥ 1.1.1), foreach (≥ 1.4.3), gglasso (≥ 1.4), glmnet (≥ 2.0.5), graphics (≥ 3.6.0), grDevices (≥ 3.6.0), LINselect (≥ 1.1.1), MASS (≥ 7.3.45), methods (≥ 3.6.0), plyr (≥ 1.8.4), Rcpp (≥ 1.0.2), robustbase (≥ 0.92.6), stats (≥ 3.6.0), utils (≥ 3.6.0) |
Suggests: | combinat (≥ 0.0.8), doParallel (≥ 1.0.10), phytools (≥ 0.5.38), testthat (≥ 1.0.2), TreeSim (≥ 2.2), knitr, rmarkdown |
LinkingTo: | Rcpp, RcppArmadillo |
License: | GPL-2 | GPL-3 | file LICENSE [expanded from: GPL (≥ 2) | file LICENSE] |
Encoding: | UTF-8 |
LazyData: | true |
RoxygenNote: | 7.3.2 |
URL: | https://github.com/pbastide/PhylogeneticEM, https://pbastide.github.io/PhylogeneticEM/ |
BugReports: | https://github.com/pbastide/PhylogeneticEM/issues |
VignetteBuilder: | knitr |
NeedsCompilation: | yes |
Packaged: | 2025-02-14 17:47:49 UTC; bastidep |
Author: | Paul Bastide [aut, cre], Mahendra Mariadassou [ctb] |
Maintainer: | Paul Bastide <paul.bastide@m4x.org> |
Repository: | CRAN |
Date/Publication: | 2025-02-14 23:10:02 UTC |
Model Estimation with Detection of Shifts
Description
PhyloEM
is the main function of the package. It uses maximum likelihood
methods to fit a BM or an OU process for several traits evolving along a
phylogenetic tree, with automatic shift detection on the branches of the tree.
This function can handle missing data.
Usage
PhyloEM(
phylo,
Y_data,
process = c("BM", "OU", "scOU", "rBM"),
check_postorder = TRUE,
independent = FALSE,
K_max = max(floor(sqrt(length(phylo$tip.label))), 10),
use_previous = FALSE,
order = TRUE,
method.selection = c("LINselect", "DDSE", "Djump"),
C.BM1 = 0.1,
C.BM2 = 2.5,
C.LINselect = 1.1,
method.variance = c("upward_downward", "simple"),
method.init = "lasso",
method.init.alpha = "estimation",
method.init.alpha.estimation = c("regression", "regression.MM", "median"),
methods.segmentation = c("lasso", "best_single_move"),
alpha_grid = TRUE,
nbr_alpha = 10,
random.root = TRUE,
stationary.root = random.root,
alpha = NULL,
check.tips.names = TRUE,
progress.bar = TRUE,
estimates = NULL,
save_step = FALSE,
rescale_OU = TRUE,
parallel_alpha = FALSE,
Ncores = 3,
K_lag_init = 5,
light_result = TRUE,
tol_tree = .Machine$double.eps^0.5,
allow_negative = FALSE,
option_is.ultrametric = 1,
trait_correlation_threshold = 0.9,
...
)
Arguments
phylo |
A phylogenetic tree of class |
Y_data |
Matrix of data at the tips, size p x ntaxa. Each line is a trait, and each column is a tip. The column names are checked against the tip names of the tree. |
process |
The model used for the fit. One of "BM" (for a full BM model, univariate or multivariate); "OU" (for an OU with independent traits, univariate or multivariate); or "scOU" (for a "scalar OU" model, see details). |
check_postorder |
Re-order the tree in post-order. If the Upward-Downward algorithm is used, the tree need to be in post-order. Default to TRUE if the upward-downward is used, otherwise automatically set to FALSE. |
independent |
Are the trait assumed to be independent from one another? Default to FALSE. OU in a multivariate setting only works if TRUE. |
K_max |
The maximum number of shifts to be considered. Default to
|
use_previous |
Should the initialization for K+1 shifts use the estimation for $K$ shifts already obtained? Default to FALSE. |
order |
Should the estimations be done for K increasing (TRUE) or K decreasing (FALSE)? If use_previous=FALSE, this has no influence, except if one initialization fails. Default to TRUE. |
method.selection |
Method selection to be used. Several ones can be used at the same time. One of "LINselect" for the Baraud Giraud Huet LINselect method; "DDSE" for the Slope Heuristic or "Djump" for the Jump Heuristic, last two based the Birgé Massart method. |
C.BM1 |
Multiplying constant to be used for the BigeMassart1 method. Need to be positive. Default to 0.1. |
C.BM2 |
Multiplying constant to be used for the BigeMassart2 method. Default to 2.5. |
C.LINselect |
Multiplying constant to be used for the LINselect method. Need to be greater than 1. Default to 1.1. |
method.variance |
Algorithm to be used for the moments computations at the E step. One of "simple" for the naive method; of "upward_downward" for the Upward Downward method (usually faster). Default to "upward_downward". |
method.init |
The initialization method. One of "lasso" for the LASSO base initialization method; or "default" for user-specified initialization values. Default to "lasso". |
method.init.alpha |
For OU model, initialization method for the selection
strength alpha. One of "estimation" for a cherry-based initialization, using
|
method.init.alpha.estimation |
If method.init.alpha="estimation",
choice of the estimation(s) methods to be used. Choices among "regression",
(method="M" is passed to |
methods.segmentation |
For OU, method(s) used at the M step to find new candidate shifts positions. Choices among "lasso" for a LASSO-based algorithm; and "best_single_move" for a one-move at a time based heuristic. Default to both of them. Using only "lasso" might speed up the function a lot. |
alpha_grid |
whether to use a grid for alpha values. Default to TRUE. This
is the only available method for scOU. This method is not available for OU with
multivariate traits. OU with univariate traits can take both TRUE or FALSE. If
TRUE, a grid based on the branch length of the tree is automatically computed,
using function |
nbr_alpha |
If |
random.root |
whether the root is assumed to be random (TRUE) of fixed (FALSE). Default to TRUE |
stationary.root |
whether the root is assumed to be in the stationary state. Default to TRUE. |
alpha |
If the estimation is done with a fixed alpha (either known, or on a grid), the possible value for alpha. Default to NULL. |
check.tips.names |
whether to check the tips names of the tree against the column names of the data. Default to TRUE. |
progress.bar |
whether to display a progress bar of the computations. Default to TRUE. |
estimates |
The result of a previous run of this same function. This function can be re-run for other model election method. Default to NULL. |
save_step |
If alpha_grid=TRUE, whether to save the intermediate results for each value of alpha (in a temporary file). Useful for long computations. Default to FALSE. |
rescale_OU |
For the Univariate OU, should the tree be re-scaled to use a BM ? This can speed up the computations a lot. However, it can make it harder for the EM to explore the space of parameters, and hence lead to a sub-optimal solution. Default to TRUE. |
parallel_alpha |
If alpha_grid=TRUE, whether to run the estimations with different values of alpha on separate cores. Default to FALSE. If TRUE, the log is written as a temporary file. |
Ncores |
If parallel_alpha=TRUE, number of cores to be used. |
K_lag_init |
Number of extra shifts to be considered at the initialization step. Increases the accuracy, but can make computations quite slow of taken too high. Default to 5. |
light_result |
if TRUE (the default), the object returned is made light, without easily computable quantities. If FALSE, the object can be very heavy, but its subsequent manipulations can be faster (especially for plotting). |
tol_tree |
tolerance to consider a branch length significantly greater than zero, or
two lineages lengths to be different, when checking for ultrametry.
(Default to .Machine$double.eps^0.5). See |
allow_negative |
whether to allow negative values for alpha (Early Burst). See details. Default to FALSE. |
option_is.ultrametric |
option for |
trait_correlation_threshold |
the trait correlation threshold to stop the analysis. Default to 0.9. |
... |
Further arguments to be passed to |
Details
Several models can be used:
BM with fixed root, univariate or multivariate.
OU with fixed or stationary root, univariate or multivariate.
For the OU in the multivariate setting, two assumptions can be made:
Independent traits. This amounts to diagonal rate and selection matrices.
"Scalar OU" (scOU): the rate matrix can be full, but the selection strength matrix is assumed to be scalar, i.e. all the traits are supposed to go to their optimum values with the same speed.
Note that the "scalar OU" model can also be seen as a re-scaling of the tree. The selection strength parameter alpha can then be interpreted as a measure of the "phylogenetic signal":
If alpha is close to 0, then the process is similar to a BM on the original tree, and the signal is strong.
If alpha is large, then the re-scaled tree is similar to a star-tree, and the signal is weak.
When there are no shifts, and the root is taken to be constant, this model is actually equivalent to an AC model (Uyeda et al. 2015). With this interpretation in mind, one might want to explore negative values of alpha, in order to fit a DC (or Early Burst) model. With no shift and a fixed root, the same proof shows that the scOU with alpha negative is equivalent to the DC model. There are two strong caveats in doing that.
The interpretation of the OU as modeling the dynamic of a trait undergoing stabilizing selection is lost. In this case, the scOU can only be seen as a re-scaling of the tree, similar to Pagel's delta.
The values of the "optimal values", and of the shifts on them, cannot be interpreted as such (the process is actually going away from this values, instead of being attracted). When looking at these values, one should only use the un-normalized values happening of the underlying BM. You can extract those using the
params_process
function withrBM = TRUE
.
Value
An object of class PhyloEM
. Relevant quantities can be extracted from it
using helper functions params_process.PhyloEM
,
imputed_traits.PhyloEM
See Also
plot.PhyloEM
, params_process.PhyloEM
,
imputed_traits.PhyloEM
Examples
## Not run:
## Load Data
data(monkeys)
## Run method
# Note: use more alpha values for better results.
res <- PhyloEM(Y_data = monkeys$dat, ## data
phylo = monkeys$phy, ## phylogeny
process = "scOU", ## scalar OU
random.root = TRUE, ## root is stationary
stationary.root = TRUE,
K_max = 10, ## maximal number of shifts
nbr_alpha = 4, ## number of alpha values
parallel_alpha = TRUE, ## parallelize on alpha values
Ncores = 2)
## Plot selected solution (LINselect)
plot(res) # three shifts
## Plot selected solution (DDSE)
plot(res, method.selection = "DDSE") # no shift
## Extract and solution with 5 shifts
params_5 <- params_process(res, K = 5)
plot(res, params = params_5)
## Show all equivalent solutions
eq_sol <- equivalent_shifts(monkeys$phy, params_5)
plot(eq_sol)
## End(Not run)
Add several entries, when only one is not NA.
Description
add_complementary
return the only non NA value of a given vector.
Usage
add_complementary(z)
Arguments
z |
a vector containing at most only one non-NA value. |
Details
This function is used in function merge_complementary_vectors
Value
row vector of the same size as entry matrix.
Allocation of regimes to nodes.
Description
allocate_regimes_from_shifts
allocate a number (from 0 to the number
of shifts) to each node, corresponding to its regime : all nodes below shift
i are numbered by i.
Usage
allocate_regimes_from_shifts(phylo, shifts_edges)
Arguments
phylo |
a phylogenetic tree, class |
shifts_edges |
edges were the shifts are. |
Value
Vector of size (ntaxa + Nnode) of the regimes of each node and tip.
Allocation of shifts to edges
Description
allocate_shifts_from_regimes
returns the position of the shifts induced
by the allocation of the regimes. Only works in an "infinite site" model.
Usage
allocate_shifts_from_regimes(phylo, regimes)
Arguments
phylo |
a phylogenetic tree, class |
regimes |
: vector of size (ntaxa + Nnode) of the regimes of each node and tip. |
Value
Vector of edges numbers where the shifts are.
Iteration allocation
Description
allocate_subset_node.simulate
slice the data correctly and allocate the right part.
To be used in function UpdateDown.
Usage
allocate_subset_node.simulate(node, array, value)
Arguments
node |
node on which to slice |
array |
structure to be sliced |
value |
value to be attributed to the slice |
Value
array: array p x Nnode x 2 (BM), with slice corresponding to node filled with value
Check selection strength
Description
check.selection.strength
checks, if the process is an OU, if the
selection strength is not too low, in which case the process is replaced
with a BM.
Usage
check.selection.strength(process, selection.strength = NA, eps = 10^(-6), ...)
Arguments
process |
: "BM" or "OU" |
selection.strength |
the selection strength parameter (if OU) |
eps |
the tolerance for the selection strength |
Details
This function return a process in a form of a character. If the entry process is "BM", then nothing is done. If it is "OU", then the verification of the selection strength is done.
Value
character : "BM" or "OU"
Test the format of data entry.
Description
check_data
tests if the data matrix has the right format, and if it is
correctly ordered to match the tips names.
Usage
check_data(phylo, Y_data, check.tips.names)
Arguments
phylo |
a phylogenetic tree, class |
Y_data |
matrix of data at the tips (pxntaxa) |
check.tips.names |
(bool) whether to check the tips names or not |
Value
Y_data a re-ordered matrix of data (if necessary)
Check dimensions of the parameters
Description
check_dimensions
checks dimensions of the parameters.
If wrong, throw an error.
Usage
check_dimensions(
p,
root.state,
shifts,
variance,
selection.strength = NULL,
optimal.value = NULL
)
Arguments
p |
dimension of the trait simulated |
root.state |
(list) state of the root, with: random : random state (TRUE) or deterministic state (FALSE) value.root : if deterministic, value of the character at the root exp.root : if random, expectation of the character at the root var.root : if random, variance of the character at the root |
shifts |
(list) position and values of the shifts : edges : vector of the K id of edges where the shifts are values : matrix p x K of values of the shifts on the edges (one column = one shift) relativeTimes : vector of dimension K of relative time of the shift from the parent node of edges |
variance |
variance-covariance matrix size p x p |
selection.strength |
matrix of selection strength size p x p (OU) |
optimal.value |
vector of p optimal values at the root (OU) |
Value
Nothing
Check Parsimony, assuming no homoplasy
Description
check_parsimony
take a vector of shifts edges, and check whether the
number of groups of the tips induced by this allocation is exactly the number of
shifts plus one. This is equivalent to parsimony when there is no homoplasy (i.e. no
convergent regimes).
Usage
check_parsimony(tree, edges, ...)
Arguments
tree |
phylogenetic tree |
edges |
a vector of edges of the tree, where the shifts are |
... |
possibly, a list giving the descendant tips of each edge |
Details
This function computes explicitly the clustering of the tips, using
function clusters_from_shifts
.
By default, this function uses enumerate_tips_under_edges
to compute
the list of tips under each edge, but a list can be provided (to avoid extra
computation, if many tests on the same tree are done).
Value
boolean : TRUE if the allocation is parsimonious.
Check whether an allocation of the shifts is parsimonious, in the "infinite site model".
Description
check_parsimony_clusters
takes a vector clusters of the tips, and
checks whether the number of groups of the tips induced by this allocation is
exactly the number of shifts plus one.
Usage
check_parsimony_clusters(tree, edges, clusters)
Arguments
tree |
phylogenetic tree. |
clusters |
a vector of clusters of the tips of the tree (result of
function |
Details
This function computes explicitly the clustering of the tips, using
function check_parsimony
.
By default, this function uses enumerate_tips_under_edges
to compute
the list of tips under each edge, but a list can be provided (if many tests are done).
Value
boolean : TRUE if the allocation is parsimonious.
Check range of alpha
Description
Check that the chosen values of alpha are not too large or too small, in order to avoid numerical instabilities.
Usage
check_range_alpha(alpha, h_tree)
Arguments
alpha |
a vector of alpha values. |
h_tree |
the total height of the tree. |
Clustering associated to a shift allocation, assuming no homoplasy.
Description
clusters_from_shifts
take a vector of shifts edges, and gives the
clustering of the tips induced by them, in a "no homoplasy" model (i.e. no
convergence is allowed).
Usage
clusters_from_shifts(tree, edges, part.list = enumerate_tips_under_edges(tree))
Arguments
tree |
phylogenetic tree |
edges |
a vector of edges of the tree, where the shifts are |
part.list |
a list giving the descendant tips of each edge |
Details
By default, this function uses enumerate_tips_under_edges
to compute
the list of tips under each edge.
Value
list of size n+m-1, entry i is the vector of tips bellow edge i.
E step
Description
compute_E.simple
computes the E step in the simple case where the invert matrix Sigma_YY_inv is given
Usage
compute_E.simple(
phylo,
times_shared,
distances_phylo,
process,
params_old,
masque_data = c(rep(TRUE, attr(params_old, "p_dim") * length(phylo$tip.label)),
rep(FALSE, attr(params_old, "p_dim") * phylo$Nnode)),
F_moments,
Y_data_vec_known,
miss = rep(FALSE, attr(params_old, "p_dim") * length(phylo$tip.label)),
Y_data,
U_tree,
...
)
Arguments
phylo |
Input tree. |
Details
This function takes parameters sim, Sigma and Sigma_YY_inv from
compute_mean_variance.simple
. It uses functions
extract_variance_covariance
, extract_covariance_parents
, and
extract_simulate_internal
to extract the needed quantities from these objects.
Value
conditional_law_X (list) : list of conditional statistics : "expectation" : matrix of size p x (ntaxa+Nnode), with ntaxa first columns set to Y_data (tips), and from ntaxa+1 to conditional expectation of the nodes conditioned to the tips E[Z_j|Y] "variances" : array of size p x p x (ntaxa+Nnode) with ntaxa first matrices of zeros (tips) and conditional variance of the nodes conditioned to the tips Var[Z_j|Y] "covariances" : array of size p x p x (ntaxa+Nnode) with ntaxa first matrices of zeros (tips) and conditional covariance of the nodes and their parents conditioned to the tips Cov[Z_j,Z_pa(j)|Y], with NA for the root. "optimal.values" : matrix of size p x ntaxa+Nnode of optimal values beta(t_j)
Compute Matrix W of actualization (Ultrametric case)
Description
compute_actualization_matrix_ultrametric
computes a squares p*Nedge bloc diagonal
matrix of the (I_p - exp(-A * (h - t_pa(j))))_{j node}.
Usage
compute_actualization_matrix_ultrametric(
tree,
selection.strength,
times_shared = compute_times_ca(tree)
)
Arguments
tree |
a phylogenetic tree. |
selection.strength |
the selection strength of the process. |
times_shared |
a matrix, result of function |
Details
Careful: the root is not taken into account in this function.
Value
Matrix of size p*Nedge
Computation of the optimal values at nodes and tips.
Description
compute_betas_from_shifts
computes the optimal values at the nodes and tips of the
tree, given the value at the root and the list of shifts occurring in the tree.
It assumes an OU model.
Usage
compute_betas_from_shifts(phylo, optimal.value, shifts)
Arguments
phylo |
a phylogenetic tree, class |
optimal.value |
the optimal value at the root of the tree. |
shifts |
position and values of the shifts . |
Details
Note that this is intended to be an internal function, and should not be used.
In general, use node_optimal_values
to get optimal values
from a set of parameters.
Value
Vector of size (ntaxa + Nnode) of the optimal values at the tips of the tree.
Compute differences of expectations between node and parent.
Description
compute_diff_exp
compute the differences of conditional expectations between all the
nodes and their parents.
Usage
compute_diff_exp.BM(phylo, conditional_law_X)
Arguments
phylo |
a phylogenetic tree |
conditional_law_X |
result of function |
Value
matrix p x Nedge containing, for each edge e finishing at node i, the quantity E[Z_i|Y]-E[Z_pa(i)|Y].
Phylogenetic Distances
Description
compute_dist_phy
computes the phylogenetic distances d_ij between all the
tips i, j.
Usage
compute_dist_phy(phy)
Arguments
phy |
a phylogenetic tree of class |
Details
This function relies on ape
function
dist.nodes
.
Value
a matrix of phylogenetic distances, ordered as the tips of the
tree. The matrix is of type symmetricMatrix-class
.
Compute the expected states of a BM
Description
compute_expectations.BM
use the matrix formulation to compute the
expected values at all the nodes.
Usage
compute_expectations.BM(phylo, root.state, shifts, U_tree = NULL)
Arguments
phylo |
Input tree. |
root.state |
(list): state of the root, with: random : random state (TRUE) or deterministic state (FALSE) value.root : if deterministic, value of the character at the root exp.root : if random, expectation of the character at the root var.root : if random, variance of the character at the root (pxp matrix) |
shifts |
(list) position and values of the shifts : edges : vector of the K id of edges where the shifts are values : matrix p x K of values of the shifts on the edges (one column = one shift) relativeTimes : vector of dimension K of relative time of the shift from the parent node of edges |
Value
paramSimu: array p x Nnode x 2 (BM). For each trait t, 1 <= t <= p, paramSimu[t, , ] has two columns, both containing the expected values for all the nodes.
Compute the expected states of a scOU
Description
compute_expectations.scOU
use the matrix formulation to compute the
expected values at all the nodes. Assumes a stationary root.
Usage
compute_expectations.scOU(
phylo,
root.state,
shifts,
alpha,
U_tree = NULL,
times_shared = NULL
)
Arguments
phylo |
Input tree. |
root.state |
(list): state of the root, with: random : random state (TRUE) or deterministic state (FALSE) value.root : if deterministic, value of the character at the root exp.root : if random, expectation of the character at the root var.root : if random, variance of the character at the root (pxp matrix) |
shifts |
(list) position and values of the shifts : edges : vector of the K id of edges where the shifts are values : matrix p x K of values of the shifts on the edges (one column = one shift) relativeTimes : vector of dimension K of relative time of the shift from the parent node of edges |
Value
paramSimu: array p x Nnode x 2 (BM). For each trait t, 1 <= t <= p, paramSimu[t, , ] has two columns, both containing the expected values for all the nodes.
Compute fixed moments for E step.
Description
compute_fixed_moments
compute the fixed matrices used in E step when
there is no missing data.
Usage
compute_fixed_moments(times_shared, ntaxa)
Arguments
times_shared |
matrix of the tree, result of function
|
Value
F_means the matrix to use for actualization of means.
F_vars the matrix to use for the actualization of variances.
Do a lm on top of a lasso regression.
Description
compute_gauss_lasso
takes the variables selected by a lasso procedure, and
uses them to do a simple linear least square regression. Function used is
lm
for non-transformed data (root = NULL), and lm.fit
for
transformed data (root = an integer).
Usage
compute_gauss_lasso(
Ypt,
Xp,
delta,
root,
projection = which(rowSums(delta) != 0)
)
Arguments
Ypt |
(transformed) data |
Xp |
(transformed) matrix of regression |
delta |
regression coefficients obtained with a lasso regression |
root |
the position of the root (intercept) in delta |
Details
Depending on the value of root, the behavior is different. If root is null, then we fit a linear regression with an intercept. If root is equal to an integer, then the "intercept" column of the matrix Xp (that has possibly been trough a multiplication with a Cholesky decomposition of the variance) is included, rather than the intercept.
Value
Named list, with "E0.gauss" the intercept (value at the root); "shifts.gauss" the list of shifts found on the branches; and "residuals" the residuals of the regression
Log Likelihood
Description
compute_log_likelihood.simple
computes the log-likelihood of the data
in the simple case where the inverse of the variance matrix is given.
Usage
compute_log_likelihood.simple(
phylo,
Y_data_vec,
sim,
Sigma,
Sigma_YY_chol_inv,
miss = rep(FALSE, dim(sim)[1] * length(phylo$tip.label)),
masque_data = c(rep(TRUE, dim(sim)[1] * length(phylo$tip.label)), rep(FALSE,
dim(sim)[1] * phylo$Nnode)),
...
)
Arguments
phylo |
Input tree. |
Sigma |
: variance-covariance matrix, result of function |
Details
This function takes parameters sim, Sigma and Sigma_YY_inv from
compute_mean_variance.simple
. It uses functions
extract_variance_covariance
, extract_covariance_parents
, and
extract_simulate_internal
to extract the needed quantities from these objects.
Value
log likelihood of the data
Squared Mahalanobis Distance
Description
compute_mahalanobis_distance.simple
computes the squared Mahalanobis distance
between the data and mean at tips of the data in the simple case where the inverse
of the variance matrix is given.
Usage
compute_mahalanobis_distance.simple(
phylo,
Y_data_vec,
sim,
Sigma_YY_chol_inv,
miss = rep(FALSE, dim(sim)[1] * length(phylo$tip.label)),
...
)
Arguments
phylo |
Input tree. |
Y_data_vec |
: vector indicating the data at the tips. |
sim |
(list) : result of function |
Sigma_YY_chol_inv |
: invert of the cholesky variance-covariance matrix of the data. |
Details
This function takes parameters sim and Sigma_YY_inv from
compute_mean_variance.simple
. It uses function extract_simulate_internal
to extract the needed quantities from these objects.
Value
squared Mahalanobis distance between data and mean at the tips.
Compute moments of params_old
Description
compute_mean_variance.simple
computes the quantities needed to compute
mean and variance matrix with parameters params_old.
Usage
compute_mean_variance.simple(
phylo,
times_shared,
distances_phylo,
process = c("BM", "OU", "rBM", "scOU"),
params_old,
masque_data = c(rep(TRUE, attr(params_old, "p_dim") * length(phylo$tip.label)),
rep(FALSE, attr(params_old, "p_dim") * phylo$Nnode)),
sim = NULL,
U_tree = NULL,
...
)
Arguments
phylo |
Input tree. |
times_shared |
(matrix) : times of shared ancestry, result of function
|
distances_phylo |
(matrix) : phylogenetic distance, result of function
|
process |
a two letter string indicating the process to consider |
params_old |
a list of parameters to be used in the computations |
Details
This function is used by functions compute_E.simple
and
compute_log_likelihood.simple
.
Value
sim (list) : result of function simulate
with the appropriate
parameters
Sigma matrix of variance covariance, result of function
compute_variance_covariance
Sigma_YY_chol_inv invert of Cholesky matrix of Sigma_YY: (Sigma_YY)^(-1) = tcrossprod(Sigma_YY_chol_inv)
Residuals
Description
compute_residuals.simple
computes the residuals after the fit of the
data in the simple case where the inverse of the variance matrix is given.
Usage
compute_residuals.simple(phylo, Y_data_vec, sim, Sigma_YY_chol_inv, miss)
Arguments
phylo |
Input tree. |
Y_data_vec |
: vector indicating the data at the tips. |
sim |
(list) : result of function |
Sigma_YY_chol_inv |
: invert of the Cholesky variance-covariance matrix of the data. |
Details
This function takes parameters sim and Sigma_YY_inv from
compute_mean_variance.simple
. It uses function extract_simulate_internal
to extract the needed quantities from these objects.
Computation of shifts from the vector of optimal values
Description
compute_shifts_from_betas
computes the list of shifts corresponding
to the vector of optimal values on nodes.
Usage
compute_shifts_from_betas(phylo, betas)
Arguments
phylo |
a phylogenetic tree, class |
betas |
vector of size (ntaxa + Nnode) of the optimal values at each node and tip. |
Details
This function uses function fun on each row of matrix of edges.
Value
vector of shifts.
List of potential daughter states when parent is in state k.
Description
compute_state_filter
compute the admissible daughters states, i.e. states that
realize the minimum cost for the tree parent -> daughter -> subtree(daughter), when
the parent node is in state k.
Usage
compute_state_filter(cost, k)
Arguments
cost |
a (ndaughters) x (nclus) matrix of the cost of each state for the daughters nodes. |
k |
the parental state considered. |
Details
This function is used in functions parsimonyNumber
and enumerate_parsimony
.
Value
A (ndaughters) x (nclus) binary matrix indicating the admissible states for the daughters node when parent node is in state k.
Compute the stationary variance matrix
Description
compute_stationary_variance
computes the stationary variance matrix of
an OU process.
Usage
compute_stationary_variance(variance, selection.strength)
Arguments
variance |
the variance (rate matrix) of the process. |
selection.strength |
the selection strength (alpha) matrix of the process. |
Value
A positive definite Matrix of class dpoMatrix-class
.
Compute weighted sum of var_diff
Description
compute_sum_var_diff
computes sum_{e edge} ell_j * Var[X_j - X_pa(j) | Y]
Usage
compute_sum_var_diff(phylo, var_diff)
Arguments
phylo |
a phylogenetic tree |
var_diff |
result of function |
Value
matrix p x p
Common Ancestors Times
Description
compute_times_ca
computes the times t_ij between the root and the common
ancestor of two tips i, j.
Usage
compute_times_ca(phy)
Arguments
phy |
a phylogenetic tree of class |
Details
This function relies on ape
functions
node.depth.edgelength
and mrca
.
Value
a matrix of times of shared evolution, ordered as the tips of the
tree. The matrix is of type symmetricMatrix-class
.
Matrix of tree-induced correlations for the BM
Description
compute_tree_correlations_matrix.BM
returns times_shared its provided argument.
Usage
compute_tree_correlations_matrix.BM(times_shared, params_old, ...)
Arguments
times_shared |
times of shared ancestry of all nodes and tips, result of function
|
Value
times_shared
Matrix of tree-induced correlations for the scOU
Description
compute_tree_correlations_matrix.scOU
computes the (n+m)x(m+n) matrix of correlations
induced by the tree. It takes two cases in consideration: root fixed, or root in stationary
state.
Usage
compute_tree_correlations_matrix.scOU(
times_shared,
distances_phylo,
params_old,
...
)
Arguments
times_shared |
times of shared ancestry of all nodes and tips, result of function
|
distances_phylo |
(matrix) : phylogenetic distance, result of function
|
params_old |
(list) : old parameters to be used in the E step |
Value
matrix of variance covariance for the scOU
Computation of the variance.
Description
compute_var_M.BM
finds the variance that is the maximum of likelihood
Usage
compute_var_M.BM(phylo, var_diff, diff_exp, edges_max, random.root, mu)
Arguments
phylo |
Tree |
var_diff |
variances of differences result of function |
diff_exp |
differences of expectations result of function |
edges_max |
Edges where the shifts occur result of function |
Details
Given the variances, the costs0 and the edges where the shifts occurs, the computation of the maximum of likelihood in the variance is simple.
Value
a p x p matrix : the computed variance
Compute variances of differences between nodes and parents.
Description
compute_var_diff
computes variances of differences between all the
nodes and their parents.
Usage
compute_var_diff.BM(phylo, conditional_law_X)
Arguments
phylo |
a phylogenetic tree |
conditional_law_X |
result of function |
Value
array p x p x Nedge containing, for each edge e finishing at node i, the quantity Var[Z_i-Z_pa(i)|Y].
Tips Variances for the BM
Description
compute_variance_block_diagonal.BM
computes the n p*p variance
matrices of each tip vector.
Usage
compute_variance_block_diagonal.BM(times_shared, params_old, ntaxa, ...)
Arguments
times_shared |
times of shared ancestry of all nodes and tips, result
of function |
params_old |
(list) : old parameters to be used in the E step |
Value
p * p * ntaxa array with ntaxa variance matrices
Tips Variances for the OU
Description
compute_variance_block_diagonal.OU
computes the n p*p variance
matrices of each tip vector.
Usage
compute_variance_block_diagonal.OU(times_shared, params_old, ntaxa, ...)
Arguments
times_shared |
times of shared ancestry of all nodes and tips, result
of function |
params_old |
(list) : old parameters to be used in the E step |
Value
p * p * ntaxa array with ntaxa variance matrices
Complete variance covariance matrix for BM
Description
compute_variance_covariance.BM
computes the (n+m)*p squared variance covariance
matrix of vec(X).
Usage
compute_variance_covariance.BM(times_shared, params_old, ...)
Arguments
times_shared |
times of shared ancestry of all nodes and tips, result of function
|
params_old |
(list) : old parameters to be used in the E step |
Value
matrix of variance covariance for the BM
Complete variance covariance matrix for OU
Description
compute_variance_covariance.OU
computes the (n+m)*p squared variance
covariance matrix of vec(X).
Usage
compute_variance_covariance.OU(times_shared, distances_phylo, params_old, ...)
Arguments
times_shared |
times of shared ancestry of all nodes and tips, result
of function |
params_old |
(list) : old parameters to be used in the E step |
Value
matrix of variance covariance for the OU
Complete variance covariance matrix for OU
Description
compute_variance_covariance.OU
computes the (n+m)*p squared variance
covariance matrix of vec(X).
Usage
compute_variance_covariance.OU.nonsym(
times_shared,
distances_phylo,
params_old,
...
)
Arguments
times_shared |
times of shared ancestry of all nodes and tips, result
of function |
params_old |
(list) : old parameters to be used in the E step |
Value
matrix of variance covariance for the OU
Complete variance covariance matrix for scOU
Description
compute_variance_covariance.scOU
computes the (n+m)*p squared variance covariance
matrix of vec(X).
Usage
compute_variance_covariance.scOU(
times_shared,
distances_phylo,
params_old,
...
)
Arguments
times_shared |
times of shared ancestry of all nodes and tips, result of function
|
distances_phylo |
(matrix) : phylogenetic distance, result of function
|
params_old |
(list) : old parameters to be used in the E step |
Value
matrix of variance covariance for the scOU
Correspondence between edges numbers
Description
correspondenceEdges
takes edges numbers on an input tree, and gives back
their corresponding numbers on the output tree.
Usage
correspondenceEdges(edges, from, to)
Arguments
edges |
vector of index of edges in the tree "from" |
from |
initial input tree (format " |
to |
aimed output tree (format " |
Value
vector of index of edges in the tree "to"
Make the result lighter
Description
enlight.PhyloEM
takes an object of class PhyloEM
,
and returns the same object, without saving the quantities that can be easily
re-computed using function imputed_traits.PhyloEM
.
Usage
enlight(x)
## S3 method for class 'PhyloEM'
enlight(x)
Arguments
x |
an object of class |
Details
The resulting object can be much lighter, saving a lot of memory space, but each
call to the function imputed_traits.PhyloEM
will be longer. As
function plot.PhyloEM
relies on this function, this makes the
plotting also longer.
This has the same effect as setting the option "light_result=TRUE
" in the
call of PhyloEM
.
Value
Same as entry, lighter.
Methods (by class)
-
enlight(PhyloEM)
:PhyloEM
object
See Also
PhyloEM
, imputed_traits.PhyloEM
,
plot.PhyloEM
Enumerate all the possible regime allocations, given a clustering of the tips.
Description
enumerate_parsimony
enumerate all the equivalent allocation of the
regimes in the tree, a clustering of the tips being given. The number of such
equivalent regimes is given by parsimonyNumber
(which is faster).
Usage
enumerate_parsimony(phylo, clusters = rep(1, length(phylo$tip.label)))
Arguments
phylo |
a phylogenetic tree, class |
clusters |
a vector representing the group of each tip. (Default to only one group with all the tips.) |
Details
Function extract.enumerate_parsimony
furnishes the result in a
human readable form (for any subtree).
Function plot.enumerate_parsimony
plots all the solutions found on
the tree.
Value
an S3 object of class "enumerate_parsimony
", with:
- nbrReconstructions
an object of class "
parsimonyCost
", result of functionparsimonyCost
.- allocations
a list of size Nnode + ntaxa. Each entry i of the list represents the solutions for the subtree starting at node i. It is a list with nclus entries, each entry being a matrix. A line of the kth matrix for the ith node is one possible allocation of the shifts, starting with regime k for node i.
- phylo
the entry phylogenetic tree
See Also
extract.enumerate_parsimony
,
plot.enumerate_parsimony
, parsimonyCost
,
parsimonyNumber
, partitionsNumber
,
equivalent_shifts
Examples
tree <- read.tree(text="(((A,B),C),D);")
plot(tree)
clusters <- c(0, 1, 2, 2)
sols <- enumerate_parsimony(tree, clusters)
plot(sols)
## Extract the parsimonious solutions from the root
extract(sols) # each line is a solution, with states of each node
## Extract the number of solutions from the root
extract(sols, what = "number")
extract(parsimonyNumber(tree, clusters)) # same result, more efficient
## Extract the cost of the solutions from the root
extract(sols, what = "cost")
extract(parsimonyCost(tree, clusters)) # same result, more efficient:
## Extract for the sub-tree below node 7
extract(sols, 7) # NAs: non-existing nodes in the sub-tree
Tips descendants of nodes.
Description
enumerate_tips_under_edges
gives, for each edge of the tree, the labels
of the tips that have this edge as an ancestor.
Usage
enumerate_tips_under_edges(tree)
Arguments
tree |
phylogenetic tree, class |
Details
This function uses function prop.part
from package ape
.
Value
list of size Nedge, entry i is the vector of tips bellow edge i.
Find all equivalent shifts allocations and values.
Description
equivalent_shifts
computes the equivalent shifts positions and their
corresponding values, assuming an ultrametric tree.
Usage
equivalent_shifts(
phylo,
params,
T_tree = incidence.matrix(phylo),
part.list = enumerate_tips_under_edges(phylo),
times_shared = NULL
)
Arguments
phylo |
a phylogenetic tree, of class |
params |
an object of class |
T_tree |
(optional) matrix of incidence of the tree, result of function
|
part.list |
(optional) list of partition of the tree, result of function
|
times_shared |
(optional) a matrix, result of function
|
Details
This function is only valid for ultrametric trees, and for models: BM, OU with fixed root or stationary root. It assumes that there are no homoplasies.
Value
object of class equivalent_shifts
, with entries:
- eq_shifts_edges
matrix of equivalent shifts
- shifts_and_betas
matrix of corresponding shifts values
- phylo
the entry phylogenetic tree
- p
the dimension
See Also
plot.equivalent_shifts
,
extract.equivalent_shifts
, params_BM
,
params_OU
, enumerate_parsimony
Examples
if (requireNamespace("TreeSim", quietly = TRUE)) {
## Simualte a tree
set.seed(17920902)
ntaxa = 20
phylo <- TreeSim::sim.bd.taxa.age(n = ntaxa, numbsim = 1, lambda = 0.1,
mu = 0, age = 1, mrca = TRUE)[[1]]
## Define parameters (BM, fixed root)
params <- params_BM(p = 4, edges = c(6, 17, 31),
values = cbind(1:4, -(1:4), rep(1, 4)))
## Find equivalent solutions and plot them
eq_shifts <- equivalent_shifts(phylo, params)
eq_shifts
plot(eq_shifts)
## Extract the values
# Shifts values for trait 2, for the three shifts (rows), and three solutions (columns)
extract(eq_shifts, trait = 2, what = "shifts_values")
# Root values for trait 4, for the tree solutions (columns)
extract(eq_shifts, trait = 4, what = "root_values")
## Define parameters (OU, stationary root
params <- params_OU(p = 4, edges = c(6, 17, 31),
selection.strength = 0.1,
values = cbind(1:4, -(1:4), rep(1, 4)),
random = TRUE)
## Find equivalent solutions and plot them
eq_shifts <- equivalent_shifts(phylo, params)
eq_shifts
plot(eq_shifts)
## Extract the values
# Shifts values for trait 2, for the three shifts (rows), and three solutions (columns)
extract(eq_shifts, trait = 2, what = "shifts_values")
# Root values for trait 4, for the three solutions (columns)
extract(eq_shifts, trait = 4, what = "root_values")
}
Find all the equivalent shift edges allocations.
Description
equivalent_shifts_edges
uses function enumerate_parsimony
to find all the shifts positions that are equivalent to a given one.
Usage
equivalent_shifts_edges(
phylo,
shifts_edges,
part.list = enumerate_tips_under_edges(phylo)
)
Arguments
phylo |
a phylogenetic tree, class |
shifts_edges |
a vector of shifts positions on the edges of the tree. |
part.list |
(optional) list of partition of the tree, result of function
|
Details
This function is uses functions enumerate_parsimony
for the
actual computation of equivalent regimes,
clusters_from_shifts
for the clustering of the tips induced
by the original set of shifts given, and
allocate_shifts_from_regimes
to convert back a parametrization
in term of regimes to a parametrization in term of shifts.
Value
a matrix with as many columns as equivalent allocation, each column representing a possible parsimonious allocation of shifts on the tree.
See Also
equivalent_shifts
, enumerate_parsimony
Find values given edges. OU stationary case. Ultrametric tree.
Description
equivalent_shifts_values
computes the values of the shifts given all
the possible allocations computed by function
equivalent_shifts_edges
.
Usage
equivalent_shifts_values(phylo, eq_shifts_edges, T_tree_ac, vect_Y, p)
Arguments
phylo |
a phylogenetic tree, class |
eq_shifts_edges |
matrix (optional) result of function
|
T_tree_ac |
matrix of incidence of the tree, result of function
|
Details
This function uses the linear representation of the problem. It fist compute
the mean at the tips given by the original shifts positions and values, and
then uses function qr.solve
to find back the values of the shifts, given their various positions,
and the means at the tips. Function compute_actualization_factors
is
used to compute the actualization factor that multiplies the shifts values at
the tips. Careful, only works for ULTRAMETRIC trees.
Value
Named list, with "shifts_values" a matrix of shifts values corresponding to the shifts positions in eq_shifts_edges; and "betas_0" a vector of corresponding root optimal values.
Function to estimate alpha
Description
optimize
to maximize the
conditional expectation log likelihood in alpha. The interval is
set to [alpha_old/2, 2*alpha_old], supposing that the previous guess of
alpha_old is not far from reality.
Usage
estimate.alpha(
phylo,
conditional_law_X,
sigma2,
mu,
shifts,
alpha_old,
max_selection.strength
)
Arguments
phylo |
Input tree. |
conditional_law_X |
result of function |
sigma2 |
variance of params |
mu |
mean of the root state |
shifts |
list of shifts on the tree |
alpha_old |
previous estimation of the selection strength |
max_selection.strength |
the maximal value of alpha authorized by the user |
Details
This function uses functions compute_var_diff.OU
and compute_diff_exp.OU
in the process. Careful : only works if the
root is stationary, and shifts at nodes.
Value
double : estimation of alpha
Perform One EM
Description
EstimateEM
performs one EM for one given number of shifts. It is called
from function PhyloEM
. Its use is mostly internal, and most user
should not need it.
Usage
estimateEM(
phylo,
Y_data,
Y_data_imp = Y_data,
process = c("BM", "OU", "scOU", "rBM"),
independent = FALSE,
tol_EM = list(variance = 10^(-2), value.root = 10^(-2), exp.root = 10^(-2), var.root =
10^(-2), selection.strength = 10^(-2), normalized_half_life = 10^(-2), log_likelihood
= 10^(-2)),
Nbr_It_Max = 500,
method.variance = c("simple", "upward_downward"),
method.init = c("default", "lasso"),
method.init.alpha = c("default", "estimation"),
method.init.alpha.estimation = c("regression", "regression.MM", "median"),
nbr_of_shifts = 0,
random.root = TRUE,
stationary.root = TRUE,
alpha_known = FALSE,
eps = 10^(-3),
known.selection.strength = 1,
init.selection.strength = 1,
max_selection.strength = 100,
use_sigma_for_lasso = TRUE,
max_triplet_number = 10000,
min_params = list(variance = 0, value.root = -10^(5), exp.root = -10^(5), var.root = 0,
selection.strength = 0),
max_params = list(variance = 10^(5), value.root = 10^(5), exp.root = 10^(5), var.root =
10^(5), selection.strength = 10^(5)),
var.init.root = diag(1, nrow(Y_data)),
variance.init = diag(1, nrow(Y_data), nrow(Y_data)),
methods.segmentation = c("lasso", "same_shifts", "best_single_move"),
check.tips.names = FALSE,
times_shared = NULL,
distances_phylo = NULL,
subtree.list = NULL,
T_tree = NULL,
U_tree = NULL,
h_tree = NULL,
F_moments = NULL,
tol_half_life = TRUE,
warning_several_solutions = TRUE,
convergence_mode = c("relative", "absolute"),
check_convergence_likelihood = TRUE,
sBM_variance = FALSE,
method.OUsun = c("rescale", "raw"),
K_lag_init = 0,
allow_negative = FALSE,
trait_correlation_threshold = 0.9,
...
)
Arguments
phylo |
A phylogenetic tree of class |
Y_data |
Matrix of data at the tips, size p x ntaxa. Each line is a trait, and each column is a tip. The column names are checked against the tip names of the tree. |
Y_data_imp |
(optional) imputed data if previously computed, same format as
|
process |
The model used for the fit. One of "BM" (for a full BM model, univariate or multivariate); "OU" (for an OU with independent traits, univariate or multivariate); or "scOU" (for a "scalar OU" model, see details). |
independent |
Are the trait assumed to be independent from one another? Default to FALSE. OU in a multivariate setting only works if TRUE. |
tol_EM |
the tolerance for the convergence of the parameters. A named list, with items:
|
Nbr_It_Max |
the maximal number of iterations of the EM allowed. Default to 500 iterations. |
method.variance |
Algorithm to be used for the moments computations at the E step. One of "simple" for the naive method; of "upward_downward" for the Upward Downward method (usually faster). Default to "upward_downward". |
method.init |
The initialization method. One of "lasso" for the LASSO base initialization method; or "default" for user-specified initialization values. Default to "lasso". |
method.init.alpha |
For OU model, initialization method for the selection
strength alpha. One of "estimation" for a cherry-based initialization, using
|
method.init.alpha.estimation |
If method.init.alpha="estimation",
choice of the estimation(s) methods to be used. Choices among "regression",
(method="M" is passed to |
nbr_of_shifts |
the number of shifts allowed. |
random.root |
whether the root is assumed to be random (TRUE) of fixed (FALSE). Default to TRUE |
stationary.root |
whether the root is assumed to be in the stationary state. Default to TRUE. |
alpha_known |
is the selection strength assumed to be known ? Default to FALSE. |
eps |
tolerance on the selection strength value before switching to a BM. Default to 10^(-3). |
known.selection.strength |
if |
init.selection.strength |
(optional) a starting point for the selection strength value. |
max_selection.strength |
the maximal value allowed of the selection strength. Default to 100. |
use_sigma_for_lasso |
whether to use the first estimation of the variance matrix in the lasso regression. Default to TRUE. |
max_triplet_number |
for the initialization of the selection strength value (when estimated), the maximal number of triplets of tips to be considered. |
min_params |
a named list containing the minimum allowed values for the parameters. If the estimation is smaller, then the EM stops, and is considered to be divergent. Default values:
|
max_params |
a named list containing the maximum allowed values for the parameters. If the estimation is larger, then the EM stops, and is considered to be divergent. Default values:
|
var.init.root |
optional initialization value for the variance of the root. |
variance.init |
optional initialization value for the variance. |
methods.segmentation |
For OU, method(s) used at the M step to find new candidate shifts positions. Choices among "lasso" for a LASSO-based algorithm; and "best_single_move" for a one-move at a time based heuristic. Default to both of them. Using only "lasso" might speed up the function a lot. |
check.tips.names |
whether to check the tips names of the tree against the column names of the data. Default to TRUE. |
times_shared |
(optional) times of shared ancestry of all nodes and tips,
result of function |
distances_phylo |
(optional) phylogenetic distances, result of function
|
subtree.list |
(optional) tips descendants of all the edges, result of
function |
T_tree |
(optional) matrix of incidence of the tree, result of function
|
U_tree |
(optional) full matrix of incidence of the tree, result of function
|
h_tree |
(optional) total height of the tree. |
F_moments |
(optional, internal) |
tol_half_life |
should the tolerance criterion be applied to the phylogenetic half life (TRUE, default) or to the raw selection strength ? |
warning_several_solutions |
whether to issue a warning if several equivalent solutions are found (default to TRUE). |
convergence_mode |
one of "relative" (the default) or "absolute". Should the tolerance be applied to the raw parameters, or to the renormalized ones ? |
check_convergence_likelihood |
should the likelihood be taken into consideration for convergence assessment ? (default to TRUE). |
sBM_variance |
Is the root of the BM supposed to be random and "stationary"? Used for BM equivalent computations. Default to FALSE. |
method.OUsun |
Method to be used in univariate OU. One of "rescale" (rescale the tree to fit a BM) or "raw" (directly use an OU, only available for univariate processes). |
K_lag_init |
Number of extra shifts to be considered at the initialization step. Increases the accuracy, but can make computations quite slow of taken too high. Default to 5. |
allow_negative |
whether to allow negative values for alpha (Early Burst).
See documentation of |
trait_correlation_threshold |
the trait correlation threshold to stop the analysis. Default to 0.9. |
... |
Further arguments to be passed to |
Details
See documentation of PhyloEM
for further details.
All the parameters monitoring the EM (like tol_EM
, Nbr_It_Max
, etc.)
can be called from PhyloEM
.
Value
An object of class EstimateEM
.
See Also
Extraction function
Description
extract
the needed quantities out of an S3 object.
Usage
extract(x, ...)
Arguments
x |
an S3 object. |
... |
further arguments to be passed to the specific method. |
Value
An integer giving the number of equivalent parsimonious solutions.
See Also
extract.parsimonyNumber
,
extract.parsimonyCost
,
extract.enumerate_parsimony
,
extract.partitionsNumber
Extract the result of enumerate_parsimony
at a node.
Description
extract.enumerate_parsimony
returns a matrix containing all the
possible regime allocations for the nodes of a given subtree.
Usage
## S3 method for class 'enumerate_parsimony'
extract(
x,
node = attr(x$allocations, "ntaxa") + 1,
what = c("solutions", "number", "cost"),
...
)
Arguments
x |
an object of class " |
node |
the node where to retrieve the parsimony number. Default to the root of the tree. |
what |
the quantity to retrieve. Either "solutions" for the full solutions, "number" for the number of solutions, or "cost" for the minimal cost of a solution. Default to "solutions" |
... |
unused |
Value
A matrix with ntaxa + Nnode columns, and as many rows as the number of possible parsimonious reconstructions.
See Also
enumerate_parsimony
, plot.enumerate_parsimony
Extract the shifts values for one trait.
Description
extract.equivalent_shifts
takes an object of class
equivalent_shifts
, result of function equivalent_shifts
,
and returns the shifts of root values for a given trait.
Usage
## S3 method for class 'equivalent_shifts'
extract(x, trait = 1, what = c("shifts_values", "root_values"), ...)
Arguments
x |
an object of class |
trait |
the number of the trait to be extracted. Default to 1. |
what |
one of "shifts_values" or "root_values". |
... |
unused. |
Value
A matrix with the values of the shifts (what = "shifts_values"
) or
the root (what = "root_values"
) for the trait for each equivalent
configuration. Each column is one configuration.
See Also
equivalent_shifts
, plot.equivalent_shifts
,
equivalent_shifts_edges
Extraction of the actual number of solutions.
Description
extract.parsimonyCost
takes an object of class "parsimonyCost
",
result of function parsimonyCost
, and computes the minimum cost
at the given node.
Usage
## S3 method for class 'parsimonyCost'
extract(x, node = attr(x, "ntaxa") + 1, ...)
Arguments
x |
an object of class " |
node |
the root node of the subtree. By default, the root of the tree. |
... |
unused |
Value
An integer giving the minimum cost of the subtree.
See Also
Extraction of the actual number of solutions.
Description
extract.parsimonyNumber
takes the two matrices computed by
parsimonyNumber
, and compute the actual number of parsimonious
solution for any subtree starting from a given node.
Usage
## S3 method for class 'parsimonyNumber'
extract(
x,
node = attr(x$nbrReconstructions, "ntaxa") + 1,
what = c("number", "cost"),
...
)
Arguments
x |
an object of class " |
node |
the root node of the subtree. By default, the root of the tree. |
what |
the quantity to retrieve. Either "number" for the number of solutions, or "cost" for the minimal cost of a solution. Default to "number". |
... |
unused |
Details
The parsimonious solutions are the one with the minimum number of shifts (that are given by matrix costReconstructions). This function sums the number of solutions (given in matrix nbrReconstructions) that have the minimum number of shifts.
Value
An integer giving the number of equivalent parsimonious solutions.
See Also
Extract from object partitionsNumber
Description
extract.partitionsNumber
extracts the number of partitions for a
given sub-tree, either marked or non-marked.
Usage
## S3 method for class 'partitionsNumber'
extract(
x,
node = attr(x, "ntaxa") + 1,
npart = attr(x, "npart"),
marked = FALSE,
...
)
Arguments
x |
an object of class |
node |
the root node of the subtree where to get the result. Default to the root of the tree. |
npart |
the number of partitions (colors) allowed at the tips.
Default to the value used in the call of function
|
marked |
whether to extract the marked (TRUE) or un-marked (FALSE) partitions. The number of models is the number of un-marked partitions. Default to FALSE. |
... |
unused. |
Value
the number of partitions with npart colors, on the sub-tree starting at node, marked or not.
See Also
Extraction of simulated traits
Description
extract.simul_process
takes an object of class "simul_process
",
result of function simul_process
, and extracts the traits values,
expectations or optimal values at the tips or the internal nodes.
Usage
## S3 method for class 'simul_process'
extract(
x,
where = c("tips", "nodes"),
what = c("states", "expectations", "optimal.values"),
...
)
Arguments
x |
an object of class " |
where |
one of "tips" (the default) or "nodes". Where to extract the results. |
what |
one of "states" (the default), "expectation", or "optimal.values". |
... |
unused |
Details
##
Value
A matrix giving the selected quantities at the selected nodes or tips. If the tips or nods are labeled, then the colnames of the matrix are set accordingly.
See Also
Extract sub-matrices of variance.
Description
extract_variance_covariance
return the adequate sub-matrix.
Usage
extract_variance_covariance(
struct,
what = c("YY", "YZ", "ZZ"),
masque_data = c(rep(TRUE, attr(struct, "ntaxa") * attr(struct, "p_dim")), rep(FALSE,
(dim(struct)[1] - attr(struct, "ntaxa")) * attr(struct, "p_dim")))
)
Arguments
struct |
structural matrix of size (ntaxa+Nnode)*p, result
of function |
what |
sub-matrix to be extracted: "YY" : sub-matrix of tips (p*ntaxa first lines and columns) "YZ" : sub matrix tips x nodes (p*Nnode last rows and p*ntaxa first columns) "ZZ" : sub matrix of nodes (p*Nnode last rows and columns) |
masque_data |
Mask of missing data |
Value
sub-matrix of variance covariance.
Find a reasonable grid for alpha
Description
Grid so that 2*ln(2)*quantile(d_ij)/factor_up_alpha < t_1/2 < factor_down_alpha * ln(2) * h_tree, with t_1/2 the phylogenetic half life: t_1/2 = log(2)/alpha. Ensures that for alpha_min, it is almost a BM, and for alpha_max, almost all the tips are decorrelated.
Usage
find_grid_alpha(
phy,
alpha = NULL,
nbr_alpha = 10,
factor_up_alpha = 2,
factor_down_alpha = 3,
quantile_low_distance = 1e-04,
log_transform = TRUE,
allow_negative = FALSE,
...
)
Arguments
phy |
phylogenetic tree of class " |
alpha |
fixed vector of alpha values if already known. Default to NULL. |
nbr_alpha |
the number of elements in the grid |
factor_up_alpha |
factor for up scalability |
factor_down_alpha |
factor for down scalability |
quantile_low_distance |
quantile for min distance |
log_transform |
whether to take a log scale for the spacing of alpha values. Default to TRUE. |
allow_negative |
whether to allow negative values for alpha (Early Burst).
See documentation of |
... |
not used. |
Details
If quantile_low_distance=0
, then quantile(d_ij)=min(d_ij)
, and, for any
two tips i,j, the correlation between i and j is bounded by exp(-factor_up_alpha/2).
Those values of alpha will be used for the re-scaling of the tree, which has an
exponential term in exp(2*alpha*h). The function makes sure that this number is
below the maximal float allowed (equals to .Machine$double.xmax
).
Value
A grid of alpha values
See Also
transform_branch_length
, .Machine
Given a regularization path, find K selected independent variables.
Description
find_independent_regression_vectors
tries to find a situation where K variables
are selected, so that the selected columns of matrix Xp are independent.
Usage
find_independent_regression_vectors.glmnet_multivariate(Xp, K, fit, root)
Arguments
Xp |
(transformed) matrix of regression |
K |
number of non-zero components allowed |
root |
integer, position of the root column (intercept) excluded from the fit. null if no root column. |
Details
To do that, if a set of selected is not independent, we go back to the previous selected variables, and forbid the moves that led to non-independence for the rest of the path.
Value
delta a vector of regression with K non-zero coefficients.
Test for rotation invariant datasets
Description
find_rotation
takes two fits from from PhyloEM
,
and test if their datasets are equal up to a rotation.
Usage
find_rotation(res1, res2, tol = NULL)
Arguments
res1 |
an object of class |
res2 |
an object of class |
tol |
relative numerical tolerance. Default to |
Value
If appropriate, the rotation matrix rot such that dat1 = rot
Find values given edges. OU stationary case. Ultrametric tree.
Description
find_actualized_shift_values
computes the values of the shifts their
positions, and the mean values at the tips.
Warning : this function does not check for consistency. Please make sure that the
shifts positions and the mean values are compatible.
Usage
find_shift_values(shifts_edges, T_tree_ac, vect_Y, p, ntaxa)
Arguments
shifts_edges |
a vector of positions of shifts on the tree. |
T_tree_ac |
matrix of incidence of the tree, result of function
|
Details
This function uses qr.solve
for rectangular linear system solving.
Value
vector, with first entry the values at the root, and other entries the values of the shifts.
Run the EM for several values of K
Description
estimateEM_several_K.OUsr
uses function estimateEM
on the data,
for all values of K between 0 and K_max.
Usage
format_output(results_estim_EM, phylo, time = NA)
Arguments
results_estim_EM |
output of function |
time |
to run the function |
Details
The EM is first launched for K=0, with alpha and gamma estimated. The
estimated values of alpha, gamma and beta_0 found by this first EM are then
used as initialization parameters for all the other runs of the EM for other
K.
The EMs are parallelized thanks to packages foreach
and
doParallel
.
WARNING : this code only work of OU with stationary root, on an ultrametric
tree.
Value
summary a data frame with K_max lines, and columns: - alpha_estim the estimated selection strength - gamma_estim the estimated root variance - beta_0_estim the estimated value of root optimum - EM_steps number of iterations needed before convergence - DV_estim has the EM diverged ? - CV_estim has the EM converged ? - log_likelihood log likelihood of the data using the estimated parameters - mahalanobis_distance_data_mean the Mahalanobis distance between the data and the estimated means at the tips - least_squares the Mahalanobis distance, renormalized by gamma^2: mahalanobis_distance_data_mean * gamma_estim. - mean_number_new_shifts the mean number of shifts that changed over the iterations of the EM - number_equivalent_solutions the number of equivalent solutions to the solution found. - K_try the number of shifts allowed. - complexity the complexity for K_try - time the CPU time needed.
params a list of inferred parameters
params_init a list of initial parameters
alpha_0 initial values of alpha
gamma_0 initial values of gamma
Zhat reconstructed node states
m_Y_estim reconstructed tip states
edge.quality for each edge, relative number of iterations in which they were present.
Get Model Selection Criterion
Description
This function takes an object of class PhyloEM
, result of function
PhyloEM
, and return the values of the model selection criterion
for each value of K.
Usage
get_criterion(res, method.selection = NULL)
Arguments
res |
an object of class |
method.selection |
select the parameters to plot. One of "LINselect", "DDSE",
"Djump" or "likelihood" (for un-penalized likelihood). Default to "LINselect". See
|
Value
A named vector with the values of the criterion for each number of shift K.
See Also
params_process.PhyloEM
, plot.PhyloEM
, plot_criterion
Get variance matrix of a node
Description
get_variance_node
returns the conditional variance of a node, or the conditional
covariance of a node and its parent.
Usage
get_variance_node(node, vars)
Arguments
node |
for which to extract the matrix. |
vars |
matrix of size p x p*(ntaxa+Nnode) result of function |
Value
sub-matrix of variance for the node.
Scale the parameters back to the original process
Description
go_back_to_original_process
takes the inferred parameters with a BM
on a rescaled tree, and gives back the equivalent parameters of the OU on
the original process.
Usage
go_back_to_original_process(
phy_original,
known.selection.strength,
sBM_variance,
params
)
Arguments
phy_original |
the original phylogenetic tree |
known.selection.strength |
the known selection strength of the original OU. |
sBM_variance |
boolean. Is the root random ? |
params |
the inferred parameters of the BM on the re-scaled tree. |
Value
params_scOU the equivalent parameters of the OU on the original tree.
Ancestral State Reconstruction
Description
imputed_traits.PhyloEM
takes an object of class PhyloEM
,
and returns the imputed traits values, either at the internal nodes (ancestral
state reconstruction) or at the tips (data imputation)
Usage
imputed_traits(x, ...)
## S3 method for class 'PhyloEM'
imputed_traits(
x,
trait = 1,
save_all = FALSE,
where = c("nodes", "tips"),
what = c("imputed", "variances", "expectations"),
params = NULL,
method.selection = NULL,
reconstructed_states = NULL,
...
)
Arguments
x |
an object of class |
... |
further arguments to be passed on to
|
trait |
an integer giving the trait to extract. Default to 1. |
save_all |
if TRUE, arguments |
where |
either "nodes" for ancestral state reconstruction, or "tips" for data imputation. |
what |
the quantity to retrieve. Either the imputed traits (default), their conditional variances, or the simple expectations under the selected process. |
params |
(optional) some user-specified parameters.
Must be of class |
method.selection |
(optional) the method selection to be used. One of "LINselect", "DDSE", "Djump". Default to "LINselect". |
reconstructed_states |
if the reconstructed states have already been
computed (by a previous call of the function, with |
Value
A matrix or array with the computed quantities.
Methods (by class)
-
imputed_traits(PhyloEM)
:PhyloEM
object
See Also
params_process.PhyloEM
, PhyloEM
Incidence matrix of a tree.
Description
incidence.matrix
computes the incidence matrix T of a tree : for a
lineage i and a branch b, T[i,b]=1 if b is in the lineage i, and 0
otherwise.
Usage
incidence.matrix(phylo)
Arguments
phylo |
a phylogenetic tree, class |
Value
Matrix of incidence, size Nedge x ntaxa.
See Also
Incidence matrix of a tree.
Description
incidence.matrix.full
computes the incidence matrix U of a tree : for a
node i and a branch b, U[i,b]=1 if b is in the lineage i, and 0
otherwise.
Usage
incidence.matrix.full(phylo)
Arguments
phylo |
a phylogenetic tree, class |
Value
Matrix of incidence, size ntaxa + Nnode.
See Also
Compute the actualization factors to apply to the incidence matrix.
Description
incidence_matrix_actualization_factors
computes a ntaxa x Nedge matrix of the
(1 - exp(-alpha * (t_i - t_pa(j) - nu_j * l_j)))_{i tip, j node}.
This matrix is to be multiplied to the incidence matrix with an outer product.
Usage
incidence_matrix_actualization_factors(
tree,
selection.strength,
relativeTimes_tree = 0,
times_shared = compute_times_ca(tree)
)
Arguments
tree |
a phylogenetic tree. |
selection.strength |
the selection strength of the process. |
relativeTimes_tree |
a Nedge vector of relative times associated with the branches. |
times_shared |
a matrix, result of function |
Value
Matrix of size ntaxa x Nedge
Initialization of the shifts using Lasso.
Description
init.EM.lasso
does the following regression :
||Y_data-T.delta||_(Sigma_YY^(-1)) + lambda |delta|_1
using the function glmnet::glmnet
of package glmnet
,
through function lasso_regression_K_fixed
.
T is the incidence matrix of the tree, and
delta the vectorial representation of the shifts (see functions
incidence.matrix
and shifts.list_to_vector
for further details).
Usage
init.EM.lasso(
phylo,
Y_data,
Y_data_imp = Y_data,
Y_data_vec_known = as.vector(Y_data),
process,
times_shared = compute_times_ca(phylo),
distances_phylo,
nbr_of_shifts,
K_lag_init = 0,
use_sigma = TRUE,
params_sigma = NULL,
variance.init = diag(1, p, p),
random.init = FALSE,
value.root.init = rep(0, p),
exp.root.init = rep(1, p),
var.root.init = diag(1, p, p),
edges.init = NULL,
values.init = matrix(0, p, length(edges.init)),
relativeTimes.init = NULL,
selection.strength.init = 1,
optimal.value.init = rep(0, p),
T_tree = incidence.matrix(phylo),
subtree.list = NULL,
miss = FALSE,
sBM_variance = FALSE,
stationary.root.init = FALSE,
masque_data,
independent = FALSE,
...
)
Arguments
Y_data |
data at the tips. |
times_shared |
(matrix) : times of shared ancestry, result of function
|
distances_phylo |
(matrix) : phylogenetic distance, result of function
|
nbr_of_shifts |
number of shifts used in the EM algorithm |
Details
A Cholesky decomposition of function Sigma_YY^(-1) is used. lambda is chosen so that delta has the right number of non zero components.
Value
params_init the list of initial parameters to be used, in the right format.
Initialization for the allocation of shifts.
Description
init.allocate_regimes_from_shifts
initialize the vector of regimes
at nodes and tips, with the regime (0) at the root.
Usage
init.allocate_regimes_from_shifts(phy, ...)
Arguments
phy |
Input tree. |
Details
This function is used in function allocate_regimes_from_shifts
and is
designed to furnish function update.allocate_regimes_from_shifts
with
the right structure of data.
Value
Matrix of size (Nnode + ntaxa)x1 of NAs, with the 0 at the root.
Initialization the selection strength alpha using robust estimation
Description
init.alpha.estimation
fits (Y_i-Y_j)^2 ~ gamma^2(1-exp(-alpha*d_ij))
for all couples of tips (i,j) that have the same mean, i.e than are not
separated by a shift. Shifts are initialized thanks to a lasso
(function init.EM.lasso
).
Usage
init.alpha.gamma.estimation(
phylo,
Y_data,
nbr_of_shifts,
times_shared,
distances_phylo,
T_tree,
subtree.list,
max_triplet_number,
alpha_known,
method.init.alpha.estimation,
tol_EM,
h_tree,
miss,
masque_data,
independent,
...
)
Arguments
phylo |
a phylogenetic tree, class |
Y_data |
data at the tips. |
nbr_of_shifts |
number of shifts used in the EM algorithm |
distances_phylo |
(matrix) : phylogenetic distance, result of function
|
Details
Function robustbase::nlrob
is used for the robust fit.
Value
params_init the list of initial parameters to be used, in the right format.
Initialization for the computation of the optimal values
Description
init.compute_betas_from_shifts
initialize the vector of optimal values at nodes and
tips, with the value at the root.
Usage
init.compute_betas_from_shifts(phy, optimal.value, ...)
Arguments
phy |
Input tree. |
optimal.value |
the optimal value at the root of the tree |
Details
This function is used in function compute_betas_from_shifts
and is designed to
furnish function update.compute_betas_from_shifts
with the right structure of data.
Value
Matrix of size (Nnode + ntaxa)x1 of NAs, with the optimal value at the root.
Initialization for the enumeration of parsimonious solutions.
Description
init.enumerate_parsimony
is used in function enumerate_parsimony
,
and initialize the correct data structure.
Usage
init.enumerate_parsimony(phy, clusters, pos)
Arguments
phy |
Input tree. |
clusters |
a vector representing the group of each tip. |
Details
This function returns a list with Nnode + ntaxa entries. Entries corresponding to the tips are initialized with a list of nclus matrices. For tip i of group k, all matrices are set to NULL, except for the kth, set to a vector of size Nnode + ntaxa, with entry i set to k, and all the others to NA.
Value
A list of size Nnode + ntaxa, as described above.
Initialization for incidence matrix
Description
init.incidence.matrix
initialize the matrix updated in
update.incidence.matrix
for the computation of the incidence matrix
in incidence.matrix
.
Usage
init.incidence.matrix(phy)
Arguments
phy |
Input tree. |
Details
The initialized matrix has ntaxa column and Nnode rows. Each node represent its parental branch. A row corresponding to a tip i is initialized to a vector of zeros, with only entry i equal to one. (Branch ending at tip i is only in the i^th lineage)
Value
Matrix with Nnode rows and ntaxa column.
Initialization for incidence matrix (full tree)
Description
init.incidence.matrix.full
initialize the matrix updated in
update.incidence.matrix.full
for the computation of the incidence
matrix of the full tree in incidence.matrix.full
.
Usage
init.incidence.matrix.full(phy)
Arguments
phy |
Input tree. |
Details
The initialized matrix is squared of size ntaxa + Nnode. Each node represent its parental branch. A row corresponding to a tip i is initialized to a vector of zeros, with only entry i equal to one. (Branch ending at tip i is only in the i^th lineage)
Value
Matrix of size ntaxa + Nnode.
Initialization for parsimonyCost.
Description
init.parsimonyCost
initialize a (ntaxa + Nnode) x (nclus) matrix with
NAs everywhere, except for the tips.
Usage
init.parsimonyCost(phy, clusters)
Arguments
phy |
a phylogenetic tree, class |
clusters |
the vector of the clusters of the tips. |
Details
At a tip i in state k, the line-vector is initialized as follow : (1 - Ind(k=p)_{1<=p<=nclus})*Inf (where Inf * 0 = 0)
Value
A (ntaxa + Nnode)x(nclus) matrix, with ntaxa first lines initialized as described.
Initialization for parsimonyNumber.
Description
init.parsimonyNumber
initialize a (ntaxa + Nnode)x(nclus) matrix with
NAs everywhere, except for the tips.
Usage
init.parsimonyNumber(phy, clusters)
Arguments
phy |
phylogenetic tree. |
clusters |
the vector of the clusters of the tips. |
Details
At a tip i in state k, the line-vector is initialized as follow : Ind(k=p)_{1<=p<=nclus}
Value
A (ntaxa + Nnode)x(nclus) matrix, with ntaxa first lines initialized as described.
Initialize BM
Description
Function used in simulate
for BM initialization.
Usage
init.simulate.BM(phy, p, root.state, simulate_random, ...)
Arguments
phy |
Input tree. |
root.state |
(list) State of the root, with: random : random state (TRUE) or deterministic state (FALSE) value.root : if deterministic, value of the character at the root exp.root : if random, expectation of the character at the root var.root : if random, variance of the character at the root (pxp matrix) |
Value
paramSimu Array p x Nnode x 2 (BM), filled with NAs. Slice paramSimu[, ntaxa + 1, ] (array p x 2) is initialized with simulated states and root expectations for all the traits.
Initialize state and expectation matrices
Description
Function used in simulate
for OU initialization.
Usage
init.simulate.OU(phy, p, root.state, optimal.value, simulate_random, ...)
Arguments
phy |
Input tree. |
p |
dimension of the trait simulated |
root.state |
(list): state of the root, with: random : random state (TRUE) or deterministic state (FALSE) value.root : if deterministic, value of the character at the root exp.root : if random, expectation of the character at the root var.root : if random, variance of the character at the root (pxp matrix) |
Value
paramSimu: array p x Nnode x 3, filled with NAs. Slice paramSimu[, ntaxa + 1, ] (array p x 3) is initialized with simulated states, root expectations, and optimal values for all the traits.
Initialize state and expectation matrices
Description
Function used in simulate
for BM/OU initializations.
Usage
init.simulate.StateAndExp(phy, p, root.state, simulate_random)
Arguments
phy |
Input tree. |
p |
dimension of the trait simulated |
root.state |
(list): state of the root, with: random : random state (TRUE) or deterministic state (FALSE) value.root : if deterministic, value of the character at the root exp.root : if random, expectation of the character at the root var.root : if random, variance of the character at the root (pxp matrix) |
Value
paramSimu: array p x Nnode x 2 (BM), filled with NAs. Slice paramSimu[, ntaxa + 1, ] (array p x 2) is initialized with simulated states and root expectations for all the traits.
Check whether parameters are in ranges.
Description
is.in.ranges.params
checks whether calculated parameters in the EM are
in the defined ranges.
Usage
is.in.ranges(p, min, max)
Arguments
p |
list of parameters with the correct structure |
min |
list of minimum values for the parameters |
max |
list of maximum values for the parameters |
Details
This function is used to test the convergence of the algorithm.
Value
boolean
Do a lasso regression with the number of non-zero variables fixed.
Description
lasso_regression_K_fixed
does the following regression :
||Yp-Xp.delta|| + lambda |delta|_1 using the function glmnet::glmnet
of
package glmnet
, where delta is a vector representing the shifts
occurring on the branches. It does a Gauss lasso regression using function
lm
on top of it. This function is used in functions
init.EM.lasso
, segmentation.OU.specialCase.lasso
, ...
Usage
lasso_regression_K_fixed.glmnet_multivariate(
Yp,
Xp,
K,
root = NULL,
penscale = rep(1, ncol(Xp)),
K_lag = 0
)
Arguments
Yp |
(transformed) data |
Xp |
(transformed) matrix of regression |
K |
number of non-zero components allowed |
Details
lambda is chosen so that delta has the right number of non zero components. If not possible, either temporarily raise the number of shifts and then select only the shifts with the highest modulus, or if not possible, throw an error.
Value
E0.gauss the intercept (value at the root)
shifts.gauss the list of shifts found on the branches
Log Likelihood of a fitted object
Description
log_likelihood
computes the log likelihood of some parameters.
Usage
log_likelihood(x, ...)
## S3 method for class 'params_process'
log_likelihood(x, Y_data, phylo, ...)
## S3 method for class 'PhyloEM'
log_likelihood(x, ...)
Arguments
x |
an object of class |
... |
for a |
Y_data |
matrix of data at the tips, size p x ntaxa. Each line is a trait, and each column is a tip. The column names are checked against the tip names of the tree. |
phylo |
a phylogenetic tree, class |
Value
The log likelihood of the data with the provided parameters on the tree.
Methods (by class)
-
log_likelihood(params_process)
:params_process
object -
log_likelihood(PhyloEM)
:PhyloEM
object
See Also
Compute parent matrix from possibles daughter matrices.
Description
matrix_of_possibles
is used in function update.enumerate_parsimony
to compute, from the list of possible matrices for the daughters, the matrix for
the node (a group for the parent being fixed).
Usage
matrix_of_possibles(matrices)
Arguments
matrices |
a list of matrices with ndaughters entries. |
Details
This function select all possible combinations of rows from all the daughters, and
merge then into one using function merge_complementary_vectors
.
Value
Matrix of all possible regimes for the subtree bellow node parent.
Merge PhyloEM fits on various grids of alpha values
Description
merge_alpha_grids
takes several fits from PhyloEM
, and
merge them so as to take into account all alpha values.
This can be used to break down computations into smaller chunks to be run independently.
Usage
merge_alpha_grids(...)
Arguments
... |
objects of class |
Value
An object of class PhyloEM
, result of the merge.
Examples
## Not run:
## Load Data
data(monkeys)
## First fit with coarse grid
res1 <- PhyloEM(Y_data = monkeys$dat,
phylo = monkeys$phy,
process = "scOU",
random.root = TRUE,
stationary.root = TRUE,
K_max = 10,
alpha = c(0.2, 0.3),
parallel_alpha = TRUE,
Ncores = 2)
## Second fit with finer grid
res2 <- PhyloEM(Y_data = monkeys$dat,
phylo = monkeys$phy,
process = "scOU",
random.root = TRUE,
stationary.root = TRUE,
K_max = 10,
alpha = c(0.22, 0.24),
parallel_alpha = TRUE,
Ncores = 2)
## Thrid fit with redundancies
res3 <- PhyloEM(Y_data = monkeys$dat,
phylo = monkeys$phy,
process = "scOU",
random.root = TRUE,
stationary.root = TRUE,
K_max = 10,
alpha = c(0.26, 0.3),
parallel_alpha = TRUE,
Ncores = 2)
## Merge the three
res_merge <- merge_alpha_grids(res1, res2, res3)
## Plot the selected result
plot(res_merge)
## Plot the model selection criterion
plot_criterion(res_merge)
## End(Not run)
Merge several complementary vectors into one.
Description
merge_complementary_vectors
take several vectors with complementary entries
(i.e, vector of same length, that are such that only one vector has a non NA value
for each entry), and merge them into one.
Usage
merge_complementary_vectors(comb, mat)
Arguments
comb |
vector giving the rows to be kept. |
mat |
matrix containing the vectors as rows. |
Details
The vectors are selected from the entry matrix by comb. At each entry, the vectors
are added using function add_complementary
.
Value
row vector of the same size as entry matrix.
Merge a list of independent parameters into into one parameter
Description
merge_params_independent
merges a list of p params
objects into one param object of dimension p
The reverse operation is done by split_params_independent
Usage
merge_params_independent(params_split)
Arguments
params_split |
a list of parameters |
Value
A parameter object
Merge fits from independent runs of PhyloEM.
Description
merge_rotations
takes several fits from PhyloEM
, and
merge them according to the best score (maximum likelihood or least squares).
For each number of shifts,
The datasets needs to be equal up to a rotation. This is tested thanks to a QR
decomposition, see function find_rotation
.
Usage
merge_rotations(..., method.selection = NULL, tol = NULL)
Arguments
... |
objects of class |
method.selection |
(optional) selection method to be applied to the merged fit.
See |
tol |
(optional) relative numerical tolerance. See |
Value
An object of class PhyloEM
, result of the merge.
Examples
## Not run:
## Load Data
data(monkeys)
## Run method
# Note: use more alpha values for better results.
res <- PhyloEM(Y_data = monkeys$dat, ## data
phylo = monkeys$phy, ## phylogeny
process = "scOU", ## scalar OU
random.root = TRUE, ## root is stationary
stationary.root = TRUE,
K_max = 10, ## maximal number of shifts
nbr_alpha = 4, ## number of alpha values
parallel_alpha = TRUE, ## parallelize on alpha values
Ncores = 2)
## Rotate dataset
rot <- matrix(c(cos(pi/4), -sin(pi/4), sin(pi/4), cos(pi/4)), nrow= 2, ncol = 2)
Yrot <- t(rot) %*% monkeys$dat
rownames(Yrot) <- rownames(monkeys$dat)
## Fit rotated dataset
# Note: use more alpha values for better results.
res_rot <- PhyloEM(Y_data = Yrot, ## rotated data
phylo = monkeys$phy,
process = "scOU",
random.root = TRUE,
stationary.root = TRUE,
K_max = 10,
nbr_alpha = 4,
parallel_alpha = TRUE,
Ncores = 2)
## Merge the two
res_merge <- merge_rotations(res, res_rot)
## Plot the selected result
plot(res_merge)
## Plot the model selection criterion
plot_criterion(res_merge)
## End(Not run)
Model Selection of a fitted object
Description
model_selection
does the model selection on a fitted PhyloEM
object, and returns the same fitted object.
Usage
model_selection(x, ...)
## S3 method for class 'PhyloEM'
model_selection(
x,
method.selection = c("LINselect", "DDSE", "Djump"),
C.BM1 = 0.1,
C.BM2 = 2.5,
C.LINselect = 1.1,
independent = FALSE,
...
)
Arguments
x |
a fitted |
... |
Further arguments to be passed to |
method.selection |
Method selection to be used. Several ones can be used at the same time. One of "LINselect" for the Baraud Giraud Huet LINselect method; "DDSE" for the Slope Heuristic or "Djump" for the Jump Heuristic, last two based the Birgé Massart method. |
C.BM1 |
Multiplying constant to be used for the BigeMassart1 method. Need to be positive. Default to 0.1. |
C.BM2 |
Multiplying constant to be used for the BigeMassart2 method. Default to 2.5. |
C.LINselect |
Multiplying constant to be used for the LINselect method. Need to be greater than 1. Default to 1.1. |
independent |
Are the trait assumed to be independent from one another? Default to FALSE. OU in a multivariate setting only works if TRUE. |
Value
The same object, but with a slot corresponding to the model selection used. See
function params_process.PhyloEM
to retrieve the selected parameters.
Methods (by class)
-
model_selection(PhyloEM)
:PhyloEM
object
See Also
PhyloEM
, params_process.PhyloEM
,
imputed_traits.PhyloEM
New World Monkeys dataset
Description
Morphometric dataset and phylogeny for brain shape variation of 50 species of New World monkeys (platyrrhine).
Usage
monkeys
Format
A list containing two objects:
- phy
The Phylogenetic tree for the platyrrhine species, pruned to match the species in the morphometric dataset
- dat
First two PC scores from a PCA of the species-averaged Procrustes coordinates
References
Aristide, L., dos Reis, S. F., Machado, A. C., Lima, I., Lopes, R. T. & Perez, S. I. (2016). Brain shape convergence in the adaptive radiation of New World monkeys. Proceedings of the National Academy of Sciences, 113(8), 2158–2163. http://doi.org/10.1073/pnas.1514473113
Computation of the optimal values at nodes and tips.
Description
compute_betas_from_shifts
computes the optimal values at the nodes and tips of the
tree, given the value at the root and the list of shifts occurring in the tree.
It assumes an OU model.
Usage
node_optimal_values(param, phylo)
Arguments
param |
an object of class |
phylo |
a phylogenetic tree, class |
Value
Matrix of size ntraits x (ntaxa + Nnode) of the optimal values at the node and tips of the tree. Column names correspond to the number of the node in the phylo object.
Examples
set.seed(1792)
ntaxa = 10
tree <- rphylo(ntaxa, 1, 0.1)
# parameters of the process
par <- params_process("BM", ## Process
p = 2, ## Dimension
variance = diag(0.5, 2, 2) + 0.5, ## Rate matrix
edges = c(4, 10, 15), ## Positions of the shifts
values = cbind(c(5, 4), ## Values of the shifts
c(-4, -5),
c(5, -3)))
plot(par, phylo = tree, traits = 1, value_in_box = TRUE,
shifts_bg = "white", root_bg = "white", ancestral_as_shift = TRUE, root_adj = 5)
nodelabels()
node_optimal_values(par, tree)
Create an object params_process
for a BM
Description
params_BM
creates a coherent object params_process from user
provided values of the parameters. Non specified parameters are set to
default values.
Usage
params_BM(
p = 1,
variance = diag(1, p, p),
random = FALSE,
value.root = rep(0, p),
exp.root = rep(0, p),
var.root = diag(1, p, p),
edges = NULL,
values = matrix(0, p, length(edges)),
relativeTimes = NULL,
nbr_of_shifts = length(edges),
phylo = NULL,
sBM_variance = FALSE,
trait_names = NULL,
...
)
Arguments
p |
the dimension (number of traits) of the parameters. Default to 1. |
variance |
the variance (rate matrix) of the BM. Default to
|
random |
whether the root of the BM is random (TRUE) or fixed (FALSE). Default to FALSE. |
value.root |
if random=FALSE, the root value. Default to 0. |
exp.root |
if random=TRUE, the root expectation. Default to 0. |
var.root |
if random=TRUE, the root variance. Default to
|
edges |
a vector of edges where the shifts occur. Default to NULL (no shift). |
values |
a matrix of shift values, with p lines and as many columns as
the number of shifts. Each column is the p values for one shift. Default to
|
relativeTimes |
(unused) the relative position of the shift on the branch, between 0 (beginning of the branch) and 1 (end of the branch). Default to 0. |
nbr_of_shifts |
the number of shifts to use (randomly drawn). Use only
if |
phylo |
a phylogenetic tree of class |
sBM_variance |
if the root is random, does it depend on the length of the root edge ? (For equivalent purposes with a rescaled OU). Default to FALSE. If TRUE, a phylogenetic tree with root edge length must be provided. |
trait_names |
vector of trait names values. Must be of length p. |
... |
unused. |
Value
an object of class params_process
.
See Also
Create an object params_process
for an OU
Description
params_OU
creates a coherent object params_process from user
provided values of the parameters. Non specified parameters are set to
default values.
Usage
params_OU(
p = 1,
variance = diag(1, p, p),
selection.strength = diag(1, p, p),
optimal.value = rep(0, p),
random = TRUE,
stationary.root = TRUE,
value.root = rep(0, p),
exp.root = rep(0, p),
var.root = diag(1, p, p),
edges = NULL,
values = matrix(0, p, length(edges)),
relativeTimes = NULL,
nbr_of_shifts = length(edges),
phylo = NULL,
trait_names = NULL,
...
)
Arguments
p |
the dimension (number of traits) of the parameters. Default to 1. |
variance |
the variance (rate matrix) of the BM. Default to
|
selection.strength |
the selection strength matrix. Default to
|
optimal.value |
the vector of the optimal values at the root. Default
to |
random |
whether the root of the OU is random (TRUE) or fixed (FALSE). Default to TRUE. |
stationary.root |
whether the root of the OU is stationary (TRUE) or not. Default to TRUE. |
value.root |
if random=FALSE, the root value. Default to 0. |
exp.root |
if random=TRUE, the root expectation. Default to 0. If
stationary.root=TRUE, default to |
var.root |
if random=TRUE, the root variance. Default to
|
edges |
a vector of edges where the shifts occur. Default to NULL (no shift). |
values |
a matrix of shift values, with p lines and as many columns as
the number of shifts. Each column is the p values for one shift. Default to
|
relativeTimes |
(unused) the relative position of the shift on the branch, between 0 (beginning of the branch) and 1 (end of the branch). Default to 0. |
nbr_of_shifts |
the number of shifts to use (randomly drawn). Use only
if |
phylo |
a phylogenetic tree of class |
trait_names |
vector of trait names values. Must be of length p. |
... |
unused. |
Value
an object of class params_process
.
See Also
Create an object params_process
Description
params_process
creates or extracts a set of parameters of class
params_process
.
Usage
params_process(x, ...)
Arguments
x |
an S3 object. |
... |
further arguments to be passed to the specific method. |
Value
An S3 object of class params_process
. This is essentially a list containing the following entries:
- process
The model used. One of "BM" (for a full BM model, univariate or multivariate); "OU" (for a full OU model, univariate or multivariate); or "scOU" (for a "scalar OU" model).
- p
Dimension of the trait.
- root.state
List describing the state of the root, with:
- random
random state (TRUE) or deterministic state (FALSE)
- value.root
if deterministic, value of the character at the root
- exp.root
if random, expectation of the character at the root
- var.root
if random, variance of the character at the root (pxp matrix)
- shifts
List with position and values of the shifts:
- edges
vector of the K id of edges where the shifts are
- values
matrix p x K of values of the shifts on the edges (one column = one shift)
- relativeTimes
vector of dimension K of relative time of the shift from the parent node of edges
- variance
Variance-covariance matrix size p x p.
- selection.strength
Matrix of selection strength size p x p (OU).
- optimal.value
Vector of p optimal values at the root (OU).
See Also
params_process.character
,
params_process.PhyloEM
,
params_BM
, params_OU
simul_process.params_process
Parameter estimates
Description
params
takes an object of class PhyloEM
, and returns the
inferred parameters of the process.
Usage
## S3 method for class 'PhyloEM'
params_process(
x,
method.selection = NULL,
K = NULL,
alpha = NULL,
rBM = FALSE,
init = FALSE,
...
)
Arguments
x |
an object of class |
method.selection |
(optional) the method selection to be used. One of "LINselect", "DDSE", "Djump". Default to "LINselect". |
K |
(optional) an integer giving the number of shifts for which to retrieve
the parameters. Default to NULL (automatically selected number of shifts, see
|
alpha |
(optional) a value of alpha for which to retrieve the parameters. Can
be an (un-ambiguous) estimation of the true value. If
specified, then |
rBM |
(optional) if TRUE, and if the process is "scOU", returns the raw
parameters of the BM on the re-scaled tree. Default to FALSE, except if
the selection strength is negative (see doc of |
init |
(optional) if TRUE, gives the parameters from the initialization of
the EM. Default to FALSE. This has no effect if |
... |
unused. |
Value
An object of class params_process
.
See Also
PhyloEM
, imputed_traits.PhyloEM
Create an object params_process
Description
params_process
creates a coherent object params_process from user
provided values of the parameters.
Usage
## S3 method for class 'character'
params_process(x, ...)
Arguments
x |
one of "BM" or "OU" |
... |
specified parameters, see functions |
Value
an object of class params_process
.
See Also
Minimal number of shifts needed to get a clustering.
Description
parsimonyCost
is an implementation of the Sankoff algorithm,
when the cost of transition between two state is always one. It is used
in functions parsimonyNumber
and enumerate_parsimony
to count or enumerate all the parsimonious solutions given one clustering of the
tips.
Usage
parsimonyCost(phylo, clusters = rep(1, length(phylo$tip.label)))
Arguments
phylo |
a phylogenetic tree, class |
clusters |
the vector of the clusters of the tips. (Default to all the tips in a single group). |
Value
An S3 class "parsimonyCost
" containing a
(ntaxa + Nnode) x (nclus) matrix of the total number of shifts needed to
get the clustering, if starting from a node in state k. The cost can be
extract from any subtree with function extract.parsimonyCost
.
See Also
extract.parsimonyCost
, parsimonyNumber
,
enumerate_parsimony
, partitionsNumber
,
equivalent_shifts
Examples
tree <- read.tree(text="(((1,1),2),2);")
plot(tree); nodelabels()
clusters <- c(1, 1, 2, 2)
costs <- parsimonyCost(tree, clusters)
costs
## Extract the parsimony cost at the root
extract(costs)
## Extract the cost for the sub-tree below node 7
extract(costs, 7)
Number of equivalent parsimonious allocations.
Description
parsimonyNumber
aims at finding the number of equivalent allocations of
the shifts on the tree, i.e allocations that are parsimonious and compatible
with a given clustering of the tips.
Usage
parsimonyNumber(phylo, clusters = rep(1, length(phylo$tip.label)))
Arguments
phylo |
phylogenetic tree, class |
clusters |
the vector of the clusters of the tips. Default to all the tips in one single cluster. |
Details
This function does a recursion up the tree.
The function extract.parsimonyNumber
gives the result sought for
any subtree.
The matrix of costs of the states (number of shifts) is also required, it is
computed by function parsimonyCost
.
Value
an object of S3 class "parsimonyNumber
" with:
- nbrReconstructions
a (ntaxa + Nnode) x (nclus) matrix of locally parsimonious solutions starting from a cluster k at a given node
- costReconstructions
an object of class "
parsimonyCost
", result of functionparsimonyCost
.
See Also
extract.parsimonyNumber
, parsimonyCost
,
enumerate_parsimony
, partitionsNumber
,
equivalent_shifts
Examples
tree <- read.tree(text="(((0,1),2),2);")
plot(tree); nodelabels()
clusters <- c(0, 1, 2, 2)
n_sols <- parsimonyNumber(tree, clusters)
n_sols
## Extract the number of parsimonious solutions at the root
extract(n_sols)
## Extract the cost of the solutions from the root
extract(n_sols, what = "cost")
extract(parsimonyCost(tree, clusters)) # same, more efficient
## Extract for the sub-tree below node 7
extract(n_sols, 7) # Result: 2 (the ancestral state is either "0" or "1").
Number of different models
Description
partitionsNumber
computes the number of different models with a given
number of shifts K. It is also the number of colorings of the tips to the
tree in npart = K + 1 colors.
Usage
partitionsNumber(phylo, npart)
Arguments
phylo |
a phylogenetic tree of class |
npart |
the numbers of partitions (colors) allowed at the tips. This is the number of shifts plus one (npart = K + 1). |
Value
an object of class partitionsNumber
. This is made of a matrix
with (Nnode + ntaxa) rows and (2*npart) columns. Each column contains two vectors:
for k=1:npart it contains the number of partitions with k groups compatible
with the tree and the shift process; and for k=(npart+1):2*npart, it contains
the number of "marked" partitions with (k-npart) groups compatible with the
tree and the shift process. The actual number can be extracted with function
extract.partitionsNumber
(see examples below).
See Also
extract.partitionsNumber
, parsimonyNumber
,
equivalent_shifts
Examples
if (requireNamespace("combinat", quietly = TRUE)) {
npart <- 8 # number of colors at the tips allowed
tree <- read.tree(text="(A,(A,(A,A,A),A,A));") # a tree with polytomies
plot(tree)
parts_num <- partitionsNumber(tree, npart)
parts_num
## Number of possible colorings of the tips in npart colors
extract(parts_num)
## Get all the solutions for colorings with 1 to nparts colors
extract(parts_num, npart = 1:npart)
## Number of possible colorings of the tips in npart colors
## For the sub-tree starting at node 17
extract(parts_num, node = 10)
## Number of possible colorings of the tips in npart colors
## with one marked color
extract(parts_num, marked = TRUE)
}
Penalty function type Baraud Giraud Huet.
Description
penalty_BaraudGiraudHuet_likelihood
is the penalty defined by :
pen' = ntaxa * log(1 + pen/(ntaxa - K)) with
pen = C * (ntaxa - K)/(ntaxa - K - 1) * EDkhi[K + 1; ntaxa - K - 1; exp(-Delta_K)/(K + 1)]
and Delta = log(model_complexity) + log(K + 1)
such that sum(exp(-Delta_K)) < infty.
Usage
penalty_BaraudGiraudHuet_likelihood(K, model_complexity, ntaxa, C = 1.1)
Arguments
K |
the dimension of the model. |
model_complexity |
the complexity of the set of models with dimension K. |
ntaxa |
the number of tips. |
C |
a constant, C > 1. Default is C = 1.1 (as suggested in Baraud Giraud Huet (2009)) |
Details
See Baraud Giraud Huet (2009, 2011).
Must be applied to log-likelihood criterion.
Function pen is computed using function penalty
from package
LINselect
.
Value
value of the penalty.
See Also
penalty_BirgeMassart_shape1
,
penalty_BirgeMassart_shape2
Penalty function type Birgé-Massart 1
Description
penalty_BirgeMassart_shape1
is the penalty shape defined by :
pen_shape = (sqrt(K) + sqrt(2 * K * L_K))^2 with sum(exp(- K * L_K)) < infty :
L_K = B + 1/K * log(model_complexity).
Usage
penalty_BirgeMassart_shape1(K, p, model_complexity, B = 0.1)
Arguments
K |
the number of shifts |
p |
the dimension of the data |
model_complexity |
the complexity of the set of models with dimension K |
B |
a non-negative constant. Default is 0.1 (as suggested in Cleynen & Lebarbier 2015) |
Details
See Birgé Massart (2001). Must be applied to least-square criterion. This penalty should be calibrated using the slope heuristic.
Value
value of the penalty
See Also
penalty_BaraudGiraudHuet_likelihood
,
penalty_BirgeMassart_shape2
Penalty function type Birgé-Massart 2
Description
penalty_BirgeMassart_shape2
is the penalty shape defined by :
pen_shape = C*K_try + log(model_complexity).
It dominates the penalty defined by penalty_BirgeMassart_shape1
.
Usage
penalty_BirgeMassart_shape2(K, p, model_complexity, C = 2.5)
Arguments
K |
the number of shifts |
p |
the dimension of the data |
model_complexity |
the complexity of the set of models with dimension K. |
C |
a non-negative constant. Default is 2.5 (as suggested in Lebarbier 2005) |
Details
See Birgé Massart (2001). Must be applied to least-square criterion. This penalty should be calibrated using the slope heuristic.
Value
value of the penalty.
See Also
penalty_BirgeMassart_shape1
,
penalty_BaraudGiraudHuet_likelihood
Penalty function type pBIC
Description
penalty_pBIC_scalarOU
is the pBIC.
Usage
penalty_pBIC(
all_params,
model_complexity,
independent,
tree,
times_shared,
distances_phylo,
T_tree,
p,
K,
ntaxa,
process
)
Arguments
model_complexity |
the complexity of the set of models with dimension K. |
K |
the dimension of the model. |
ntaxa |
the number of tips. |
Value
value of the penalty.
See Also
penalty_BirgeMassart_shape1
,
penalty_BirgeMassart_shape2
Plot for class PhyloEM
Description
This function takes an object of class PhyloEM
, result of function
PhyloEM
, and plots the result of the inference.
Usage
## S3 method for class 'PhyloEM'
plot(
x,
traits = 1:(x$p),
params = NULL,
method.selection = NULL,
automatic_colors = TRUE,
color_characters = "black",
color_edges = "black",
plot_ancestral_states = FALSE,
name_trait = "Trait Value",
imposed_scale,
ancestral_cex = 2,
ancestral_pch = 19,
value_in_box = FALSE,
ancestral_as_shift = FALSE,
shifts_cex = 0.6,
shifts_bg = "chocolate4",
root_bg = "chocolate4",
shifts_adj = 0,
root_adj = 1,
color_shifts_regimes = FALSE,
regime_boxes = FALSE,
alpha_border = 70,
show.tip.label = FALSE,
label_cex = 0.5,
label_font = 1,
label_offset = 0,
axis_cex = 0.7,
axis_las = 0,
show_axis_traits = TRUE,
edge.width = 1,
margin_plot = NULL,
gray_scale = FALSE,
root.edge = TRUE,
...
)
Arguments
x |
an object of class |
traits |
a vector of integers giving the numbers of the trait to be plotted. Default to 1:p (all the traits). |
params |
(optional) some user-specified parameters.
Must be of class |
method.selection |
select the parameters to plot. One of "LINselect", "DDSE",
"Djump". Default to "LINselect". See
|
automatic_colors |
whether to color the edges automatically according to
their regimes. Default to TRUE. If FALSE, colors can be manually specified through
arguments |
color_characters |
if |
color_edges |
if |
plot_ancestral_states |
whether to plot the ancestral traits inferred at the internal nodes of the tree. Only available if only one trait is plotted. Default to FALSE. |
name_trait |
name of the trait scale bar for the ancestral states plotting. Default to "Trait Value". |
imposed_scale |
if |
ancestral_cex |
if |
ancestral_pch |
if |
value_in_box |
whether to plot the value of the shift in a box on the edges. Only available when only one trait is plotted. Can be difficult to read on big trees. The size of the text in the boxes is controlled by parameter. Default to FALSE. |
ancestral_as_shift |
whether to represent the ancestral value at the root
as an ancestral shift on the root edge. Default to FALSE.
|
shifts_cex |
if |
shifts_bg |
if |
root_bg |
if |
shifts_adj |
the adj parameter for the shifts position on the edges. Default to 0 (beginning of the edge). |
root_adj |
if |
color_shifts_regimes |
whether to color each shift according to its regime (default to the same color of the edge it's on). Default to FALSE. |
regime_boxes |
whether to draw a box showing all the tips below a given.
The transparency of the border of the box is controlled by parameter
|
alpha_border |
if |
show.tip.label |
whether to show the tip labels. Default to FALSE. |
label_cex |
if |
label_font |
if |
label_offset |
if |
axis_cex |
cex for the label values of the plot. Default to 0.7. |
axis_las |
las for the label values of the plot. Default to 0 (see par). |
show_axis_traits |
control whether the trait values axis is plotted (default to TRUE). |
edge.width |
width of the edge. Default to 1. |
margin_plot |
vector giving the margin to around the plot.
Default to |
gray_scale |
if TRUE, the colors are replaced by a gray scale. Default to FALSE. |
root.edge |
a logical indicating whether to draw the root edge (defaults to TRUE) |
... |
further arguments to be passed to |
See Also
params_process.PhyloEM
, imputed_traits.PhyloEM
Plot all the equivalent solutions.
Description
plot.enumerate_parsimony
plots a representation of all the equivalent
solutions.
Usage
## S3 method for class 'enumerate_parsimony'
plot(x, numbering = FALSE, nbr_col = 3, ...)
Arguments
x |
an object of class |
numbering |
whether to number the solutions. Default to FALSE. |
nbr_col |
the number of columns on which to display the plot. Default to 3. |
... |
further arguments to be passed to |
Details
This function uses functions plot.phylo
and
nodelabels
for the actual plotting of the trees.
Value
A plot of the equivalent shifts allocations.
See Also
plot.phylo
, enumerate_parsimony
,
plot.equivalent_shifts
, nodelabels
Plot all the equivalent solutions.
Description
plot.equivalent_shifts
plots a representation of all the equivalent
shifts allocations, with a representation of the shifts and their values,
and a coloration of the branches in term of regimes.
Usage
## S3 method for class 'equivalent_shifts'
plot(
x,
trait = 1,
show_shifts_values = TRUE,
numbering = FALSE,
colors_tips = NULL,
nbr_col = 3,
gray_scale = FALSE,
edge.width = 2,
shifts_cex = 1.2,
...
)
Arguments
x |
an object of class |
trait |
(integer) the trait to be plotted, if multivariate. Default to 1. |
show_shifts_values |
whether to show the equivalent shifts values or not. Default to FALSE. |
numbering |
whether to number the solutions. Default to FALSE. |
colors_tips |
user-provided colors for the tips of the tree. A vector vector with as many colors as there are tips. Will be automatically computed if not provided. |
nbr_col |
the number of columns on which to display the plot. Default to 3. |
gray_scale |
if TRUE, the colors are replaced by a gray scale. Default to FALSE. |
edge.width |
width of the edge. Default to 1. |
shifts_cex |
if |
... |
further arguments to be passed to |
Details
This function uses function plot.phylo
for the actual
plotting of the trees.
Value
A plot of the equivalent shifts allocations.
See Also
Plot for class simul_process
Description
This function takes an object of class params_process
, and plots them along
with some data at the tips of the tree.
Usage
## S3 method for class 'params_process'
plot(
x,
phylo,
data = NULL,
traits,
automatic_colors = TRUE,
color_characters = "black",
color_edges = "black",
plot_ancestral_states = FALSE,
ancestral_states = NULL,
imposed_scale,
ancestral_cex = 2,
ancestral_pch = 19,
value_in_box = FALSE,
ancestral_as_shift = FALSE,
shifts_cex = 0.6,
shifts_bg = "chocolate4",
root_bg = "chocolate4",
shifts_adj = 0,
root_adj = 1,
color_shifts_regimes = FALSE,
regime_boxes = FALSE,
alpha_border = 70,
show.tip.label = FALSE,
label_cex = 0.5,
label_offset = 0,
axis_cex = 0.7,
edge.width = 1,
margin_plot = NULL,
gray_scale = FALSE,
...
)
Arguments
x |
an object of class |
phylo |
a phylogenetic tree. |
data |
a matrix of data at the tips of the tree. Must have p rows and
ntaxa columns. If these are simulated, use the |
traits |
a vector of integers giving the numbers of the trait to be plotted. Default to 1:p (all the traits). |
automatic_colors |
whether to color the edges automatically according to
their regimes. Default to TRUE. If FALSE, colors can be manually specified through
arguments |
color_characters |
if |
color_edges |
if |
plot_ancestral_states |
whether to plot the ancestral traits inferred at the internal nodes of the tree. Only available if only one trait is plotted. Default to FALSE. |
ancestral_states |
if |
imposed_scale |
if |
ancestral_cex |
if |
ancestral_pch |
if |
value_in_box |
whether to plot the value of the shift in a box on the edges. Only available when only one trait is plotted. Can be difficult to read on big trees. The size of the text in the boxes is controlled by parameter. Default to FALSE. |
ancestral_as_shift |
whether to represent the ancestral value at the root
as an ancestral shift on the root edge. Default to FALSE.
|
shifts_cex |
if |
shifts_bg |
if |
root_bg |
if |
shifts_adj |
the adj parameter for the shifts position on the edges. Default to 0 (beginning of the edge). |
root_adj |
if |
color_shifts_regimes |
whether to color each shift according to its regime (default to the same color of the edge it's on). Default to FALSE. |
regime_boxes |
whether to draw a box showing all the tips below a given.
The transparency of the border of the box is controlled by parameter
|
alpha_border |
if |
show.tip.label |
whether to show the tip labels. Default to FALSE. |
label_cex |
if |
label_offset |
if |
axis_cex |
cex for the label values of the plot. Default to 0.7. |
edge.width |
width of the edge. Default to 1. |
margin_plot |
vector giving the margin to around the plot.
Default to |
gray_scale |
if TRUE, the colors are replaced by a gray scale. Default to FALSE. |
... |
further arguments to be passed to |
See Also
simul_process
, plot.PhyloEM
,
params_BM
, params_OU
Plot Model Selection Criterion
Description
This function takes an object of class PhyloEM
, result of function
PhyloEM
, and plots a model selection criterion.
Usage
plot_criterion(
res,
method.selection = NULL,
add = FALSE,
select.col = "red",
...
)
Arguments
res |
an object of class |
method.selection |
select the parameters to plot. One of "LINselect", "DDSE",
"Djump" or "likelihood" (for un-penalized likelihood). Default to "LINselect". See
|
add |
boolean: should the points be added to a current plot (default to FALSE). |
select.col |
the color of the point selected by the criterion. Default to "red". |
... |
further argument to be passed to base |
See Also
params_process.PhyloEM
, plot.PhyloEM
, get_criterion
Product of elements of a matrix
Description
prod_index
return the product of chosen elements of a matrix.
Usage
prod_index(X, Id)
Arguments
X |
a matrix with p rows and K column. Each row contains the number of partition in 1<=k<=K groups for one of the p children of a given node |
Id |
a vector of length p, result of the function |
Details
This function is to be used in sum_simplex
to be applied to all the
elements given by xsimplex
.
Performs the product : X[1,Id[1]]*X[2,Id[2]]*...*X[p,Id[p]] if all the
elements of Id are positive. Otherwise, just return 0.
Value
double : the result of the product.
exact qr.solve
Description
This is the same function as qr.solve
, but it throws an error if an exact fit cannot
be found (instead of returning a least square fitted value, which is the default behavior
of qr.solve
).
Usage
qr_solve_exact(a, b, tol = 1e-07)
Arguments
a |
a QR decomposition or a rectangular matrix. |
b |
a vector or matrix of right-hand sides of equations.
|
tol |
the tolerance for detecting linear dependencies in the columns of x. Only used if LAPACK is false and x is real. |
Generic recursion down the tree.
Description
recursionDown
uses the function updateDown
to compute
daughters rows of matrix param.
Usage
recursionDown(phy, params, updateDown, ...)
Arguments
phy |
Input tree, in cladewise order. |
params |
Matrix of parameters to update by the recursion |
updateDown |
Function to be used for the update |
... |
Arguments to be used by the function updateDown |
Details
This function is to be used in other more complex function that need to update a quantity from the root to the tips of a tree. Note that the input tree must be in cladewise order.
Value
Matrix of parameters updated.
Residuals of a fitted object
Description
residuals
computes the residuals of some parameters.
Usage
## S3 method for class 'PhyloEM'
residuals(object, ...)
Arguments
object |
an object of class |
... |
for a |
Value
The log likelihood of the data with the provided parameters on the tree.
See Also
Sample equally spaced edges.
Description
sample_edges_intervals
samples K shifts, each in one of the K intervals
regularly spaced on the height of the tree.
Usage
sample_edges_intervals(tree, K)
Arguments
tree |
: input tree |
K |
: number of edges to be sampled. |
Details
In case where the tree is not ultrametric, its "height" is defined as the minimum tip height.
Value
vector of edges
Sample shifts edges in a parsimonious way.
Description
sample_shifts_edges
attempts to find K shifts in the tree which allocations
are parsimonious. The actual generation of edges is done by function
sample_edges_intervals
.
Usage
sample_shifts_edges(tree, K, part.list = enumerate_tips_under_edges(tree))
Arguments
tree |
: input tree |
K |
: number of edges to be sampled. |
Details
This function uses function enumerate_tips_under_edges
to generate a list
of tips under each edge, and function check_parsimony
to check for
parsimony of a given solution, under the assumption of an "infinite site model".
Value
vector of edges
Scale variance and selection strength from a linear transform
Description
Used for process equivalencies on re-scaled trees.
Usage
scale_params(params, f)
Arguments
params |
Parameters list |
f |
Factor of the linear transform. If t' = f * t, the function takes parameters from phylo' back to phylo. |
Value
re-scaled parameters
Segmentation in the BM case
Description
segmentation.BM
performs the segmentation algorithm described.
Usage
segmentation.BM(nbr_of_shifts, costs0, diff_exp)
Arguments
nbr_of_shifts |
Number of shifts on the phylogeny allowed |
costs0 |
Cost of each edge |
diff_exp |
Difference of expectations |
Details
This function takes the largest values of costs0, and make them null, thanks to delta and tau.
Value
List containing : edges_max : array of nbr_of_shifts edges where costs0 is maximal. shifts:list containing the computed tau and delta
Segmentation in the OU special case, using lasso regression
Description
segmentation.OU.specialCase.lasso
performs the segmentation using a
lasso regression to select for the edges where the shifts are added.
Usage
segmentation.OU.specialCase.lasso(
phylo,
nbr_of_shifts,
D,
Xp,
penscale = rep(1, (nrow(phylo$edge) + 1)),
...
)
Arguments
phylo |
a phylogenetic tree |
nbr_of_shifts |
Number of shifts on the phylogeny allowed |
Details
This function re-write the sum of costs to be minimized as a least squares
regression problem, and uses a lasso regression to solve it. It uses
functions incidence.matrix.full
to express the problem as a
linear model.
Value
List containing : beta_0 : the optimal value at the root shifts : list containing the computed tau and delta costs : vector of costs
Segmentation in the OU special case, conserving the same shifts position.
Description
segmentation.OU.specialCase.same_shifts
keeps the same shifts position,
and optimize the sum of costs using function
best_scenario
Usage
segmentation.OU.specialCase.same_shifts(phylo, shifts_old, D, Xp, ...)
Arguments
phylo |
a phylogenetic tree |
shifts_old |
the previous list of shifts (only position is used) |
Details
This is the best move if keeping the previous shifts positions.
Value
List containing : beta_0 : the optimal value at the root shifts : list containing the computed tau and delta costs : vector of costs
Compute the matrix of shifts.
Description
shifts.list_to_matrix
takes the list description of the shifts
to give the matrix representation of the shifts : the b th element of
the lth line has the value of the shift on character l occurring on that branch b
Usage
shifts.list_to_matrix(phy, shifts, p = nrow(shifts$values))
Arguments
phy |
Input tree. |
shifts |
list description of the shifts : shifts$edges, shifts$values. |
p |
number of traits (optional, needed when shifts = NULL). |
Value
Matrix p x Nedge of length nbranch.
See Also
Compute the vector of shifts.
Description
shifts.list_to_vector
takes the list description of the shifts
to give the vectorial representation of the shifts : the b th element of
the vector has the value of the shift occurring on that branch b.
Usage
shifts.list_to_vector(phy, shifts)
Arguments
phy |
Input tree. |
shifts |
: list description of the shifts : shifts$edges, shifts$values |
Value
Vector of length nbranch.
See Also
Compute the list of shifts.
Description
shifts.matrix_to_list
takes the vectorial description of the shifts
to create the list description of the shifts.
Usage
shifts.matrix_to_list(delta)
Arguments
delta |
matrix description of the shift. |
Value
List describing shifts.
See Also
Compute the list of shifts.
Description
shifts.vector_to_list
takes the vectorial description of the shifts
to create the list description of the shifts.
Usage
shifts.vector_to_list(delta)
Arguments
delta |
: vector description of the shift. |
Value
Vector of length nbranch.
See Also
shifts.list_to_vector
Simmap format mapping from list of edges
Description
shifts_to_simmap
takes a vector of edges where the shifts occur, and return a
simmap formatted tree, mapped with corresponding regimes.
Usage
shifts_to_simmap(tree, shifts_edges)
Arguments
tree |
input tree in |
shifts_edges |
shifts positions on the edges |
Details
Ancestral state is always 0, and other states are consecutive integers.
Value
tree a simmap object
Simulate a Stochastic Process on a tree
Description
simulate
simulate a stochastic process on a tree.
Usage
simul_process(x, ...)
## S3 method for class 'params_process'
simul_process(
x,
phylo,
simulate_random = TRUE,
checks = TRUE,
U_tree = NULL,
times_shared = NULL,
...
)
## S3 method for class 'PhyloEM'
simul_process(
x,
simulate_random = TRUE,
checks = TRUE,
U_tree = NULL,
times_shared = NULL,
...
)
Arguments
x |
an object of class |
... |
for a |
phylo |
a phylogenetic tree, class |
simulate_random |
set to FALSE if only the expected values are needed (and not the random sample). Default to TRUE. |
checks |
whether to check the entry parameters for consistency. Default to TRUE. |
U_tree |
optional, full incidence matrix of the tree, result of function
|
times_shared |
optional, times of shared ancestry of all nodes and tips,
result of function |
Value
An S3 object of class simul_process
. This contains:
- sim_traits
an array with dimensions p x Nnode x 2 (BM) or p x Nnode x 3 (OU). For each trait t, 1 <= t <= p, sim_traits[t, , ] has tree columns, containing respectively the simulated state, expected value and optimal value for all the nodes.
- phylo
the phylogenetic tree used for the simulations (class
phylo
).- params
the parameters used for the simulations (class
params_proces
).
Methods (by class)
-
simul_process(params_process)
:params_process
object -
simul_process(PhyloEM)
:PhyloEM
object
See Also
params_process
, PhyloEM
, extract.simul_process
Simulate a Stochastic Process on a tree
Description
simulate_internal
simulate a stochastic process on a tree.
Usage
simulate_internal(
phylo,
process = c("BM", "OU", "scOU", "OUBM", "StudentOU"),
p = 1,
root.state = list(random = FALSE, stationary.root = FALSE, value.root = NA, exp.root =
NA, var.root = NA),
shifts = list(edges = NULL, values = NULL, relativeTimes = NULL),
eps = 10^(-6),
selection.strength = NULL,
variance = NULL,
optimal.value = NULL,
checks = TRUE,
simulate_random = TRUE,
U_tree = NULL,
times_shared = NULL,
df = 1
)
Arguments
phylo |
a phylogenetic tree, class |
process |
The model used for the simulation. One of "BM" (for a full BM model, univariate or multivariate); "OU" (for a full OU model, univariate or multivariate); or "scOU" (for a "scalar OU" model). |
p |
Dimension of the simulated trait |
root.state |
List describing the state of the root, with:
|
shifts |
List with position and values of the shifts :
|
eps |
Tolerance for the value of the norm 1 of the selection strength matrix for OU |
selection.strength |
Matrix of selection strength size p x p (OU) |
variance |
Variance-covariance matrix size p x p |
optimal.value |
Vector of p optimal values at the root (OU) |
checks |
whether to check the entry parameters for consistency. Default to TRUE. |
simulate_random |
set to FALSE if only the expected values are needed (and not the random sample). Default to TRUE. |
U_tree |
optional, full incidence matrix of the tree, result of function
|
times_shared |
optional, times of shared ancestry of all nodes and tips,
result of function |
df |
if the process is "StudentOU", the number of degree of freedom of the chosen student law. default to 1. |
Value
paramSimu An array with dimensions p x Nnode x 2 (BM) or p x Nnode x 3 (OU). For each trait t, 1 <= t <= p, paramSimu[t, , ] has tree columns, containing respectively the simulated state, expected value and optimal value for all the nodes.
Split independent parameters into a list of parameters
Description
split_params_independent
split a params object for a
process with p independent traits into p params objects.
The reverse operation is done by merge_params_independent
Usage
split_params_independent(params)
Arguments
params |
parameters |
Value
A list of p parameters
Sum on all subsets.
Description
sum_partitions
returns the sum on all I subset of [1,p], with |I|>m.
Usage
sum_partitions(A, N, K, p, m)
Arguments
A |
a matrix with p rows and K column. Each row contains the number of marked partition in 1<=k<=K groups for one of the p children of a given node. |
N |
a matrix with p rows and K column. Each row contains the number of partition in 1<=k<=K groups for one of the p children of a given node |
K |
an integer. The number of groups wanted. |
p |
an integer. The number of daughters of a node. |
m |
an integer. The minimum cardinal of a subset allowed. |
Details
This function applies sum_partitions.cardFixed
to all integer between
m and p, and sum the results.
Value
double : the result of the sum.
Sum on subsets of a given cardinal.
Description
sum_partitions.cardFixed
returns the sum on I subset of [1,p], |I|
fixed, of the sums computed by sum_prod.comb
.
Usage
sum_partitions.cardFixed(A, N, K, p, cardI)
Arguments
A |
a matrix with p rows and K column. Each row contains the number of marked partition in 1<=k<=K groups for one of the p children of a given node. |
N |
a matrix with p rows and K column. Each row contains the number of partition in 1<=k<=K groups for one of the p children of a given node |
K |
an integer. The number of groups wanted. |
p |
an integer. The number of daughters of a node. |
cardI |
an integer. The cardinal of the subset wanted. |
Details
This function uses combn
to enumerate all the subsets I of [1,p] of
a given cardinal, and then performs the wanted sum on these subsets.
Value
double : the result of the sum.
Sum on a simplex
Description
sum_prod.comb
returns the sum on k_1+...+k_p=K+|I|-1, k_i>0 of the
products of prod(A[i,k_i], i in I)*prod(N[i,k_i], i not in I).
Usage
sum_prod.comb(I, A, N, K, p)
Arguments
I |
a vector of integers representing a subset of [1,p] |
A |
a matrix with p rows and K column. Each row contains the number of marked partition in 1<=k<=K groups for one of the p children of a given node. |
N |
a matrix with p rows and K column. Each row contains the number of partition in 1<=k<=K groups for one of the p children of a given node |
K |
an integer. The number of groups wanted. |
p |
an integer. The number of daughters of a node. |
Details
This function uses sum_simplex
to perform the wanted sum on a ad-hoc
matrix, combination of rows of A and N.
Value
double : the result of the sum.
Sum on a simplex
Description
sum_simplex
returns the sum on k_1+...+k_p=K, k_i>0 of the products
of NN[i,k_i].
Usage
sum_simplex(NN, K, p)
Arguments
NN |
a matrix with p rows and K column. Each row contains the number of partition in 1<=k<=K groups for one of the p children of a given node |
K |
an integer. The number of groups wanted. |
p |
an integer. The number of daughters of a node. |
Details
This function uses xsimplex
to perform the product of NN[i,k_i] for
all combination of k_i such that k_1+...+k_p=K, k_i>0, using function
prod_index
. Then sum all the products.
Value
double : the result of the sum.
Test state of root.
Description
test.root.state
test whether the parameters of root.state given
by the user are coherent. If not, it returns a new corrected list to
define root.state.
Usage
test.root.state(root.state, process = c("BM", "OU", "scOU", "OUBM"), ...)
Arguments
root.state |
A list giving the root state |
process |
"BM", "OU" or "scOU" |
... |
parameters of the process (if OU) |
Details
To test coherence, the following priorities are applied: random > stationary.root > values.root = exp.root = var.root
Value
Coherent list root.state.
Transform branch length for a re-scaled BM
Description
Re-scale the branch length of the tree so that a BM running on the new tree produces the same observations at the tips than an OU with parameter alpha.
Usage
transform_branch_length(phylo, alp)
Arguments
phylo |
A phylogenetic tree of class |
alp |
Value of the selection strength. |
Value
phylo The same phylogenetic tree, with transformed branch lengths.
Transform the shift values
Description
transform_shifts_values
takes the shifts generating a given expectation structure
given an OU with alpha = from, and gives back the equivalent shifts values that produce the
same structure with an OU with alpha = to. If from or to is 0, then the process is supposed
to be a BM.
Usage
transform_shifts_values(shifts, from = 0, to, phylo)
Arguments
shifts |
the shifts on the original process |
from |
alpha value of the original process. If equals 0, then the original process is taken to be a BM. |
to |
alpha value of the destination process |
phylo |
the phylogenetic tree (un-scaled) |
Update function for regime allocation.
Description
update.allocate_regimes_from_shifts
computes the regime of a daughter
node, knowing the regime at the parent node and the vector of shifts positions
Usage
update.allocate_regimes_from_shifts(edgeNbr, ancestral, shifts_edges, ...)
Arguments
edgeNbr |
: Number of the edge considered |
ancestral |
: regime of the parent node |
shifts_edges |
positions on edges |
Details
This function is used in function allocate_regimes_from_shifts
and is
designed to furnish function recursionDown
with the right structure of
data.
Value
regime of the daughter node.
Update function for optimal value computation
Description
update.compute_betas_from_shifts
computes the optimal value at a daughter node,
knowing the optimal value at the parent node and the vector of shifts.
Usage
update.compute_betas_from_shifts(edgeNbr, ancestral, shifts, ...)
Arguments
edgeNbr |
: Number of the edge considered |
ancestral |
: Computed vector for the parental node |
shifts |
position and values of the shifts |
Details
This function is used in function compute_betas_from_shifts
and is designed to
furnish function recursionDown
with the right structure of data.
Value
Updated matrix of size (Nnode + ntaxa)x1.
Actualization of the enumeration.
Description
update.enumerate_parsimony
is used in function enumerate_parsimony
,
and compute the solution for the parent node, given its children.
Usage
update.enumerate_parsimony(
daughters,
daughtersParams,
parent,
cost,
clus,
pos,
...
)
Arguments
daughters |
vector of daughters nodes. |
daughtersParams |
list with length(daughters) entries, each entry being a list of k matrices representing the possible allocations starting from daughter. |
parent |
the parent node. |
Details
This function takes a list with L entries corresponding to the children of a node,
and compute, for all the regimes, the possible allocations starting with parent
node in that regime. It uses functions select.matrices
to select the
possible states of the children, and matrix_of_possibles
to find the
possible states.
Value
A list of size nclus, each entry being a matrix representing the possible allocations starting with node parent in state k.
Update function for incidence matrix
Description
update.incidence.matrix
updates the matrix initialized in
init.incidence.matrix
for the computation of the incidence matrix
in incidence.matrix
.
Usage
update.incidence.matrix(daughtersParams, ...)
Arguments
daughtersParams |
: rows of updated matrix corresponding to the daughters of the current node. |
Details
A node belongs to all the lineages of its daughters.
Value
Vector of length ntaxa, indicating to which lineages the branch above the current node belongs to.
Update function for incidence matrix
Description
update.incidence.matrix.full
updates the matrix initialized in
init.incidence.matrix.full
for the computation of the incidence matrix
in incidence.matrix.full
.
Usage
update.incidence.matrix.full(daughtersParams, parent, ...)
Arguments
daughtersParams |
: rows of updated matrix corresponding to the daughters of the current node. |
Details
A node belongs to all the lineages of its daughters.
Value
Vector of length ntaxa + Nnode, indicating to which lineages the branch above the current node belongs to.
Actualization for parsimonyCost.
Description
update.parsimonyCost
compute the line vector of a parent node, given
the vectors of its daughters.
Usage
update.parsimonyCost(daughtersParams, ...)
Arguments
daughtersParams |
a (ndaughters) x (nclus) matrix with the line vectors of the cost for the daughters node. |
Details
This function computes the cost of putting the parent in a state k, as the minimum number of shifts needed to get the given clustering of the trees bellow parental node.
Value
A line vector corresponding to the parent node.
Actualization for parsimonyNumber.
Description
update.parsimonyNumber
compute the line vector of a parent node, given
the vectors of its daughters.
Usage
update.parsimonyNumber(daughters, daughtersParams, cost, ...)
Arguments
daughters |
the identifiers of the daughters nodes. |
daughtersParams |
a ndaughters x (nclus) matrix with the line vectors of the number of solutions for the daughters nodes. |
cost |
the (ntaxa + Nnode) x nclus matrix of costs (computed by |
Details
This function uses function compute_state_filter
to find all the admissible
states of the daughters, given a starting state for the parent.
Value
A line vector corresponding to the parent node.
Update formula in the general case
Description
update.partitionsNumber.gen
apply the actualization formula to get
Nk and Ak of a node given its daughters.
Usage
update.partitionsNumber.gen(daughtersParams, ...)
Arguments
daughtersParams |
: matrix with 2*npart columns. Each row contains the result of partitionsNumber for the children of a given node |
Details
Uses functions sum_partitions
and sum_simplex
to compute
the needed sums.
Value
vector of size 2*npart. For k=1:npart it contains the number of partitions with k groups compatible with the sub tree starting at the current node ; and for k=(npart+1):2*npart, it contains the number of "marked" partitions with (k-npart) groups compatible with the sub tree starting at the current node.
Wrapper for E step in EM
Description
wrapper_E_step
is used in the EM algorithm. It calls itself
recursively in case of independent parameters.
Usage
wrapper_E_step(
phylo,
times_shared,
distances_phylo,
process,
params_old,
masque_data,
F_moments,
independent,
Y_data_vec_known,
miss,
Y_data,
U_tree,
compute_E
)