Visualization of ColocBoost Results

This vignette demonstrates how to visualize and interpret the output of ColocBoost results.

library(colocboost)

Causal variants (simulated)

The dataset features two causal variants with indices 194 and 589.

# Loading the Dataset
data(Ind_5traits)
# Run colocboost 
res <- colocboost(X = Ind_5traits$X, Y = Ind_5traits$Y)
#> Validating input data.
#> Starting gradient boosting algorithm.
#> Gradient boosting for outcome 4 converged after 40 iterations!
#> Gradient boosting for outcome 5 converged after 59 iterations!
#> Gradient boosting for outcome 1 converged after 61 iterations!
#> Gradient boosting for outcome 3 converged after 91 iterations!
#> Gradient boosting for outcome 2 converged after 94 iterations!
#> Performing inference on colocalization events.
#> Extracting colocalization results with pvalue_cutoff = 0.001, cos_npc_cutoff = 0.2, and npc_outcome_cutoff = 0.2.
#> Keep only CoS with cos_npc >= 0.2. For each CoS, keep the outcomes configurations that pvalue of variants for the outcome < 0.001 and npc_outcome >0.2.

1. Default plot function

The default plot of the colocboost results provides a visual representation of the colocalization events.

colocboost_plot(res)

Parameters to adjust plot

2. Advanced options

There are several advanced options available for customizing the plot by deepening the visualization of the colocboost results.

2.1. Plot with a zoom-in region

You can specify a zoom-in region by providing a grange argument, which is a vector indicating the indices of the region to be zoomed in.

colocboost_plot(res, grange = c(1:400), outcome_idx = c(1:4))

2.2. Plot with marked top variants

You can highlight the top variants in the plot by setting show_top_variables = TRUE. This will add a red circle to top variants with highest VCP for each CoS.

colocboost_plot(res, show_top_variables = TRUE)

2.3. Plot CoS variants to uncolocalized traits to diagnostic the colocalization.

There are three options available for plotting the CoS variants to uncolocalized traits:

colocboost_plot(res, show_cos_to_uncoloc = TRUE)
#> Show all CoSs to uncolocalized outcomes.

2.4. Plot with added highlight points

You can highlight specific variants in the plot by setting add_highlight = TRUE and add_highlight_idx = **. This will add red dashed vertical lines (default add_highlight_style = "vertical_lines") at the specified index you want to highlight. Alternatively, you can use add_highlight_style = "star" to change the highlight style to the red star for the specified variants. For example, to add a vertical line at true causal variants, you can set add_vertical_idx = unique(unlist(Ind_5traits$true_effect_variants)). Following plot also shows the top variants.

colocboost_plot(
  res, show_top_variables = TRUE, 
  add_highlight = TRUE, 
  add_highlight_idx = unique(unlist(Ind_5traits$true_effect_variants)),
  add_highlight_style = "star"
)

2.5. Plot with trait-specific sets if exists

There are two options available for plotting the trait-specific (uncolocalized) variants:

Important Note: You should use colocboost(..., output_level = 2) to obtain the trait-specific (uncolocalized) information.

# Create a mixed dataset
data(Ind_5traits)
data(Heterogeneous_Effect)
X <- Ind_5traits$X[1:3]
Y <- Ind_5traits$Y[1:3]
X1 <- Heterogeneous_Effect$X
Y1 <- Heterogeneous_Effect$Y[,1,drop=F]

# Run colocboost
res <- colocboost(X = c(X, list(X1)), Y = c(Y, list(Y1)), output_level = 2)
#> Validating input data.
#> Starting gradient boosting algorithm.
#> Gradient boosting for outcome 1 converged after 86 iterations!
#> Gradient boosting for outcome 3 converged after 99 iterations!
#> Gradient boosting for outcome 4 converged after 103 iterations!
#> Gradient boosting for outcome 2 converged after 113 iterations!
#> Performing inference on colocalization events.
#> Extracting colocalization results with pvalue_cutoff = 0.001, cos_npc_cutoff = 0.2, and npc_outcome_cutoff = 0.2.
#> Keep only CoS with cos_npc >= 0.2. For each CoS, keep the outcomes configurations that pvalue of variants for the outcome < 0.001 and npc_outcome >0.2.
colocboost_plot(res, plot_ucos = TRUE)

In this example, there are two colocalized sets (blue and orange) and two trait-specific sets for trait 4 only (green and purple). For comprehensive tutorials on result interpretation, please visit our tutorials portal at Interpret ColocBoost Output.

2.6 Plot with focal trait for disease prioritized colocalization

There are three options available for plotting the results from disease prioritized colocalization, considering a focal trait:

# Create a mixed dataset
data(Ind_5traits)
data(Sumstat_5traits) 
X <- Ind_5traits$X[1:3]
Y <- Ind_5traits$Y[1:3]
sumstat <- Sumstat_5traits$sumstat[4]
LD <- get_cormat(Ind_5traits$X[[1]])

# Run colocboost
res <- colocboost(X = X, Y = Y, 
                  sumstat = sumstat, LD = LD, 
                  focal_outcome_idx = 4)
#> Validating input data.
#> Starting gradient boosting algorithm.
#> Gradient boosting for focal outcome 4 converged after 25 iterations!
#> Gradient boosting for outcome 1 converged after 45 iterations!
#> Gradient boosting for outcome 3 converged after 66 iterations!
#> Gradient boosting for outcome 2 converged after 77 iterations!
#> Performing inference on colocalization events.
#> Extracting colocalization results with pvalue_cutoff = 0.001, cos_npc_cutoff = 0.2, and npc_outcome_cutoff = 0.2.
#> Keep only CoS with cos_npc >= 0.2. For each CoS, keep the outcomes configurations that pvalue of variants for the outcome < 0.001 and npc_outcome >0.2.

# Only plot CoS with focal trait
colocboost_plot(res, plot_focal_only = TRUE)

# Plot all CoS including at least one traits colocalized with focal trait
colocboost_plot(res, plot_focal_cos_outcome_only = TRUE)