--- title: "Comparison with other packages" author: "Johan Larsson" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Comparison with other packages} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} bibliography: eulerr.bib --- ```{r} #| label: setup #| include: false knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(eulerr) # The benchmark numbers are precomputed by `data-raw/benchmarks.R` (which needs # the competitor packages and is not part of the package build) and stored in # `benchmark_results.rds`. This vignette only reads and displays them, so it # builds with eulerr, knitr, and lattice alone. results_path <- "benchmark_results.rds" have_results <- file.exists(results_path) if (have_results) { results <- readRDS(results_path) } ``` This vignette compares **eulerr** with the other R packages that genuinely fit *area-proportional* Euler and Venn diagrams, both quantitatively (accuracy and speed) and qualitatively (features and scope). ## What counts as a competitor An area-proportional diagram is one in which each region's *area* is made proportional to the quantity it represents. Surprisingly few R packages actually solve this geometric fitting problem. Most Venn/Euler packages draw fixed, schematic shapes and encode quantities through labels or color instead of area. On CRAN and Bioconductor, the packages that really fit area-proportional diagrams are: - **eulerr** --- circles, ellipses, or axis-aligned rectangles/squares via numerical optimization; reports `stress` and `diagError` goodness-of-fit statistics. - **venneuler** --- Wilkinson's circle algorithm [@wilkinson_exact_2012]; circles only; Java-based (depends on rJava). - **nVennR** --- the nVenn algorithm [@Perez-Silva_2018]; quasi-proportional, n-dimensional diagrams built from irregular polygons. Distributed via GitHub. - **BioVenn** --- accurate 2--3 circle diagrams from identifier lists [@Hulsen_2021]. - **vennplot** --- 2D circles and 3D spheres for 2--3 sets [@Xu_2017]; currently dormant on CRAN. - **VennDiagram** --- a drawing package whose *scaled* mode is genuinely area-proportional for two circles only [@Chen_2011]. The [last section](#excluded-packages) lists the many packages that are *not* area-proportional fitters and explains why they are excluded. ## How accuracy is measured Comparing fitters fairly is subtle because they optimize different objectives. A package that minimizes overall squared error will look bad if scored on the single worst region, and vice versa. We therefore compare **per objective**: for each competitor we configure eulerr to optimize the *same* objective and score both packages on the *same* metric. To make the metric identical across packages, we ignore each package's self-reported diagnostics and instead recompute the fit from its *realized geometry*. We turn the fitted shapes into polygons, compute the area of every disjoint region with `polyclip`, and evaluate eulerr's own statistics on those areas: - **stress** --- the normalized residual sum of squares from venneuler, $\sum_i (A_i - \beta\omega_i)^2 \big/ \sum_i A_i^2$, with $\beta = \sum_i A_i\omega_i \big/ \sum_i \omega_i^2$. - **diagError** --- the largest absolute difference between a region's realized and target *proportion*, $\max_i |A_i/\sum_k A_k - \omega_i/\sum_k \omega_k|$ (from eulerAPE, [@Micallef_2014]). Here $\omega_i$ is the input size of region $i$ and $A_i$ its realized area. Both statistics are scale-invariant, so packages that work in different coordinate systems remain comparable. We benchmark on a mix of small hand-built configurations and harder cases derived from eulerr's bundled `fruits` and `organisms` datasets, plus a five-set example. The three comparisons are: | Comparison | Competitor optimizes | eulerr setting | Scored on | | ---------- | ---------------------------- | ------------------------------ | --------- | | A | overall fit (stress) | `loss = "stress"`, circles | stress | | B | region proportionality | `loss = "diag_error"`, circles | diagError | | C | overlap areas (2--3 circles) | `loss = "stress"`, circles | stress | In each comparison eulerr's default **ellipse** fit is included as a reference (`eulerr (ellipse)`): ellipses have more degrees of freedom than circles and typically fit better, but the head-to-head against each circle-based competitor uses eulerr's circle mode. ```{r} #| label: helpers #| include: false # Reshape a long accuracy/timing slice to a dataset-by-package matrix, dropping # competitor columns that were never run (all NA) so the scaffolded state (only # eulerr available) still produces a clean table. to_matrix <- function(df, value_col) { m <- tapply( df[[value_col]], list(df$dataset, df$package), FUN = function(x) x[1] ) m <- m[, colSums(!is.na(m)) > 0, drop = FALSE] # Order datasets as first seen, eulerr columns first. m <- m[unique(df$dataset), , drop = FALSE] pkg_order <- order(!grepl("^eulerr", colnames(m)), colnames(m)) m[, pkg_order, drop = FALSE] } fmt <- function(x) { ifelse(is.na(x), "---", formatC(x, format = "g", digits = 2)) } ``` ```{r} #| label: no-results #| echo: false #| results: asis if (!have_results) { cat( "> **Benchmark results are not available.** Run", "`Rscript data-raw/benchmarks.R` to generate `benchmark_results.rds`,", "then rebuild this vignette.\n" ) } ``` ## Accuracy ```{r} #| label: accuracy-note #| echo: false #| results: asis #| eval: have_results run <- results$meta$competitors_run if (length(run) == 0) { cat( "> The stored results were generated **without any competitor package", "installed**, so only eulerr's own numbers are shown below. Install", "venneuler, BioVenn, and nVennR and re-run `data-raw/benchmarks.R` for the", "full comparison.\n" ) } else { cat( "> Competitors benchmarked:", paste(run, collapse = ", "), paste0("(generated ", results$meta$generated, ").\n") ) } ``` ```{r} #| label: accuracy-tables #| echo: false #| results: asis #| eval: have_results acc <- results$accuracy for (cmp in unique(acc$comparison)) { sub <- acc[acc$comparison == cmp, ] metric <- sub$metric[1] m <- to_matrix(sub, "value") cat("\n**", cmp, "** --- lower ", metric, " is better.\n\n", sep = "") print(knitr::kable( apply(m, 2, fmt), align = "r", caption = NULL )) cat("\n") } ``` ```{r} #| label: accuracy-plot #| echo: false #| fig-width: 7 #| fig-height: 3 #| fig-cap: "Fit error by dataset and package, faceted by comparison. Lower is better. Each competitor is matched against eulerr configured for the same objective." #| eval: have_results library(lattice) acc <- results$accuracy acc$package <- factor(acc$package, levels = unique(acc$package)) lattice::barchart( value ~ dataset | comparison, groups = package, data = acc, scales = list(x = list(rot = 45), y = list(relation = "free")), layout = c(3, 1), ylab = "fit error", auto.key = list(columns = 1, space = "right"), par.settings = lattice::simpleTheme(col = eulerr_options()$fills$fill(8)) ) ``` ## Speed Runtimes are the median wall-clock time of repeated fits on the machine that generated the results (see `sessionInfo` in the stored `meta`). They are indicative rather than definitive --- venneuler pays a fixed JVM cost, and absolute numbers depend on hardware --- but they show the broad picture. ```{r} #| label: timing-table #| echo: false #| results: asis #| eval: have_results tim <- results$timing # Average eulerr-circle / competitor timings across datasets for a compact view. m <- to_matrix(tim, "time_ms") cat("Median fit time (ms), by dataset and package:\n\n") knitr::kable( apply(m, 2, function(x) formatC(x, format = "f", digits = 1)), align = "r" ) ``` ```{r} #| label: timing-plot #| echo: false #| fig-width: 7 #| fig-height: 3 #| fig-cap: "Median fit time (ms, log scale) by dataset and package." #| eval: have_results tim <- results$timing tim$package <- factor(tim$package, levels = unique(tim$package)) lattice::barchart( time_ms ~ dataset, groups = package, data = tim[!duplicated(tim[c("dataset", "package")]), ], scales = list(x = list(rot = 45), y = list(log = 10)), ylab = "median time (ms)", auto.key = list(columns = 1, space = "right"), par.settings = lattice::simpleTheme(col = eulerr_options()$fills$fill(8)) ) ``` ## Feature comparison Accuracy and speed are only part of the story. The table below summarizes how the packages differ in scope and capabilities. ```{r} #| label: qualitative-table #| echo: false qual <- data.frame( Package = c( "eulerr", "venneuler", "nVennR", "BioVenn", "vennplot", "VennDiagram" ), Shapes = c( "circle, ellipse, rectangle, square", "circle", "irregular polygon", "circle", "circle (2D), sphere (3D)", "circle" ), `Max sets` = c("many", "many", "many", "2--3", "2--3", "4 (2 scaled)"), Proportional = c( "approximate", "approximate", "quasi", "accurate (2--3)", "approximate", "2 sets only" ), Input = c( "vectors, data frames, matrices, tables, lists", "named vector, data frame", "lists", "ID lists", "counts, lists", "counts" ), `Fit reported` = c( "stress, diagError, regionError", "stress", "none", "none", "none", "none" ), `Key dependency` = c( "none (Rust)", "rJava / Java", "C++", "none", "Rcpp, rgl", "none" ), Source = c( "CRAN", "CRAN", "GitHub", "CRAN", "CRAN (dormant)", "CRAN" ), check.names = FALSE, stringsAsFactors = FALSE ) knitr::kable(qual, align = "l") ``` A few points worth drawing out: - **eulerr is the only package that fits ellipses** (and rectangles/squares). Ellipses can represent many three-set configurations exactly that are impossible with circles, which is why eulerr's ellipse fits score better than every circle-based package on the harder datasets above. - **eulerr and venneuler are the only packages that report a numeric goodness-of-fit.** Without one, it is impossible to know how much a diagram distorts the data --- and area-proportional diagrams with three or more sets almost always distort it to some degree. - **eulerr has no system dependencies beyond a Rust toolchain at build time**, whereas venneuler requires a working Java installation through rJava, which is a common source of installation trouble. - **nVennR achieves closer proportionality only by abandoning smooth shapes** for irregular polygons. This is the central trade-off: smooth, interpretable shapes (eulerr, venneuler) versus exact topology with jagged regions (nVennR). ## Excluded packages {#excluded-packages} Many widely cited Venn/Euler packages are *not* area-proportional fitters and are therefore outside the scope of this comparison. ```{r} #| label: excluded-table #| echo: false excl <- data.frame( Package = c( "venn, ggVennDiagram, ggvenn, RVenn, gplots", "UpSetR", "colorfulVennPlot", "VennMaster", "Vennerable" ), Reason = c( "Draw fixed, schematic shapes; quantity shown via labels or color, not area", "Not a Venn/Euler diagram at all (UpSet matrix/bar charts)", "Archived from CRAN; only a 2-set helper, never a general fitter", "Area-proportional, but a standalone Java application --- not an R package", "Area-weighted, but hosted on R-Forge/GitHub, not CRAN or Bioconductor" ), check.names = FALSE, stringsAsFactors = FALSE ) knitr::kable(excl, align = "l") ``` ## Caveats - **"Area-proportional" is a spectrum.** For three or more sets, exact solutions with circles or ellipses are often geometrically impossible, so eulerr and venneuler produce *approximate* diagrams; nVennR reaches closer proportionality only by giving up smooth shapes. Any honest comparison should report fit error, not merely whether a diagram was produced --- which is exactly what the accuracy section above does. - **Self-description bias.** The BioVenn documentation claims it is "the only R package that can automatically generate an accurate area-proportional Venn diagram," which is not literally true; eulerr, venneuler, and vennplot also produce accurate area-proportional circle diagrams. - **Reproducibility.** The numbers here come from one machine and one set of package versions. To regenerate them, install the competitors (`venneuler`, `BioVenn`, and `nVennR` from GitHub) and run `data-raw/benchmarks.R`. ## References