---
title: "Comparison with other packages"
author: "Johan Larsson"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Comparison with other packages}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
bibliography: eulerr.bib
---

```{r}
#| label: setup
#| include: false

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

library(eulerr)

# The benchmark numbers are precomputed by `data-raw/benchmarks.R` (which needs
# the competitor packages and is not part of the package build) and stored in
# `benchmark_results.rds`. This vignette only reads and displays them, so it
# builds with eulerr, knitr, and lattice alone.
results_path <- "benchmark_results.rds"
have_results <- file.exists(results_path)
if (have_results) {
  results <- readRDS(results_path)
}
```

This vignette compares **eulerr** with the other R packages that genuinely fit
*area-proportional* Euler and Venn diagrams, both quantitatively (accuracy and
speed) and qualitatively (features and scope).

## What counts as a competitor

An area-proportional diagram is one in which each region's *area* is made
proportional to the quantity it represents. Surprisingly few R packages actually
solve this geometric fitting problem. Most Venn/Euler packages draw fixed,
schematic shapes and encode quantities through labels or color instead of area.

On CRAN and Bioconductor, the packages that really fit area-proportional
diagrams are:

- **eulerr** --- circles, ellipses, or axis-aligned rectangles/squares via
  numerical optimization; reports `stress` and `diagError` goodness-of-fit
  statistics.
- **venneuler** --- Wilkinson's circle algorithm [@wilkinson_exact_2012];
  circles only; Java-based (depends on rJava).
- **nVennR** --- the nVenn algorithm [@Perez-Silva_2018]; quasi-proportional,
  n-dimensional diagrams built from irregular polygons. Distributed via GitHub.
- **BioVenn** --- accurate 2--3 circle diagrams from identifier lists
  [@Hulsen_2021].
- **vennplot** --- 2D circles and 3D spheres for 2--3 sets [@Xu_2017]; currently
  dormant on CRAN.
- **VennDiagram** --- a drawing package whose *scaled* mode is genuinely
  area-proportional for two circles only [@Chen_2011].

The [last section](#excluded-packages) lists the many packages that are *not*
area-proportional fitters and explains why they are excluded.

## How accuracy is measured

Comparing fitters fairly is subtle because they optimize different objectives. A
package that minimizes overall squared error will look bad if scored on the
single worst region, and vice versa. We therefore compare **per objective**: for
each competitor we configure eulerr to optimize the *same* objective and score
both packages on the *same* metric.

To make the metric identical across packages, we ignore each package's
self-reported diagnostics and instead recompute the fit from its *realized
geometry*. We turn the fitted shapes into polygons, compute the area of every
disjoint region with `polyclip`, and evaluate eulerr's own statistics on those
areas:

- **stress** --- the normalized residual sum of squares from venneuler,
  $\sum_i (A_i - \beta\omega_i)^2 \big/ \sum_i A_i^2$, with
  $\beta = \sum_i A_i\omega_i \big/ \sum_i \omega_i^2$.
- **diagError** --- the largest absolute difference between a region's realized
  and target *proportion*, $\max_i |A_i/\sum_k A_k - \omega_i/\sum_k \omega_k|$
  (from eulerAPE, [@Micallef_2014]).

Here $\omega_i$ is the input size of region $i$ and $A_i$ its realized area.
Both statistics are scale-invariant, so packages that work in different
coordinate systems remain comparable. We benchmark on a mix of small hand-built
configurations and harder cases derived from eulerr's bundled `fruits` and
`organisms` datasets, plus a five-set example.

The three comparisons are:

  | Comparison | Competitor optimizes         | eulerr setting                 | Scored on |
  | ---------- | ---------------------------- | ------------------------------ | --------- |
  | A          | overall fit (stress)         | `loss = "stress"`, circles     | stress    |
  | B          | region proportionality       | `loss = "diag_error"`, circles | diagError |
  | C          | overlap areas (2--3 circles) | `loss = "stress"`, circles     | stress    |

In each comparison eulerr's default **ellipse** fit is included as a reference
(`eulerr (ellipse)`): ellipses have more degrees of freedom than circles and
typically fit better, but the head-to-head against each circle-based competitor
uses eulerr's circle mode.

```{r}
#| label: helpers
#| include: false

# Reshape a long accuracy/timing slice to a dataset-by-package matrix, dropping
# competitor columns that were never run (all NA) so the scaffolded state (only
# eulerr available) still produces a clean table.
to_matrix <- function(df, value_col) {
  m <- tapply(
    df[[value_col]],
    list(df$dataset, df$package),
    FUN = function(x) x[1]
  )
  m <- m[, colSums(!is.na(m)) > 0, drop = FALSE]
  # Order datasets as first seen, eulerr columns first.
  m <- m[unique(df$dataset), , drop = FALSE]
  pkg_order <- order(!grepl("^eulerr", colnames(m)), colnames(m))
  m[, pkg_order, drop = FALSE]
}

fmt <- function(x) {
  ifelse(is.na(x), "---", formatC(x, format = "g", digits = 2))
}
```

```{r}
#| label: no-results
#| echo: false
#| results: asis

if (!have_results) {
  cat(
    "> **Benchmark results are not available.** Run",
    "`Rscript data-raw/benchmarks.R` to generate `benchmark_results.rds`,",
    "then rebuild this vignette.\n"
  )
}
```

## Accuracy

```{r}
#| label: accuracy-note
#| echo: false
#| results: asis
#| eval: have_results

run <- results$meta$competitors_run
if (length(run) == 0) {
  cat(
    "> The stored results were generated **without any competitor package",
    "installed**, so only eulerr's own numbers are shown below. Install",
    "venneuler, BioVenn, and nVennR and re-run `data-raw/benchmarks.R` for the",
    "full comparison.\n"
  )
} else {
  cat(
    "> Competitors benchmarked:",
    paste(run, collapse = ", "),
    paste0("(generated ", results$meta$generated, ").\n")
  )
}
```

```{r}
#| label: accuracy-tables
#| echo: false
#| results: asis
#| eval: have_results

acc <- results$accuracy
for (cmp in unique(acc$comparison)) {
  sub <- acc[acc$comparison == cmp, ]
  metric <- sub$metric[1]
  m <- to_matrix(sub, "value")
  cat("\n**", cmp, "** --- lower ", metric, " is better.\n\n", sep = "")
  print(knitr::kable(
    apply(m, 2, fmt),
    align = "r",
    caption = NULL
  ))
  cat("\n")
}
```

```{r}
#| label: accuracy-plot
#| echo: false
#| fig-width: 7
#| fig-height: 3
#| fig-cap: "Fit error by dataset and package, faceted by comparison. Lower is better. Each competitor is matched against eulerr configured for the same objective."
#| eval: have_results

library(lattice)

acc <- results$accuracy
acc$package <- factor(acc$package, levels = unique(acc$package))

lattice::barchart(
  value ~ dataset | comparison,
  groups = package,
  data = acc,
  scales = list(x = list(rot = 45), y = list(relation = "free")),
  layout = c(3, 1),
  ylab = "fit error",
  auto.key = list(columns = 1, space = "right"),
  par.settings = lattice::simpleTheme(col = eulerr_options()$fills$fill(8))
)
```

## Speed

Runtimes are the median wall-clock time of repeated fits on the machine that
generated the results (see `sessionInfo` in the stored `meta`). They are
indicative rather than definitive --- venneuler pays a fixed JVM cost, and
absolute numbers depend on hardware --- but they show the broad picture.

```{r}
#| label: timing-table
#| echo: false
#| results: asis
#| eval: have_results

tim <- results$timing
# Average eulerr-circle / competitor timings across datasets for a compact view.
m <- to_matrix(tim, "time_ms")
cat("Median fit time (ms), by dataset and package:\n\n")
knitr::kable(
  apply(m, 2, function(x) formatC(x, format = "f", digits = 1)),
  align = "r"
)
```

```{r}
#| label: timing-plot
#| echo: false
#| fig-width: 7
#| fig-height: 3
#| fig-cap: "Median fit time (ms, log scale) by dataset and package."
#| eval: have_results

tim <- results$timing
tim$package <- factor(tim$package, levels = unique(tim$package))

lattice::barchart(
  time_ms ~ dataset,
  groups = package,
  data = tim[!duplicated(tim[c("dataset", "package")]), ],
  scales = list(x = list(rot = 45), y = list(log = 10)),
  ylab = "median time (ms)",
  auto.key = list(columns = 1, space = "right"),
  par.settings = lattice::simpleTheme(col = eulerr_options()$fills$fill(8))
)
```

## Feature comparison

Accuracy and speed are only part of the story. The table below summarizes how
the packages differ in scope and capabilities.

```{r}
#| label: qualitative-table
#| echo: false

qual <- data.frame(
  Package = c(
    "eulerr",
    "venneuler",
    "nVennR",
    "BioVenn",
    "vennplot",
    "VennDiagram"
  ),
  Shapes = c(
    "circle, ellipse, rectangle, square",
    "circle",
    "irregular polygon",
    "circle",
    "circle (2D), sphere (3D)",
    "circle"
  ),
  `Max sets` = c("many", "many", "many", "2--3", "2--3", "4 (2 scaled)"),
  Proportional = c(
    "approximate",
    "approximate",
    "quasi",
    "accurate (2--3)",
    "approximate",
    "2 sets only"
  ),
  Input = c(
    "vectors, data frames, matrices, tables, lists",
    "named vector, data frame",
    "lists",
    "ID lists",
    "counts, lists",
    "counts"
  ),
  `Fit reported` = c(
    "stress, diagError, regionError",
    "stress",
    "none",
    "none",
    "none",
    "none"
  ),
  `Key dependency` = c(
    "none (Rust)",
    "rJava / Java",
    "C++",
    "none",
    "Rcpp, rgl",
    "none"
  ),
  Source = c(
    "CRAN",
    "CRAN",
    "GitHub",
    "CRAN",
    "CRAN (dormant)",
    "CRAN"
  ),
  check.names = FALSE,
  stringsAsFactors = FALSE
)

knitr::kable(qual, align = "l")
```

A few points worth drawing out:

- **eulerr is the only package that fits ellipses** (and rectangles/squares).
  Ellipses can represent many three-set configurations exactly that are
  impossible with circles, which is why eulerr's ellipse fits score better than
  every circle-based package on the harder datasets above.
- **eulerr and venneuler are the only packages that report a numeric
  goodness-of-fit.** Without one, it is impossible to know how much a diagram
  distorts the data --- and area-proportional diagrams with three or more sets
  almost always distort it to some degree.
- **eulerr has no system dependencies beyond a Rust toolchain at build time**,
  whereas venneuler requires a working Java installation through rJava, which is
  a common source of installation trouble.
- **nVennR achieves closer proportionality only by abandoning smooth shapes**
  for irregular polygons. This is the central trade-off: smooth, interpretable
  shapes (eulerr, venneuler) versus exact topology with jagged regions (nVennR).

## Excluded packages {#excluded-packages}

Many widely cited Venn/Euler packages are *not* area-proportional fitters and
are therefore outside the scope of this comparison.

```{r}
#| label: excluded-table
#| echo: false

excl <- data.frame(
  Package = c(
    "venn, ggVennDiagram, ggvenn, RVenn, gplots",
    "UpSetR",
    "colorfulVennPlot",
    "VennMaster",
    "Vennerable"
  ),
  Reason = c(
    "Draw fixed, schematic shapes; quantity shown via labels or color, not area",
    "Not a Venn/Euler diagram at all (UpSet matrix/bar charts)",
    "Archived from CRAN; only a 2-set helper, never a general fitter",
    "Area-proportional, but a standalone Java application --- not an R package",
    "Area-weighted, but hosted on R-Forge/GitHub, not CRAN or Bioconductor"
  ),
  check.names = FALSE,
  stringsAsFactors = FALSE
)

knitr::kable(excl, align = "l")
```

## Caveats

- **"Area-proportional" is a spectrum.** For three or more sets, exact solutions
  with circles or ellipses are often geometrically impossible, so eulerr and
  venneuler produce *approximate* diagrams; nVennR reaches closer
  proportionality only by giving up smooth shapes. Any honest comparison should
  report fit error, not merely whether a diagram was produced --- which is
  exactly what the accuracy section above does.
- **Self-description bias.** The BioVenn documentation claims it is "the only R
  package that can automatically generate an accurate area-proportional Venn
  diagram," which is not literally true; eulerr, venneuler, and vennplot also
  produce accurate area-proportional circle diagrams.
- **Reproducibility.** The numbers here come from one machine and one set of
  package versions. To regenerate them, install the competitors (`venneuler`,
  `BioVenn`, and `nVennR` from GitHub) and run `data-raw/benchmarks.R`.

## References