Bas Ketsman ; Dan Suciu ; Yufei Tao - A Near-Optimal Parallel Algorithm for Joining Binary Relations

lmcs:6944 - Logical Methods in Computer Science, May 5, 2022, Volume 18, Issue 2 - https://doi.org/10.46298/lmcs-18(2:6)2022
A Near-Optimal Parallel Algorithm for Joining Binary RelationsArticle

Authors: Bas Ketsman ORCID; Dan Suciu ; Yufei Tao ORCID

    We present a constant-round algorithm in the massively parallel computation (MPC) model for evaluating a natural join where every input relation has two attributes. Our algorithm achieves a load of $\tilde{O}(m/p^{1/\rho})$ where $m$ is the total size of the input relations, $p$ is the number of machines, $\rho$ is the join's fractional edge covering number, and $\tilde{O}(.)$ hides a polylogarithmic factor. The load matches a known lower bound up to a polylogarithmic factor. At the core of the proposed algorithm is a new theorem (which we name the "isolated cartesian product theorem") that provides fresh insight into the problem's mathematical structure. Our result implies that the subgraph enumeration problem, where the goal is to report all the occurrences of a constant-sized subgraph pattern, can be settled optimally (up to a polylogarithmic factor) in the MPC model.


    Volume: Volume 18, Issue 2
    Published on: May 5, 2022
    Accepted on: February 3, 2022
    Submitted on: December 1, 2020
    Keywords: Computer Science - Databases
    Funding:
      Source : OpenAIRE Graph
    • NSF-BSF: III: Small: Data Driven Schema; Funder: National Science Foundation; Code: 2109922
    • III:Small: Optimal Query Processing meets Information Theory: from Proofs to Algorithms; Funder: National Science Foundation; Code: 1907997

    Consultation statistics

    This page has been seen 1954 times.
    This article's PDF has been downloaded 872 times.