## Algorithms & Complexity Seminar, MIT : Spring 2024
The Algorithms & Complexity Seminar will usually (unless otherwise stated) meet on ## Talks-
**February 28, 2024**:**D Ellis Hershkowitz**(Brown University). [Time: 4-5pm: 32-G575]
**Topic.**Polylogarithmic Universal Steiner Trees and Strong Sparse Partition Hierarchies**Abstract.**This talk concerns universal Steiner trees (USTs). Informally, a UST is a single spanning tree that is a good approximation for every instance of Steiner tree. More formally, an α-approximate universal Steiner tree (UST) of a graph G for root vertex r is a spanning tree T such that, for any vertex subset S containing r, the cost of the minimal subtree of T connecting S is within an α factor of the cost of the minimum cost subgraph of G connecting S. α-approximate USTs immediately give α-approximations for well-studied variants of Steiner tree such as online or oblivious Steiner tree. Sub-linear-approximate USTs are known but neither the existence of nor poly-time algorithms for computing poly-logarithmic-approximate USTs were previously known. In this talk, I will discuss the first construction of poly-logarithmic-approximate USTs which (nearly) exponentially improve the approximation guarantees of previous constructions and (nearly) match logarithmic lower bounds. The result is based on new constructions of poly-logarithmic-quality graph hierarchies called strong sparse partition hierarchies which may be interesting in their own right. Roughly, a strong sparse partition (i.e. each level of this hierarchy) is a vertex partition that provides deterministic guarantees on the number of parts balls of appropriate radii intersect. Joint work with Costas Busch, Da Qi Chen, Arnold Filtser, Daniel Hathcock and Rajmohan Rajaraman. -
**March 6, 2024**:**Brian Zhang**(CMU). [Time: 4-5pm: 32-G575]
**Topic.**Efficient Φ-Regret Minimization with Linear and Low-Degree Swap Deviations in Extensive-Form Games**Abstract.**Recent breakthrough results by Dagan, Daskalakis, Fishelson and Golowich [2023] and Peng and Rubinstein [2023] established an efficient algorithm attaining at most \epsilon swap regret over extensive-form strategy spaces of dimension N in N^{\tilde O(1/\epsilon)} rounds. On the other extreme, Farina and Pipis [2023] developed an efficient algorithm for minimizing the weaker notion of linear-swap regret in \mathsf{poly}(N)/\epsilon^2 rounds. In this talk, we will discuss several new results that extend and expand upon these recent works. The first result is an interpretation of linear-swap deviations in extensive-form games. We show a connection between linear deviations and a generalization of communication deviations in which the player can make queries to a "mediator" who replies with action recommendations, and, critically, the player is not constrained to match the timing of the game as would be the case for communication deviations. We coin this latter set the untimed communication (UTC) deviations. We show that the UTC deviations coincide precisely with the linear deviations, and therefore that any player minimizing UTC regret also minimizes linear-swap regret. Next, we take a step toward bridging the gap between linear and swap deviations. We introduce the set of k-mediator deviations, which generalize the untimed communication deviations to the case of having multiple mediators. We develop parameterized algorithms for minimizing the regret with respect to this set of deviations in N^{O(k)}/\epsilon^2 rounds. This closes the gap in the sense that k=1 recovers linear swap regret, while k=N recovers swap regret. Moreover, by relating k -mediator deviations to low-degree polynomials, we show that regret minimization against degree- k polynomial swap deviations is achievable in N^{O(kd)^3}/\epsilon^2 rounds, where d is the depth of the game, assuming constant branching factor. For a fixed degree k, this is polynomial for Bayesian games and quasipolynomial more broadly when d = \mathsf{polylog} N---the usual balancedness assumption on the game tree. This talk covers joint work with Ioannis Anagnostides, Gabriele Farina, and Tuomas Sandholm. -
**March 13, 2024**:**Ilya Volkovich**(Boston College). [Time: 4-5pm: 32-G575]
**Topic.**Synergy between Circuit Obfuscation and Circuit Minimization**Abstract.**We study close connections between Indistinguishability Obfuscation (IO) and the Minimum Circuit Size Problem (MCSP), and argue that algorithms for one of MCSP or IO would empower the other one. Some of our main results are: If there exists a perfect (imperfect) IO that is computationally-secure against non-uniform polynomial-size circuits, then we obtain fixed-polynomial lower bounds against NP (MA). In addition, computationally-secure IO against non-uniform polynomial-size circuits imply super-polynomial lower bounds against NEXP. If MCSP is in BPP, then statistical security and computational security for IO are equivalent. If computationally-secure perfect IO exists, then MCSP is contained in BPP iff NP = ZPP; As a consequence we separate ZPEXP from BPP. To the best of our knowledge, this is the first consequence of strong circuit lower bounds from the existence of an IO. The results are obtained via a construction of an optimal universal distinguisher, computable in randomized polynomial time with access to the MCSP oracle, that will distinguish any two circuit-samplable distributions with the advantage that is the statistical distance between these two distributions minus some negligible error term. This is our main technical contribution. As another application, we get a simple proof of the result by Allender and Das (Inf. Comput., 2017) that SZK is contained in BPP^MCSP. -
**March 20, 2024**:**Kai Zheng**(MIT). [Time: 4-5pm: 32-G575]
**Topic.**Near Optimal Alphabet-Soundness Tradeoff PCPs**Abstract.**We show a nearly optimal alphabet-soundness tradeoff for NP-hardness of 2-Prover-1-Round Games (2P1R). More specifically, we show that for all \eps > 0, for sufficiently large M, it is NP-hard to decide whether a 2P1R instance of alphabet size M has value nearly 1 or at most M^{-1+\eps}. 2P1R are equivalent to 2-Query PCPs, and are widely used in obtaining hardness of approximation results. As such, our result implies the following: 1) hardness of approximating Quadratic Programming within a factor of nearly log(n), 2) hardness of approximating d-bounded degree 2-CSP within a factor of nearly d/2, and 3) improved hardness of approximation results for various k-vertex connectivity problems. For the former two applications, our results nearly match the performance of the best known algorithms. Joint work with Dor Minzer. -
**April 11, 2024**:**Vahab Mirrokni**(Google Reseaerch). [Time: 2-3pm: 32-D507]
**Topic.**ML Efficiency for Large Models: Faster Transformers, Sparsity, and beyond**Abstract.**Scaling large models efficiently for faster training and inference is a fundamental challenge. In this talk, we present a number of algorithmic challenges and potential solutions from theory to practice. First, we discuss data efficiency and model efficiency problems that can be formalized as a subset selection problem. For model efficiency, we present sequential attention for feature selection and sparsification[ICLR'23, arxiv]. For data efficiency, we present a sensitivity sampling technique for improved quality and efficiency of the models. Furthermore, we discuss the intrinsic quadratic complexity of attention models as well as token generation. We first discuss HyperAttention; a technique to develop linear-time attention algorithms under mild assumptions[ICLR'24]. We then present PolySketchFormer, a technique to bypass the hardness results of achieving sub-quadratic attention by applying sketching of polynomial functions[arxiv]. Finally, we show how to address the complexity of token generation via clustering techniques[arxiv]. -
**April 17, 2024**:**Peter Manohar**(CMU). [Time: 4-5pm: 32-G575]
**Topic.**An Exponential Lower Bound for Linear 3-Query Locally Correctable Codes**Abstract.**A locally correctable code (LCC) is an error correcting code where one can recover any symbol of the original codeword by querying only a small number of randomly chosen symbols from the received corrupted codeword. In this talk, we present a proof that the blocklength n of a linear 3-query LCC C: {0,1}^k -> {0,1}^n must be at least \exp(k^{1/8}), i.e., it grows exponentially with k. This improves on the prior best lower bound of k^3 shown in our recent work, which holds even for the weaker setting of 3-query locally decodable codes (LDCs), and comes close to matching the best-known construction of 3-query LCCs based on Reed—Muller codes, which achieve a blocklength of n = \exp(\sqrt{k}). Because there is a 3-query LDC with a strictly subexponential blocklength, as a corollary we obtain the first separation between q-query LCCs and LDCs for any constant q. We subsequently show that any “design 3-LCC”, an LCC with a parity check matrix that is a block design, must have blocklength at least 2^{(1-o(1)) sqrt{k}}. As Reed—Muller codes are design 3-LCCs with blocklength n = 2^{2 sqrt{2k}}, this is tight up to a factor of 2 sqrt{2} in the exponent, and as a corollary resolves the Hamada conjecture for 4-designs up to this constant factor. For nonlinear LCCs, we give a generalization of our approach that proves a superpolynomial lower bound of k^{log k}, improving on the prior best lower bound of k^3. -
**April 18, 2024**:**Dmitrii Avdiukhin**(Northwestern Uniersity). [Time: 4-5pm: 32-D507]
**Topic.**Optimal Sample Complexity of Contrastive Learning**Abstract.**Contrastive learning is a highly successful technique for learning representations of data from labeled tuples, specifying the distance relations within the tuple. We study the sample complexity of contrastive learning, i.e. the minimum number of labeled tuples sufficient for getting high generalization accuracy. We give tight bounds on the sample complexity in a variety of settings, focusing on arbitrary distance functions, both general ℓp-distances, and tree metrics. Our main result is an (almost) optimal bound on the sample complexity of learning ℓp-distances for integer p. For any p≥1 we show that Θ̃ (min(nd,n2)) labeled tuples are necessary and sufficient for learning d-dimensional representations of n-point datasets. Our results hold for an arbitrary distribution of the input samples and are based on giving the corresponding bounds on the Vapnik-Chervonenkis/Natarajan dimension of the associated problems. We further show that the theoretical bounds on sample complexity obtained via VC/Natarajan dimension can have strong predictive power for experimental results, in contrast with the folklore belief about a substantial gap between the statistical learning theory and the practice of deep learning. -
**April 24, 2024**:**Felix Zhou**(Yale University). [Time: 4-5pm: 32-G575]
**Topic.**Lipschitz Continuous Graph Algorithms via Proximal Gradient Analysis**Abstract.**We study the class of Lipschitz continuous graph algorithms as introduced in the recent work of Kumabe and Yoshida [FOCS'23]. The Lipschitz constant of an algorithm, intuitively, bounds the ratio of the changes in its output over the perturbations of its input. Our approach consists of three main steps. First, we consider a natural convex relaxation of the underlying graph problem with the addition of a smooth and strongly convex regularizer. Then, we give upper bounds on the ell-1 distance between the optimal solutions of the convex programs, under small perturbations of the weights, via a stability analysis of the trajectory of the proximal gradient method. Finally, we present new problem-specific rounding techniques to obtain integral solutions to several graph problems that approximately maintain the stability guarantees of the fractional solutions. We apply our framework to a number of problems including minimum s-t cut, multiway cut, densest subgraph, maximum b-matching, and packing integer programs. Finally, we show the tightness of our results for certain problems by establishing matching lower bounds. -
**April 25, 2024**:**Alex Psomas**(Purdue University). [Time: 4-5pm: 32-D507]
**Topic.**Theory and Practice of Fair Food Allocation**Abstract.**Food rescue organizations worldwide are leading programs aimed at addressing food waste and food insecurity. Food Drop is such a program, run by the non-governmental organization Indy Hunger Network (IHN), in the state of Indiana, in the United States. Food Drop matches truck drivers with rejected truckloads of food --- food that would otherwise be discarded at a landfill --- to food banks. Matching decisions are currently made by the Food Assistance Programs Manager of IHN. In this talk, I will discuss a partnership with IHN with the goal of completely automating Food Drop. Motivated by this collaboration, I will present a series of theoretical models and results for the fair division of indivisible goods, with each model refining our understanding of the essence of IHN's problem. These results directly informed our choice of algorithm for matching drivers to food banks for the platform we built for IHN. -
**May 1, 2024**:**Ce Jin and Yinzhan Xu**(MIT). [Time: 4-5pm: 32-G575]
**Topic.**Shaving Logs via Large Sieve Inequality: Faster Algorithms for Sparse Convolution and More**Abstract.**In sparse convolution-type problems, a common technique is to hash the input integers modulo a random prime p\in [Q/2,Q] for some parameter Q, which reduces the range of the input integers while preserving their additive structure. However, this hash family suffers from two drawbacks, which led to bottlenecks in many state-of-the-art algorithms: (1) The collision probability of two elements from [N] is O(\frac{\log N}{Q}) rather than O(\frac{1}{Q}); (2) It is difficult to derandomize the choice of p; known derandomization techniques lead to super-logarithmic overhead [Chan, Lewenstein STOC'15]. In this paper, we partially overcome these drawbacks in certain scenarios, via novel applications of the large sieve inequality from analytic number theory. Consequently, we obtain the following improved algorithms for various problems (in the standard word RAM model): - Sparse Nonnegative Convolution: We obtain an O(t\log t)-time Las Vegas algorithm that computes the convolution A\star B of two nonnegative integer vectors A,B, where t is the output sparsity \|A\star B\|_0. Moreover, our algorithm terminates in O(t\log t) time with 1-1/\mathrm{poly}(t) probability. This simultaneously improves the O(t\log t \log \log t)-time Las Vegas algorithm [Bringmann, Fischer, Nakos SODA'22] and the Monte Carlo O(t\log t)-time algorithm with failure probability 2^{-\sqrt{\log t}} [Bringmann, Fischer, Nakos STOC'21]. - Text-to-Pattern Hamming Distances: Given a length-m pattern P and a length-n text T, we obtain an O(n\sqrt{m\log \log m})-time deterministic algorithm that exactly computes the Hamming distance between P and every length-m substring of T. This improves the previous O(n\sqrt{m}(\log m\log \log m)^{1/4})-time deterministic algorithm [Chan, Jin, Vassilevska Williams, Xu FOCS'23] and nearly matches their O(n\sqrt{m})-time Las Vegas algorithm. - Sparse General Convolution: For sparse convolution with possibly negative input, all previous approaches required \Omega(t\log ^2 t) time, where t is the maximum of input and output sparsity, and an important question left open by [Bringmann, Fischer, Nakos STOC'21] is whether this can be improved. We make partial progress towards solving this question by giving a Monte Carlo O(t\log t) time algorithm in the restricted case where the length N of the input vectors satisfies N\le t^{1.99}. To appear in STOC 2024. Link: https://arxiv.org/abs/2403.20326 -
**May 6, 2024**:**Meghal Gupta**(UC Berkeley). [Time: 4-5pm: 32-D507]
**Topic.**Optimal Quantile Estimation: Beyond the Comparison Model**Abstract.**Estimating quantiles is one of the most basic problems in data sketching. In this problem, a stream x1, x2, x3, …, xn of elements from some universe of size U, a rank r, and an accuracy ε are given. The goal is to give a space-efficient algorithm that outputs an element with rank between r-εn and r+εn. For example, this captures median and 99th percentile estimation. Unsurprisingly, a quantile sketch can be made more space-efficient than storing every element individually (which would take nlogU memory). In this talk, we’ll describe a deterministic quantile sketch using O(ε-1·(logn+logU)) bits of memory. This is optimal and improves upon the previous best deterministic algorithm (Greenwald and Khanna 2003) and even upon the best randomized algorithm (Karnin, Lang, and Libery 2016). This is joint work with Mihir Singhal and Hongxun Wu. -
**May 8, 2024**:**Andrea Lincoln**(Boston Uniersity). [Time: 4-5pm: 32-G575]
**Topic.**A Biased Introduction to Average-Case Fine Grained Complexity**Abstract.**I will introduce the techniques and approaches of fine-grained average-case complexity. I will explain the centrality of low degree polynomials. I will highlight results from many papers and different techniques that can be used. We will end with a result showing that for any constant sized subgraph H, counting the number of H in a worst-case graph and counting the number of H in an Erdős–Rényi graph take the same time up to polylog factors. -
**May 15, 2024**:**Gavin Brown**(Boston University). [Time: 4-5pm: 32-G575]
**Topic.**Stable Estimators for Fast Private Statistics**Abstract.**A long line of work establishes statistical algorithms that satisfy differential privacy, a strong notion of privacy protection. Most existing algorithms have significant drawbacks, requiring prior bounds on the data or suffering in high dimensions. For many basic tasks, the best known algorithms require exponential time. In this talk, we will discuss a new approach that leads to fast and near-optimal private algorithms for mean estimation, covariance estimation, and linear regression. The analysis focuses on stabilizing a greedy, but careful, outlier-removal process. Based on work with Sam Hopkins, Adam Smith, Jon Hayase, Xiyang Liu, Weihao Kong, Sewoong Oh, and Juan Perdomo. arxiv links: https://arxiv.org/abs/2301.12250, https://arxiv.org/abs/2404.15409
## Theory CalendarAdd this calendar to yours to receive details about upcoming events in the Theory group
## Earlier Editions of A&C Seminar |