Algorithms & Complexity Seminar, MIT : Spring 2024

Organizers: Noah Golowich (nzg AT mit DOT edu), Rahul Ilango (rilango AT mit DOT edu)

The Algorithms & Complexity Seminar will usually (unless otherwise stated) meet on Wednesdays 4pm-5pm at MIT. The style and format of these meetings are variable. Please feel free to contact the organizers to find out more details. To receive announcements of upcoming talks, subscribe to the mailing list by either visiting the mailman page or send an empty email to compalgsem-subscribe@lists.csail.mit.edu.

Talks

  • February 28, 2024: D Ellis Hershkowitz (Brown University). [Time: 4-5pm: 32-G575]
    Topic. Polylogarithmic Universal Steiner Trees and Strong Sparse Partition Hierarchies
    Abstract. This talk concerns universal Steiner trees (USTs). Informally, a UST is a single spanning tree that is a good approximation for every instance of Steiner tree. More formally, an α-approximate universal Steiner tree (UST) of a graph G for root vertex r is a spanning tree T such that, for any vertex subset S containing r, the cost of the minimal subtree of T connecting S is within an α factor of the cost of the minimum cost subgraph of G connecting S. α-approximate USTs immediately give α-approximations for well-studied variants of Steiner tree such as online or oblivious Steiner tree. Sub-linear-approximate USTs are known but neither the existence of nor poly-time algorithms for computing poly-logarithmic-approximate USTs were previously known. In this talk, I will discuss the first construction of poly-logarithmic-approximate USTs which (nearly) exponentially improve the approximation guarantees of previous constructions and (nearly) match logarithmic lower bounds. The result is based on new constructions of poly-logarithmic-quality graph hierarchies called strong sparse partition hierarchies which may be interesting in their own right. Roughly, a strong sparse partition (i.e. each level of this hierarchy) is a vertex partition that provides deterministic guarantees on the number of parts balls of appropriate radii intersect. Joint work with Costas Busch, Da Qi Chen, Arnold Filtser, Daniel Hathcock and Rajmohan Rajaraman.
  • March 6, 2024: Brian Zhang (CMU). [Time: 4-5pm: 32-G575]
    Topic. Efficient Φ-Regret Minimization with Linear and Low-Degree Swap Deviations in Extensive-Form Games
    Abstract. Recent breakthrough results by Dagan, Daskalakis, Fishelson and Golowich [2023] and Peng and Rubinstein [2023] established an efficient algorithm attaining at most \epsilon swap regret over extensive-form strategy spaces of dimension N in N^{\tilde O(1/\epsilon)} rounds. On the other extreme, Farina and Pipis [2023] developed an efficient algorithm for minimizing the weaker notion of linear-swap regret in \mathsf{poly}(N)/\epsilon^2 rounds. In this talk, we will discuss several new results that extend and expand upon these recent works. The first result is an interpretation of linear-swap deviations in extensive-form games. We show a connection between linear deviations and a generalization of communication deviations in which the player can make queries to a "mediator" who replies with action recommendations, and, critically, the player is not constrained to match the timing of the game as would be the case for communication deviations. We coin this latter set the untimed communication (UTC) deviations. We show that the UTC deviations coincide precisely with the linear deviations, and therefore that any player minimizing UTC regret also minimizes linear-swap regret. Next, we take a step toward bridging the gap between linear and swap deviations. We introduce the set of k-mediator deviations, which generalize the untimed communication deviations to the case of having multiple mediators. We develop parameterized algorithms for minimizing the regret with respect to this set of deviations in N^{O(k)}/\epsilon^2 rounds. This closes the gap in the sense that k=1 recovers linear swap regret, while k=N recovers swap regret. Moreover, by relating k -mediator deviations to low-degree polynomials, we show that regret minimization against degree- k polynomial swap deviations is achievable in N^{O(kd)^3}/\epsilon^2 rounds, where d is the depth of the game, assuming constant branching factor. For a fixed degree k, this is polynomial for Bayesian games and quasipolynomial more broadly when d = \mathsf{polylog} N---the usual balancedness assumption on the game tree. This talk covers joint work with Ioannis Anagnostides, Gabriele Farina, and Tuomas Sandholm.
  • March 13, 2024: Ilya Volkovich (Boston College). [Time: 4-5pm: 32-G575]
    Topic. Synergy between Circuit Obfuscation and Circuit Minimization
    Abstract. We study close connections between Indistinguishability Obfuscation (IO) and the Minimum Circuit Size Problem (MCSP), and argue that algorithms for one of MCSP or IO would empower the other one. Some of our main results are: If there exists a perfect (imperfect) IO that is computationally-secure against non-uniform polynomial-size circuits, then we obtain fixed-polynomial lower bounds against NP (MA). In addition, computationally-secure IO against non-uniform polynomial-size circuits imply super-polynomial lower bounds against NEXP. If MCSP is in BPP, then statistical security and computational security for IO are equivalent. If computationally-secure perfect IO exists, then MCSP is contained in BPP iff NP = ZPP; As a consequence we separate ZPEXP from BPP. To the best of our knowledge, this is the first consequence of strong circuit lower bounds from the existence of an IO. The results are obtained via a construction of an optimal universal distinguisher, computable in randomized polynomial time with access to the MCSP oracle, that will distinguish any two circuit-samplable distributions with the advantage that is the statistical distance between these two distributions minus some negligible error term. This is our main technical contribution. As another application, we get a simple proof of the result by Allender and Das (Inf. Comput., 2017) that SZK is contained in BPP^MCSP.
  • March 20, 2024: Kai Zheng (MIT). [Time: 4-5pm: 32-G575]
    Topic. Near Optimal Alphabet-Soundness Tradeoff PCPs
    Abstract. We show a nearly optimal alphabet-soundness tradeoff for NP-hardness of 2-Prover-1-Round Games (2P1R). More specifically, we show that for all \eps > 0, for sufficiently large M, it is NP-hard to decide whether a 2P1R instance of alphabet size M has value nearly 1 or at most M^{-1+\eps}. 2P1R are equivalent to 2-Query PCPs, and are widely used in obtaining hardness of approximation results. As such, our result implies the following: 1) hardness of approximating Quadratic Programming within a factor of nearly log(n), 2) hardness of approximating d-bounded degree 2-CSP within a factor of nearly d/2, and 3) improved hardness of approximation results for various k-vertex connectivity problems. For the former two applications, our results nearly match the performance of the best known algorithms. Joint work with Dor Minzer.
  • April 11, 2024: Vahab Mirrokni (Google Reseaerch). [Time: 2-3pm: 32-D507]
    Topic. ML Efficiency for Large Models: Faster Transformers, Sparsity, and beyond
    Abstract. Scaling large models efficiently for faster training and inference is a fundamental challenge. In this talk, we present a number of algorithmic challenges and potential solutions from theory to practice. First, we discuss data efficiency and model efficiency problems that can be formalized as a subset selection problem. For model efficiency, we present sequential attention for feature selection and sparsification[ICLR'23, arxiv]. For data efficiency, we present a sensitivity sampling technique for improved quality and efficiency of the models. Furthermore, we discuss the intrinsic quadratic complexity of attention models as well as token generation. We first discuss HyperAttention; a technique to develop linear-time attention algorithms under mild assumptions[ICLR'24]. We then present PolySketchFormer, a technique to bypass the hardness results of achieving sub-quadratic attention by applying sketching of polynomial functions[arxiv]. Finally, we show how to address the complexity of token generation via clustering techniques[arxiv].
  • April 17, 2024: Peter Manohar (CMU). [Time: 4-5pm: 32-G575]
    Topic. An Exponential Lower Bound for Linear 3-Query Locally Correctable Codes
    Abstract. A locally correctable code (LCC) is an error correcting code where one can recover any symbol of the original codeword by querying only a small number of randomly chosen symbols from the received corrupted codeword. In this talk, we present a proof that the blocklength n of a linear 3-query LCC C: {0,1}^k -> {0,1}^n must be at least \exp(k^{1/8}), i.e., it grows exponentially with k. This improves on the prior best lower bound of k^3 shown in our recent work, which holds even for the weaker setting of 3-query locally decodable codes (LDCs), and comes close to matching the best-known construction of 3-query LCCs based on Reed—Muller codes, which achieve a blocklength of n = \exp(\sqrt{k}). Because there is a 3-query LDC with a strictly subexponential blocklength, as a corollary we obtain the first separation between q-query LCCs and LDCs for any constant q. We subsequently show that any “design 3-LCC”, an LCC with a parity check matrix that is a block design, must have blocklength at least 2^{(1-o(1)) sqrt{k}}. As Reed—Muller codes are design 3-LCCs with blocklength n = 2^{2 sqrt{2k}}, this is tight up to a factor of 2 sqrt{2} in the exponent, and as a corollary resolves the Hamada conjecture for 4-designs up to this constant factor. For nonlinear LCCs, we give a generalization of our approach that proves a superpolynomial lower bound of k^{log k}, improving on the prior best lower bound of k^3.
  • April 18, 2024: Dmitrii Avdiukhin (Northwestern Uniersity). [Time: 4-5pm: 32-D507]
    Topic. Optimal Sample Complexity of Contrastive Learning
    Abstract. Contrastive learning is a highly successful technique for learning representations of data from labeled tuples, specifying the distance relations within the tuple. We study the sample complexity of contrastive learning, i.e. the minimum number of labeled tuples sufficient for getting high generalization accuracy. We give tight bounds on the sample complexity in a variety of settings, focusing on arbitrary distance functions, both general ℓp-distances, and tree metrics. Our main result is an (almost) optimal bound on the sample complexity of learning ℓp-distances for integer p. For any p≥1 we show that Θ̃ (min(nd,n2)) labeled tuples are necessary and sufficient for learning d-dimensional representations of n-point datasets. Our results hold for an arbitrary distribution of the input samples and are based on giving the corresponding bounds on the Vapnik-Chervonenkis/Natarajan dimension of the associated problems. We further show that the theoretical bounds on sample complexity obtained via VC/Natarajan dimension can have strong predictive power for experimental results, in contrast with the folklore belief about a substantial gap between the statistical learning theory and the practice of deep learning.

Theory Calendar

Add this calendar to yours to receive details about upcoming events in the Theory group
(includes A&C Seminars, TOC Colloquium, CIS Seminars, Theory Lunch, TOC tea and more!) :


Click on the events above to get more description about the same (title/abstract for talks, etc.)

Earlier Editions of A&C Seminar