Algorithms & Complexity Seminar, MIT : Spring 2023

Organizers: Noah Golowich (nzg AT mit DOT edu), Rahul Ilango (rilango AT mit DOT edu)

The Algorithms & Complexity Seminar will usually (unless otherwise stated) meet on Wednesdays 4pm-5pm at MIT. If you contact us (the organizers) beforehand, we can arrange so that the talk can be viewed at this zoom link (password is 348416). The style and format of these meetings are variable. Please feel free to contact the organizers to find out more details. To receive announcements of upcoming talks, subscribe to the mailing list by either visiting the mailman page or send an empty email to compalgsem-subscribe@lists.csail.mit.edu.

Upcoming Talks

Tuesday, May 30, 2023: Or Zamir (IAS, Princeton). [Time: 4-5pm: 32-D507]
Topic. Algorithmic Applications of Hypergraph and Partition Containers
Abstract. We present a general method to convert algorithms into faster algorithms for almost-regular input instances. Informally, an almost-regular input is an input in which the maximum degree is larger than the average degree by at most a constant factor. This family of inputs vastly generalizes several families of inputs for which we commonly have improved algorithms, including bounded-degree inputs and random inputs. It also generalizes families of inputs for which we don't usually have faster algorithms, including regular-inputs of arbitrarily high degree and all very dense inputs. We apply our method to achieve breakthroughs in exact algorithms for several central NP-Complete problems including k-SAT, Graph Coloring, and Maximum Independent Set. Our main tool is the first algorithmic application of the relatively new Hypergraph Container Method (Saxton and Thomason 2015, Balogh, Morris and Samotij 2015). This recent breakthrough, which generalizes an earlier version for graphs (Kleitman and Winston 1982, Sapozhenko 2001), has been used extensively in recent years in extremal combinatorics. An important component of our work is a new generalization of (hyper-)graph containers to Partition Containers.

Past Talks

Wednesday, February 8, 2023: Ziv Scully (MIT & Harvard). [Time: 4-5pm: 32-G575]
Topic. Local Hedging for Combinatorial Pandora's Box Problems with Optional Inspection
Abstract. Weitzman's Pandora's box problem is a classic model of sequential search. A decision maker has a finite number of boxes, each with contents of random value. The decision maker sequentially inspects any subset of them, paying a cost for each inspection, before ultimately stopping and selecting one box's value to keep. Weitzman shows that the optimal search rule is an index policy. That is, the policy assigns each box a rating "in isolation", then always inspects or selects the box of best rating. Because of this, Weitzman's Pandora's rule naturally extends to combinatorial optimization versions of the problem, in which now a larger set of boxes may be selected, but the selected set must satisfy some constraints. An implicit assumption in Weitzman's model is that conditional on stopping, the decision maker can only select one of the already inspected boxes. As it turns out, this assumption underlies the simplicity of Weitzman's rule. Indeed, the more natural optional inspection version of Pandora's problem does not admit an optimal index policy, even in the single-item selection case. We introduce a new randomized index policy, called local hedging, for Pandora's box problems with optional inspection. Like Weitzman's original index policy, local hedging treats each box in isolation when assigning it its random rating. Local hedging enables us to translate policies for a wide range of mandatory-inspection Pandora's box problems into policies for optional-inspection problems, at the cost of an extra factor in the policies' approximation ratios. The factor is instance-dependent, but it is at most 4/3 for cost-minimization settings and at least 1/2 for reward-maximization settings. Thus, we obtain the first approximation algorithms for a variety of combinatorial settings, including matroid optimization, matching, and facility location problems. Based on joint work with Laura Doval.
Wednesday, February 15, 2023: Jason Li (UC Berkeley). [Time: 4-5pm: 32-G575]
Topic. Minimum Isolating Cuts: A new tool for solving minimum cut problems
Abstract. Minimum cut problems are among the most well-studied questions in combinatorial optimization. In this talk, I will introduce a simple but powerful new tool for solving minimum cut problems called the minimum isolating cuts. I will show how this tool can be employed to obtain faster algorithms for several fundamental min-cut problems, namely global min-cut, Steiner min-cut, and all-pairs min-cut. For these problems, the new results represent the first improvement in their runtimes in several decades. These results are in collaboration with Amir Abboud, Robert Krauthgamer, Danupon Nanongkai, Debmalya Panigrahi, Thatchaphol Saranurak, and Ohad Trabelsi.
Thursday, February 16, 2023: Rajesh Jayaram (Google NYC). [Time: 4-5pm: 32-G575]
Topic. Streaming Euclidean MST to a Constant Factor
Abstract. We consider the problem of computing the cost of the Euclidean Minimum Spanning Tree (MST) of an n-point set X in R^d, where the points in X are presented in an arbitrary order stream of insertions and deletions. In the streaming setting, the goal is to maintain an approximation to the MST cost in small space. In low dimensions, (1+ε) approximations are possible in sublinear (in n) space [Frahling, Indyk, Sohler, SoCG '05]. However, for high dimensional spaces the best known approximation was Õ(log n), due to [Chen, Jayaram, Levi, Waingarten, STOC '22], and prior to that it was O(log^2 n) due to [Indyk, STOC '04] and [Andoni, Indyk, Krauthgamer, SODA '08]. In this talk, we describe the first algorithm which achieves a constant factor approximation to the Euclidean MST in sublinear space. Specifically, for any ε > 0, our algorithm obtains a Õ(1/ε^2) approximation in n^{O(ε)} space. We show the approximation is tight up to a constant, by proving an Omega(sqrt{n}) space lower bound for any algorithm that beats a 1.1 approximation. Furthermore, the algorithm can be modified to give a (1+ε) approximation in O(1/ε)-passes over the stream, which separates the complexity for single vs multi-pass streaming. Joint work with Xi Chen, Vincent Cohen-Addad, Amit Levi, and Erik Waingarten (to appear in STOC '23). Full paper available at: https://arxiv.org/abs/2212.06546
Wednesday, February 22, 2023: Roie Levin (Tel Aviv University). [Time: 4:15-5:15pm: 32-G575]
Topic. Online Covering: Secretaries, Prophets and Universal Maps
Abstract. We give a polynomial-time algorithm for online covering IPs with a competitive ratio of O(log mn) when the constraints are revealed in random order, essentially matching the best possible offline bound of O(log n) and circumventing the Ω(log m log n) lower bound known in adversarial order. We then use this result to give an O(log mn) competitive algorithm for the prophet version of this problem, where constraints are sampled from a sequence of known distributions (in fact, our algorithm works even when only a single sample from each of the distributions is given). Since our algorithm is universal, as a byproduct we establish that only O(n) samples are necessary to build a universal map for online covering IPs with competitive ratio O(log mn) on input sequences of length n. This talk is based on joint work with Anupam Gupta and Gregory Kehne, the first half of which appeared at FOCS 2021.
Wednesday, March 1, 2023: Huy Le Nguyen (Northeastern). [Time: 4-5pm: 32-G575]
Topic. Efficient locally private algorithms for estimation problems
Abstract. In federated computation, user data is distributed over many devices which each communicate to some central server for downstream analytics and/or machine learning tasks. Already in the classical setting, there are many desiderata to be balanced in the system from accurate aggregation to low communication and fast runtime for the users and the server. Adding to the already delicate balancing, privacy concern necessitates new methods that are efficient and accurate while preserving the privacy of individual data. In this talk we will focus on two basic tasks underlying many applications: computing the histogram of user data and estimating their mean. While the problems are well-studied, a lot of recent works have been devoted to developing efficient and optimally accurate algorithms beyond asymptotic behavior. We will describe new algorithms achieving near optimal error, fast runtime, and low communication for these tasks. This is based on joint work with Hilal Asi, Vitaly Feldman, Jelani Nelson, and Kunal Talwar.
Wednesday, March 8, 2023: Misha Khodak (CMU). [Time: 4-5pm: 32-G575]
Topic. New Directions in Algorithms with Predictions: Learning and Privacy
Abstract. A burgeoning paradigm in algorithm design is learning-augmented algorithms, or algorithms with predictions, where methods can take advantage of a (possibly imperfect) prediction about their instance. While past work has focused on using predictions to improve competitive ratios and runtime, this talk addresses a different, salient question: how do we learn the predictions themselves? We introduce an approach for co-designing learning-augmented algorithms with their own custom learning algorithms, with the crucial step being to optimize nice surrogate losses bounding the algorithms’ costs. This leads to improved sample complexity bounds for several learning-augmented graph algorithms and the first learning-theoretic guarantees for page migration with predictions, among other contributions. We also instantiate these ideas on the new direction of learning-augmented private algorithms, where the goal is to reduce utility loss due to privacy rather than runtime. Our approach drives numerous insights on how to robustly incorporate external information to release better statistics of sensitive datasets, which we verify empirically on the task of multiple quantile release.
Thursday, March 9, 2023: Ludwig Schmidt (University of Washington). [Time: 3-4pm: 32-G449]
Topic. A data-centric view on reliable generalization: From ImageNet to LAION-5B
Abstract. Researchers have proposed many algorithms to make neural networks more reliable under distribution shift, yet there is still large room for improvement. Are better training algorithms or training data the more promising way forward? In this talk, we study this question in the context of OpenAI’s CLIP model for learning from image-text data. First, we survey the current robustness landscape based on a large-scale experimental study involving more than 200 different models and test conditions. The CLIP models stand out with unprecedented robustness on multiple challenging distribution shifts. To further improve CLIP, we then introduce new algorithms for reliably fine-tuning models by interpolating the weights of multiple models. Next, we investigate the cause of CLIP’s robustness via controlled experiments to disentangle the influence of language supervision and training distribution. While CLIP leveraged large scale language supervision for the first time, its robustness actually comes from the pre-training dataset. We conclude with a brief overview of ongoing work to improve pre-training datasets: LAION-5B, the largest public image-text dataset, and initial experiments to increase the robustness induced by pre-training data.
Wednesday, March 15, 2023: Khashayar Gatmiry (MIT). [Time: 4-5pm: 32-G575]
Topic. Sampling with Barriers: Faster Mixing via Lewis Weights
Abstract. We analyze Riemannian Hamiltonian Monte Carlo (RHMC) for sampling a polytope defined by m inequalities in R^n endowed with the metric defined by the Hessian of a self-concordant convex barrier function. We use a hybrid of the p-Lewis weight barrier and the standard logarithmic barrier and prove that the mixing rate is bounded by \tilde{O}(m1/3n 4/3), improving on the previous best bound of \tilde{O}(mn2/3), based on the log barrier. Our analysis overcomes several technical challenges to establish this result, in the process deriving smoothness bounds on Hamiltonian curves and extending self-concordance notions to the infinity norm; both properties appear to be of independent interest.
Thursday, March 23, 2023: Lydia Zakynthinou (Northeastern). [Time: 4-5pm: 32-D507]
Topic. From Robustness to Privacy and Back
Abstract. We study the relationship between two desiderata of algorithms in statistical inference and machine learning—differential privacy and robustness to adversarial data corruptions. Their conceptual similarity was first observed by Dwork and Lei (STOC 2009), who observed that private algorithms satisfy robustness, and gave a general method for converting robust algorithms to private ones. However, all general methods for transforming robust algorithms into private ones lead to suboptimal error rates. Our work gives the first black-box transformation that converts any adversarially robust algorithm into one that satisfies pure differential privacy. Moreover, we show that for any low-dimensional estimation task, applying our transformation to an optimal robust estimator results in an optimal private estimator. Thus, we conclude that for any low-dimensional task, the optimal error rate for ε-differentially private estimators is essentially the same as the optimal error rate for estimators that are robust to adversarially corrupting 1/ε training samples. We apply our transformation to obtain new optimal private estimators for several high-dimensional tasks, including Gaussian (sparse) linear regression and PCA. Finally, we present an extension of our transformation that leads to approximate differentially private algorithms whose error does not depend on the range of the output space, which is impossible under pure differential privacy. Joint work with Hilal Asi and Jonathan Ullman.
Wednesday, April 12, 2023: Chin Ho Lee (Harvard). [Time: 4-5pm: 32-G575]
Topic. Bounded independence plus noise and derandomization
Abstract. k-wise independent distributions have been a fundamental object in algorithm design and derandomization since the late 70s. They are defined as distributions on n bits that are uniform on every k coordinates. Despite their importance, they have limitations as a pseudorandom object. A recent line of work has shown that adding noise to k-wise independent distributions can be unexpectedly powerful, leading to new constructions of pseudorandom generators (PRGs) for various classes of functions. In this talk, I will present an overview of the latest research on the topic, including various constructions and analyses of PRGs, and how it is related to coding theory and Boolean function analysis.
Wednesday, April 19, 2023: Matheus Venturyne Xavier Ferreira (Harvard). [Time: 4-5pm: 32-G575]
Topic. Credible Decentralized Exchange Design via Verifiable Sequencing Rules
Abstract. Trading on decentralized exchanges has been one of the primary use cases for permissionless blockchains with daily trading volume exceeding billions of U.S.~dollars. In the status quo, users broadcast transactions they wish to execute in the exchange and miners are responsible for composing a block of transactions and picking an execution ordering --- the order in which transactions execute in the exchange. Due to the lack of a regulatory framework, it is common to observe miners exploiting their privileged position by front-running transactions and obtaining risk-fee profits. Indeed, the Flashbots service institutionalizes this exploit, with miners auctioning the right to front-run transactions. In this work, we propose to modify the interaction between miners and users and initiate the study of {\em verifiable sequencing rules}. As in the status quo, miners can determine the content of a block; however, they commit to respecting a sequencing rule that constrains the execution ordering and is verifiable (there is a polynomial time algorithm that can verify if the execution ordering satisfies such constraints). Thus in the event a miner deviates from the sequencing rule, anyone can generate a proof of non-compliance. We ask if there are sequencing rules that limit price manipulation from miners in a two-token liquidity pool exchange. Our first result is an impossibility theorem: for any sequencing rule, there is an instance of user transactions where the miner can obtain non-zero risk-free profits. In light of this impossibility result, our main result is a verifiable sequencing rule that provides execution price guarantees for users. In particular, for any user transaction $A$, it ensures that either (1) the execution price of $A$ is at least as good as if $A$ was the only transaction in the block, or (2) the execution price of $A$ is worse than this ``standalone'' price and the miner does not gain when including $A$ in the block. Our framework does not require users to use countermeasures against predatory trading strategies, for example, set limit prices or split large transactions into smaller ones. This is likely to improve user experience relative to the status quo. Joint work with David C. Parkes. To appear at STOC 2023.
Wednesday, April 26, 2023: Maryam Aliakbarpour (Boston University & Northeastern University). [Time: 4-5pm: 32-G575]
Topic. Statistical inference with computational constraints
Abstract. The vast amount of digital data we create and collect has revolutionized many scientific fields and industrial sectors. However, despite our success in harnessing this transformative power of data, computational and societal trends emerging from the current practices of data science necessitate upgrading our toolkit for data analysis. In this talk, we discuss how practical considerations such as time and memory limits affect statistical inference tasks, with a particular focus on the problem of entropy estimation of a distribution by streaming over i.i.d. samples from it. We determine how bounded memory affects the number of samples we need to solve this problem. Recent work by Acharya et al. (NeurIPS 2019) showed how to estimate the entropy of a distribution D over an alphabet of size k up to \epsilon additive error by streaming over (k/\epsilon^3)⋅polylog(1/\epsilon) i.i.d. samples and using only O(1) words of memory. In this work, we propose a new constant memory scheme that reduces the sample complexity to (k/\epsilon^2)⋅polylog(1/\epsilon). We conjecture that this is optimal up to polylog(1/\epsilon) factors. Based on: https://arxiv.org/abs/2205.09804 Joint work with: Andrew McGregor, Jelani Nelson, Erik Waingarten
Thursday, April 27, 2023: Ewin Tang (University of Washington). [Time: 4-5pm: 32-G449]
Topic. Query-optimal estimation of unitary channels in diamond distance
Abstract. We will present an algorithm to learn an unknown unitary channel acting on a d-dimensional qudit to diamond-norm error ε, using O(d²/ε) applications of the unknown channel and only one qudit. This algorithm uses the optimal number of qudits and number of queries, even if one has access to the inverse or controlled versions of the unknown unitary. This improves over prior work, which achieves entanglement infidelity δ using O(d²/√δ) applications in parallel, thereby requiring Ω(d²) qudits. Based on joint work with Jeongwan Haah, Robin Kothari, and Ryan O'Donnell.

Theory Calendar

Add this calendar to yours to receive details about upcoming events in the Theory group
(includes A&C Seminars, TOC Colloquium, CIS Seminars, Theory Lunch, TOC tea and more!) :

Click on the events above to get more description about the same (title/abstract for talks, etc.)

Algorithms & Complexity Seminar, MIT : Spring 2023

Upcoming Talks

Past Talks

Theory Calendar

Earlier Editions of A&C Seminar