Original Source Here
Four Deep Learning Papers to Read in September 2021
From Auto-ML to Vision Transformer Training & Representations and Catastrophic Fisher Explosion
Welcome to the September edition of the ‚Machine-Learning-Collage‘ series, where I provide an overview of the different Deep Learning research streams. So what is a ML collage? Simply put, I draft one-slide visual summaries of one of my favourite recent papers. Every single week. At the end of the month all of the resulting visual collages are collected in a summary blog post. Thereby, I hope to give you a visual and intuitive deep dive into some of the coolest trends. So without further ado: Here are my four favourite papers that I read in August 2021 and why I believe them to be important for the future of Deep Learning.
‘Auto-Sklearn 2.0: Hands-free AutoML via Meta-Learning’
One Paragraph Summary: Auto-ML holds the promise of eliminating the tedious manual tuning of hyperparameters and model selection. One example is the Auto-Sklearn API, which provides a simplistic high-level interface to automatically evaluate multiple preprocessing and model fitting pipelines. A key ingredient to previous Auto-ML systems has been the use of so-called meta-features, which are initially computed for the dataset at hand. These features are then used to select a “policy” for sequentially searching through the solution space. The policy selection is based on the meta-feature distance to a meta-dataset of representative datasets. Sometimes this may lead to problems in generalisation if your dataset differs substantially from the meta-dataset. Furthermore, it can be hard to design representative meta-features and to tune the hyperparameters of the Auto-ML algorithm itself. Auto-Sklearn 2.0 aims to overcome both of these challenges by introducing two modifications: First, instead of relying on meta-features, they use a meta-learned portfolio of initial pipelines. Initially, these portfolio candidates are evaluated in order to warmstart the Bayesian Optimization inner loop. Second, they introduce a meta-learned policy selector, which prescribes a model selection strategy (e.g. cross-validation versus simple holdout evaluation) and a budget allocation strategy (full budget versus more aggressive successive halving) based on the number of samples and features in the considered dataset. Hence, the system comes closer to a hierarchical meta-meta approach. The authors validate their proposed modifications on the OpenML Benchmark and provide a new state-of-the-art for both 10 and 60 minute time budgets.
Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot