Skip to content

An up-to-date list of progress made in neural architecture search

License

Notifications You must be signed in to change notification settings

chenyaofo/awesome-architecture-search

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 

Repository files navigation

Awesome - Neural Architecture Search

Awesome

This repo provides an up-to-date list of progress made in Neural architecture search, which includes but not limited to papers, datasets, codebases, frameworks and etc. Please feel free to open an issue to add new progress.

Note: The papers are grouped by published year. In each group, the papers are sorted by their citations. In addition, the paper with underline means a milestone in the field. The third-party code prefers PyTorch. If you are interested in manually-deisgned architectures, please refer to my another repo awesome-vision-architecture.

  • FairNAS: Rethinking Evaluation Fairness of Weight Sharing Neural Architecture Search Cited by 167 ICCV 2021 Xiaomi AI Lab FairNAS Supernet Training PDF Official Code (Stars 297) TL;DR: Based on inherent unfairness in the supernet training, the authors propose two levels of constraints: expectation fairness and strict fairness. Particularly, strict fairness ensures equal optimization opportunities for all choice blocks throughout the training, which neither overestimates nor underestimates their capacity.

  • Zero-Cost Proxies for Lightweight NAS Cited by 51 ICLR 2021 Samsung AI Center, Cambridge Zero-Cost NAS PDF Official Code (Stars 80) TL;DR: In this paper, the authors evaluate conventional reduced-training proxies and quantify how well they preserve ranking between neural network models during search when compared with the rankings produced by final trained accuracy.

  • AutoFormer: Searching Transformers for Visual Recognition Cited by 32 ICCV 2021 Stony Brook University Microsoft Research Asia AutoFormer PDF Official Code (Stars 554) TL;DR: The authors propose a new one-shot architecture search framework, namely AutoFormer, dedicated to vision transformer search. AutoFormer entangles the weights of different blocks in the same layers during supernet training. The performance of these subnets with weights inherited from the supernet is comparable to those retrained from scratch.

  • Vision Transformer Architecture Search Cited by 10 arXiv 2021 The University of Sydney SenseTime Research Vision Transformer Superformer PDF Official Code (Stars 40) TL;DR: This paper present a new cyclic weight-sharing mechanism for token embeddings of the Vision Transformers, which enables each channel could more evenly contribute to all candidate architectures.

  • Searching the Search Space of Vision Transformer Cited by 0 NeurIPS 2021 Institute of Automation, CAS Microsoft Research AutoFormerV2 S3 PDF Official Code (Stars 554) TL;DR: The authors propose to use neural architecture search to automate this process, by searching not only the architecture but also the search space. The central idea is to gradually evolve different search dimensions guided by their E-T Error computed using a weight-sharing supernet.

  • Once-for-All: Train One Network and Specialize it for Efficient Deployment Cited by 508 ICLR 2020 Massachusetts Institute of Technology Once-for-All OFA PDF Official Code (Stars 1.5k) TL;DR: Conventional NAS approaches find a specialized neural network and need to train it from scratch for each case.The authors propose to train a once-for-all (OFA) network that supports diverse architectural settings by decoupling training and search, to reduce the cost, which quickly gets a specialized sub-network by selecting from the OFA network without additional training.

  • Designing Network Design Spaces Cited by 476 CVPR 2020 FAIR RegNet PDF Third-party Code (Stars 162) TL;DR: Instead of focusing on designing individual network instances, the authors design network design spaces that parametrize populations of networks. The overall process is analogous to classic manual design of networks, but elevated to the design space level.

  • Single Path One-Shot Neural Architecture Search with Uniform Sampling Cited by 436 ECCV 2020 MEGVII Technology SPOS Supernet Training PDF Third-party Code (Stars 208) TL;DR: The authors seek to construct a simplified supernet, where all architectures are single paths so that weight co-adaption problem is alleviated. Training is performed by uniform path sampling. All architectures (and their weights) are trained fully and equally.

  • FBNetV2: Differentiable Neural Architecture Search for Spatial and Channel Dimensions Cited by 151 CVPR 2020 UC Berkeley Facebook Inc. FBNetV2 PDF Official Code (Stars 724) TL;DR: The authors propose a memory and computationally efficient DNAS variant: DMaskingNAS. This algorithm expands the search space by up to 10^14x over conventional DNAS, supporting searches over spatial and channel dimensions that are otherwise prohibitively expensive: input resolution and number of filters.

  • EcoNAS: Finding Proxies for Economical Neural Architecture Search Cited by 56 CVPR 2020 The University of Sydney SenseTime Computer Vision Research Group EcoNAS PDF TL;DR: The authors observe that most existing proxies exhibit different behaviors in maintaining the rank consistency among network candidates. In particular, some proxies can be more reliable. Inspired by these observations, the authors present a reliable proxy and further formulate a hierarchical proxy strategy that spends more computations on candidate networks that are potentially more accurate.

  • FBNetV3: Joint Architecture-Recipe Search using Neural Acquisition Function Cited by 41 arXiv 2020 Facebook Inc. UC Berkeley UNC Chapel Hill FBNetV3 PDF TL;DR: Previous NAS methods search for architectures under one set of training hyper-parameters (i.e., a training recipe), overlooking superior architecture-recipe combinations. To address this, this paper presents Neural Architecture-Recipe Search (NARS) to search both architectures and their corresponding training recipes, simultaneously.

  • Semi-Supervised Neural Architecture Search Cited by 27 NeurIPS 2020 University of Science and Technology of China Microsoft Research Asia Semi-Supervised NAS PDF TL;DR: Neural architecture search (NAS) relies on a good controller to generate promising architectures. However, training the controller requires both abundant and high-quality pairs of architectures and their accuracy, which is costly. In this paper, the authors propose SemiNAS, a semi-supervised NAS approach that leverages numerous unlabeled architectures (without evaluation and thus nearly no cost).

  • DARTS: Differentiable Architecture Search Cited by 2.5k ICLR 2019 Carnegie Mellon University Google DeepMind DARTS PDF Official Code (Stars 3.6k) TL;DR: This paper addresses the scalability challenge of architecture search by formulating the task in a differentiable manner. Unlike conventional approaches of applying evolution or reinforcement learning over a discrete and non-differentiable search space, the proposed method is based on the continuous relaxation of the architecture representation, allowing efficient search of the architecture using gradient descent.

  • Regularized Evolution for Image Classifier Architecture Search Cited by 2.0k AAAI 2019 Google Brain Evolution AmoebaNet PDF Official Code (Stars 23.4k) TL;DR: The authors evolve an image classifier---AmoebaNet-A---that surpasses hand-designs for the first time. To do this, they modify the tournament selection evolutionary algorithm by introducing an age property to favor the younger genotypes.

  • MnasNet: Platform-Aware Neural Architecture Search for Mobile Cited by 1.8k CVPR 2019 Google Brain Google Inc. MNASNet PDF Official Code (Stars 4.8k) TL;DR: The authors propose an automated mobile neural architecture search (MNAS) approach, which explicitly incorporate model latency into the main objective, where latency is directly measures as real-world inference latency by executing the model on mobile phones.

  • ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware Cited by 1.2k ICLR 2019 Massachusetts Institute of Technology ProxylessNAS PDF Official Code (Stars 1.3k) TL;DR: This paper presents ProxylessNAS that can directly learn the architectures for large-scale target tasks and target hardware platforms. The proposed method address the high memory consumption issue of differentiable NAS and reduce the computational cost (GPU hours and GPU memory) to the same level of regular training while still allowing a large candidate set.

  • FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search Cited by 811 CVPR 2019 UC Berkeley Princeton University Facebook Inc. FBNet Latency Table PDF Official Code (Stars 724) TL;DR: The authors propose a differentiable neural architecture search (DNAS) framework that uses gradient-based methods to optimize ConvNet architectures, by directly considering latency on target devices.

  • Efficient Neural Architecture Search via Parameters Sharing Cited by 1.9k ICML 2018 Google Brain Carnegie Mellon University ENAS Reinforcement Learning PDF Third-party Code (Stars 2.6k) TL;DR: The proposed method (ENAS) constructs a large computational graph (suprenet), where each subgraph represents a neural network architecture, hence forcing all architectures to share their parameters. Evaluating candidate architectures with these subgraphs and their corresponding parameters would lead to much lower GPU hours (1000x less expensive than existing methods).
  • Neural Architecture Search with Reinforcement Learning Cited by 4.0k ICLR 2017 Google Brain Reinforcement Learning PDF Third-party Code (Stars 395) TL;DR: This is a pioneering work exploits the paradigms of reinforcement learning (RL) to solve NAS problem. To be specific, the authors use a recurrent network to generate the model descriptions of neural networks and train this RNN with reinforcement learning to maximize the expected accuracy of the generated architectures on a validation set.

  • Designing Neural Network Architectures using Reinforcement Learning Cited by 1.2k ICLR 2017 Massachusetts Institute of Technology Q-learning Reinforcement Learning PDF Official Code (Stars 127) TL;DR: The authors introduce MetaQNN, a meta-modeling algorithm based on reinforcement learning to automatically generate high-performing CNN architectures for a given learning task. The learning agent is trained to sequentially choose CNN layers using -learning with an e-greedy exploration strategy and experience replay

  • Neural Architecture Search: A Survey Cited by 1.5k JMLR 2019 University of Freiburg Survey PDF TL;DR: The authors provide an overview of existing work in this field of research and categorize them according to three dimensions: search space, search strategy, and performance estimation strategy.
  • NAS-Bench-101 Download Link TL;DR: This dataset contains 423,624 unique neural networks exhaustively generated and evaluated from a fixed graph-based search space. Each network is trained and evaluated multiple times on CIFAR-10 at various training budgets and we present the metrics in a queriable API. The current release contains over 5 million trained and evaluated models. How to cite: NAS-Bench-101: Towards Reproducible Neural Architecture Search Cited by 340 ICML 2019 Google Brain NAS-Bench-101 PDF
  • D-X-Y/AutoDL-Projects Stars 1.4k AutoDL TL;DR: Automated Deep Learning Projects (AutoDL-Projects) is an open source, lightweight, but useful project for everyone. This project implemented several neural architecture search (NAS) and hyper-parameter optimization (HPO) algorithms.

Releases

No releases published

Packages

No packages published