Artificial Intelligence 15 min read

Weight‑Sharing Neural Architecture Search: Challenges, Methods, and Future Directions

This article reviews the major challenges of AI—data, model, and knowledge—explains why automated machine learning and neural architecture search are crucial, analyzes weight‑sharing NAS algorithms and their instability, presents various improved DARTS‑based methods, and discusses experimental results and future research directions.

DataFunTalk

Apr 30, 2020

Weight‑Sharing Neural Architecture Search: Challenges, Methods, and Future Directions

The next generation of AI models relies on automated techniques to design better deep‑learning architectures, addressing three core challenges: data efficiency, model design, and knowledge representation.

AutoML, particularly Neural Architecture Search (NAS), automates the discovery of network structures, reducing manual effort and computational cost. Weight‑sharing NAS reuses computations across sampled sub‑networks, dramatically improving search efficiency but introducing instability and optimization gaps.

NAS pipelines typically consist of three components: a search space (defining possible architectures), a search strategy (sampling architectures), and an evaluation method (assessing performance). Choices between open vs. closed search spaces, cell‑based vs. whole‑network search, and operation set size affect both flexibility and stability.

Weight‑sharing methods such as DARTS suffer from performance collapse when training longer or using deeper networks, due to gradient approximation errors and over‑fitting of the supernet. To mitigate these issues, several enhanced algorithms have been proposed:

P‑DARTS (Progressive‑DARTS) : gradually increases network depth during search to reduce depth‑related optimization error.

PC‑DARTS (Partial‑Channel DARTS) : randomly samples a subset of channels, improving regularization and search speed.

Stabilized‑DARTS : refines gradient estimation to keep the angle between estimated and true gradients below 90°, enhancing stability.

LA‑DARTS : incorporates latency prediction for hardware‑aware architecture search.

Scalable‑DARTS : expands the operation set via factorized channel search, improving accuracy on CIFAR‑10 and ImageNet.

Extensive experiments on CIFAR‑10/100 and ImageNet demonstrate that these methods achieve lower error rates and significantly lower GPU‑day consumption compared with the original DARTS (e.g., P‑DARTS: 2.55% error on CIFAR‑10 with 0.3 GPU‑days; PC‑DARTS: 2.57% error with 0.06 GPU‑days).

The article concludes by highlighting two main NAS paradigms—discrete search and weight‑sharing search—emphasizing the need for stable, scalable, and hardware‑friendly methods, and outlines open questions about optimal search strategies, basic search units, and real‑world deployment.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

AI AutoML Neural Architecture Search DARTS Weight Sharing

Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.