Early Bird is an energy-efficient method developed to train Deep Neural Network (DNNs), artificial intelligence that supports self-driving vehicles, facial recognition, and similar types of applications. It was developed by researchers from Texas A&M University and Rice University.
A study by lead authors Haoran You and Chaojian Li of Rice’s Efficient and Intelligent Computing (EIC) Lab showed Early Bird could use 10.7 times less energy to train a DNN to the same level of accuracy or better than typical training. EIC Lab director Yingyan Lin led the research along with Rice’s Richard Baraniuk and Texas A&M’s Zhangyang Wang.
“A major driving force in recent AI breakthroughs is the introduction of bigger, more expensive DNNs,” Lin said. “But training these DNNs demands considerable energy. For more innovations to be unveiled, it is imperative to find ‘greener’ training methods that both address environmental concerns and reduce financial barriers of AI research.”
The reason for Early Bird
Training cutting-edge DNNs is costly. A 2019 study by the Allen Institute for AI in Seattle found the number of computations needed to train a top-flight deep neural network increased 300,000 times between 2012-2018. Another 2019 study by researchers at the University of Massachusetts Amherst found the carbon footprint for training a single, elite DNN was roughly equivalent to the lifetime carbon dioxide emissions of five U.S. automobiles.
DNNs are constituted by billions of artificial neurons that learn to perform specific tasks. They are capable of taking decisions very much like human beings. Without any explicit programming, deep networks of artificial neurons can learn to make human-like decisions — and even outperform human experts — by “studying” a large number of previous examples. One example is AlphaGo, a deep network trained to play the board game Go, beat a professional human player in 2015 after studying tens of thousands of previously played games.
“The state-of-art way to perform DNN training is called progressive prune and train,” said Lin. “First, you train a dense, giant network, then remove parts that don’t look important — like pruning a tree. Then you retrain the pruned network to restore performance because performance degrades after pruning. And in practice you need to prune and retrain many times to get good performance.”
How it works
Training strengthens the bondings between most useful neurons and filters out those that can be pruned away. So, it’s possible to “prune” a network as specialized tasks are performed only by a fraction of artificial neurons. Pruning plays a key role in reducing the model size and computational costs, thereby increasing the affordability of DNN training.
“Our idea in this work is to identify the final, fully functional pruned network, which we call the ‘early-bird ticket’ in the beginning stage of this costly first step.” Lin and her team searched for early-bird tickets by studying key network connectivity patterns in the initial stage of training. The team found that Early Bird could emerge as one-tenth or less of the way through the initial phase of training.
This project is a crucial step in making AI techniques more environmentally-friendly and inclusive. If everything goes well, it will open doors to AI innovation with a single laptop or limited computational resources.
“Our method can automatically identify early-bird tickets within the first 10% or less of the training of the dense, giant networks. This means you can train a DNN to achieve the same or even better accuracy for a given tasks in about 10% or less of the time needed for traditional training, which can lead to more than one order savings in both computation and energy,” added Lin.