SqueezeBERT promises faster mobile NLP while maintaining BERT levels of accuracy


Former DeepScale CEO Forrest landola left Tesla to focus on NLP research, he told VentureBeat in a phone interview. Computer vision startup DeepScale was acquired by Tesla in fall 2019 for an undisclosed amount. Iandola said he left Tesla because he wants to explore questions beyond autonomous driving and engage with the kind of accidental discovery that comes with broader forms of AI research.

In research circles, Iandola is perhaps best known for his work in a computer vision and lead author of a 2017 paper on SqueezeNet, a model that achieved AlexNet-like levels of image classification accuracy with 50 times fewer parameters.

In his first piece of NLP research since leaving Tesla, he worked with a team that included DeepScale cofounder and UC Berkeley professor Kurt Keutzer and Tesla senior machine learning engineer Albert Shaw. On Monday, they published a paper detailing SqueezeBERT, a mobile NLP neural network architecture that they say is 4.3 times faster than BERT on a Pixel 3 smartphone while achieving accuracy similar to MobileBERT in GLUE benchmark tasks. A key difference between MobileBERT and SqueezeBERT, Iandola told VentureBeat in an interview, is the use of grouped convolutions to increase speed and efficiency, a technique first introduced in 2012.

“[W]e didn’t really change the size of the layers or how many of them there are, but we sort of grouped convolutions. It’s not really sparsity in the sense that you just delete random parameters, but there are blocks of parameters intentionally missing from the beginning of training, and that’s where the speed-up in our case came from,” he said.

VB Transform 2020 Online – July 15-17. Join leading AI executives: Register for the free livestream.

SqueezeBERT also relies on techniques derived from SqueezeNAS, a neural architecture search (NAS) model developed last year by former DeepScale employees, including Shaw and Iandola.

Iandola said he chose to commit to NLP research because of advances enabled by Transformer-based networks in recent years. He’s also interested in mobile and edge use cases of NLP that can run locally without data leaving a device.

“I guess I’m not completely backing away from doing vision, but I think NLP feels like where computer vision was in maybe 2013, where AlexNet had just happened and people are going ‘Okay, so what are all the things we want to do over again using this new technology? And I feel like in some sense, self-attention networks are that big of a disruption to NLP and people are kind of starting over in designing NLP algorithms,” he said.

Since the open source release of BERT in 2017, Transformer-based models and variations of BERT like Facebook’s RoBERTa, Baidu’s ERNIE, and Google’s XLNet have achieved state-of-the-art results for language models. A group of experts VentureBeat spoke with last year called advances in NLP a major trend in machine learning in 2019.

SqueezeBERT is the latest piece of research at the convergence of computer vision and NLP. Last week, Facebook and UC Berkeley researchers including Keutzer introduced Visual Transformers for finding relationships between visual concepts. Last month, Facebook AI Research released DETR, the first object detection system created using the Transformer neural network architecture that has been at the forefront of advances in NLP.

One potential next step for SqueezeBERT is to attempt downsampling to cut sentences the way computer vision models like EfficientNet or AlexNet cut the height and width of images for speed improvements.

“The notion of treating a sentence like an image that you can upsample or downsample is something that I think could become a popular thing in NLP — we’ll have to see,” Iandola said.

He said SqueezeBERT code will be released for review this summer.



Source link

Previous post Six-Word Sci-Fi: Imagine an Apocalypse With a Happy Ending
Next post Mixer failed — here’s why