Google open-sources BERT, a state-of-the-art pretraining technique for natural language processing

Nitin Naresh November 2, 2018

0 985 2 minutes read

Natural language processing (NLP) — the subcategory of artificial intelligence (AI) that spans language translation, sentiment analysis, semantic search, and dozens of other linguistic tasks — is easier said than done. Procuring diverse datasets large enough to train text-parsing AI systems is an ongoing challenge for researchers; modern deep learning models, which mimic the behavior of neurons in the human brain, improve when trained on millions, or even billions, of annotated examples.
One popular solution is pretraining, which refines general-purpose language models trained on unlabeled text to perform specific tasks. Google this week open-sourced its cutting-edge take on the technique — Bidirectional Encoder Representations from Transformers, or BERT — which it claims enables developers to train a “state-of-the-art” NLP model in 30 minutes on a single Cloud TPU (tensor processing unit, Google’s cloud-hosted accelerator hardware) or a few hours on a single graphics processing unit.
The release is available on Github, and includes pretrained language representation models (in English) and source code built on top of the Mountain View company’s TensorFlow machine learning framework. Additionally, there’s a corresponding notebook on Colab, Google’s free cloud service for AI developers,
As Jacob Devlin and Ming-Wei Chang, research scientists at Google AI, explained, BERT is unique in that it’s both bidirectional, allowing it to access context from both past and future directions, and unsupervised, meaning it can ingest data that’s neither classified nor labeled. That’s as opposed to conventional NLP models such as word2vec and GloVe, which generate a single, context-free word embedding (a mathematical representation of a word) for each word in their vocabularies.
BERT learns to model relationships between sentences by pretraining on a task that can be generated from any corpus, Devlin and Chang wrote. It builds on Google’s Transformer, an open source neural network architecture based on a self-attention mechanism that’s optimized for NLP. (In a paper published last year, Google showed that Transformer outperformed conventional models on English to German and English to French translation benchmarks while requiring less computation to train.)
When tested on the Stanford Question Answering Dataset (SQuAD), a reading comprehension dataset comprising questions posed on a set of Wikipedia articles, BERT achieved 93.2 percent accuracy, besting the previous state-of-the-art and human-level scores of 91.6 percent and 91.2 percent, respectively. And on the General Language Understanding Evaluation (GLUE) benchmark, a collection of resources for training and evaluating NLP systems, it hit 80.4 percent accuracy.
The release of BERT follows on the heels of the debut of Google’s AdaNet, an open source tool for combining machine learning algorithms to achieve better predictive insights, and ActiveQA, a research project that investigates the use of reinforcement learning to train AI agents for question answering.
Source: VentureBeat
To Read Our Daily News Updates, Please visit Inventiva or Subscribe Our Newsletter & Push.

Google open-sources BERT, a state-of-the-art pretraining technique for natural language processing

Nitin Naresh

Read Next

For The First Time In Years, Modi Blinked. How India’s Gen Z Forced A Political Retreat And Raised Questions About His Invincibility

India Won The Race To E20. But Did It Get The Transition Right? The Next Challenge For India’s Ethanol Revolution Isn’t Producing More Fuel

Will Adani Launch An Airline? Should The Owner Of Critical Aviation Infrastructure Also Become A Competitor Within That Same Ecosystem?

After OpenAI’s AI Hacked Another Company’s Systems, The Debate Over AI Safety Just Got Real

A Two-Year Reprieve, Then A 200% Tariff. The Clock Starts Now For India’s Pharma Industry

Why Is The Indian Rupee Sliding Again? RBI’s Hands-Off Approach Leaves Markets Guessing

India’s Markets Are Changing. The Easy Money Is Gone. The Winners, The Losers And The Biggest Bets Still To Come

Trump Didn’t Just Change America. He Changed How The World Sees American Democracy. Has America Started To Look Like India Politically?

Inside Groww’s Bold Plan To Expand Beyond Brokerage Without Losing Its Technology-First Edge. AI, Wealth Management And Lending Are All Part Of Groww’s Biggest Bet Yet.

Government Opens Talks With Cockroach Janta Party. But Can A Meme Become India’s Next Political Force?

For The First Time In Years, Modi Blinked. How India’s Gen Z Forced A Political Retreat And Raised Questions About His Invincibility

India Won The Race To E20. But Did It Get The Transition Right? The Next Challenge For India’s Ethanol Revolution Isn’t Producing More Fuel

Will Adani Launch An Airline? Should The Owner Of Critical Aviation Infrastructure Also Become A Competitor Within That Same Ecosystem?

After OpenAI’s AI Hacked Another Company’s Systems, The Debate Over AI Safety Just Got Real

A Two-Year Reprieve, Then A 200% Tariff. The Clock Starts Now For India’s Pharma Industry

Why Is The Indian Rupee Sliding Again? RBI’s Hands-Off Approach Leaves Markets Guessing

India’s Markets Are Changing. The Easy Money Is Gone. The Winners, The Losers And The Biggest Bets Still To Come

Trump Didn’t Just Change America. He Changed How The World Sees American Democracy. Has America Started To Look Like India Politically?

Inside Groww’s Bold Plan To Expand Beyond Brokerage Without Losing Its Technology-First Edge. AI, Wealth Management And Lending Are All Part Of Groww’s Biggest Bet Yet.

Government Opens Talks With Cockroach Janta Party. But Can A Meme Become India’s Next Political Force?

Leave a Reply Cancel reply

Acer may shutter or sell StarVR after location-based VR revenues sink

Covid-19:Why Indians might struggle against the Possible pandemic’s third wave?

The death of democracy in India

Indonesia short on oxygen, seeks help as virus cases soar

The solar storms will hit the Earth and cause disruption in GPS and mobile connectivity.

Floods- Why are Pune and Mumbai prone to it?

Read Next

For The First Time In Years, Modi Blinked. How India’s Gen Z Forced A Political Retreat And Raised Questions About His Invincibility

India Won The Race To E20. But Did It Get The Transition Right? The Next Challenge For India’s Ethanol Revolution Isn’t Producing More Fuel

Will Adani Launch An Airline? Should The Owner Of Critical Aviation Infrastructure Also Become A Competitor Within That Same Ecosystem?

After OpenAI’s AI Hacked Another Company’s Systems, The Debate Over AI Safety Just Got Real

A Two-Year Reprieve, Then A 200% Tariff. The Clock Starts Now For India’s Pharma Industry

Why Is The Indian Rupee Sliding Again? RBI’s Hands-Off Approach Leaves Markets Guessing

India’s Markets Are Changing. The Easy Money Is Gone. The Winners, The Losers And The Biggest Bets Still To Come

Trump Didn’t Just Change America. He Changed How The World Sees American Democracy. Has America Started To Look Like India Politically?

Inside Groww’s Bold Plan To Expand Beyond Brokerage Without Losing Its Technology-First Edge. AI, Wealth Management And Lending Are All Part Of Groww’s Biggest Bet Yet.

Government Opens Talks With Cockroach Janta Party. But Can A Meme Become India’s Next Political Force?

Subscribe to our mailing list to get the new updates!

Diablo: Immortal is coming to mobile

Twitter is sorry about that whole ‘Kill all Jews’ thing

Related Articles

Leave a Reply Cancel reply

Acer may shutter or sell StarVR after location-based VR revenues sink

Covid-19:Why Indians might struggle against the Possible pandemic’s third wave?

The death of democracy in India

Indonesia short on oxygen, seeks help as virus cases soar

The solar storms will hit the Earth and cause disruption in GPS and mobile connectivity.

Floods- Why are Pune and Mumbai prone to it?