Improving task-agnostic BERT distillation with layer mapping search

Title
Improving task-agnostic BERT distillation with layer mapping search
Authors
Keywords
Pre-trained language models, Bert, Knowledge distillation, Task-agnostic, Layer mapping
Journal
NEUROCOMPUTING
Volume 461, Issue -, Pages 194-203
Publisher
Elsevier BV
Online
2021-07-22
DOI
10.1016/j.neucom.2021.07.050

Ask authors/readers for more resources

Reprint

Contact the author

Discover Peeref hubs

Discuss science. Find collaborators. Network.

Join a conversation

Find the ideal target journal for your manuscript

Explore over 38,000 international journals covering a vast array of academic fields.

Search