Improving task-agnostic BERT distillation with layer mapping search

Authors

Keywords

Pre-trained language models, Bert, Knowledge distillation, Task-agnostic, Layer mapping

Journal

Volume 461, Issue -, Pages 194-203

Publisher

Elsevier BV

Online

2021-07-22

DOI

10.1016/j.neucom.2021.07.050

References

View 5 related references

Contact the author

Discuss science. Find collaborators. Network.

Join a conversation

Explore over 38,000 international journals covering a vast array of academic fields.