Named Entity Recognition using BERT and ELMo

In this project, we contrasted two different methods for Named Entity Recognition on 3 benchmark datasets. The first method was the biLSTM-CRF neural arcitecture that has been the norm for Named Entity Recognition versus using language models like BERT or ELMo. We use the contextual word embeddings from ELMo and pretrain BERT for Named Entity Recognition. The datasets we used were the CoNLL 2003 dataset, MIT movie dataset and the GMB NER dataset. In our analysis, we find that language models do better than the neural architecture with ELMo outperforming both BERT and bi-LSTM-CRF.

language models
ner
python
bert
elmo
pytorch
allennlp
conll 2003