In this project, we contrasted two different methods for Named Entity Recognition on 3 benchmark datasets. The first method was the biLSTM-CRF neural arcitecture that has been the norm for Named Entity Recognition versus using language models like BERT or ELMo. We use the contextual word embeddings from ELMo and pretrain BERT for Named Entity Recognition. The datasets we used were the CoNLL 2003 dataset, MIT movie dataset and the GMB NER dataset. In our analysis, we find that language models do better than the neural architecture with ELMo outperforming both BERT and bi-LSTM-CRF.