Examining the performance of several deep learning architectures for Persian Named Entity Recognition (NER)

As part of my final project to obtain my bachelor’s degree, I undertook research project focused on Named Entity Recognition (NER) using deep algorithms. NER is a crucial task in text mining that involves classifying entities into different categories like person, location, organization, etc. The project aimed to compare the performance of various deep learning models on a Persian dataset, using the open-source ArmanPersoNERCorpus database.

Background and Motivation

Traditionally, NER systems have been divided into classic and statistical approaches. Classic approaches rely on handcrafted rules, while statistical approaches leverage the capabilities of deep neural networks. With recent advancements in deep learning, there has been a growing interest in statistical approaches for NER. This project aimed to explore deep learning algorithms for NER in the Persian language, which is an essential area within artificial intelligence and deep learning.

Dataset Description

The dataset used in this project consisted of 250,015 tokens and 7,682 Persian sentences. It was divided into 3 folds for training and testing purposes. Each file in the dataset contained one token per line along with its manually annotated named-entity tag. The NER tags followed the IOB format, and the dataset included annotations for person, organization, location, facility, product, event, and other categories.

Methodology

To compare the performance of different deep learning models, I implemented and evaluated four models:

  1. Bi-directional LSTM-softmax
  2. Bi-directional LSTM-CRF
  3. Bi-directional LSTM-CNN’s-CRF

These models were chosen as they are commonly used in NER tasks and have shown promising results in various languages. By training and testing each model on the Persian dataset, I aimed to determine the best architecture based on the accuracy of the designed models.