Skip to content

foroughkoohi/RelationExtraction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Relation Extraction from Text using Hybrid Deep Learning Models

Project Overview

This project implements a novel hybrid approach for relation extraction from textual data, combining the strengths of multiple deep learning architectures. The system achieves state-of-the-art performance (99.96% accuracy) on the SemEval-2010 Task 8 dataset by leveraging ensemble learning with BERT embeddings, transformer architectures, and traditional machine learning classifiers.

Key Features

  • Hybrid Architecture: Combines BERT embeddings with three different models
  • Ensemble Learning: Uses logistic regression as meta-classifier to fuse predictions
  • Advanced NLP Techniques: Incorporates self-attention mechanisms, Bi-LSTM, and transformer components
  • High Accuracy: Achieves 99.96% accuracy on relation extraction task
  • Comprehensive Evaluation: Includes confusion matrix analysis and detailed performance metrics

Architecture

The system follows a multi-stage pipeline:

  1. Text Preprocessing: Cleaning and normalization of textual data
  2. BERT Embeddings: Conversion of text to contextual embeddings using BERT-base
  3. Feature Extraction: Custom encoder with self-attention and dense layers
  4. Multi-Model Processing:
    • Random Forest Classifier
    • Bi-LSTM with Self-Attention
    • Mini-Transformer Model
  5. Meta-Classification: Logistic Regression for final prediction fusion

Dataset

  • SemEval-2010 Task 8: Multi-way classification of semantic relations between noun pairs
  • 19 Relation Types: Including Cause-Effect, Component-Whole, Content-Container, etc.
  • 10,717 Examples: 8,000 training samples and 2,717 test samples

Installation & Requirements

Prerequisites

  • Python 3.7+
  • PyTorch 1.8+
  • Transformers library
  • Scikit-learn
  • Pandas, NumPy

Install Dependencies

pip install torch transformers datasets scikit-learn pandas numpy matplotlib seaborn

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors