Experimental open source NLP project for similiarities and differences retreival
-
Updated
May 26, 2020 - Python
Experimental open source NLP project for similiarities and differences retreival
Repository for the LREC-COLING 2024 Paper: Persona-Based Corpus in the Diabetes Mellitus Domain – Applying a Human-Centered Approach to a Low-Resource Context
DSFSI South African Terminlogy Lists and Lexicon Project
Comparing between residual stream and highway stream in transformers(BERT) .
Embedding Evaluation Data for South African Languages
This repository is an initial pipeline for reading, processing, labelling and classifying unstructured annual reports of South African (SA) banks with the aim of identifying financial risk. It leveraged work by the Corporate Financial Information Environment-Final Report Structure Extractor (CFIE–FRSE) of El-Haj et al. which created a corpus of …
CNEP (Contrastive Notes Events Pre-training), Contrastive Learning with Clinical Notes and Events Data Pre-training from MIMIC-III
Clickbait detector for English tweets, trained on Webis-17 dataset
Generalizing Knowledge Acquisition with unsupervised automatic labeling for Special Cargo Domain.
A Roberta-based language model specially designed for Setswana, using the new PuoData dataset.
Implementation of Master Thesis on "Belief State for Visually Grounded, Task-Oriented Neural Dialogue Model"
Code base for the EMNLP 2021 paper, "Multi-granularity Textual Adversarial Attack with Behavior Cloning".
Pythonic wrappers for Cider/CiderD evaluation metrics. Provides CIDEr as well as CIDEr-D (CIDEr Defended) which is more robust to gaming effects. We also add the possibility to replace the original PTBTokenizer with the Spacy tekenizer (No java dependincy but slower)
Grapheme-to-phoneme rule-based converter for Polish in Go.
An always-a-work-in-progress combination of documentation and demo notebooks for working with the LatinCy models
The Structured Weighted Violation MIRA
Mapping research capabilities using contextual text embeddings
The data set contains cabinet statements from the South African government. Data was scraped from the governments website: https://www.gov.za/cabinet-statements
Repository for QA-based event detection and extraction from news and social media.
Add a description, image, and links to the nlproc topic page so that developers can more easily learn about it.
To associate your repository with the nlproc topic, visit your repo's landing page and select "manage topics."