[ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.
-
Updated
Jun 7, 2024 - Jupyter Notebook
[ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.
[ICML 2024] TrustLLM: Trustworthiness in Large Language Models
A curated list of awesome academic research, books, code of ethics, data sets, institutes, newsletters, principles, podcasts, reports, tools, regulations and standards related to Responsible AI, Trustworthy AI, and Human-Centered AI.
🐢 Open-Source Evaluation & Testing for LLMs and ML models
Moonshot - A simple and modular tool to evaluate and red-team any LLM application.
AI-HCI research project with the aim to study the key factors affecting trust in an AI system recommendations.
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
Evaluation & testing framework for computer vision models
The open-sourced Python toolbox for backdoor attacks and defenses.
Code from PLDI '23 paper "Architecture-Preserving Provable Repair of Deep Neural Networks."
code & data of PoisonedRAG paper
Code for paper "FreezeAsGuard: Mitigating Illegal Adaptation of Diffusion Models via Selective Tensor Freezing"
We make Generative AI accessible to Federal agencies and businesses. Easy-to-use ezGPT™ platform eliminates the need for in-house expertise and delivers pre-built solutions for rapid innovation. With security and privacy at its core, we unlock the potential of AI. Our innovative chatbot guides users, ensuring a smooth and successful experience.
We make Generative AI accessible to Federal agencies and businesses. Easy-to-use ezGPT™ platform eliminates the need for in-house expertise and delivers pre-built solutions for rapid innovation. With security and privacy at its core, we unlock the potential of AI. Our innovative chatbot guides users, ensuring a smooth and successful experience.
Birhanu Eshete is an Associate Professor of Computer Science at the University of Michigan, Dearborn. His main research focus is in trustworthy machine learning with emphasis on security, safety, privacy, interpretability, fairness, and the dynamics thereof. He also studies online cybercrime and advanced and persistent threats (APTs).
Breaking the Trilemma of Privacy, Utility, Efficiency via Controllable Machine Unlearning
In the dynamic landscape of medical artificial intelligence, this study explores the vulnerabilities of the Pathology Language-Image Pretraining (PLIP) model, a Vision Language Foundation model, under targeted attacks like PGD adversarial attack.
Optimization-based deep learning models can give explainability with output guarantees and certificates of trustworthiness.
Add a description, image, and links to the trustworthy-ai topic page so that developers can more easily learn about it.
To associate your repository with the trustworthy-ai topic, visit your repo's landing page and select "manage topics."