Safeguarding AI: Effectiveness of Guardrails in Controlling Malicious Output from Locally Hosted LLMs

This paper explores the effectiveness of open-source guardrails that can be added to LLM-based conversational applications to mitigate the threat of potential misuse.

By

Jared McWherter

August 21, 2024

All papers are copyrighted. No re-posting of papers is permitted

Related Content

Cyber Defense Essentials

October 6, 2021

Can We Move Past Blocklists to Automated Takedowns?

Penetration Testing and Ethical Hacking

October 6, 2021

Striking from the Shadows: Applying and Analyzing Mitigation Techniques to Bypass Antivirus Payload Detection

September 16, 2021

Cloud Multi-Account Policy Enforcement