## **PoC** [The code](https://github.com/facebookresearch/PurpleLlama ) - Using Purple LLama, a fine-tuned safety LLM, containing tools and evals for Cyber Security and Input/Output safeguards. Not currently including prompt injection defenses. ## **Details** [Benchmarks here](https://huggingface.co/spaces/facebook/CyberSecEval) show the efficacy of approaches like Purple Llama against willingness to help with cyber security attacks, showing that it has a notable positive effect on the unsafe outputs of LLMs. [Paper - cyberseceval2](https://ai.meta.com/research/publications/purple-llama-cyberseceval-a-benchmark-for-evaluating-the-cybersecurity-risks-of-large-language-models/) Paper -[cyberseceval3](https://ai.meta.com/research/publications/cyberseceval-3-advancing-the-evaluation-of-cybersecurity-risks-and-capabilities-in-large-language-models/)