Using garak to automate finding injections

## **PoC:** https://github.com/leondz/garak ## **Details** `garak` checks if an LLM can be made to fail in an way we don't want. `garak` probes for hallucination, data leakage, prompt injection, misinformation, toxicity generation, jailbreaks, and many other weaknesses. If you know `nmap`, it's `nmap` for LLMs. Best used directly. `Garak` is great for regression testing also. Can be used to test visual / multimodal prompt injection too: Prompts can then be further segmented into things like `TextPrompt`, `MultiStepTextPrompt`, `VisualPrompt`, `VisualTextPrompt` and other such constructs to that on the base functions available to allow use with different and even mixed prompt modalities for models that can accept various input patterns. Rough example: ```python class Prompt: text = None def str(self) return self.text class TextPrompt(Prompt): def __init__(self, text: str): self.text = text class VisualTextPrompt(Prompt): image def __init__(self, text: str, image_path: str): self.text = text try: Image.open(image_path) except Exception: logger.error(f"No image found at: {image_path}") ``` [Great usage video by embrace the red](https://www.youtube.com/watch?v=f713_sFqItY) As of 7/2/2025, it also supposes adversarial audio amongst this multimodal capabilities and multilingual support ID: AML.T0051, AML.T0054