Haize Labs' "haizing suite" is a collection of search and optimization algorithms designed to probe large language models (LLMs) for weaknesses. It helps identify security vulnerabilities and alignment flaws in AI systems by crawling the space of inputs to LLMs with the objective of producing harmful model outputs4. The suite includes various algorithms such as evolutionary programming, reinforcement learning, multi-turn simulations, gradient-based methods, and more.
The founders of Haize Labs are Leonard Tang, Richard Liu, and Steve Li1. They are all formerly classmates at Harvard University.
Haize Labs has found that models like Vicuna and Mistral, which don't explicitly perform safety finetuning, are the easiest to jailbreak. On the other hand, Claude by Anthropic has proven to be the most difficult to jailbreak.