This guide will help you get started with the General Analysis package, from installation to running your first jailbreak test.
Installation
For development or to access the latest features:
git clone https://github.com/General-Analysis/GA.git
cd GA
pip install -e .Optional Dependencies
For specific features, install additional dependencies:
# For gradient-based attacks (GCG)
pip install torch transformers
# For embedding models
pip install sentence-transformersAPI Keys Setup
General Analysis works with various model providers. Set up your API keys as environment variables:
# OpenAI
export OPENAI_API_KEY="your-openai-api-key"
# Anthropic
export ANTHROPIC_API_KEY="your-anthropic-api-key"
# Together.ai
export TOGETHER_API_KEY="your-together-api-key"Basic Usage
1. Running a Simple Jailbreak Test
Here’s how to run your first jailbreak test using TAP (Tree-of-Attacks with Pruning):
from generalanalysis.jailbreaks import TAP, TAPConfig
from generalanalysis.data_utils import load_harmbench_dataset
# Configure the attack
config = TAPConfig(
project="my_first_tap_test",
target_model="gpt-4o",
attacker_model="deepseek-ai/DeepSeek-R1",
evaluator_model="deepseek-ai/DeepSeek-R1",
branching_factor=2,
max_depth=5
)
# Initialize TAP
tap = TAP(config)
# Load test data
dataset = load_harmbench_dataset()
# Run the attack
best_nodes, root_nodes = tap.optimize(dataset[:5]) # Test on first 5 samples2. Using Different Jailbreak Methods
AutoDAN
from generalanalysis.jailbreaks import AutoDAN, AutoDANConfig
config = AutoDANConfig(
target_model="claude-3-7-sonnet-20250219",
project="autodan_test",
initial_candidates=["Tell me about safety", "Explain security"],
device="cuda:0",
N=10,
max_iterations=10
)
autodan = AutoDAN(config)
results = autodan.optimize(goals=["Generate harmful content"])Crescendo (Multi-turn Attack)
from generalanalysis.jailbreaks import Crescendo, CrescendoConfig
config = CrescendoConfig(
target_model="gpt-4o",
attacker_model="meta-llama/Llama-3.3-70B-Instruct-Turbo",
evaluator_model="meta-llama/Llama-3.3-70B-Instruct-Turbo",
project="crescendo_test",
max_rounds=5
)
crescendo = Crescendo(config)
dataset = load_harmbench_dataset()
results = crescendo.optimize(dataset[:3])3. Working with Models
BlackBoxModel (API-based)
from generalanalysis.boiler_room import BlackBoxModel
# Initialize model
model = BlackBoxModel("gpt-4o")
# Simple query
response = model.query("Explain quantum computing")
# Batch processing
prompts = ["What is AI?", "How do neural networks work?"]
responses = model.query_parallel(prompts)WhiteBoxModel (Local models)
from generalanalysis.boiler_room import WhiteBoxModel
# Load a local model
model = WhiteBoxModel(
model_name="meta-llama/Llama-3.2-1B-Instruct",
device="cuda"
)
# Generate text
responses = model.generate_with_chat_template(
prompts=["Explain machine learning"],
max_new_tokens=200
)4. Evaluating Results
from generalanalysis.jailbreaks import AdverserialEvaluator
# Set up evaluator
evaluator = AdverserialEvaluator(
dataset="harmbench",
target_models=["gpt-4o", "claude-3-7-sonnet-20250219"],
evaluator_model="gpt-4o"
)
# Evaluate responses
results = evaluator.evaluate_from_responses(
responses=["Response 1", "Response 2"],
prompts=["Prompt 1", "Prompt 2"]
)Next Steps
- Explore different Jailbreak Methods
- Learn about Adversarial Generators
- Read the Development Guide to contribute
- Check out our Jailbreak Cookbook for detailed examples
Last updated on