Genetic Algorithm
Generate adversarial prompts using evolutionary algorithms
The GACandidateGenerator
implements an evolutionary approach to adversarial prompt generation. It uses genetic algorithms to evolve a population of prompts through selection, crossover, and mutation operations.
Class Definition
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
helper_llm | str | (Required) | Model name to use for mutations |
elitism_rate | float | 0.1 | Percentage of top performers to preserve unchanged |
crossover_rate | float | 0.5 | Probability of crossover at each potential crossover point |
mutation_rate | float | 0.5 | Probability of mutation for each prompt |
Methods
generate_candidates
Generates a new population of prompts using genetic operations.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
jailbreak_method_instance | JailbreakMethod | (Required) | The jailbreak method being used |
prompts | List[str] | (Required) | Current population of prompts |
fitness_scores | List[float] | (Required) | Fitness scores for each prompt |
N | int | 10 | Target population size to generate |
Returns
A list of generated prompts that form the next generation.
Internal Operation
Selection
The generator uses a probabilistic selection method where prompts with higher fitness scores have a higher chance of being selected for crossover:
Crossover
The crossover operation combines parts of two parent prompts to create new variations:
Mutation
The mutation operation uses an LLM to generate variations of prompts:
Example Usage
Integration with Jailbreak Methods
The genetic algorithm generator is used in several jailbreak methods, particularly in AutoDAN: