The Boiler Room module provides a unified interface for interacting with various language models, abstracting away the complexities of different model APIs and allowing for consistent querying, batching, and error handling. Some key features include parallel querying, embedding models, and system prompt management.

Overview

Boiler Room offers two primary model classes:

  • BlackBoxModel: For interacting with API-based models like OpenAI GPT models, Anthropic Claude models, and Together.ai hosted models.
  • WhiteBoxModel: For loading and interacting with locally-hosted models using Hugging Face Transformers.

Both model classes provide standardized methods for generating text, handling errors, and managing retries consistently.

Supported Models

The Boiler Room supports a wide range of models including:

OpenAI Models

  • GPT-4o, GPT-4o-mini, GPT-4-turbo
  • GPT-4.5-preview-2025-02-27
  • GPT-4-0125-preview, GPT-4-0613
  • GPT-3.5-turbo
  • o1, o1-mini, o3-mini, o3-mini-2025-01-31
  • Text embedding models (text-embedding-3-small, text-embedding-3-large, text-embedding-ada-002)

Anthropic Models

  • claude-3-7-sonnet-20250219
  • claude-3-5-sonnet-20241022, claude-3-5-sonnet-20240620
  • claude-3-5-haiku-20241022
  • claude-3-sonnet-20240229

Together.ai Hosted Models

  • meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo
  • meta-llama/Llama-3.3-70B-Instruct-Turbo
  • meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo
  • mistralai/Mistral-Small-24B-Instruct-2501
  • mistralai/Mixtral-8x22B-Instruct-v0.1
  • deepseek-ai/DeepSeek-R1, deepseek-ai/DeepSeek-R1-Distill-Llama-70B
  • databricks/dbrx-instruct
  • Qwen/Qwen2.5-7B-Instruct-Turbo
  • google/gemma-2-27b-it

Basic Usage

from generalanalysis.boiler_room import BlackBoxModel
import torch  # Used for data types and tensor operations

# Initialize a model
model = BlackBoxModel("gpt-4o")

# Simple query
response = model.query("Explain quantum computing in simple terms")

# Query with system prompt
response = model.query(
    prompt="Write a tutorial on quantum computing",
    system_prompt="You are a quantum physics expert who explains concepts simply."
)

# Generate embeddings
embeddings = model.embed("This is a text to encode into vector space")

Parallel Querying

For batch processing or efficiency, you can query models in parallel:

from generalanalysis.boiler_room import BlackBoxModel

model = BlackBoxModel("claude-3-7-sonnet-20250219")

prompts = [
    "What is machine learning?",
    "How do neural networks work?",
    "Explain deep learning in simple terms"
]

# Process all prompts in parallel
responses = model.query_parallel(
    prompts=prompts,
    max_threads=10,
    temperature=0.5
)

for prompt, response in zip(prompts, responses):
    print(f"Prompt: {prompt}\nResponse: {response}\n")

Working with Locally-Hosted Models

For gradient-based techniques or direct model access, use the WhiteBoxModel:

from generalanalysis.boiler_room import WhiteBoxModel
import torch

model = WhiteBoxModel(
    model_name="meta-llama/Llama-3.2-1B-Instruct",
    device="cuda",
    dtype=torch.bfloat16
)

# Generate with the model's chat template
responses = model.generate_with_chat_template(
    prompts=["Explain neural networks"],
    max_new_tokens=200,
    temperature=0.7
)

print(responses[0])

Error Handling

The module includes robust error handling for API failures:

try:
    response = model.query("Tell me about quantum computing")
except Exception as e:
    print(f"Error querying model: {e}")

Next Steps