Job Description:
• Develop and run adversarial test suites—both manual and scripted—for LLMs and image / video models.
• Craft multilingual prompts, jailbreaks, and escalation chains targeting policy edge cases.
• Analyze outputs, triage failures, and write concise vulnerability reports.
• Contribute to internal tooling (e.g., prompt libraries, scenario generators, dashboards).
Requirements:
• Bachelor’s degree—or equivalent experience—in CS, data science, linguistics, international studies, or security.
• Basic proficiency with Python and command-line tools.
• Demonstrated interest in AI safety, adversarial ML, or abuse detection.
• Strong writing skills for short vulnerability reports and long-form analyses.
• Ability to rapidly context switch across domains, modalities, and abuse areas
• Excited to work in a fast-paced and ambiguous space
Benefits:
• Salary range: $60K–$70K depending on experience.
• Opportunity for spot bonuses and annual performance-based bonus.
• Fully remote (U.S.-based) with flexible hours.
• Comprehensive health, dental, and vision.
• Generous PTO and paid holidays.
• 401(k) plan.
• Professional-development stipend for courses, conferences, or language study.
• We reward excellence with growth—team members who excel have clear paths for promotion and skill development.
Apply Now
Apply Now