Expression of Interest- Red Team
ExternalPrepare for this interview
EliteAI-generated questions, company research, and talking points tailored to this role
Benefits
Additional Information
About the AI Security Institute The AI Security Institute is the world's largest and best-funded team dedicated to understanding advanced AI risks and translating that knowledge into action. We're in the heart of the UK government with direct lines to No. 10 (the Prime Minister's office), and we work with frontier developers and governments globally. We're here because governments are critical for advanced AI going well, and UK AISI is uniquely positioned to mobilise them. With our resources, unique agility and international influence, this is the best place to shape both AI development and government action. Expression of Interest - Red Team Our Red Team is open to expressions of interest from talented research engineers and scientists who want to work at the forefront of frontier AI safety and security but do not see a role listed that matches their skills and experience. Our talent team will reach out should a role arise that may be a good match. Team Description Our Red Team conducts cutting edge research to identify, evaluate and stress test vulnerabilities in frontier AI systems, sharing our findings directly with leading AI companies, UK officials and allied governments to inform deployment, research and policy decisions. The Red Team is composed of three specialised sub teams, each tackling a distinct set of challenges. While each sub team has its own focus, we work closely together, sharing methodologies, tooling, research insights and findings across the wider Red Team. Many of our most impactful projects draw on expertise from across all three sub teams, and we actively encourage collaboration in working to tackle the most pressing challenges in frontier AI safety and security. Alignment The Alignment sub-team focuses on detecting, evaluating and understanding misalignment in frontier AI systems. Our work centres on loss of control risks, including deceptive alignment, research sabotage and self-exfiltration attempts. We carry out novel research to develop techniques for finding misalignment, investigate how to attribute misaligned behaviour to more fundamental concerns such as instrumental convergence, and conduct pre and post deployment evaluations. We share our findings with frontier AI companies and with the UK and allied governments to inform deployment, research and policy making, and we work directly with safety teams at frontier labs to help improve their alignment training and monitoring methodologies. Misuse The Misuse sub-team focuses on stress testing frontier AI safeguards for dangerous capabilities, researching novel attack vectors and developing advanced automated attack tooling. Our work probes the robustness of safeguards against real world threats, helping to identify where frontier systems may be vulnerable to exploitation and where defences need to be strengthened. We share our findings with frontier AI companies, including Anthropic, OpenAI and DeepMind, as well as with key UK officials and other governments, to inform their deployment, research and policy decisions and to support the development of more resilient safeguards across the frontier AI ecosystem. Control The Control sub-team partners with leading frontier AI companies to stress test control measures designed to prevent AI systems from causing harm. Drawing on techniques from adversarial machine learning, we develop algorithms to uncover a wide range of failures in control measures and use these findings to assess and strengthen them. These partnerships allow us to directly influence vital control measures, while our position within government enables us to bring our understanding of the current state of control measures to wider government as critical deployment, research, and policy decisions are made. Salary £65,000 - £145,000 GBP
Your Match
How well this role fits your profile.
Company Intel
What employees say
Worked at aisi? Share your experience