Andrei Baroian

From pre-training to post-training to attacking LLMs.

I'm a MSc Computer Science - AI student at Leiden University, currently doing my thesis at SPY Lab (ETH Zurich) on prompt injection attacks, supervised by Jie Zhang and Florian Tramèr. I'm also part of the Robotics Safety Division at ETH Robotics Club, working on adversarial attacks against VLA models in humanoid robots.

My past research includes LLM pre-training (exploring architectural variants) and LLM post-training (speeding up GRPO training with Prompt Reuse), with smaller projects in LLM quantization and mechanistic interpretability. I also work as a data engineer at Akida and was a Teaching Assistant at Leiden University.

Excited about autoresearch, scared of cybersecurity capabilities of agents.

Research & Projects

MSc Thesis: (Image) Prompt Injection In Progress

SPY Lab, ETH Zurich — Supervised by Jie Zhang and Florian Tramèr

Feb 2026 – Present

Studying transferability of image prompt injections: adversarial images optimized on open-source VLMs that transfer to closed-source models (GPT, Gemini, Claude). Concurrently exploring prompt injection attacks on OpenClaw agents, designing self-propagating worm attacks that modify agents' internal goal files and spread to other agents.

Adversarial Attacks on VLA Models in Humanoid Robots In Progress

ETH Robotics Club — Robotics Safety Division

Jan 2026 – Present

Creating adversarial attacks against Vision-Language-Action (VLA) models in humanoid robots. Investigating how visual perturbations can override task instructions and induce harmful behaviors. Concurrently working on defenses.

Prompt Replay: Speeding Up GRPO with On-Policy Reuse of High-Signal Prompts

arXiv:2603.21177

Sep 2025 – Mar 2026

An overhead-free online data selection method for GRPO that reuses prompts (not trajectories) to preserve on-policy optimization. Buffers medium-difficulty prompts near a 50% pass rate to maximize learning signal, combining reused prompts with fresh samples using cooldown periods and reuse limits. Tested on Llama-3.2-3B and Qwen3-8B, reducing zero-variance prompts and accelerating early training gains.

arXiv

Crown, Frame, Reverse: Layer-Wise Scaling Variants for LLM Pre-Training

arXiv:2509.06518

Apr – Jul 2025

Explored architectural variants that redistribute capacity across transformer layers during pre-training. Introduced three layer-wise scaling patterns using linear interpolation of FFN widths and attention head counts. Pre-trained 180M parameter models on 5B tokens; all variants converged to better performance than an equal-cost isotropic baseline without loss of training throughput.

arXiv

Experience

Robotics Safety Researcher, ETH Robotics Club Jan 2026 – Present

ETH Zurich, Zurich

Part of Robotics Safety division of ETHRC. Exploring adversarial attacks and defenses of VLA models in humanoid robots.

Teaching Assistant, Automated Machine Learning Sep 2025 – Jan 2026

Leiden University

Grade assignments and provide feedback.
Guide students in selecting, understanding, and presenting research papers.

Data Engineer Mar 2025 – Present

Akida, The Hague

Develop filtering logic to detect construction projects in public-sector sources using heuristics & GenAI.
Build the extraction pipeline for summarization and structured information retrieval with LLMs.
Collaborate on annotation workflows and quality evaluation of LLMs.
Test, deploy, and monitor Azure applications.

Education

MSc Computer Science: Artificial Intelligence Aug 2024 – Jul 2026

Leiden University, The Netherlands — GPA: 8.5/10

Notable grades: Seminar in Deep Reinforcement Learning (10), Deep Learning (9.0), Seminar in Deep Learning (9.0), Natural Language Processing (9.0).

Exchange Semester — MSc Thesis at SPY Lab Feb 2026 – Jul 2026

ETH Zurich, Switzerland

Adversarial attacks on vision-language models. Supervised by Jie Zhang and Florian Tramèr.

BSc Entrepreneurship & Business Innovation Sep 2021 – Jul 2024

Tilburg University, The Netherlands