Defending Against Prompt Injection with StruQ and SecAlign
Overview of StruQ and SecAlign defenses to mitigate prompt injection in LLM-powered apps, with Secure Front-End concepts and evaluation results.
Items tagged with “Berkeley”.
Overview of StruQ and SecAlign defenses to mitigate prompt injection in LLM-powered apps, with Secure Front-End concepts and evaluation results.
Recent advances in Large Language Models (LLMs) enable exciting LLM-integrated applications. However, as LLMs have improved, so have the attacks against them. Prompt injection attack is listed as the #1 threat by OWASP to LLM-integrated applications, where an LLM input contains a trusted prompt (ins
PLAID enables simultaneous generation of protein sequences and 3D structures by sampling the latent space of folding models, leveraging large sequence databases and diffusion on embeddings.
PLAID jointly generates protein 1D sequences and 3D structures by learning the latent space of protein folding models. It enables function- and organism-guided prompts and decodes structure with frozen folding-model weights.
Berkeley researchers deployed 100 RL-controlled vehicles on a live highway to dampen stop-and-go waves, improving traffic flow and cutting energy use for all drivers.
100 RL-controlled cars deployed on I-24 during rush hour to dampen stop-and-go waves, improve throughput, and reduce fuel use for all road users. Decentralized controllers rely on basic radar sensors and local observations.
Anthology conditions language models on richly detailed backstories to simulate representative, consistent, and diverse virtual personas for surveys and social science research.
A method to steer LLMs toward representative, consistent virtual personas by generating naturalistic backstories and using them as conditioning context, enabling individualized simulations and scalable user studies.
A Berkeley AI study finds ChatGPT favors Standard American English, shows poorer comprehension and more stereotyping for non‑standard English varieties, and can amplify dialect discrimination in GPT‑3.5 and GPT‑4.
Analysis of how ChatGPT responds to different English dialects, highlighting biases against non-standard varieties and implications for global users.
StrongREJECT advances jailbreak evaluation by pairing a high-quality forbidden-prompt dataset with automated evaluators aligned to human judgments, delivering more reliable measurements of jailbreak effectiveness against frontier LLMs.
Overview of a high-quality jailbreak benchmark with dual automated evaluators, a 313-prompt dataset, and findings that many jailbreaks underperform claims from earlier work.
A new MIQA benchmark tests Large Multimodal Models on visual retrieval and reasoning across 1–10K images, revealing key limitations and introducing MIRAGE, a single-stage approach to scale LMMs.
Benchmark for long-context visual reasoning across large, uncorrelated image sets; introduces MIRAGE to extend LMMs beyond single-image VQA.
TinyAgent shows small language models can be fine-tuned for reliable function calling and edge deployment, using curated synthetic data, an LLMCompiler planner, and a Tool RAG approach to power private, low-latency agentic workflows.
A study demonstrating how small language models can perform accurate function calling at the edge using a curated data pipeline, an LLMCompiler-based planner, and on-device execution with macOS integrations.
xT enables end-to-end modeling of gigapixel-scale images on modern GPUs using nested tokenization, region encoders, and long-context vision, delivering high fidelity and context on images up to 29,000×25,000 pixels.
End-to-end modeling of extremely large images on contemporary GPUs via nested tokenization and region/context encoders, delivering richer context with lower memory footprints.
Overview of BAIR Lab's 2024 AI PhD graduates, their research areas, advisors, and contact links, with profiles, research blurbs, and URLs for recruiting and collaboration.
Directory of BAIR Lab PhD graduates featuring research interests, advisor(s), and contact details to facilitate collaboration and recruitment.