FlowersML_conf
Conference Programme

Two days
Three tracks

4 Keynotes · 24 Talks · 6 Lightning Sessions2 Panels · 200 Attendees
Day 01

Saturday, 11 July 2026

8:30 amRegistration & Morning Coffee30m
9:00 am·Keynote·Plenary
20 min

Opening Remarks

Dr. Leah Ramirez
9:20 am·Keynote·Plenary
45 min

The Bitter Lesson Revisited: What Scaling Actually Taught Us

Prof. Bernard Okafor
10:05 amShort Break10m
TalkResearch

Toward Mechanistic Interpretability at Scale: Sparse Autoencoders Beyond Toy Models

Dr. Jason Lee
TalkApplied

Shipping LLM Products Without Breaking Everything: Lessons from 18 Months in Production

Samantha Patel
TalkEthics

Auditing Foundation Models: Methodologies, Gaps, and Who Should Pay for It

Malik Mensah
TalkResearch

Data Mixture Laws: How Corpus Composition Predicts Downstream Capability

Dr. Silvia Rossi
TalkApplied

Evaluation Pipelines That Don't Lie to You: Building Ground Truth at Scale

Jonathan Kim
TalkEthics

Environmental Accounting in ML: Why Current Carbon Estimates Are Almost Certainly Wrong

Dr. Emma Park
Lightning TalksResearch

Lightning Talks - Research Track

  • RoPE Embeddings Don't Generalise the Way You Think
    Michael Chen
  • Emergent Structure in Latent Space: A Geometric Perspective
    Priyanka Das
  • Forgetting as a Feature: Controlled Unlearning in Fine-Tuned Models
    Dr. Tomas Alvarez
Lightning TalksApplied

Lightning Talks - Applied Track

  • Context Window Management for Long-Running Agents
    Grace Liu
  • When RAG Goes Wrong: A Postmortem Anthology
    Daniel Gomez
  • Prompt Versioning in the Real World
    Maya Kaur
Lightning TalksEthics

Lightning Talks - Ethics Track

  • Consent at Scale: Can Users Actually Opt Out?
    Dr. Lucia Bianchi
  • Regulatory Divergence and What It Means for Model Deployment
    Kevin O'Neil
  • Red-Teaming as an Ethics Practice, Not Just a Safety One
    Sofia Nordin
12:00 pmLunch75m
1:15 pm·Keynote·Plenary
45 min

Agents in the Wild: What Six Months of Real Deployments Actually Looked Like

Angela Rivera
TalkResearch

Superposition and Polysemanticity: New Evidence from Activation Patching at Depth

Dr. Marcus Evans
TalkApplied

The Latency Tax: Real Costs of Chain-of-Thought in Customer-Facing Products

Lauren Schneider
TalkEthics

Differential Privacy for Fine-Tuning: What Practitioners Actually Need to Know

Dr. Aisha Rahman
TalkResearch

Long-Context Faithfulness: Measuring How Well Models Actually Use What You Give Them

Noah Bekele
TalkApplied

Structured Generation Without the Footguns: A Practitioner's Honest Assessment

Viktor Petrov
TalkEthics

Disparate Impact Across Language: Benchmarking Multilingual Models for Fairness

Dr. Heidi Sørensen
3:10 pmAfternoon Break25m
3:35 pm·Panel·Plenary
60 min

Is the Benchmark Era Over? How We Know Whether Models Are Actually Getting Better

Dr. Natalie BrooksProf. David IbehDr. Lila AhmedMarcus ValdezCaroline Dubois
4:35 pm·Keynote·Plenary
25 min

Day One Close & Community Announcements

Dr. Brian Choi
5:00 pmEvening Reception - Foyer & Rooftop Terrace90m
Day 02

Sunday, 12 July 2026

8:45 amMorning Coffee15m
9:00 am·Keynote·Plenary
45 min

Reasoning Without Shortcuts: Building Models That Fail Gracefully

Dr. Hannah McAllister
9:45 amShort Break10m
TalkResearch

Reward Hacking at Deployment: Characterising Sycophancy in RLHF-Trained Systems

Dr. Amir Khan
TalkApplied

Multi-Modal Pipelines in Production: The Failure Modes Nobody Blogs About

Ken Tanaka
TalkEthics

Model Transparency Reports: What They Should Contain and Why They Don't

Astrid Larsen
TalkResearch

Chain-of-Thought Is Not Reasoning: A Philosophical and Empirical Challenge

Prof. Julia Kostova
TalkApplied

Observability for LLM Applications: Tracing What Actually Matters

Sebastian Ortega
TalkEthics

Labor, Annotation, and the Workers Behind Every "Clean" Dataset

Dr. Zainab Musa
TalkResearch

Speculative Decoding at the Edge: Accuracy, Latency, and the Real Tradeoffs

Tanveer Shah
TalkApplied

Fine-Tuning vs. Prompting vs. RAG: A Framework for Actually Deciding

Bridget Weiss
TalkEthics

Concentration Risk in AI Infrastructure: What Happens When Three Providers Go Down

Dr. Felix Mwangi
11:40 amLunch80m
1:00 pm·Keynote·Plenary
45 min

The Quiet Infrastructure: Who Owns the Compute That Shapes What Gets Built

Prof. Chong Park
TalkResearch

Measuring Calibration in Instruction-Tuned Models Under Distribution Shift

Dr. Pablo Serrano
TalkApplied

Cost-Optimal Inference: A Practical Guide to Routing, Batching, and Caching

Mei Chan
TalkEthics

Meaningful Human Oversight: What It Requires in Practice

Dr. Rachel Mensah
TalkResearch

Towards Formal Verification of Safety Properties in Neural Networks

Dr. Artem Volkov
TalkApplied

The Prompt Injection Problem Is Not Solved: A Survey of What Works and What Doesn't

Yosef Azulay
TalkEthics

Child Safety and Foundation Models: Gaps in Current Practice

Dr. Claire Reid
2:55 pmAfternoon Break20m
3:15 pm·Panel·Plenary
60 min

Where Is This All Going? A Disagreement About the Next Five Years

Prof. Olivia GrantDr. Eric WuProf. Maria DelgadoDr. Deepak MehtaGrace ZhuThomas Winkler
4:15 pm·Keynote·Plenary
35 min

Closing Keynote: Staying Serious in a Hype Cycle

Dr. Elise Nguyen
4:50 pmConference Close10m