ICML 2026 Workshop

From Frames to Stories (F2S)

Toward Reliable, Controllable and Trustworthy Long-Horizon Video Generation
Location: Seoul, South Korea
Date: Friday, July 10, 2026

Overview

Video generation has advanced rapidly for short clips, yet generating long, multi-shot videos that remain coherent, controllable, and reliable is still an open challenge. Across minutes of generation, current systems often suffer from identity drift, scene inconsistency, narrative breakdown, and weak responsiveness to user intent. These challenges make long-horizon video generation a compelling testbed for long-context multimodal modeling, structured generation, interactive systems, and evaluation.

F2S brings together researchers working on the core scientific and practical questions behind this transition from frames to stories. We are broadly interested in methods that maintain consistency over time, support richer forms of control and revision, and enable rigorous evaluation of long-form generation. The workshop welcomes work spanning model design, memory and state tracking, planning, editing, multimodal interaction, datasets, benchmarks, and real-world systems for long-horizon video creation.

Our goal is to foster a shared research agenda around reliable, controllable, and trustworthy long-horizon video generation, while creating space for perspectives from generative modeling, multimodal learning, interactive machine learning, and creative applications.

Key Questions

  • Q1 — Persistent State: What minimal, compressible, compact state representation (entities, relations, events) must be carried and updated across minutes of generation, and how can models maintain it under tight compute/memory budgets?
  • Q2 — Interactive Control: How to enable multi-modal, creator-facing interaction with rich, compositional control signals (shot plans, localized edits, multimodal constraints, actions) over minutes-long generation?
  • Q3 — Evaluation: What kinds of benchmarks and protocols can separate long-horizon state/narrative consistency from short-term visual quality, and measure drift and constraint/control satisfaction robustly and reproducibly?

Call for Papers

We invite submissions on all aspects of long-horizon video generation, with a focus on reliability, controllability, and evaluation. Topics include but are not limited to:

Models, Memory & Long-Context Generation

Control, Editing & Interactive Systems

Benchmarks, Data & Trustworthy Evaluation

Submission URL: OpenReview

Format: All submissions must be in PDF format and anonymized. Submissions are limited to four content pages, including all figures and tables; unlimited additional pages containing references and supplementary materials are allowed. Reviewers may choose to read the supplementary materials but will not be required to. Camera-ready versions may go up to five content pages.

Style file: You must format your submission using the ICML 2026 LaTeX style file. Please include the references and supplementary materials in the same PDF as the main paper. The maximum file size for submissions is 50MB. Submissions that violate the ICML style (e.g., by decreasing margins or font sizes) or page limits may be rejected without further review.

Dual-submission policy: We welcome ongoing and unpublished work. We will also accept papers that are under review at the time of submission, or that have been recently accepted without published proceedings.

Non-archival: The workshop is a non-archival venue and will not have official proceedings. Workshop submissions can be subsequently or concurrently submitted to other venues.

Visibility: Submissions and reviews will not be public. Only accepted papers will be made public.

Important Dates

Submission deadline April 30, 2026 (AoE)
Notification to authors May 15, 2026 (AoE)
Camera-ready deadline June 1, 2026
Workshop date July 10, 2026 (Friday; Seoul, South Korea)

Schedule

All times are in Korea Standard Time (KST, GMT+9). This is the tentative schedule of the workshop.

Time Session
08:00 – 08:10 Opening Remarks
08:10 – 08:55 Keynote / Invited Talk 1
08:55 – 09:25 Invited Talk 2
09:25 – 09:45 Coffee Break
09:45 – 10:15 Invited Talk 3
10:15 – 10:45 Invited Talk 4
10:45 – 11:25 Oral Presentations (Selected Papers)
11:25 – 12:10 Poster Session 1
12:10 – 13:40 Lunch Break
13:40 – 14:10 Invited Talk 5
14:10 – 14:50 Panel Discussion + Audience Q&A
14:50 – 15:50 Poster Session 2
15:50 – 16:05 Coffee Break
16:05 – 16:35 Invited Talk 6
16:35 – 16:55 Breakout Session + Report-back
16:55 – 17:00 Closing Remarks

Invited Speakers

Vincent Sitzmann

Vincent Sitzmann

Massachusetts Institute of Technology
Xihui Liu

Xihui Liu

University of Hong Kong
Daquan Zhou

Daquan Zhou

Peking University
Pinar Yanardag

Pinar Yanardag

Virginia Tech
Bohyung Han

Bohyung Han

Seoul National University

Accepted Papers

To be announced after the notification date.

Organizers

Yu Lu

Yu Lu

Zhejiang University
Junhao Dong

Junhao Dong

Nanyang Technological University
Enis Simsar

Enis Simsar

ETH Zurich & Google
Hila Chefer

Hila Chefer

Black Forest Labs & Tel Aviv University
Ismini Lourentzou

Ismini Lourentzou

University of Illinois Urbana-Champaign
Piotr Koniusz

Piotr Koniusz

The University of New South Wales
Yi Yang

Yi Yang

Zhejiang University

Program Committee

To be announced.

Contact

Contact (Google Group): f2s-workshop@googlegroups.com