mllv-uds

Seminar: Machine Learning for Language and Vision

News

May 9, 2023. Our grading standard is online.

Apr 26, 2023. Our tentative schedule is online. Please contact us ASAP if you have any questions.

Apr 17, 2023. Our kickoff meeting is scheduled for April 18, 2023, from 12:15 to 13:45, at C7 3 - Seminarraum 1.12. As the demand for attendance is exceedingly high, we have added more papers to our list. We kindly request that you attend the first two meetings to secure your spot. In the event that you are unable to attend but would still like to participate, please send us an email and we will arrange remote attendance for you.

Mar 27, 2023. If you are interested, please send Xudong Hong an email to register for this seminar.

Schedule

Date	Topic	Paper Title	Presenter
May 9, 2023	Contrastive Pre-training - Image	Learning transferable visual models from natural language supervision	Raj Mohan Tumarada
May 9, 2023	Contrastive Pre-training - Image	Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation	Larisa Ivanova
May 16, 2023	Seq2seq Pre-training - Image	Simple Visual Language Model Pretraining with Weak Supervision	Julian Schlenker
May 16, 2023	Seq2seq Pre-training - Image	Flava: A foundational language and vision alignment model	Mehrad Zamani
May 23, 2023	Pre-training - Video	Merlot: Multimodal neural script knowledge models	Yage Zhang
May 23, 2023	Pre-training - Video	Merlot reserve: Neural script knowledge through vision and language and sound	-
May 30, 2023	Multitask Learning	Unifying Vision-and-Language Tasks via Text Generation	Nitish Juttu
May 30, 2023	Multitask Learning	Unit: Multimodal multitask learning with a unified transformer	Zixuan Liu
June 6, 2023	Multitask Learning	Ofa: Unifying architectures, tasks, and modalities through a simple sequence-to-sequence learning framework	Jakob Gürtler
June 6, 2023	Parameter Efficiency - Prompting	An empirical study of gpt-3 for few-shot knowledge-based vqa	Raphael Maximilian Stephan Maser
June 13, 2023	Parameter Efficiency - Prompting	Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language	Karen Li
June 13, 2023	Parameter Efficiency - Prompt Tuning	Multimodal few-shot learning with frozen language models	Muhammad Anas Tahir
June 20, 2023	Parameter Efficiency - Prompt Tuning	Transitional adaptation of pretrained models for visual storytelling	Mahnoor Shahid
June 20, 2023	Parameter Efficiency - Prefix-Tuning	Hyperpelt: Unified parameter-efficient language model tuning for both language and vision-and-language tasks	Sijie Wu
June 27, 2023	Parameter Efficiency - Prefix-Tuning	Visual Prompt Tuning	Prathvish Mithare
June 27, 2023	Parameter Efficiency - Adapters	Vl-adapter: Parameter-efficient transfer learning for vision-and-language tasks	Muhammed Saeed
July 4, 2023	Parameter Efficiency - Adapters	LST: Ladder Side-Tuning for Parameter and Memory Efficient Transfer Learning	Rajkumar Anilkumar Vaghashiya
July 4, 2023	Generative Model - Text-to-Image	High-resolution image synthesis with latent diffusion models	Shreyash Arya
July 11, 2023	Generative Model - GPT	GPT-4, GPT-4 Technical Report	Abdul Rafay
July 11, 2023	Generative Model - GPT	Visual ChatGPT: Talking, Drawing, and Editing with Visual Foundation Models	-
July 18, 2023	Reinforcement Learning	No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling	-
July 18, 2023	Reinforcement Learning	What Makes a Good Story? Designing Composite Rewards for Visual Storytelling	-
July 25, 2023	Summary		Xudong Hong and Ruitao Feng

Introduction

Please find it here.

Topics

Please find them here.

Grading

10% draft presentation (due each Wednesday)
10% questions about the papers (due each Friday)
35% final talk
5% Attendance of all the talks and giving feedback
5% Discussion during the talk with the others
35% Term paper on your understanding of the paper. 5 pages, using ACL 2023 template
(optional) 10% Demo

Representations

Requirement

Students should have a basic understanding of deep learning, natural language processing and computer vision concepts. Students are expected to actively engage in discussions and critically analyze the papers presented during the seminar. They are also encouraged to share their own insights and perspectives on the topics covered.

Discussion Format

We will have a group discussion on each paper, where participants need to first present the papers. Then others can share their thoughts and insights on the research.

Date and Time

every Tue 12:15-13:45

kick-off meeting on Apr 18, 2023 Location: Gebäude C7 3 - Seminarraum 1.12

Contact

If you have any questions or concerns, please contact us via email. We look forward to seeing you at the discussion!

Xudong Hong: xhong@coli.uni-saarland.de

Ruitao Feng: fruitao@coli.uni-saarland.de

(The following is under construction. Please stay tuned. )

This site is open source. Improve this page.