Description
Vision-language models can analyze images, extract information, and answer complex questions, but getting reliable results requires thoughtful prompt design. In this hands-on workshop, we extend Prompt by Design to practical image-based tasks using modern vision-language models.
Participants will learn how to design prompts for core vision-language tasks such as image classification, description generation, multimodal question answering, visual information extraction, and multimodal document understanding. We will focus on practical techniques for improving accuracy, reducing hallucination, directing the model’s attention to relevant parts of an image, and producing consistent, well-formatted responses suitable for downstream use.
Through guided exercises, attendees will programmatically build working solutions for practical image-driven tasks, with no coding experience required.
Prompt by Design: Processing Real-World Multimodal Data with AI
Location
San Pedro I - Weston Conference Center
Category:
Campus Events Students