Abstract
Large Language Models (LLMs) have become ubiquitous across various domains, transforming the way we interact with information and conduct research. While the proliferation of LLMs has enhanced numerous applications, a significant number of high-performing models remain proprietary, impeding the progress of scientific exploration. LLMs are also susceptible to hallucinations, generating seemingly credible yet factually inaccurate information that can impact their broad acceptance and integration. In this seminar, I will commence by introducing one of our open-sourced XGen LLMs. I will delve into its pre-training process and present its results on standard benchmarks. Subsequently, I will discuss our work involving reasoning with LLMs, democratizing them for low-resource languages, and distilling knowledge from a larger (175B) proprietary LLM to a smaller (7B) model in a personalized manner. Finally, I will conclude by addressing some limitations of LLMs, emphasizing that scaling alone might not suffice as a solution and that new innovations are needed to tackle these challenges.
Bio
Dr. Shafiq Joty (https://raihanjoty.github.io/) is currently a Research Director at Salesforce Research (Palo Alto, USA), where he oversees the NLP group’s efforts in large language modeling (LLM) and generative AI. He also holds the position of a tenured Associate Professor (currently on leave) in the School of Computer Science and Engineering (SCSE) at NTU, Singapore. He was a founding manager of the Salesforce Research Asia (Singapore) lab. His research has contributed to over 30+ patents and 140+ papers in top-tier NLP and ML conferences and journals. He has served as the Program Chair of SIGDIAL-2023, as a member of the best paper award committees for ICLR-23 and NAACL-22, and in the capacity of a (senior) area chair for many of the leading NLP and ML conferences.
Hackerman Hall B17 @ 3400 N. Charles Street, Baltimore, MD 21218