Overview
The “Gen AI for E-commerce” workshop explores the role of Generative Artificial Intelligence in transforming e-commerce through enhanced user experience and operational efficiency. E-commerce companies grapple with multiple challenges such as lack of quality content for products, subpar user experience, sparse datasets etc. Gen AI offers significant potential to address these complexities. Yet, deploying these technologies at scale presents challenges such as hallucination in data, excessive costs, increased latency response, and limited generalization in sparse data environments. This workshop will bring together experts from academia and industry to discuss these challenges and opportunities, aiming to showcase case studies, breakthroughs, and insights into practical implementations of Gen AI in e-commerce.
Call for papers
We will welcome papers that leverage Generative Artificial Intelligence (Gen AI) in e-commerce. Detailed topics are mentioned in CFP. Papers can be submitted at Easychair.
Information for the day of the workshop
Workshop at CIKM2024
- Paper submission deadline: 16th August 2024
- Paper acceptance notification: 30 August 2024
- Workshop: 25 October 2024
Keynote Speakers
Himabindu Lakkaraju
Harvard University
Title of the talk: TBD
Manisha Verma
Amazon
Title of the talk: TBD
Xia Ning
Ohio State University
Title of the talk: Generalizing Large Language Models for E-commerce from Large-scale, High-quality Instruction Data
Vinodh Kumar Sunkara
Meta
Title of the talk: LLM integrated Meta Ad Promotion Sourcing
Accepted Papers
- Multimodal Arabic Negotiation Bots
Samah Albast, Wassim El-Hajj, Hazem Hajj, Khaled Shaban and Shady ElbassuoniAbstractAbstract: Negotiation is a fundamental aspect of human interaction. With recent advancements in chatbots, leveraging artificial intelligence for negotiation has emerged as an ideal application. Despite significant progress in English negotiation bots, such advancements are notably absent in Arabic. Furthermore, while previous research has focused on developing high-performing neural response generation systems for negotiation bots, the integration of multi-modality remains unexplored. This work presents the first Arabic multi-modal negotiation bot presented by a seller agent capable of engaging in negotiations with buyers in the context of item sales. This seller agent is designed to understand the buyer's Arabic utterances and to interpret the negotiation context through images provided by the buyer. To achieve this, we fine-tuned a generative pre-trained transformer (GPT-2) model on an Arabic dataset, integrating it with reinforcement learning for more coherent and persuasive responses, and a convolutional neural network to support multimodality. To evaluate our model, we relied on both automatic evaluation using established metrics such as cross-entropy loss and the BLEU score, as well as human evaluation in terms of fluency, consistency and persuasion. Our evaluation results reveal both the successes and limitations of the designed multi-modal Arabic negotiation bot, offering insights into the inherent challenges and setting directions for future research.PDF Code - Decoding Style: Efficient Fine-Tuning of LLMs for Image-Guided Outfit Recommendation with Preference Feedback
Najmeh Forouzandehmehr, Nima Farrokhsiar, Ramin Giahi, Evren Korpeoglu and Kannan AchanAbstractAbstract: Personalized outfit recommendation remains a complex challenge, demanding both fashion compatibility understanding and trend awareness. This paper presents a novel framework that harnesses the expressive power of large language models (LLMs) for this task, mitigating their "black box" and static nature through fine-tuning and direct feedback integration. We bridge the item visual-textual gap in items descriptions by employing image captioning with a Multimodal Large Language Model (MLLM). This enables the LLM to extract style and color characteristics from human-curated fash- ion images, forming the basis for personalized recommendations. The LLM is efficiently fine-tuned on the open-source Polyvore dataset of curated fashion images, optimizing its ability to recom- mend stylish outfits. A direct preference mechanism using negative examples is employed to enhance the LLM’s decision-making pro- cess. This creates a self-enhancing AI feedback loop that continu-ously refines recommendations in line with seasonal fashion trends. Our framework is evaluated on the Polyvore dataset, demonstrating its effectiveness in two key tasks: fill-in-the-blank, and complemen-tary item retrieval. These evaluations underline the framework’sability to generate stylish, trend-aligned outfit suggestions, contin-uously improving through direct feedback. The evaluation results demonstrated that our proposed framework significantly outper- forms the base LLM, creating more cohesive outfits. The improved performance in these tasks underscores the proposed framework’s potential to enhance the shopping experience with accurate sug- gestions, proving its effectiveness over the vanilla LLM based outfit generation.PDF Code - Cross-Modal Zero-Shot Product Attribute Value Generation
Jiaying Gong, Ming Cheng, Hongda Shen, Pierre-Yves Vandenbussche and Hoda EldardiryAbstractAbstract: Existing zero-shot product attribute value (aspect) extraction aims at using open-mining, graph, or large language models to predict unseen product attribute values. These approaches rely on uni-modal or multi-modal models, where the sellers should provide detailed textual inputs (product descriptions) for the products. However, manually providing (typing) the product descriptions is time-consuming and frustrating for the users. Thus, we propose a cross-modal zero-shot attribute value generation framework (ViOC-AG) based on CLIP, which only requires product images as the inputs. In other words, users only need to take photos of the products they want to sell to generate unseen attribute values. ViOC-AG follows a text-only training process, where a task-customized text decoder with a projection layer is trained with the frozen CLIP text encoder to alleviate the modality gap and task disconnection. During the zero-shot inference, product aspects are generated by the frozen CLIP image encoder connected with the trained task-customized text decoder. OCR tokens and outputs from a frozen prompt-based LLM correct the decoded outputs for out-of-domain attribute values. Extensive experiments with ablation studies conducted on the public dataset MAVE demonstrate that our proposed model significantly outperforms other fine-tuned vision-language models for zero-shot attribute value generation.PDF Code - Learning variant product relationship and variation attributes from e-commerce website structures
Pedro Herrero-Vidal, You-Lin Chen, Cris Liu, Prithviraj Sen and Lichao WangAbstractAbstract: We introduce VARM, variant relationship matcher model, to identify pairs of variant products in e-commerce catalogs. Traditional definitions of entity resolution are concern with whether two mentions refer to the same underlying product. However, this fails to capture product relationships that are critical for e-commerce applications, such as listing similar, but not identical, products on the same webpage or review sharing. Here, we formulate a new type of entity resolution in variant product relationships. In contrast with the traditional definition, the new definition requires both identifying if two products are variants match of each other and what are the attributes that vary between them. To overcome these challenges, we developed a model that leverages the strengths of both encoding and generative AI models. First, we construct a dataset that captures webpage product links, and therefore variant product relationships, to train an encoding LLM to predict variant matches for any given pair of products. Second, we use RAG prompted generative LLMs to extract variant and common attributes amongst groups of variant products. To validate our strategy, we evaluated model performance using real data from one of the world's leading e-commerce retailers. The results showed that our model outperforms alternative solutions and paves the way to exploiting these new type of product relationships.PDF Code - Towards More Relevant Product Search Ranking Via Large Language Models An Empirical Study
Qi Liu, Atul Singh, Jingbo Liu, Cun Mu and Zheng YanAbstractAbstract: Training Learning-to-Rank models for e-commerce product search ranking can be challenging due to the lack of gold standard of ranking relevance. In this paper, we decompose ranking relevance into content-based and engagement-based aspects, and we propose to leverage Large Language Models (LLMs) for both label and feature generation in model training, primarily aiming to improve the models predictive capability for content-based relevance. Additionally, we introduce different sigmoid transformations on the LLM outputs to polarize relevance scores, enhancing the model's ability to balance between content-based and engagement-based relevances and thus prioritize highly relevant items overall. Comprehensive online tests and offline evaluations are also conducted for the proposed design. Our work sheds light on advanced strategies for integrating Language Models into e-commerce product search ranking model training, offering a pathway to more effective and balanced models with improved ranking relevance.PDF Code - Hierarchical Knowledge Graph Construction from Images for Scalable E-Commerce
Zhantao Yang, Han Zhang, Fangyi Chen, Anudeepsekhar Bolimera and Marios SavvidesAbstractAbstract: Knowledge Graph (KG) is playing an increasingly important role in various AI systems. For e-commerce, an efficient and low-cost automated knowledge graph construction method is the foundation of enabling various successful downstream applications. In this paper, we propose a novel method for constructing structured product knowledge graphs from raw product images. The method cooperatively leverages recent advances in the vision-language model (VLM) and large language model (LLM), fully automating the process and allowing timely graph updates. We also present a human-annotated e-commerce product dataset for benchmarking product property extraction in knowledge graph construction. Our method outperforms our baseline in all metrics and evaluated properties, demonstrating its effectiveness and bright usage potential.PDF Code - PAE: LLM-based Product Attribute Extraction for E-Commerce Fashion Trends
Apurva Sinha and Ekta GujralAbstractAbstract: Product attribute extraction is a growing field in e-commerce business, with several applications including product ranking, product recommendation, future assortment planning and improving online shopping customer experiences. Understanding the customer needs is critical part of online business, specifically fashion products. Re- tailers use assortment planning to determine the mix of products to offer in each store and channel, stay responsive to market dynamics and to manage inventory and catalogs. The goal is to offer the right styles, in the right sizes and colors, through the right channels to fostering customer loyalty. In this paper we present PAE , a product attribute extraction algorithm for future trend reports consisting text and images in PDF format. Most existing methods focus on attribute extraction from titles or product descriptions or utilize visual information from existing product images. Compared to the prior works, our work focuses on attribute extraction from PDF files where upcoming fashion trends are explained. Our contributions are three-fold: (a) We develop PAE, an efficient framework to extract attributes from unstructured data (text and images); (b) We provide catalog matching methodology based on BERT representations to discover the existing attributes using upcoming attribute values; (c) We conduct extensive experiments with several baselines and show that PAE is an effective, flexible and on par or superior (avg 92.5% F1-Score) framework to existing state-of-the-art for attribute value extraction task.PDF Code - LLM-Modulo-Rec Leveraging Approximate World-Knowledge of LLMs to Improve eCommerce Search Ranking Under Data Paucity
Ali El Sayed, Sathappan Muthiah and Nikhil MuralidharAbstractAbstract: Effective ranking of products relevant to a user's query and interest is the main goal of e-commerce product ranking. In this context, ranking irrelevant products or those mismatched with the intent of the user query results in sub-optimal user experience. Providing high-quality, relevant search rankings requires large labelled datasets for training powerful deep learning (DL) based ranking pipelines. However, such large datasets are costly and time-consuming to obtain. Another important facet that influences search ranking quality is the intent and ambiguity in the user's search query. Hence, data paucity and query ambiguity are two ever-present challenges impeding the success of modern deep learning (DL) based e-commerce ranking models. In this work, we present the first ever investigation of employing large-language models (LLMs) as approximate knowledge sources to counter these challenges and improve the performance of off-the-shelf ranking models, under data paucity and query ambiguity. Specifically, we undertake the first ever investigation of developing an LLM-Modulo method to improve the search ranking performance of off-the-shelf ranking models. Our experiments demonstrate notable performance improvements in ranking quality of these off-the-shelf models, when employed in an LLM-Modulo manner.PDF Code - ReScorer An Aggregation and Alignment Technique for Building Trust into LLM Reasons
Brian de Silva, Jay Mohta, Sugumar Murugesan, Dantong Liu, Yan Xu and Mingwei ShenAbstractAbstract: Large language models (LLMs) offer substantial potential for automating labeling tasks, showcasing robust zero-shot performance across diverse classification tasks. The LLM-generated reasons that accompany these classifications contain signals about the quality of the classifications. Estimates of quality of these reasons can, in essence, be used to detect potentially incorrect predictions. Conventional metrics for scoring reasons such as ROUGE-L and BLEU scores depend on ground truth reference reasons, which are challenging and expensive to acquire, and are not available at inference time for new examples. In this paper, we use a product classification dataset to evaluate two reasoning scoring strategies that do not rely on reference reasons, one involving an LLM-based scorer and another using recently proposed ROSCOE metrics. Our analysis reveals that LLM-based approaches are computationally intensive, while aligning ROSCOE metrics with human judgment presents challenges. Consequently, we propose an extension to the ROSCOE framework called ReScorer, which achieves 7% better alignment with human judgment compared to LLM-based evaluation and 59% better than ROSCOE, while being 89% cheaper compared to LLM-based scoring.PDF Code - Label with Confidence Effective Confidence Calibration and Ensembles in LLM-Powered Classification
Karen Hovsepian, Dantong Liu and Sugumar MurugesanAbstractAbstract: Large Language Models (LLMs) have been employed as crowdsourced annotators to alleviate the burden of human labeling. However, the broader adoption of LLM-based automated labeling systems encounters two main challenges, 1) LLMs are prone to producing unexpected and unreliable predictions, and 2) no single LLM excels at all labeling tasks. To address these challenges, we first develop fast and effective logit-based confidence score calibration pipelines, aiming to leverage calibrated LLM confidence score to accurately estimate the LLM’s level of confidence. We propose novel calibration error based sampling strategy to efficiently select labeled data for calibration, leading to a reduction of calibration error by 46\%, compared with uncalibrated scores. Leveraging calibrated confidence scores, we then design a cost-aware cascading LLM ensemble policy which achieves improved accuracy, while reducing inference cost by more than 2 times compared with the conventional weighted majority voting ensemble policy.PDF Code - E-Commerce Product Categorization with LLM-based Dual-Expert Classification Paradigm
Zhu Cheng, Wen Zhang, Chih-Chi Chou, You-Yi Jau, Archita Pathak, Peng Gao and Umit BaturAbstractAbstract: Accurate product categorization in e-commerce is critical for delivering a satisfactory online shopping experience to customers. With the vast number of products available and the numerous potential categories, it becomes crucial to develop a classification system capable of assigning products to their correct categories with high accuracy. We present a novel dual-expert classification system that utilizes the power of large language models (LLMs). This framework integrates domain-specific knowledge and pre-trained LLM’s general knowledge through effective model fine-tuning and prompting techniques. First, the fine-tuned domain-specific expert recommends top K candidate categories for a given input product. Then, the more general LLM-based expert, through prompting techniques, analyzes the nuanced differences between candidate categories and selects the most suitable target category. Experiments on e-commerce datasets demonstrate the effectiveness of our LLM-based Dual-Expert classification system.PDF Code
Organizers
Mansi Mane
Walmart Global Tech
Djordje Gligorijevic
eBay
Dingxian Wang
Upwork
Behzad Shahrasbi
Amazon
Topojoy Biswas
Walmart Global Tech
Evren Korpeoglu
Walmart Global Tech
Marios Savvides
CMU, UltronAI
Program Committee
- ABC (XYZ University)