Abstract
We propose INTENT-O-METER, a perceived human intent prediction model for multimodal (image and text) social media posts. INTENT-O-METER models ideas from psychology and cognitive modeling literature, in addition to using the visual and textual features for an improved perceived intent prediction model. INTENT-O-METER leverages Theory of Reasoned Action (TRA) factoring in (i) the creator's attitude towards sharing a post, and (ii) the social norm or perception towards the post in determining the creator's intention. We also introduce INTENTGRAM, a dataset of $55$K social media posts scraped from public Instagram profiles. We compare INTENT-O-METER with state-of-the-art intent prediction approaches on four perceived intent prediction datasets, Intentonomy, MDID, MET-Meme, and INTENTGRAM. We observe that leveraging TRA in addition to visual and textual features--as opposed to using only the latter--results in improved prediction accuracy by up to $7.5\%$ in Top-$1$ accuracy and $8\%$ in AUC on INTENTGRAM.
Dataset
Dataset can be found here.