I am Ph.D. candidate (soon to graduate) at Michigan State University, majoring in natural language processing. My research is focused on combining reasoning tools and deep neural networks in conjunction with integrating background knowledge. My research is more closely related to spatial-temporal reasoning, procedural reasoning, Large Language Models, Question-Answering, and information extraction. I have been a reviewer for top-tier NLP venues such as ACL , NAACL, EMNLP, and broader conferences such as IJCAI and AAAI. I have also been involved in organizing neuro-symbolic workshops such as CLeaR (Combining Learning and Reasoning). I obtained my Ms.c. at Sharif University of Technology on the topic of integration of Machine Learning and Symbolic knowledge in the scope of IOT. My Bs.c. degree is from AmirKabir University of Technology.
Computer Science - NLP
Michigan State University
Thesis: Exploiting Semantic Structures Toward Procedural Reasoning
Focus: NLP, AI, Machine Learning, Neuro-Symbolic AI, LLM, Neural Logical Reasoning
Computer engineering
Sharif university of technology
Thesis: Hybrid Learning Approach Toward Situation Recognition and Handling
Computer Science
Amirkabir University of technology
GPA: 3.94
Michigan State University
Apple
Dataminr
Vestaak
Vanda Andisheh Amirkabir
Declarative Neuro-Symbolic Learning and Reasoning Framework
AI Symposium
University of Michigan
Spartqa: A textual question answering benchmark for spatial reasoning
Graduate Symposium
Michigan State University
Spartqa: A textual question answering benchmark for spatial reasoning
AI Symposium
University of Michigan
Domiknows: A library for integration of symbolic domain knowledge in deep learning
My research focus in on enhancing the reasoning capabilities of deep learning models, specially the models for natural language understanding.
During my time at Michigan State University, I have published many research papers on the reasoning and deep learning on topics including: Procedural Reasoning, Spatial Reasoning, Question-Answering, Semantic Parsing, Neuro-Symbolic AI, Domain Knowledge Integration, and more.
In addition to my ongoing research at MSU, I have also been involved in application focused research at both Apple, and Dataminr. I have contributed in improving the underlying Siri natural language understanding at Apple. I have also published a research paper, introduced novel techniques and benchmarks at Dataminr to extract coherent disaster stories from Social Media.
Click on each item to learn more!
The goal of this project is to facilitate the development of custom architectures by simply prompting Large Language Models. We explore a series of techniques for mapping the prompts into declarative programing frameworks for the designing custom neural architectures which can be aligned with a symbolic knowledge space.
Machine Learning AI LLM Prompt Engineering Dynamic Retrieval Neuro-Symbolic AI NLPIn this research project which led to multiple publications in top-tier conferences, we explore the capability of deep learning models on understanding the evolution of events and actions in a process. The understanding includes but is not limited to identifying important actions, the effect of actions on involved entities and the goal and objective of the process. Within this project, we mainly focus on improving the capability of Language Models through knowledge augmentation, connecting to symbolic knowledge, incorporating domain knowledge and similar techniques.
Machine Learning AI Procedural Reasoning Reasoning Neuro-Symbolic AI NLP VisionDomiKnowS is a library for integrating domain knowledge in deep learning architectures. Using this library, the data structure is expressed symbolically via graph declarations, and the logical constraints over outputs or latent variables can be seamlessly added to the deep models. The domain knowledge can be defined explicitly, improving the models' explainability and the performance and generalizability in the low-data regime. Several approaches for integrating symbolic and sub-symbolic models have been introduced; however, there is no library to facilitate the programming for such integration in a generic way while various underlying algorithms can be used. Our library aims to simplify programming for such integration in both training and inference phases while separating the knowledge representation from learning algorithms. We showcase various NLP benchmark tasks and beyond. The framework is publicly available on GitHub (https://github.com/HLR/DomiKnowS).
Research Machine Learning AI Ontology Domain Knowledge Knowledge Graph Neuro-Symbolic AI NLP VisionIn this project which led to a publication in EMNLP2022, we propose a novel approach to extract relevant information on disaster events from Twitter. Here, we go beyond just topic detection and extract coherent stories on local scale from the Tweets. We further provide a dataset for understanding the relevancy of information, their importance of the content and further how to summarize the disaster stories extracted from Twitter.
Machine Learning AI LLM Social Media Clustering Neuro-Symbolic AI NLPIn this project, which led to a paper published in NAACL'2021, we aim to explore the capabilities of SOTA models in understanding complex multi-hop spatial relationships described in natural language. We provide synthetic and human-generated datasets for mutli-hop spatial reasoning, evaluate the capability of models to transfer their learning from the synthetic set to other datasets, and further analyse a series of baselines on their spatial understanding.
Machine Learning AI Spatial Reasoning Reasoning Neuro-Symbolic AI NLP VisionWe propose a novel hybrid learning approach to gain situation awareness in smart environments by introducing a new situation identifier that combines an expert system and a machine learning approach. Traditionally, expert systems and machine learning approaches have been widely used independently to detect ongoing situations as the main functionality in smart environments in various domains. Expert systems lack the functionality to adapt the system to each user and are expensive to design based on each setting. On the other hand, machine learning approaches fail in the challenge of cold start and making explainable decisions. Using both of these approaches enables the system to use user’s feedback and capture environmental changes while exploiting the initial expert knowledge to solve the mentioned challenges. We use decision trees and situation templates as the core structure to interpret sensor data. To evaluate the proposed method, we generate a new human-annotated dataset simulating a smart environment. Our experiments show superior results compared with the initial expert system and the machine learning approach while preserving the initial expert system’s interpretability.
Research IOT Game Theory Machine Learning Ontology XML Hybrid Learning Neuro-Symbolic AIThe project was about finding papers related to professors from certain universities integrated data from professors and store them into data warehouses. the project was performed using python, scrapy and selenium.
Python Crawler Data Scienceanalysing received resumes and assign matching point to each one. invite the top job seekers to interview using telegram cli.
Python Crawler Data Science Telegram CLIThe project is a telegram crawler which finds out the topic of telegram channels. channels are discovered by the robot itself from groups and other channels.
Python Machine Learning Data Science Telegram CLIthe project was about extracting information into categories from pdf resume files.
Python Crawler Data Science Machine LearningThe project was built to help students find out their chance to apply in universities. the process was about comparing user data with previous data from universities containing a map between universities and inference from them.
Python Crawler Data Science Machine Learning Data Miningthe issues were cleaning data and extracting some historical data at first to detect noise points and etc. then we made predictions on count of numbers from professors and suggesting partners based on previous papers.
Research Data miningin this paper we discussed previous solutions for opinion mining and improved them using domain specific ontology. the ontology helped us in many cases like aspect analysis and similar words.
Research Ontology Opinion Mining NLPthe mission was to design an E-Learning platform for students and teachers collabrating togheter in term of courses and classes. teacher questions are crowdsourced and the point assigning process was crowdsourced among students either. students was encouraged to continue solving problems with feedback.
Thesis Research Project Web Developing Laravel Crowd sourcingThe project was about coding a complete social media including administrator. the stack of project was html,css and javascript for front end developing and php, mysql for back end developing.
Programming HTML CSS Javascript PHP Database Mysql Social Mediathe project was about designing and efficient database structure for social media including query building for several common request. designing complex queries and creating trigger events.
Programming Mysql Postgres PHP Database PHP Mysql Social MediaThe project was about designing and coding a minimal java compiler. the process began from lexical phase to semantic phase of compiler. the compiler itself was written is C++ language
Programming C++ CompilerThe project was about making a course selection system for students to register. each student had an access code to login and then commands to list the courses, select a course, get the selected courses list, remove lists and etc. each teacher had a access code, either. the teacher was able to register a new course and get the list of students registered on course. user interface was designed as a terminal command line.
Programming C++ System Design Object oriented ProgrammingIn this paper, we investigate whether symbolic semantic representations, extracted from deep semantic parsers, can help reasoning over the states of involved entities in a procedural text. We consider a deep semantic parser (TRIPS) and semantic role labeling as two sources of semantic parsing knowledge. First, we propose PROPOLIS, a symbolic parsing-based procedural reasoning framework. Second, we integrate semantic parsing information into state-of-the-art neural models to conduct procedural reasoning. Our experiments indicate that explicitly incorporating such semantic knowledge improves procedural understanding. This paper presents new metrics for evaluating procedural reasoning tasks that clarify the challenges and identify differences among neural, symbolic, and integrated models.
Recent research has shown that integrating domain knowledge into deep learning architectures is effective; It helps reduce the amount of required data, improves the accuracy of the models' decisions, and improves the interpretability of models. However, the research community lacks a convened benchmark for systematically evaluating knowledge integration methods. In this work, we create a benchmark that is a collection of nine tasks in the domains of natural language processing and computer vision. In all cases, we model external knowledge as constraints, specify the sources of the constraints for each task, and implement various models that use these constraints. We report the results of these models using a new set of extended evaluation criteria in addition to the task performances for a more in-depth analysis. This effort provides a framework for a more comprehensive and systematic comparison of constraint integration techniques and for identifying related research challenges. It will facilitate further research for alleviating some problems of state-of-the-art neural models.
Social media has increasingly played a key role in emergency response: first responders can use public posts to better react to ongoing crisis events and deploy the necessary resources where they are most needed. Timeline extraction and abstractive summarization are critical technical tasks to leverage large numbers of social media posts about events. Unfortunately, there are few datasets for benchmarking technical approaches for those tasks. This paper presents , the largest dataset of local crisis event timelines available to date. contains 1,000 crisis event timelines across four domains: wildfires, local fires, traffic, and storms. We built using a semi-automated cluster-then-refine approach to collect data from the public Twitter stream. Our initial experiments indicate a significant gap between the performance of strong baselines compared to the human performance on both tasks. Our dataset, code, and models are publicly available (https://github.com/CrisisLTLSum/CrisisTimelines).
We demonstrate a library for the integration of domain knowledge in deep learning architectures. Using this library, the structure of the data is expressed symbolically via graph declarations and the logical constraints over outputs or latent variables can be seamlessly added to the deep models. The domain knowledge can be defined explicitly, which improves the explainability of the models in addition to their performance and generalizability in the low-data regime. Several approaches for such integration of symbolic and sub-symbolic models have been introduced; however, there is no library to facilitate the programming for such integration in a generic way while various underlying algorithms can be used. Our library aims to simplify programming for such integration in both training and inference phases while separating the knowledge representation from learning algorithms. We showcase various NLP benchmark tasks and beyond. The framework is publicly available at Github(https://github.com/HLR/DomiKnowS).
This paper proposes a question-answering (QA) benchmark for spatial reasoning on natural language text which contains more realistic spatial phenomena not covered by prior work and is challenging for state-of-the-art language models (LM). We propose a distant supervision method to improve on this task. Specifically, we design grammar and reasoning rules to automatically generate a spatial description of visual scenes and corresponding QA pairs. Experiments show that further pretraining LMs on these automatically generated data significantly improves LMs’ capability on spatial understanding, which in turn helps to better solve two external datasets, bAbI, and boolQ. We hope that this work can foster investigations into more sophisticated models for spatial reasoning over text.
Tracking entities throughout a procedure described in a text is challenging due to the dynamic nature of the world described in the process. Firstly, we propose to formulate this task as a question answering problem. This enables us to use pre-trained transformer-based language models on other QA benchmarks by adapting those to the procedural text understanding. Secondly, since the transformer-based language models cannot encode the flow of events by themselves, we propose a Time-Stamped Language Model (TSLM) to encode event information in LMs architecture by introducing the timestamp encoding. Our model evaluated on the Propara dataset shows improvements on the published state-of-the-art results with a 3.1% increase in F1 score. Moreover, our model yields better results on the location prediction task on the NPN-Cooking dataset. This result indicates that our approach is effective for procedural text understanding in general.
We propose a novel alignment mechanism to deal with procedural reasoning on a newly released multimodal QA dataset, named RecipeQA. Our model is solving the textual cloze task which is a reading comprehension on a recipe containing images and instructions. We exploit the power of attention networks, cross-modal representations, and a latent alignment space between instructions and candidate answers to solve the problem. We introduce constrained max-pooling which refines the max pooling operation on the alignment matrix to impose disjoint constraints among the outputs of the model. Our evaluation result indicates a 19% improvement over the baselines.
Structured learning algorithms usually involve an inference phase that selects the best global output variables assignments based on the local scores of all possible assignments. We extend deep neural networks with structured learning to combine the power of learning representations and leveraging the use of domain knowledge in the form of output constraints during training. Introducing a nondifferentiable inference module to gradient-based training is a critical challenge. Compared to using conventional loss functions that penalize every local error independently, we propose an inferencemasked loss that takes into account the effect of inference and does not penalize the local errors that can be corrected by the inference. We empirically show the inference-masked loss combined with the negative log-likelihood loss improves the performance on different tasks, namely entity relation recognition on CoNLL04 and ACE2005 corpora, and spatial role labeling on CLEF 2017 mSpRL dataset. We show the proposed approach helps to achieve better generalizability, particularly in the low-data regime.
We propose a novel hybrid learning approach to gain situation awareness in smart environments by introducing a new situation identifier that combines an expert system and a machine learning approach. Traditionally, expert systems and machine learning approaches have been widely used independently to detect ongoing situations as the main functionality in smart environments in various domains. Expert systems lack the functionality to adapt the system to each user and are expensive to design based on each setting. On the other hand, machine learning approaches fail in the challenge of cold start and making explainable decisions. Using both of these approaches enables the system to use user’s feedback and capture environmental changes while exploiting the initial expert knowledge to solve the mentioned challenges. We use decision trees and situation templates as the core structure to interpret sensor data. To evaluate the proposed method, we generate a new human-annotated dataset simulating a smart environment. Our experiments show superior results compared with the initial expert system and the machine learning approach while preserving the initial expert system’s interpretability.