Education

  • Ph.D.2019-2023

    Computer Science - NLP

    Michigan State University

    Thesis: Exploiting Semantic Structures Toward Procedural Reasoning

    Focus: NLP, AI, Machine Learning, Neuro-Symbolic AI, LLM, Neural Logical Reasoning

  • MSc2016-2018

    Computer engineering

    Sharif university of technology

    Thesis: Hybrid Learning Approach Toward Situation Recognition and Handling

  • BSc2012-2016

    Computer Science

    Amirkabir University of technology

    GPA: 3.94

Recent Experiences

  • 08/2019 - Current
    Graduate Research Assistant

    Michigan State University

    • Led a team of more than 12 researchers in designing and implementing a neuro-symbolic framework using PyTorch, leveraging Declarative Learning-Based Programming language to integrate domain knowledge into neural networks effectively.
    • Developed cutting-edge models to enhance the comprehension of procedural texts by pretrained language models, resulting in improved procedural reasoning capabilities.
    • Created and curated challenging tasks and datasets to evaluate language models' spatial reasoning capabilities.
    • Employed language models~(Roberta, T5, GPT-X) and adapted them to solve complex reasoning tasks.
    • Published more than 6 lead-authored research papers at major conferences, such as NAACL 2021, EMNLP 2021, ACL 2020, AAAI 2023, and EACL 2023.
  • 05/2022 - 08/2022
    Research Scientist Intern

    Apple

    • Improving the performance of existing models for Siri intent detection by 30% through various techniques. This involved addressing data imbalance, applying data clustering based on sentence representations, and leveraging language models.
    • Contributed to developing novel data augmentation techniques to enhance the performance and robustness of Siri intent detection.
  • 05/2021 - 08/2021
    Research Scientist Intern

    Dataminr

    • Led the design and implementation of a knowledge-based clustering pipeline for timeline extraction and summarization of local crisis events on Twitter.
    • Published a groundbreaking paper as the lead author at EMNLP 2022, introducing a novel timeline extraction method and benchmark.
  • 08/2016 - 06/2019
    CEO

    Vestaak

    • I founded a company specializing on training young talent in Programing and Machine Learning and introducing them to companies requiring new engineers
    • I created a business plan which had two sides, one, we would develop and maintain Projects for startups, two we would introduce trained engineers to big companies.
  • 11/2017 - 07/2019
    CTO

    Vanda Andisheh Amirkabir

    • I was in charge of the technical team to lead Web application and Mobile development.
    • I was responsible to create technical plans and identify the software requirements for implementing E-Education services

Awards

  • Jackson Data & Impact Award2022

    Declarative Neuro-Symbolic Learning and Reasoning Framework

  • Best Poster2022

    AI Symposium

    University of Michigan

    Spartqa: A textual question answering benchmark for spatial reasoning

  • Best Poster2022

    Graduate Symposium

    Michigan State University

    Spartqa: A textual question answering benchmark for spatial reasoning

  • Best Poster2021

    AI Symposium

    University of Michigan

    Domiknows: A library for integration of symbolic domain knowledge in deep learning

Research Projects

Click on each item to learn more!


  • Prompt 2 Model 2023

    Generating Complex Neural Architectures from Natural Language Prompts

    Research

    The goal of this project is to facilitate the development of custom architectures by simply prompting Large Language Models. We explore a series of techniques for mapping the prompts into declarative programing frameworks for the designing custom neural architectures which can be aligned with a symbolic knowledge space.

    Machine Learning AI LLM Prompt Engineering Dynamic Retrieval Neuro-Symbolic AI NLP
  • Procedural Reasoning 2020 - 2023

    Understanding world models and their evolution in a Procedural text/multi-modal data.

    Research

    In this research project which led to multiple publications in top-tier conferences, we explore the capability of deep learning models on understanding the evolution of events and actions in a process. The understanding includes but is not limited to identifying important actions, the effect of actions on involved entities and the goal and objective of the process. Within this project, we mainly focus on improving the capability of Language Models through knowledge augmentation, connecting to symbolic knowledge, incorporating domain knowledge and similar techniques.

    Machine Learning AI Procedural Reasoning Reasoning Neuro-Symbolic AI NLP Vision
  • DomiKnows 2019 - 2023

    Declarative Learning-based Programming

    Research

    DomiKnowS is a library for integrating domain knowledge in deep learning architectures. Using this library, the data structure is expressed symbolically via graph declarations, and the logical constraints over outputs or latent variables can be seamlessly added to the deep models. The domain knowledge can be defined explicitly, improving the models' explainability and the performance and generalizability in the low-data regime. Several approaches for integrating symbolic and sub-symbolic models have been introduced; however, there is no library to facilitate the programming for such integration in a generic way while various underlying algorithms can be used. Our library aims to simplify programming for such integration in both training and inference phases while separating the knowledge representation from learning algorithms. We showcase various NLP benchmark tasks and beyond. The framework is publicly available on GitHub (https://github.com/HLR/DomiKnowS).

    Research Machine Learning AI Ontology Domain Knowledge Knowledge Graph Neuro-Symbolic AI NLP Vision
  • Social Media Crisis Stories 2021

    Automatic Extraction of disaster events from Twitter

    Research

    In this project which led to a publication in EMNLP2022, we propose a novel approach to extract relevant information on disaster events from Twitter. Here, we go beyond just topic detection and extract coherent stories on local scale from the Tweets. We further provide a dataset for understanding the relevancy of information, their importance of the content and further how to summarize the disaster stories extracted from Twitter.

    Machine Learning AI LLM Social Media Clustering Neuro-Symbolic AI NLP
  • Spatial Reasoning 2021

    Understanding spatial relations from natural language

    Research

    In this project, which led to a paper published in NAACL'2021, we aim to explore the capabilities of SOTA models in understanding complex multi-hop spatial relationships described in natural language. We provide synthetic and human-generated datasets for mutli-hop spatial reasoning, evaluate the capability of models to transfer their learning from the synthetic set to other datasets, and further analyse a series of baselines on their spatial understanding.

    Machine Learning AI Spatial Reasoning Reasoning Neuro-Symbolic AI NLP Vision
  • MS.c Thesis 2017 - 2019

    A Situation Aware Framework for IOT Environments

    Research

    We propose a novel hybrid learning approach to gain situation awareness in smart environments by introducing a new situation identifier that combines an expert system and a machine learning approach. Traditionally, expert systems and machine learning approaches have been widely used independently to detect ongoing situations as the main functionality in smart environments in various domains. Expert systems lack the functionality to adapt the system to each user and are expensive to design based on each setting. On the other hand, machine learning approaches fail in the challenge of cold start and making explainable decisions. Using both of these approaches enables the system to use user’s feedback and capture environmental changes while exploiting the initial expert knowledge to solve the mentioned challenges. We use decision trees and situation templates as the core structure to interpret sensor data. To evaluate the proposed method, we generate a new human-annotated dataset simulating a smart environment. Our experiments show superior results compared with the initial expert system and the machine learning approach while preserving the initial expert system’s interpretability.

    Research IOT Game Theory Machine Learning Ontology XML Hybrid Learning Neuro-Symbolic AI
  • Paper Crawler & Integrator 2018

    Paper and professors Crawler & Integrator from university websites and paper libraries.

    Project

    The project was about finding papers related to professors from certain universities integrated data from professors and store them into data warehouses. the project was performed using python, scrapy and selenium.

    Python Crawler Data Science
  • Automatic Resume Analyzer and Interview caller 2018

    built with python and telegram cli.

    Project

    analysing received resumes and assign matching point to each one. invite the top job seekers to interview using telegram cli.

    Python Crawler Data Science Telegram CLI
  • Social Media Topic Detector 2017

    Detecting Telegram channel Topic from texts

    Project

    The project is a telegram crawler which finds out the topic of telegram channels. channels are discovered by the robot itself from groups and other channels.

    Python Machine Learning Data Science Telegram CLI
  • Enhanced Resume Analyzer 2017

    Resume Analyzer from Pdf with the help of Machine Learning Techniques.

    Project Research

    the project was about extracting information into categories from pdf resume files.

    Python Crawler Data Science Machine Learning
  • Smart Apply Suggester 2018

    Suggesting Apply Destination for master and phd

    Project Research

    The project was built to help students find out their chance to apply in universities. the process was about comparing user data with previous data from universities containing a map between universities and inference from them.

    Python Crawler Data Science Machine Learning Data Mining
  • Academic Partnership Recommender 2018

    Data extraction from paper and writer dataset

    Project Research

    the issues were cleaning data and extracting some historical data at first to detect noise points and etc. then we made predictions on count of numbers from professors and suggesting partners based on previous papers.

    Research Data mining
  • Semantic Web survey and imorovement - NLP 2017

    Domain Specific ontology for aspect based opinion mining

    Research

    in this paper we discussed previous solutions for opinion mining and improved them using domain specific ontology. the ontology helped us in many cases like aspect analysis and similar words.

    Research Ontology Opinion Mining NLP
  • Bachelor of Science Thesis 2016

    Designing an E-Learn system including encouragement tactics and crowd source questions

    Research Project

    the mission was to design an E-Learning platform for students and teachers collabrating togheter in term of courses and classes. teacher questions are crowdsourced and the point assigning process was crowdsourced among students either. students was encouraged to continue solving problems with feedback.

    Thesis Research Project Web Developing Laravel Crowd sourcing
  • Designing Google Plus (Mock) 2015

    Designing and Coding a social media like Google plus

    Project

    The project was about coding a complete social media including administrator. the stack of project was html,css and javascript for front end developing and php, mysql for back end developing.

    Programming HTML CSS Javascript PHP Database Mysql Social Media
  • Database Design 2015

    Designing social media database structure and administrator phase

    Project

    the project was about designing and efficient database structure for social media including query building for several common request. designing complex queries and creating trigger events.

    Programming Mysql Postgres PHP Database PHP Mysql Social Media
  • Compiler Fundamentals - Java Compiler 2014

    Designing and Coding MinI Java Compiler

    Project Research

    The project was about designing and coding a minimal java compiler. the process began from lexical phase to semantic phase of compiler. the compiler itself was written is C++ language

    Programming C++ Compiler
  • Advanced Programming - Student Registration Service using C++ 2013

    Designing and Coding a functional student, teacher and course registration system.

    Project

    The project was about making a course selection system for students to register. each student had an access code to login and then commands to list the courses, select a course, get the selected courses list, remove lists and etc. each teacher had a access code, either. the teacher was able to register a new course and get the list of students registered on course. user interface was designed as a terminal command line.

    Programming C++ System Design Object oriented Programming

The Role of Semantic Parsing in Understanding Procedural Text

Hossein Rajaby Faghihi , Parisa Kordjamshidi, Choh Man Teng, James Allen
Conference PaperEACL' 2023

Abstract

In this paper, we investigate whether symbolic semantic representations, extracted from deep semantic parsers, can help reasoning over the states of involved entities in a procedural text. We consider a deep semantic parser (TRIPS) and semantic role labeling as two sources of semantic parsing knowledge. First, we propose PROPOLIS, a symbolic parsing-based procedural reasoning framework. Second, we integrate semantic parsing information into state-of-the-art neural models to conduct procedural reasoning. Our experiments indicate that explicitly incorporating such semantic knowledge improves procedural understanding. This paper presents new metrics for evaluating procedural reasoning tasks that clarify the challenges and identify differences among neural, symbolic, and integrated models.

GLUECons: A Generic Benchmark for Learning under Constraints

Hossein Rajaby Faghihi , Aliakbar Nafar, Chen Zheng, Roshanak Mirzaee, Yue Zhang, Andrzej Uszok, Alexander Wan, Tanawan Premsri, Dan Roth, Parisa Kordjamshidi
Conference PaperAAAI' 2023

Abstract

Recent research has shown that integrating domain knowledge into deep learning architectures is effective; It helps reduce the amount of required data, improves the accuracy of the models' decisions, and improves the interpretability of models. However, the research community lacks a convened benchmark for systematically evaluating knowledge integration methods. In this work, we create a benchmark that is a collection of nine tasks in the domains of natural language processing and computer vision. In all cases, we model external knowledge as constraints, specify the sources of the constraints for each task, and implement various models that use these constraints. We report the results of these models using a new set of extended evaluation criteria in addition to the task performances for a more in-depth analysis. This effort provides a framework for a more comprehensive and systematic comparison of constraint integration techniques and for identifying related research challenges. It will facilitate further research for alleviating some problems of state-of-the-art neural models.

CrisisLTLSum: A Benchmark for Local Crisis Event Timeline Extraction and Summarization

Hossein Rajaby Faghihi , Bashar Alhafni, Ke Zhang, Shihao Ran, Joel Tetreault, Alejandro Jaimes
Conference PaperEMNLP' 2022

Abstract

Social media has increasingly played a key role in emergency response: first responders can use public posts to better react to ongoing crisis events and deploy the necessary resources where they are most needed. Timeline extraction and abstractive summarization are critical technical tasks to leverage large numbers of social media posts about events. Unfortunately, there are few datasets for benchmarking technical approaches for those tasks. This paper presents , the largest dataset of local crisis event timelines available to date. contains 1,000 crisis event timelines across four domains: wildfires, local fires, traffic, and storms. We built using a semi-automated cluster-then-refine approach to collect data from the public Twitter stream. Our initial experiments indicate a significant gap between the performance of strong baselines compared to the human performance on both tasks. Our dataset, code, and models are publicly available (https://github.com/CrisisLTLSum/CrisisTimelines).

DomiKnowS: A Library for Integration of Symbolic Domain Knowledge in Deep Learning

Hossein Rajaby Faghihi , Quan Guo, Andrzej Uszok, Aliakbar Nafar, Parisa Kordjamshidi
Conference PaperEMNLP' 2021

Abstract

We demonstrate a library for the integration of domain knowledge in deep learning architectures. Using this library, the structure of the data is expressed symbolically via graph declarations and the logical constraints over outputs or latent variables can be seamlessly added to the deep models. The domain knowledge can be defined explicitly, which improves the explainability of the models in addition to their performance and generalizability in the low-data regime. Several approaches for such integration of symbolic and sub-symbolic models have been introduced; however, there is no library to facilitate the programming for such integration in a generic way while various underlying algorithms can be used. Our library aims to simplify programming for such integration in both training and inference phases while separating the knowledge representation from learning algorithms. We showcase various NLP benchmark tasks and beyond. The framework is publicly available at Github(https://github.com/HLR/DomiKnowS).

SPARTQA: A Textual Question Answering Benchmark for Spatial Reasoning

Roshanak Mirzaee, Hossein Rajaby Faghihi , Qiang Ning, Parisa Kordjamshidi
Conference PaperNAACL' 2021

Abstract

This paper proposes a question-answering (QA) benchmark for spatial reasoning on natural language text which contains more realistic spatial phenomena not covered by prior work and is challenging for state-of-the-art language models (LM). We propose a distant supervision method to improve on this task. Specifically, we design grammar and reasoning rules to automatically generate a spatial description of visual scenes and corresponding QA pairs. Experiments show that further pretraining LMs on these automatically generated data significantly improves LMs’ capability on spatial understanding, which in turn helps to better solve two external datasets, bAbI, and boolQ. We hope that this work can foster investigations into more sophisticated models for spatial reasoning over text.

Time-stamped language model: Teaching language models to understand the flow of events

Hossein Rajaby Faghihi , Parisa Kordjamshidi
Conference PaperNAACL' 2021

Abstract

Tracking entities throughout a procedure described in a text is challenging due to the dynamic nature of the world described in the process. Firstly, we propose to formulate this task as a question answering problem. This enables us to use pre-trained transformer-based language models on other QA benchmarks by adapting those to the procedural text understanding. Secondly, since the transformer-based language models cannot encode the flow of events by themselves, we propose a Time-Stamped Language Model (TSLM) to encode event information in LMs architecture by introducing the timestamp encoding. Our model evaluated on the Propara dataset shows improvements on the published state-of-the-art results with a 3.1% increase in F1 score. Moreover, our model yields better results on the location prediction task on the NPN-Cooking dataset. This result indicates that our approach is effective for procedural text understanding in general.

Latent alignment of procedural concepts in multimodal recipes

Hossein Rajaby Faghihi, Roshanak Mirzaee, Sudarshan Paliwal, Parisa Kordjamshidi
Workshop PaperALVR (ACL)' 2020

Abstract

We propose a novel alignment mechanism to deal with procedural reasoning on a newly released multimodal QA dataset, named RecipeQA. Our model is solving the textual cloze task which is a reading comprehension on a recipe containing images and instructions. We exploit the power of attention networks, cross-modal representations, and a latent alignment space between instructions and candidate answers to solve the problem. We introduce constrained max-pooling which refines the max pooling operation on the alignment matrix to impose disjoint constraints among the outputs of the model. Our evaluation result indicates a 19% improvement over the baselines.

Inference-masked loss for deep structured output learning

Quan Guo, Hossein Rajaby Faghihi , Yue Zhang, Andrzej Uszok, Parisa Kordjamshidi
Conference PaperIJCAI' 2020

Abstract

Structured learning algorithms usually involve an inference phase that selects the best global output variables assignments based on the local scores of all possible assignments. We extend deep neural networks with structured learning to combine the power of learning representations and leveraging the use of domain knowledge in the form of output constraints during training. Introducing a nondifferentiable inference module to gradient-based training is a critical challenge. Compared to using conventional loss functions that penalize every local error independently, we propose an inferencemasked loss that takes into account the effect of inference and does not penalize the local errors that can be corrected by the inference. We empirically show the inference-masked loss combined with the negative log-likelihood loss improves the performance on different tasks, namely entity relation recognition on CoNLL04 and ACE2005 corpora, and spatial role labeling on CLEF 2017 mSpRL dataset. We show the proposed approach helps to achieve better generalizability, particularly in the low-data regime.

Hybrid Learning Approach Toward Situation Recognition and Handling

Hossein Rajaby Faghihi, MohammadAmin Fazli, Jafar Habibi
Journal PaperComputer Journal

Abstract

We propose a novel hybrid learning approach to gain situation awareness in smart environments by introducing a new situation identifier that combines an expert system and a machine learning approach. Traditionally, expert systems and machine learning approaches have been widely used independently to detect ongoing situations as the main functionality in smart environments in various domains. Expert systems lack the functionality to adapt the system to each user and are expensive to design based on each setting. On the other hand, machine learning approaches fail in the challenge of cold start and making explainable decisions. Using both of these approaches enables the system to use user’s feedback and capture environmental changes while exploiting the initial expert knowledge to solve the mentioned challenges. We use decision trees and situation templates as the core structure to interpret sensor data. To evaluate the proposed method, we generate a new human-annotated dataset simulating a smart environment. Our experiments show superior results compared with the initial expert system and the machine learning approach while preserving the initial expert system’s interpretability.